From aarchiba at physics.mcgill.ca Mon Aug 1 01:20:16 2011 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Mon, 1 Aug 2011 01:20:16 -0400 Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> References: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> Message-ID: I realize this discussion has gone rather far afield from efficient 1D deconvolution, but we do a funny thing in radio interferometry, and I'm curious whether this is normal for other kinds of deconvolution as well. In radio interferometry we obtain our images convolved with the so-called "dirty beam", a convolution kernel that has a nice narrow peak but usually a chaos of monstrous sidelobes often only marginally smaller than the main lobe. We use a different regularization condition to do our deconvolution: we treat the underlying image as a modest collection of point sources. (One can see why this appeals to astronomers.) Through an iterative process (the "CLEAN" algorithm and its many descendants) we obtain an estimate of this underlying image. But we very rarely actually work with this image directly. We normally convolve it with a sort of idealized version of our kernel without all the sidelobes. This then gives an image one might have obtained from a normal telescope the size of the interferometer array. (Apart from all the CLEAN artifacts.) What I'm wondering is, is this final step of convolving with an idealized version of the kernel standard practice elsewhere? >From one point of view it could just be parochiality, astronomers being so accustomed to smudgy images that we have to convert anything else to this format. But I think that at the least it softens the effect of the rather strict regularization assumption behind CLEAN - which amounts to "no extended sources". It probably makes us less sensitive to shortcuts in CLEAN implementations. I think, though, that this trick may be useful for many applications of deconvolution. Rather than try to translate the image from the observed kernel to some ideal Dirac-delta kernel, this tries to convert it from the observed kernel to a similar but simpler kernel; one would expect the impact of a deconvolution artifact to be related to the magnitude of the difference between kernels. In terms of 1D Fourier deconvolution, this is saying, after deconvolution, that we don't really need all those high frequencies amplified so much anyway, and smoothing them back down with a nice clean easy-to-understand kernel. In these terms, in fact, it makes perfect sense to use a wider kernel than necessary for this smoothing if one is interested in larger-scale features. Anne On 31 July 2011 20:51, David Baddeley wrote: > Hi Ralf, > I do a reasonable?amount?of (2 & 3D) deconvolution of microscopy images and > the method I use depends quite a lot on the exact properties of the signal. > You can usually get away with fft based convolutions even if your signal is > not periodic as long as your kernel is significantly smaller than the signal > extent. > As Joe mentioned, for a noisy signal convolving with the inverse or > performing fourier domain division doesn't work as you end up amplifying > high frequency noise components. You thus need some form of regularisation. > The thresholding of fourier components that Joe suggests does this, but you > might also want to explore more sophisticated options, the simplest of which > is probably Wiener filtering > (http://en.wikipedia.org/wiki/Wiener_deconvolution). > If you've got a signal which is constrained to be positive, it's often > useful to introduce a positivity constraint on the deconvolution result > which generally means you need an iterative algorithm. The choice of > algorithm should also depend on the type of noise that is present in your > signal - ?my image data is constrained to be +ve and typically has either > Poisson or a mixture of Poisson and Gaussian noise and I use either the > Richardson-Lucy or a weighted version of ICTM (Iterative Constrained > Tikhonov-Miller) algorithm. I can provide more details of these if required. > cheers, > David > > > > ________________________________ > From: Ralf Gommers > To: SciPy Users List > Sent: Mon, 1 August, 2011 5:56:49 AM > Subject: [SciPy-User] deconvolution of 1-D signals > > Hi, > > For a measured signal that is the convolution of a real signal with a > response function, plus measurement noise on top, I want to recover the real > signal. Since I know what the response function is and the noise is > high-frequency compared to the real signal, a straightforward approach is to > smooth the measured signal (or fit a spline to it), then remove the response > function by deconvolution. See example code below. > > Can anyone point me towards code that does the deconvolution efficiently? > Perhaps signal.deconvolve would do the trick, but I can't seem to make it > work (except for directly on the output of np.convolve(y, window, > mode='valid')). > > Thanks, > Ralf > > > import numpy as np > from scipy import interpolate, signal > import matplotlib.pyplot as plt > > # Real signal > x = np.linspace(0, 10, num=201) > y = np.sin(x + np.pi/5) > > # Noisy signal > mode = 'valid' > window_len = 11. > window = np.ones(window_len) / window_len > y_meas = np.convolve(y, window, mode=mode) > y_meas += 0.2 * np.random.rand(y_meas.size) - 0.1 > if mode == 'full': > ??? xstep = x[1] - x[0] > ??? x_meas = np.concatenate([ \ > ??????? np.linspace(x[0] - window_len//2 * xstep, x[0] - xstep, > num=window_len//2), > ??????? x, > ??????? np.linspace(x[-1] + xstep, x[-1] + window_len//2 * xstep, > num=window_len//2)]) > elif mode == 'valid': > ??? x_meas = x[window_len//2:-window_len//2+1] > elif mode == 'same': > ??? x_meas = x > > # Approximating spline > xs = np.linspace(0, 10, num=500) > knots = np.array([1, 3, 5, 7, 9]) > tck = interpolate.splrep(x_meas, y_meas, s=0, k=3, t=knots, task=-1) > ys = interpolate.splev(xs, tck, der=0) > > # Find (low-frequency part of) original signal by deconvolution of smoothed > # measured signal and known window. > y_deconv = signal.deconvolve(ys, window)[0]? #FIXME > > # Plot all signals > fig = plt.figure() > ax = fig.add_subplot(111) > > ax.plot(x, y, 'b-', label="Original signal") > ax.plot(x_meas, y_meas, 'r-', label="Measured, noisy signal") > ax.plot(xs, ys, 'g-', label="Approximating spline") > ax.plot(xs[window.size//2-1:-window.size//2], y_deconv, 'k-', > ??????? label="signal.deconvolve result") > ax.set_ylim([-1.2, 2]) > ax.legend() > > plt.show() > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From johann.cohentanugi at gmail.com Mon Aug 1 01:50:33 2011 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Mon, 01 Aug 2011 07:50:33 +0200 Subject: [SciPy-User] ipython-qtconsole with current master Message-ID: <4E363EA9.60804@gmail.com> hi there, I get : (mypy)cohen at jarrett:~/sources/python/pyvault/ipython$ ipython-qtconsole Traceback (most recent call last): File "/home/cohen/sources/python/mypy/bin/ipython-qtconsole", line 5, in from pkg_resources import load_entry_point File "/home/cohen/sources/python/mypy/lib/python2.6/site-packages/distribute-0.6.10-py2.6.egg/pkg_resources.py", line 2659, in parse_requirements(__requires__), Environment() File "/home/cohen/sources/python/mypy/lib/python2.6/site-packages/distribute-0.6.10-py2.6.egg/pkg_resources.py", line 546, in resolve raise DistributionNotFound(req) pkg_resources.DistributionNotFound: ipython==0.11.dev ? Johann From ralf.gommers at googlemail.com Mon Aug 1 02:03:10 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 1 Aug 2011 08:03:10 +0200 Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> References: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> Message-ID: On Mon, Aug 1, 2011 at 2:51 AM, David Baddeley wrote: > Hi Ralf, > > I do a reasonable amount of (2 & 3D) deconvolution of microscopy images > and the method I use depends quite a lot on the exact properties of the > signal. You can usually get away with fft based convolutions even if your > signal is not periodic as long as your kernel is significantly smaller than > the signal extent. > The kernel is typically about 5 to 15 times smaller than the signal extent, so I guess that may be problematic. > > As Joe mentioned, for a noisy signal convolving with the inverse or > performing fourier domain division doesn't work as you end up amplifying > high frequency noise components. You thus need some form of regularisation. > The thresholding of fourier components that Joe suggests does this, but you > might also want to explore more sophisticated options, the simplest of which > is probably Wiener filtering ( > http://en.wikipedia.org/wiki/Wiener_deconvolution). > I'm aware of the problems with high frequency noise. This is why I tried the spline fitting - I figured that on a spline the deconvolution would be okay because the spline is very smooth. This should be fine for my data because the noise is much higher-frequency than the underlying signal, and the SNR is high to start with. But maybe there are better ways. I looked for a Python implementation of Wiener deconvolution but couldn't find one so quickly. Is there a package out there that has it? > > If you've got a signal which is constrained to be positive, it's often > useful to introduce a positivity constraint on the deconvolution result > which generally means you need an iterative algorithm. The choice of > algorithm should also depend on the type of noise that is present in your > signal - my image data is constrained to be +ve and typically has either > Poisson or a mixture of Poisson and Gaussian noise and I use either the > Richardson-Lucy or a weighted version of ICTM (Iterative Constrained > Tikhonov-Miller) algorithm. I can provide more details of these if required. > > By constrained to be positive I'm guessing you mean monotonic? Otherwise I could just add a constant offset, but that's probably not what you mean. What's typically the speed penalty for an iterative method? Ralf > > cheers, > David > > > > > ------------------------------ > *From:* Ralf Gommers > *To:* SciPy Users List > *Sent:* Mon, 1 August, 2011 5:56:49 AM > *Subject:* [SciPy-User] deconvolution of 1-D signals > > Hi, > > For a measured signal that is the convolution of a real signal with a > response function, plus measurement noise on top, I want to recover the real > signal. Since I know what the response function is and the noise is > high-frequency compared to the real signal, a straightforward approach is to > smooth the measured signal (or fit a spline to it), then remove the response > function by deconvolution. See example code below. > > Can anyone point me towards code that does the deconvolution efficiently? > Perhaps signal.deconvolve would do the trick, but I can't seem to make it > work (except for directly on the output of np.convolve(y, window, > mode='valid')). > > Thanks, > Ralf > > > import numpy as np > from scipy import interpolate, signal > import matplotlib.pyplot as plt > > # Real signal > x = np.linspace(0, 10, num=201) > y = np.sin(x + np.pi/5) > > # Noisy signal > mode = 'valid' > window_len = 11. > window = np.ones(window_len) / window_len > y_meas = np.convolve(y, window, mode=mode) > y_meas += 0.2 * np.random.rand(y_meas.size) - 0.1 > if mode == 'full': > xstep = x[1] - x[0] > x_meas = np.concatenate([ \ > np.linspace(x[0] - window_len//2 * xstep, x[0] - xstep, > num=window_len//2), > x, > np.linspace(x[-1] + xstep, x[-1] + window_len//2 * xstep, > num=window_len//2)]) > elif mode == 'valid': > x_meas = x[window_len//2:-window_len//2+1] > elif mode == 'same': > x_meas = x > > # Approximating spline > xs = np.linspace(0, 10, num=500) > knots = np.array([1, 3, 5, 7, 9]) > tck = interpolate.splrep(x_meas, y_meas, s=0, k=3, t=knots, task=-1) > ys = interpolate.splev(xs, tck, der=0) > > # Find (low-frequency part of) original signal by deconvolution of smoothed > # measured signal and known window. > y_deconv = signal.deconvolve(ys, window)[0] #FIXME > > # Plot all signals > fig = plt.figure() > ax = fig.add_subplot(111) > > ax.plot(x, y, 'b-', label="Original signal") > ax.plot(x_meas, y_meas, 'r-', label="Measured, noisy signal") > ax.plot(xs, ys, 'g-', label="Approximating spline") > ax.plot(xs[window.size//2-1:-window.size//2], y_deconv, 'k-', > label="signal.deconvolve result") > ax.set_ylim([-1.2, 2]) > ax.legend() > > plt.show() > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_baddeley at yahoo.com.au Mon Aug 1 03:03:18 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Mon, 1 Aug 2011 00:03:18 -0700 (PDT) Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: References: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> Message-ID: <1312182198.97186.YahooMailRC@web113409.mail.gq1.yahoo.com> Hi Ralf, 5-15 times smaller would probably be fine, although you might want to watch the edges in the reconstruction - if they're at different dc levels you'll get edge artifacts (within ~ 1 kernel width of the edges). I'd tend to avoid spline filtering (or any form of noise reduction) before deconvolution, as this will also transform the data in a way not explained by the model you're using to deconvolve with. Weiner filtering is a 2 liner -> H = fft(kernel) deconvolved = ifftshift(ifft(fft(signal)*np.conj(H)/(H*np.conj(H) + lambda**2))) where lambda is your regularisation parameter, and white noise is assumed. There are various methods for choosing lambda optimally, but most people tend to use trial and error. Iterative methods are typically ~1-2 orders of magnitude slower than a Weiner filter, but with fast fft libraries and modern computers still quite reasonable for modest data sizes (a 3D image stack of ~ 512x512x50 pixels will tend to be done in a bit under a minute, can't really comment on 1D data, but unless your signal is very long I'd expect it to be significantly quicker). Ffts scale with O(nlogn) so will generally dramatically outperform things based on a simple convolution or filtering approaches (O(n**2)) for large n. This might make an iterative approach using ffts faster than something like scipy.signal.deconvolve if your kernel is large. cheers, David ________________________________ From: Ralf Gommers To: David Baddeley ; SciPy Users List Sent: Mon, 1 August, 2011 6:03:10 PM Subject: Re: [SciPy-User] deconvolution of 1-D signals On Mon, Aug 1, 2011 at 2:51 AM, David Baddeley wrote: Hi Ralf, > > >I do a reasonable amount of (2 & 3D) deconvolution of microscopy images and the >method I use depends quite a lot on the exact properties of the signal. You can >usually get away with fft based convolutions even if your signal is not periodic >as long as your kernel is significantly smaller than the signal extent. The kernel is typically about 5 to 15 times smaller than the signal extent, so I guess that may be problematic. > >As Joe mentioned, for a noisy signal convolving with the inverse or performing >fourier domain division doesn't work as you end up amplifying high frequency >noise components. You thus need some form of regularisation. The thresholding of >fourier components that Joe suggests does this, but you might also want to >explore more sophisticated options, the simplest of which is probably Wiener >filtering (http://en.wikipedia.org/wiki/Wiener_deconvolution). I'm aware of the problems with high frequency noise. This is why I tried the spline fitting - I figured that on a spline the deconvolution would be okay because the spline is very smooth. This should be fine for my data because the noise is much higher-frequency than the underlying signal, and the SNR is high to start with. But maybe there are better ways. I looked for a Python implementation of Wiener deconvolution but couldn't find one so quickly. Is there a package out there that has it? > >If you've got a signal which is constrained to be positive, it's often useful to >introduce a positivity constraint on the deconvolution result which generally >means you need an iterative algorithm. The choice of algorithm should also >depend on the type of noise that is present in your signal - my image data is >constrained to be +ve and typically has either Poisson or a mixture of Poisson >and Gaussian noise and I use either the Richardson-Lucy or a weighted version of >ICTM (Iterative Constrained Tikhonov-Miller) algorithm. I can provide more >details of these if required. > By constrained to be positive I'm guessing you mean monotonic? Otherwise I could just add a constant offset, but that's probably not what you mean. What's typically the speed penalty for an iterative method? Ralf > cheers, >David > > > > > > > > > ________________________________ From: Ralf Gommers >To: SciPy Users List >Sent: Mon, 1 August, 2011 5:56:49 AM >Subject: [SciPy-User] deconvolution of 1-D signals > > >Hi, > >For a measured signal that is the convolution of a real signal with a response >function, plus measurement noise on top, I want to recover the real signal. >Since I know what the response function is and the noise is high-frequency >compared to the real signal, a straightforward approach is to smooth the >measured signal (or fit a spline to it), then remove the response function by >deconvolution. See example code below. > >Can anyone point me towards code that does the deconvolution efficiently? >Perhaps signal.deconvolve would do the trick, but I can't seem to make it work >(except for directly on the output of np.convolve(y, window, mode='valid')). > >Thanks, >Ralf > > >import numpy as np >from scipy import interpolate, signal >import matplotlib.pyplot as plt > ># Real signal >x = np.linspace(0, 10, num=201) >y = np.sin(x + np.pi/5) > ># Noisy signal >mode = 'valid' >window_len = 11. >window = np.ones(window_len) / window_len >y_meas = np.convolve(y, window, mode=mode) >y_meas += 0.2 * np.random.rand(y_meas.size) - 0.1 >if mode == 'full': > xstep = x[1] - x[0] > x_meas = np.concatenate([ \ > np.linspace(x[0] - window_len//2 * xstep, x[0] - xstep, >num=window_len//2), > x, > np.linspace(x[-1] + xstep, x[-1] + window_len//2 * xstep, >num=window_len//2)]) >elif mode == 'valid': > x_meas = x[window_len//2:-window_len//2+1] >elif mode == 'same': > x_meas = x > ># Approximating spline >xs = np.linspace(0, 10, num=500) >knots = np.array([1, 3, 5, 7, 9]) >tck = interpolate.splrep(x_meas, y_meas, s=0, k=3, t=knots, task=-1) >ys = interpolate.splev(xs, tck, der=0) > ># Find (low-frequency part of) original signal by deconvolution of smoothed ># measured signal and known window. >y_deconv = signal.deconvolve(ys, window)[0] #FIXME > ># Plot all signals >fig = plt.figure() >ax = fig.add_subplot(111) > >ax.plot(x, y, 'b-', label="Original signal") >ax.plot(x_meas, y_meas, 'r-', label="Measured, noisy signal") >ax.plot(xs, ys, 'g-', label="Approximating spline") >ax.plot(xs[window.size//2-1:-window.size//2], y_deconv, 'k-', > label="signal.deconvolve result") >ax.set_ylim([-1.2, 2]) >ax.legend() > >plt.show() > > >_______________________________________________ >SciPy-User mailing list >SciPy-User at scipy.org >http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Aug 1 10:14:13 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Aug 2011 08:14:13 -0600 Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: References: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> Message-ID: On Sun, Jul 31, 2011 at 11:20 PM, Anne Archibald wrote: > I realize this discussion has gone rather far afield from efficient 1D > deconvolution, but we do a funny thing in radio interferometry, and > I'm curious whether this is normal for other kinds of deconvolution as > well. > > In radio interferometry we obtain our images convolved with the > so-called "dirty beam", a convolution kernel that has a nice narrow > peak but usually a chaos of monstrous sidelobes often only marginally > smaller than the main lobe. We use a different regularization > condition to do our deconvolution: we treat the underlying image as a > modest collection of point sources. (One can see why this appeals to > astronomers.) Through an iterative process (the "CLEAN" algorithm and > its many descendants) we obtain an estimate of this underlying image. > But we very rarely actually work with this image directly. We normally > convolve it with a sort of idealized version of our kernel without all > the sidelobes. This then gives an image one might have obtained from a > normal telescope the size of the interferometer array. (Apart from all > the CLEAN artifacts.) > > What I'm wondering is, is this final step of convolving with an > idealized version of the kernel standard practice elsewhere? > > That's interesting. It sounds like fitting a parametric model, which yields points, followed by a smoothing that in some sense represents the error. Are there frequency aliasing problems associated with the deconvolution? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Aug 1 10:19:34 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Aug 2011 08:19:34 -0600 Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: <1312182198.97186.YahooMailRC@web113409.mail.gq1.yahoo.com> References: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> <1312182198.97186.YahooMailRC@web113409.mail.gq1.yahoo.com> Message-ID: On Mon, Aug 1, 2011 at 1:03 AM, David Baddeley wrote: > Hi Ralf, > > 5-15 times smaller would probably be fine, although you might want to watch > the edges in the reconstruction - if they're at different dc levels you'll > get edge artifacts (within ~ 1 kernel width of the edges). I'd tend to avoid > spline filtering (or any form of noise reduction) before deconvolution, as > this will also transform the data in a way not explained by the model you're > using to deconvolve with. > > Weiner filtering is a 2 liner -> > > H = fft(kernel) > deconvolved = ifftshift(ifft(fft(signal)*np.conj(H)/(H*np.conj(H) + > lambda**2))) > > where lambda is your regularisation parameter, and white noise is assumed. > There are various methods for choosing lambda optimally, but most people > tend to use trial and error. > > Iterative methods are typically ~1-2 orders of magnitude slower than a > Weiner filter, but with fast fft libraries and modern computers still quite > reasonable for modest data sizes (a 3D image stack of ~ 512x512x50 pixels > will tend to be done in a bit under a minute, can't really comment on 1D > data, but unless your signal is very long I'd expect it to be significantly > quicker). Ffts scale with O(nlogn) so will generally dramatically outperform > things based on a simple convolution or filtering approaches (O(n**2)) for > large n. This might make an iterative approach using ffts faster than > something like scipy.signal.deconvolve if your kernel is large. > > The main problem with Weiner filtering is that it assumes that both the signal and noise are Gaussian. For instance, if you are looking for spikes in noise, the amplitudes of the spikes would have a Gaussian distribution. The Weiner filter is then the Bayesian estimate that follows from those assumptions, but those might not be the best assumptions for the data. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rigal at rapideye.de Mon Aug 1 13:47:30 2011 From: rigal at rapideye.de (Matthieu Rigal) Date: Mon, 1 Aug 2011 19:47:30 +0200 Subject: [SciPy-User] wrong output shape calculation in scipy.ndimage.interpolation.zoom Message-ID: <201108011947.30560.rigal@rapideye.de> Hi guys, I just detected a problem with the output shape calculation when running a zoom function. Sometimes it returns an odd value, here is an example : >>> import numpy >>> from scipy.ndimage.interpolation import zoom >>> aT = numpy.ones((5000,5000)) >>> aT2 = numpy.ones((556,463)) >>> aT3 = zoom(aT2, (float(aT.shape[0])/aT2.shape[0], float(aT.shape[1])/aT2.shape[1])) >>> aT3.shape (4999, 5000) Whereas adding a very little incrementation factor produces it right : >>> aT3 = zoom(aT2, (1.00001*float(aT.shape[0])/aT2.shape[0], 1.00001*float(aT.shape[1])/aT2.shape[1])) >>> aT3.shape (5000, 5000) There must be somewhere a problem with the roundings... should I file a ticket ? Regards, Matthieu RapidEye AG Molkenmarkt 30 14776 Brandenburg an der Havel Germany Follow us on Twitter! www.twitter.com/rapideye_ag Head Office/Sitz der Gesellschaft: Brandenburg an der Havel Management Board/Vorstand: Wolfgang G. Biedermann, Frederik Jung-Rothenhaeusler Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates: Juergen Breitkopf Commercial Register/Handelsregister Potsdam HRB 17 796 Tax Number/Steuernummer: 048/100/00053 VAT-Ident-Number/Ust.-ID: DE 199331235 DIN EN ISO 9001 certified ************************************************************************* Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist nicht gestattet. The information in this e-mail is intended for the named recipients only. It may contain privileged and confidential information. If you have received this communication in error, any use, copying or dissemination of its contents is strictly prohibited. Please erase all copies of the message along with any included attachments and notify RapidEye AG or the sender immediately by telephone at the number indicated on this page. From gustavo.goretkin at gmail.com Mon Aug 1 15:23:38 2011 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Mon, 1 Aug 2011 15:23:38 -0400 Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function Message-ID: I am using the Gaussian Process module in scikit-learn. It uses optimize.fmin_cobyla to find the best hyper-parameters. It looks like, though, that fmin_cobyla is, after a couple of iterations, feeding nan to the objective function. Any ideas? scipy.__version__ = '0.10.0.dev7180' Thanks, Gustavo -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjordan1 at uw.edu Mon Aug 1 15:49:20 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Mon, 1 Aug 2011 14:49:20 -0500 Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function In-Reply-To: References: Message-ID: Could you send the code that's causing the problem? -Chris Jordan-Squire On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin wrote: > I am using the Gaussian Process module in scikit-learn. It uses > optimize.fmin_cobyla to find the best hyper-parameters. It looks like, > though, that fmin_cobyla is, after a couple of iterations, feeding nan to > the objective function. Any ideas? > > scipy.__version__ = '0.10.0.dev7180' > > Thanks, > Gustavo > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Mon Aug 1 16:22:18 2011 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 1 Aug 2011 22:22:18 +0200 Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: References: Message-ID: Hi Ralf, 2011/7/31 Ralf Gommers : > For a measured signal that is the convolution of a real signal with a > response function, plus measurement noise on top, I want to recover the real > signal. Since I know what the response function is and the noise is > high-frequency compared to the real signal, a straightforward approach is to > smooth the measured signal (or fit a spline to it), then remove the response > function by deconvolution. See example code below. I ran across this (see below) soon ago since I'm dealing with information theory recently. It has an deconvolution example included in 1D, and it compares some different general methods in a kind-of "unified framework", as far as this exists. I found it quite informative and helpful. If you can't get access I can get it from the library in 2 weeks. The citation is: Robert L. Fry (ed.), Bayesian Inference and Maximum Entropy Methods in Science and Engineering: 21st International Workshop, Baltimore, Maryland, AIP Conf. Proc. 617 (2002) ISBN 0-7354-0063-6; ISSN 0094-243X Tutorial "Bayesian Inference for Inverse Problems" (A. Mohammad-Djafari) on page 477ff. It includes different noise models, afair, at least the structure how to deal with this. If I'm not mistaken the problem discussed there was a mass-spectrometry spectrum, so should been shot noise mainly, and of course the psf. The tutorial covers (in short) maximum entropy as well as maximum likelihood, and a combination of both (hence the "unification"). I cannot help much with this since I'm new to it myself. But I did a reasonable literature search, and this was one of the best outcomes. But as said, I was about information theory. Hope this is a useful pointer, Friedrich > Can anyone point me towards code that does the deconvolution efficiently? > Perhaps signal.deconvolve would do the trick, but I can't seem to make it > work (except for directly on the output of np.convolve(y, window, > mode='valid')). No. In fact, I don't think there is an automagical solution anywhere. :-) Good luck! From aarchiba at physics.mcgill.ca Mon Aug 1 17:07:48 2011 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Mon, 1 Aug 2011 17:07:48 -0400 Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: References: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> Message-ID: On 1 August 2011 10:14, Charles R Harris wrote: > > > On Sun, Jul 31, 2011 at 11:20 PM, Anne Archibald > wrote: >> >> I realize this discussion has gone rather far afield from efficient 1D >> deconvolution, but we do a funny thing in radio interferometry, and >> I'm curious whether this is normal for other kinds of deconvolution as >> well. >> >> In radio interferometry we obtain our images convolved with the >> so-called "dirty beam", a convolution kernel that has a nice narrow >> peak but usually a chaos of monstrous sidelobes often only marginally >> smaller than the main lobe. We use a different regularization >> condition to do our deconvolution: we treat the underlying image as a >> modest collection of point sources. (One can see why this appeals to >> astronomers.) Through an iterative process (the "CLEAN" algorithm and >> its many descendants) we obtain an estimate of this underlying image. >> But we very rarely actually work with this image directly. We normally >> convolve it with a sort of idealized version of our kernel without all >> the sidelobes. This then gives an image one might have obtained from a >> normal telescope the size of the interferometer array. (Apart from all >> the CLEAN artifacts.) >> >> What I'm wondering is, is this final step of convolving with an >> idealized version of the kernel standard practice elsewhere? >> > > That's interesting. It sounds like fitting a parametric model, which yields > points, followed by a smoothing that in some sense represents the error. Are > there frequency aliasing problems associated with the deconvolution? It's very like fitting a parametric model, yes, except that we don't care much about the model parameters. In fact we often end up with models that have clusters of "point sources" with positive and negative emissions trying to match up with what is in reality a single point source. This can be due to inadequacies of the dirty beam model (though usually we have a decent estimate) or simply noise. In any case smoothing with an idealized main lobe makes us much less sensitive to this kind of junk. Plus if you're going to do this anyway, it can make life much easier to constrain your point sources to a grid. (As an aside, this trick - of fitting a parametric model but then extracting "observational" parameters for comparison to reduce model-sensitivity - came up with some X-ray spectral data I was looking at: you need to use a model to pull out the instrumental effects, but if you report (say) the model luminosity in a band your instrument can detect, then it doesn't much matter whether your model thinks the photons are thermal or power-law. In principle you can even do this trick with published model parameters, but you run into the problem that people don't give full covariance matrices for the fitted parameters so you get spurious uncertainties.) As far as frequency aliasing, there's not so much coming from the deconvolution, since our beam is so irregular. The actual observation samples image spatial frequencies rather badly; it's the price we pay for not having a filled aperture. So we're often simply missing information on spatial frequencies, most often the lowest ones (because there's a limit on how close you can put tracking dishes together without shadowing). But I don't think this is a deconvolution issue; in fact in situations where people are really pushing the limits of interferometry, like the millimeter-wave interferometric observations of the black hole at the center of our galaxy, you often give up on producing an image at all and fit (say) an emission model including the event horizon to the observed spatial frequencies directly. Anne From gustavo.goretkin at gmail.com Mon Aug 1 18:06:37 2011 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Mon, 1 Aug 2011 18:06:37 -0400 Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function In-Reply-To: References: Message-ID: The code depends on scikit-learn. I'll post the issue there if you think the problem is related to that module. My thinking is that fmin_cobyla shouldn't be feeding nan to the objective function. The code that causes the exception is gp_error.py . I made a change to one of the functions in scikit-learn, so I just included that file too. Just keep both files in the same directory. Thanks for the help. Gustavo On Mon, Aug 1, 2011 at 3:49 PM, Christopher Jordan-Squire wrote: > Could you send the code that's causing the problem? > > -Chris Jordan-Squire > > On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin < > gustavo.goretkin at gmail.com> wrote: > >> I am using the Gaussian Process module in scikit-learn. It uses >> optimize.fmin_cobyla to find the best hyper-parameters. It looks like, >> though, that fmin_cobyla is, after a couple of iterations, feeding nan to >> the objective function. Any ideas? >> >> scipy.__version__ = '0.10.0.dev7180' >> >> Thanks, >> Gustavo >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Aug 1 19:48:06 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Aug 2011 17:48:06 -0600 Subject: [SciPy-User] deconvolution of 1-D signals In-Reply-To: References: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com> Message-ID: On Mon, Aug 1, 2011 at 3:07 PM, Anne Archibald wrote: > On 1 August 2011 10:14, Charles R Harris > wrote: > > > > > > On Sun, Jul 31, 2011 at 11:20 PM, Anne Archibald > > wrote: > >> > >> I realize this discussion has gone rather far afield from efficient 1D > >> deconvolution, but we do a funny thing in radio interferometry, and > >> I'm curious whether this is normal for other kinds of deconvolution as > >> well. > >> > >> In radio interferometry we obtain our images convolved with the > >> so-called "dirty beam", a convolution kernel that has a nice narrow > >> peak but usually a chaos of monstrous sidelobes often only marginally > >> smaller than the main lobe. We use a different regularization > >> condition to do our deconvolution: we treat the underlying image as a > >> modest collection of point sources. (One can see why this appeals to > >> astronomers.) Through an iterative process (the "CLEAN" algorithm and > >> its many descendants) we obtain an estimate of this underlying image. > >> But we very rarely actually work with this image directly. We normally > >> convolve it with a sort of idealized version of our kernel without all > >> the sidelobes. This then gives an image one might have obtained from a > >> normal telescope the size of the interferometer array. (Apart from all > >> the CLEAN artifacts.) > >> > >> What I'm wondering is, is this final step of convolving with an > >> idealized version of the kernel standard practice elsewhere? > >> > > > > That's interesting. It sounds like fitting a parametric model, which > yields > > points, followed by a smoothing that in some sense represents the error. > Are > > there frequency aliasing problems associated with the deconvolution? > > It's very like fitting a parametric model, yes, except that we don't > care much about the model parameters. In fact we often end up with > models that have clusters of "point sources" with positive and > negative emissions trying to match up with what is in reality a single > point source. This can be due to inadequacies of the dirty beam model > (though usually we have a decent estimate) or simply noise. In any > case smoothing with an idealized main lobe makes us much less > sensitive to this kind of junk. Plus if you're going to do this > anyway, it can make life much easier to constrain your point sources > to a grid. > > (As an aside, this trick - of fitting a parametric model but then > extracting "observational" parameters for comparison to reduce > model-sensitivity - came up with some X-ray spectral data I was > looking at: you need to use a model to pull out the instrumental > effects, but if you report (say) the model luminosity in a band your > instrument can detect, then it doesn't much matter whether your model > thinks the photons are thermal or power-law. In principle you can even > do this trick with published model parameters, but you run into the > problem that people don't give full covariance matrices for the fitted > parameters so you get spurious uncertainties.) > > As far as frequency aliasing, there's not so much coming from the > deconvolution, since our beam is so irregular. The actual observation > samples image spatial frequencies rather badly; it's the price we pay > for not having a filled aperture. So we're often simply missing > information on spatial frequencies, most often the lowest ones > (because there's a limit on how close you can put tracking dishes > together without shadowing). But I don't think this is a deconvolution > issue; in fact in situations where people are really pushing the > limits of interferometry, like the millimeter-wave interferometric > observations of the black hole at the center of our galaxy, you often > give up on producing an image at all and fit (say) an emission model > including the event horizon to the observed spatial frequencies > directly. > > Thanks Anne, it's a good trick to know about. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From R.Springuel at umit.maine.edu Mon Aug 1 20:36:07 2011 From: R.Springuel at umit.maine.edu (R. Padraic Springuel) Date: Mon, 01 Aug 2011 20:36:07 -0400 Subject: [SciPy-User] numpy, scipy, and python 3 Message-ID: <4E374677.7010709@umit.maine.edu> I read in the readme for numpy 1.6.1 and scipy 0.9.0 that both support python 3, but I can't find and Mac installation files for either package that work with any version of python 3. Anyone know where I can get some or do I need to build from source (something I haven't done in a while, but should, theoretically, be able to do)? -- R. Padraic Springuel, PhD From wardefar at iro.umontreal.ca Mon Aug 1 20:52:08 2011 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Mon, 1 Aug 2011 20:52:08 -0400 Subject: [SciPy-User] numpy, scipy, and python 3 In-Reply-To: <4E374677.7010709@umit.maine.edu> References: <4E374677.7010709@umit.maine.edu> Message-ID: <4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca> On 2011-08-01, at 8:36 PM, R. Padraic Springuel wrote: > I read in the readme for numpy 1.6.1 and scipy 0.9.0 that both support > python 3, but I can't find and Mac installation files for either package > that work with any version of python 3. Anyone know where I can get > some or do I need to build from source (something I haven't done in a > while, but should, theoretically, be able to do)? I don't think Ralf has been building Python 3 binaries. I'm guessing that there isn't much demand, as other parts of the tool stack have yet to make the jump to Python 3 (notably matplotlib). However, OS X is probably the easiest platform on which to build NumPy. SciPy, at last glance, required you grab a certain version of gfortran (this one: http://r.research.att.com/tools/ but check the docs in case this has changed), but is otherwise a straightforward "python setup.py build && sudo python setup.py install" affair as well. Let the list know if you have problems. David From jason.heeris at gmail.com Mon Aug 1 22:19:02 2011 From: jason.heeris at gmail.com (Jason Heeris) Date: Tue, 2 Aug 2011 10:19:02 +0800 Subject: [SciPy-User] Vectorised convolution Message-ID: I'm using the scipy.signal.convolve function on an ndarray that represents independent sets of data (each set is a row). It seems that with this function I need to manually split up the rows to work on them independently, otherwise it does a 2D convolution: for idx in xrange(0, S): conv[idx] = sp.signal.convolve(inputs[idx], other, mode='full') Is there a vectorised version of this function? In other words, if I were doing an FFT I'd use np.fft.fft(inputs, axis=1) ? is it possible to do a single axis convolution on a 2D array? Cheers, Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_baddeley at yahoo.com.au Mon Aug 1 22:37:54 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Mon, 1 Aug 2011 19:37:54 -0700 (PDT) Subject: [SciPy-User] Vectorised convolution In-Reply-To: References: Message-ID: <1312252674.23043.YahooMailRC@web113401.mail.gq1.yahoo.com> try scipy.ndimage.convolve1d It doesn't seem to support mode='full' though cheers, David ________________________________ From: Jason Heeris To: SciPy Users List Sent: Tue, 2 August, 2011 2:19:02 PM Subject: [SciPy-User] Vectorised convolution I'm using the scipy.signal.convolve function on an ndarray that represents independent sets of data (each set is a row). It seems that with this function I need to manually split up the rows to work on them independently, otherwise it does a 2D convolution: for idx in xrange(0, S): conv[idx] = sp.signal.convolve(inputs[idx], other, mode='full') Is there a vectorised version of this function? In other words, if I were doing an FFT I'd use np.fft.fft(inputs, axis=1) ? is it possible to do a single axis convolution on a 2D array? Cheers, Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Mon Aug 1 22:57:25 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 1 Aug 2011 21:57:25 -0500 Subject: [SciPy-User] Vectorised convolution In-Reply-To: References: Message-ID: On Mon, Aug 1, 2011 at 9:19 PM, Jason Heeris wrote: > I'm using the scipy.signal.convolve function on an ndarray that represents > independent sets of data (each set is a row). It seems that with this > function I need to manually split up the rows to work on them independently, > otherwise it does a 2D convolution: > > ? ? for idx in xrange(0, S): > ? ? ? ? conv[idx] = sp.signal.convolve(inputs[idx], other, mode='full') > Is there a vectorised version of this function? In other words, if I were > doing an FFT I'd use?np.fft.fft(inputs, axis=1) ? is it possible to do a > single axis convolution on a 2D array? I show one way to do this in the following SciPy cookbook entry: http://www.scipy.org/Cookbook/ApplyFIRFilter In particular, see the second paragraph. Warren > Cheers, > Jason > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From jason.heeris at gmail.com Mon Aug 1 23:02:18 2011 From: jason.heeris at gmail.com (Jason Heeris) Date: Tue, 2 Aug 2011 11:02:18 +0800 Subject: [SciPy-User] Vectorised convolution In-Reply-To: <1312252674.23043.YahooMailRC@web113401.mail.gq1.yahoo.com> References: <1312252674.23043.YahooMailRC@web113401.mail.gq1.yahoo.com> Message-ID: On 2 August 2011 10:37, David Baddeley wrote: > try scipy.ndimage.convolve1d > > It doesn't seem to support mode='full' though > > That's easy to implement by zero padding both input arrays. Between this and Warren's answer I should be able to do it. Thanks! ? Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Aug 2 02:26:52 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 2 Aug 2011 08:26:52 +0200 Subject: [SciPy-User] numpy, scipy, and python 3 In-Reply-To: <4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca> References: <4E374677.7010709@umit.maine.edu> <4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca> Message-ID: On Tue, Aug 2, 2011 at 2:52 AM, David Warde-Farley < wardefar at iro.umontreal.ca> wrote: > On 2011-08-01, at 8:36 PM, R. Padraic Springuel wrote: > > > I read in the readme for numpy 1.6.1 and scipy 0.9.0 that both support > > python 3, but I can't find and Mac installation files for either package > > that work with any version of python 3. Anyone know where I can get > > some or do I need to build from source (something I haven't done in a > > while, but should, theoretically, be able to do)? > > I don't think Ralf has been building Python 3 binaries. I'm guessing that > there isn't much demand, as other parts of the tool stack have yet to make > the jump to Python 3 (notably matplotlib). > > I haven't. It still requires some work since bdist_mpkg (used for the 2.x binaries) doesn't support python 3.x. > However, OS X is probably the easiest platform on which to build NumPy. > SciPy, at last glance, required you grab a certain version of gfortran (this > one: http://r.research.att.com/tools/ but check the docs in case this has > changed), but is otherwise a straightforward "python setup.py build && sudo > python setup.py install" affair as well. Let the list know if you have > problems. > > That's the right site to grab gfortran from, but note that if you're on Lion you need a new binary that wasn't on the site last time I checked. Easiest is to install it through homebrew. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.heeris at gmail.com Tue Aug 2 02:35:48 2011 From: jason.heeris at gmail.com (Jason Heeris) Date: Tue, 2 Aug 2011 14:35:48 +0800 Subject: [SciPy-User] Vectorised convolution In-Reply-To: References: Message-ID: On 2 August 2011 10:57, Warren Weckesser wrote: > I show one way to do this in the following SciPy cookbook entry: > Interesting ? I just tried that approach and found that it was actually slower than the looped version, which seems weird. But the convolve1d version works great (less than a tenth of the time as my loop) and the lfilter version is almost as good. Thanks, Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From aarchiba at physics.mcgill.ca Tue Aug 2 02:56:10 2011 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Tue, 2 Aug 2011 02:56:10 -0400 Subject: [SciPy-User] Vectorised convolution In-Reply-To: References: Message-ID: The blunt-instrument approach is to pad your image with zeros then flatten it. If you're careful about memory layouts this can even be done with a view. Either way you get a single long one-dimensional array you can apply unvectorized 1D convolutions to. You can then reshape the output back to two dimensions, clipping out the padding as appropriate in the process. The big drawback is that you have to pad the whole image at once, which could be a memory hog if your kernel is large. Anne On 2 August 2011 02:35, Jason Heeris wrote: > On 2 August 2011 10:57, Warren Weckesser > wrote: >> >> I show one way to do this in the following SciPy cookbook entry: > > Interesting ? I just tried that approach and found that it was actually > slower than the looped version, which seems weird. > But the convolve1d version works great (less than a tenth of the time as my > loop) and the lfilter version is almost as good. > Thanks, > Jason > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From timmichelsen at gmx-topmail.de Tue Aug 2 03:37:29 2011 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 2 Aug 2011 07:37:29 +0000 (UTC) Subject: [SciPy-User] Status of TimeSeries SciKit References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk> <7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com> <7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com> <20110726222843.GB8920@phare.normalesup.org> <20110727141251.GB30024@phare.normalesup.org> Message-ID: > >> I agree. I already have 50% or more of the features in > >> scikits.timeseries, so this gets back to my fragmentation argument > >> (users being stuck with a confusing choice between multiple > >> libraries). Let's make it happen! > > So what needs to be done to move things forward? > > Do we need to draw up a roadmap? > > A table with functions that respond to common use cases in natual > > science, computing, and economics? > Having a place to collect concrete use cases (like your list from the > prior e-mail, but with illustrative code snippets) would be good. > You're welcome to start doing it here: > > https://github.com/wesm/pandas/wiki Here goes: https://github.com/wesm/pandas/wiki/Time-Series-Manipulation I will fill it with my stuff. Shall we file feature request directly as issues? > A good place to start, which I can do when I have some time, would be > to start moving the scikits.timeseries code into pandas. There are > several key components > > - Date and DateArray stuff, frequency implementations > - masked array time series implementations (record array and not) > - plotting > - reporting, moving window functions, etc. > > We need to evaluate Date/DateArray as they relate to numpy.datetime64 > and see what can be done. I haven't looked closely but I'm not sure if > all the convenient attribute access stuff (day, month, day_of_week, > weekday, etc.) is available in NumPy yet. I suspect it would be > reasonably straightforward to wrap DateArray so it can be an Index for > a pandas object. > > I won't have much time for this until mid-August, but a couple days' > hacking should get most of the pieces into place. I guess we can just > keep around the masked array classes for legacy API support and for > feature completeness. I value very much the work of Pierre and Matt. But my difficulti with the scikit was that the code is too complex. So I was only able to contribute helper functions for doc fixes. Please, lets make it happen that this effort is not a on or 3 man show but results in something whcih can be maintained by the whole community. Nevertheless, the timeseries scikit made my work more comfortable and understadable than I was able to manage with R. Regards, Timmie From pgmdevlist at gmail.com Tue Aug 2 03:49:05 2011 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 2 Aug 2011 09:49:05 +0200 Subject: [SciPy-User] Status of TimeSeries SciKit In-Reply-To: References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk> <7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com> <7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com> <20110726222843.GB8920@phare.normalesup.org> <20110727141251.GB30024@phare.normalesup.org> Message-ID: On Aug 2, 2011, at 9:37 AM, Tim Michelsen wrote: >>>> I agree. I already have 50% or more of the features in >>>> scikits.timeseries, so this gets back to my fragmentation argument >>>> (users being stuck with a confusing choice between multiple >>>> libraries). Let's make it happen! >>> So what needs to be done to move things forward? >>> Do we need to draw up a roadmap? >>> A table with functions that respond to common use cases in natual >>> science, computing, and economics? >> Having a place to collect concrete use cases (like your list from the >> prior e-mail, but with illustrative code snippets) would be good. >> You're welcome to start doing it here: >> >> https://github.com/wesm/pandas/wiki > Here goes: > https://github.com/wesm/pandas/wiki/Time-Series-Manipulation > > I will fill it with my stuff. > Shall we file feature request directly as issues? > >> A good place to start, which I can do when I have some time, would be >> to start moving the scikits.timeseries code into pandas. There are >> several key components >> >> - Date and DateArray stuff, frequency implementations >> - masked array time series implementations (record array and not) >> - plotting >> - reporting, moving window functions, etc. >> >> We need to evaluate Date/DateArray as they relate to numpy.datetime64 >> and see what can be done. I haven't looked closely but I'm not sure if >> all the convenient attribute access stuff (day, month, day_of_week, >> weekday, etc.) is available in NumPy yet. I suspect it would be >> reasonably straightforward to wrap DateArray so it can be an Index for >> a pandas object. >> >> I won't have much time for this until mid-August, but a couple days' >> hacking should get most of the pieces into place. I guess we can just >> keep around the masked array classes for legacy API support and for >> feature completeness. > I value very much the work of Pierre and Matt. > But my difficulti with the scikit was that the code is too complex. So I was > only able to contribute helper functions for doc fixes. > Please, lets make it happen that this effort is not a on or 3 man show but > results in something whcih can be maintained by the whole community. The apparent complexity of the code comes likely from the fact that some features were coded directly in C (not even Cython) for efficiency. That, and that it relied on MaskedArray, of course ;) > Nevertheless, the timeseries scikit made my work more comfortable and > understadable than I was able to manage with R. Great ! That was the purpose. From warren.weckesser at enthought.com Tue Aug 2 08:47:53 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 2 Aug 2011 07:47:53 -0500 Subject: [SciPy-User] Vectorised convolution In-Reply-To: References: Message-ID: On Tue, Aug 2, 2011 at 1:35 AM, Jason Heeris wrote: > On 2 August 2011 10:57, Warren Weckesser > wrote: >> >> I show one way to do this in the following SciPy cookbook entry: > > Interesting ? I just tried that approach and found that it was actually > slower than the looped version, which seems weird. > But the convolve1d version works great (less than a tenth of the time as my > loop) and the lfilter version is almost as good. That's good to know. I just updated http://www.scipy.org/Cookbook/ApplyFIRFilter to include ndimage.convolve1d. Warren > Thanks, > Jason > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ralf.gommers at googlemail.com Tue Aug 2 09:58:48 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 2 Aug 2011 15:58:48 +0200 Subject: [SciPy-User] wrong output shape calculation in scipy.ndimage.interpolation.zoom In-Reply-To: <201108011947.30560.rigal@rapideye.de> References: <201108011947.30560.rigal@rapideye.de> Message-ID: On Mon, Aug 1, 2011 at 7:47 PM, Matthieu Rigal wrote: > Hi guys, > > I just detected a problem with the output shape calculation when running a > zoom function. > Sometimes it returns an odd value, here is an example : > >>> import numpy > >>> from scipy.ndimage.interpolation import zoom > >>> aT = numpy.ones((5000,5000)) > >>> aT2 = numpy.ones((556,463)) > >>> aT3 = zoom(aT2, (float(aT.shape[0])/aT2.shape[0], > float(aT.shape[1])/aT2.shape[1])) > >>> aT3.shape > (4999, 5000) > > Whereas adding a very little incrementation factor produces it right : > >>> aT3 = zoom(aT2, (1.00001*float(aT.shape[0])/aT2.shape[0], > 1.00001*float(aT.shape[1])/aT2.shape[1])) > >>> aT3.shape > (5000, 5000) > > There must be somewhere a problem with the roundings... should I file a > ticket > ? > The zoom function seems to round down when non-integer shapes are requested. This is more a problem with the interface than an actual bug. Your first zoom factor times the input axis size gives: >>> (float(aT.shape[0])/aT2.shape[0]) * aT2.shape[0] 4999.9999999999991 The zoom function can't know whether you want an array of size 4999 or 5000 if you pass in a zoom factor that implies an output shape of 4999.xxx. A patch for zoom to accept an `output_shape` keyword that would override the `zoom` parameter may be useful. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From gustavo.goretkin at gmail.com Tue Aug 2 15:53:46 2011 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Tue, 2 Aug 2011 15:53:46 -0400 Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function In-Reply-To: References: Message-ID: Okay here is a much simpler test case: https://gist.github.com/1121046 The print statements, for me, show that *objective* is being passed nan. Does this warrant a ticket? Should I continue this discussion on the development list? Gustavo On Mon, Aug 1, 2011 at 6:06 PM, Gustavo Goretkin wrote: > The code depends on scikit-learn. I'll post the issue there if you think > the problem is related to that module. My thinking is that fmin_cobyla > shouldn't be feeding nan to the objective function. > > The code that causes the exception is gp_error.py . I made a change to one > of the functions in scikit-learn, so I just included that file too. Just > keep both files in the same directory. > > Thanks for the help. > > Gustavo > > > > On Mon, Aug 1, 2011 at 3:49 PM, Christopher Jordan-Squire > wrote: > >> Could you send the code that's causing the problem? >> >> -Chris Jordan-Squire >> >> On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin < >> gustavo.goretkin at gmail.com> wrote: >> >>> I am using the Gaussian Process module in scikit-learn. It uses >>> optimize.fmin_cobyla to find the best hyper-parameters. It looks like, >>> though, that fmin_cobyla is, after a couple of iterations, feeding nan to >>> the objective function. Any ideas? >>> >>> scipy.__version__ = '0.10.0.dev7180' >>> >>> Thanks, >>> Gustavo >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Aug 2 16:42:25 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 2 Aug 2011 22:42:25 +0200 Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function In-Reply-To: References: Message-ID: On Tue, Aug 2, 2011 at 9:53 PM, Gustavo Goretkin wrote: > Okay here is a much simpler test case: https://gist.github.com/1121046 > > The print statements, for me, show that *objective* is being passed nan. > Does this warrant a ticket? Should I continue this discussion on the > development list? > > Your function always returns inf, so it's not very surprising that you get a nan after a few iterations. Could happen for example if the code determines a derivative numerically, resulting in inf / inf = nan. It would be helpful if you had a realistic, self-contained example. Ralf Gustavo > > On Mon, Aug 1, 2011 at 6:06 PM, Gustavo Goretkin < > gustavo.goretkin at gmail.com> wrote: > >> The code depends on scikit-learn. I'll post the issue there if you think >> the problem is related to that module. My thinking is that fmin_cobyla >> shouldn't be feeding nan to the objective function. >> >> The code that causes the exception is gp_error.py . I made a change to one >> of the functions in scikit-learn, so I just included that file too. Just >> keep both files in the same directory. >> >> Thanks for the help. >> >> Gustavo >> >> >> >> On Mon, Aug 1, 2011 at 3:49 PM, Christopher Jordan-Squire < >> cjordan1 at uw.edu> wrote: >> >>> Could you send the code that's causing the problem? >>> >>> -Chris Jordan-Squire >>> >>> On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin < >>> gustavo.goretkin at gmail.com> wrote: >>> >>>> I am using the Gaussian Process module in scikit-learn. It uses >>>> optimize.fmin_cobyla to find the best hyper-parameters. It looks like, >>>> though, that fmin_cobyla is, after a couple of iterations, feeding nan to >>>> the objective function. Any ideas? >>>> >>>> scipy.__version__ = '0.10.0.dev7180' >>>> >>>> Thanks, >>>> Gustavo >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gustavo.goretkin at gmail.com Tue Aug 2 16:55:46 2011 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Tue, 2 Aug 2011 16:55:46 -0400 Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function In-Reply-To: References: Message-ID: > > >> Your function always returns inf, so it's not very surprising that you get > a nan after a few iterations. Could happen for example if the code > determines a derivative numerically, resulting in inf / inf = nan. > > It would be helpful if you had a realistic, self-contained example. > > Raln > In scikit-learn, fmin_cobyla is used to optimize some parameters of a Gaussian Process. The objective function returns inf when the parameters are such that the matrix calculations are unstable and NumPy throws a LinAlg exception. What would be a better way to handle this? My gut feeling is that an optimizer should not pass nan to the objective function, since it cannot possibly be informative. Maybe checking for nan would be inefficient. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Wed Aug 3 03:40:46 2011 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 3 Aug 2011 00:40:46 -0700 Subject: [SciPy-User] [ANN] IPython 0.11 is officially out In-Reply-To: References: Message-ID: On Sun, Jul 31, 2011 at 10:19 AM, Fernando Perez wrote: > Please see our release notes for the full details on everything about > this release: https://github.com/ipython/ipython/zipball/rel-0.11 And embarrassingly, that URL was for a zip download instead (copy/paste error), the detailed release notes are here: http://ipython.org/ipython-doc/rel-0.11/whatsnew/version0.11.html Sorry about the mistake... Cheers, f From ralf.gommers at googlemail.com Wed Aug 3 12:05:33 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 3 Aug 2011 18:05:33 +0200 Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function In-Reply-To: References: Message-ID: On Tue, Aug 2, 2011 at 10:55 PM, Gustavo Goretkin < gustavo.goretkin at gmail.com> wrote: > >>> Your function always returns inf, so it's not very surprising that you >> get a nan after a few iterations. Could happen for example if the code >> determines a derivative numerically, resulting in inf / inf = nan. >> >> It would be helpful if you had a realistic, self-contained example. >> >> Raln >> > > In scikit-learn, fmin_cobyla is used to optimize some parameters of a > Gaussian Process. The objective function returns inf when the parameters are > such that the matrix calculations are unstable and NumPy throws a LinAlg > exception. What would be a better way to handle this? > Let the objective function do something sensible? Like figure out what the unstable region is and returning values that steer the optimizer away from it. With a slight modification to your last test script I see that fmin_cobyla doesn't choke on receiving a first inf from the objective function (see below). If it receives infs not for a single x, but several or a range, then I'd expect it to fail. from scipy.optimize import fmin_cobyla import numpy as np def objective(x): print 'Input: ', x, ' return value: ', x + 1./x return x + 1./x def constraint1(x): return 0 xstar = fmin_cobyla(func=objective, x0=0, cons=[constraint1]) Cheers, Ralf > My gut feeling is that an optimizer should not pass nan to the objective > function, since it cannot possibly be informative. Maybe checking for nan > would be inefficient. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From keith.hughitt at gmail.com Wed Aug 3 12:19:39 2011 From: keith.hughitt at gmail.com (Keith Hughitt) Date: Wed, 3 Aug 2011 09:19:39 -0700 (PDT) Subject: [SciPy-User] numpy, scipy, and python 3 In-Reply-To: References: <4E374677.7010709@umit.maine.edu> <4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca> Message-ID: <18374965.5317.1312388379274.JavaMail.geo-discussion-forums@yqkc21> Matplotlib actually has a Python 3 branch as well now: https://github.com/matplotlib/matplotlib-py3 I'm sure there is still work to be done, but I've used it for some basic plotting and so far it has worked well. Keith -------------- next part -------------- An HTML attachment was scrubbed... URL: From keith.hughitt at gmail.com Wed Aug 3 12:37:29 2011 From: keith.hughitt at gmail.com (Keith Hughitt) Date: Wed, 3 Aug 2011 09:37:29 -0700 (PDT) Subject: [SciPy-User] Having scipy.ndimage, etc. methods return ndarray subclass instances? Message-ID: <3536413.654.1312389449535.JavaMail.geo-discussion-forums@yqcj24> Hello, I'm currently working on creating a subclass of numpy.ndarray, and would like to ensure that other methods that work with ndarrays (e.g. http://docs.scipy.org/doc/scipy/reference/ndimage.html) return an instance of the subclass instead of an ndarray. After reading a discussion on the topic on Stack Overflow ( http://stackoverflow.com/questions/6190859/some-numpy-functions-return-ndarray-instead-of-my-subclass), I looked into adding/modifying __array_finalize__ and __array_wrap__ but neither of these appear to be called when I call a scipy.ndimage method such as median_filter. Is there a way I can extend my subclass so that I can ensure that a new subclass instance or view is returned instead of an ndarray? Any suggestions would be greatly appreciated. Thanks, Keith -------------- next part -------------- An HTML attachment was scrubbed... URL: From keith.hughitt at gmail.com Wed Aug 3 11:58:51 2011 From: keith.hughitt at gmail.com (Keith Hughitt) Date: Wed, 3 Aug 2011 08:58:51 -0700 (PDT) Subject: [SciPy-User] Having scipy.ndimage, etc. methods return ndarray subclass instances? Message-ID: <27260730.6415.1312387131805.JavaMail.geo-discussion-forums@yqbp37> Hello, I'm currently working on creating a subclass of numpy.ndarray, and would like to ensure that other methods that work with ndarrays (e.g. scipy.ndimage.* ) return an instance of the subclass instead of an ndarray. After reading a discussion on the topicon StackOverflow, I looked into adding/modifying __array_finalize__ and __array_wrap__. Neither of these appear to be called when I call a scipy.ndimage method such as median_filter . Is there a way I can extend my subclass so that I can ensure that a new subclass instance or view is returned instead of an ndarray? Any suggestions would be greatly appreciated. Thanks, Keith -------------- next part -------------- An HTML attachment was scrubbed... URL: From xavier.gnata at gmail.com Thu Aug 4 10:17:07 2011 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Thu, 4 Aug 2011 16:17:07 +0200 Subject: [SciPy-User] bug in optimize.curve_fit ? Message-ID: Hi, def func(x, a, b, c): return a*np.exp(-b*x) + c x = np.linspace(0,4,50) y = func(x, 2.5, 1.3, 0.5) yn = y + 0.2*np.random.normal(size=len(x)) popt, pcov = curve_fit(func,x, yn) works vey well but if you change it to: def func(x, a, b, c): return a*np.exp(-b*x) + c x = list(np.linspace(0,4,50)) y = func(x, 2.5, 1.3, 0.5) yn = y + 0.2*np.random.normal(size=len(x)) popt, pcov = curve_fit(func, x, yn) then x is a list and we get this error: "TypeError: can't multiply sequence by non-int of type 'float'" However, according to the documentation, xdata : An N-length sequence or an (k,N)-shaped array. I understand this statement as : "either a list, a tuple or an array". Should optimize.curve_fit internally cast xdata to an array? I would think so. Xavier From guziy.sasha at gmail.com Thu Aug 4 10:26:23 2011 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Thu, 4 Aug 2011 10:26:23 -0400 Subject: [SciPy-User] bug in optimize.curve_fit ? In-Reply-To: References: Message-ID: Hi, if you did x = list(x) it becomes a simple list, which cannot be multiplied by a number. don't do this. -- Oleksandr Huziy 2011/8/4 Xavier Gnata > Hi, > > def func(x, a, b, c): > return a*np.exp(-b*x) + c > x = np.linspace(0,4,50) > y = func(x, 2.5, 1.3, 0.5) > yn = y + 0.2*np.random.normal(size=len(x)) > popt, pcov = curve_fit(func,x, yn) > > works vey well but if you change it to: > > def func(x, a, b, c): > return a*np.exp(-b*x) + c > x = list(np.linspace(0,4,50)) > y = func(x, 2.5, 1.3, 0.5) > yn = y + 0.2*np.random.normal(size=len(x)) > popt, pcov = curve_fit(func, x, yn) > > then x is a list and we get this error: > "TypeError: can't multiply sequence by non-int of type 'float'" > > However, according to the documentation, xdata : An N-length sequence > or an (k,N)-shaped array. I understand this statement as : "either a > list, a tuple or an array". Should optimize.curve_fit internally cast > xdata to an array? I would think so. > > > > Xavier > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xavier.gnata at gmail.com Fri Aug 5 03:29:08 2011 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Fri, 05 Aug 2011 09:29:08 +0200 Subject: [SciPy-User] bug in optimize.curve_fit ? In-Reply-To: References: Message-ID: <4E3B9BC4.4000404@gmail.com> Hi, Yes but the doc ( http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html ) claims that xdata can be "An N-length sequence or an (k,N)-shaped array" IRL, I would not cast a array to a list to call optimize.curve_fit. x = list(x) was only an easy way to write a short testcase based on the exemple in the doc. Xavier > Hi, > > if you did x = list(x) it becomes a simple list, which cannot be > multiplied by a number. > don't do this. > > -- > Oleksandr Huziy > > > 2011/8/4 Xavier Gnata > > > Hi, > > def func(x, a, b, c): > return a*np.exp(-b*x) + c > x = np.linspace(0,4,50) > y = func(x, 2.5, 1.3, 0.5) > yn = y + 0.2*np.random.normal(size=len(x)) > popt, pcov = curve_fit(func,x, yn) > > works vey well but if you change it to: > > def func(x, a, b, c): > return a*np.exp(-b*x) + c > x = list(np.linspace(0,4,50)) > y = func(x, 2.5, 1.3, 0.5) > yn = y + 0.2*np.random.normal(size=len(x)) > popt, pcov = curve_fit(func, x, yn) > > then x is a list and we get this error: > "TypeError: can't multiply sequence by non-int of type 'float'" > > However, according to the documentation, xdata : An N-length sequence > or an (k,N)-shaped array. I understand this statement as : "either a > list, a tuple or an array". Should optimize.curve_fit internally cast > xdata to an array? I would think so. > > > > Xavier > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Aug 5 04:39:14 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Aug 2011 04:39:14 -0400 Subject: [SciPy-User] bug in optimize.curve_fit ? In-Reply-To: <4E3B9BC4.4000404@gmail.com> References: <4E3B9BC4.4000404@gmail.com> Message-ID: On Fri, Aug 5, 2011 at 3:29 AM, Xavier Gnata wrote: > Hi, > > Yes but the doc ( > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html > ) claims that xdata can be "An N-length sequence or an (k,N)-shaped array" > > IRL, I would not cast a array to a list to call optimize.curve_fit. > x = list(x) was only an easy way to write a short testcase based on the > exemple in the doc. > > Xavier > >> Hi, >> >> if you did x = list(x) it becomes a simple list, which cannot be >> multiplied by a number. >> don't do this. >> >> -- >> Oleksandr Huziy >> >> >> 2011/8/4 Xavier Gnata > > >> >> ? ? Hi, >> >> ? ? def func(x, a, b, c): >> ? ? ? ?return a*np.exp(-b*x) + c >> ? ? x = np.linspace(0,4,50) >> ? ? y = func(x, 2.5, 1.3, 0.5) >> ? ? yn = y + 0.2*np.random.normal(size=len(x)) >> ? ? popt, pcov = curve_fit(func,x, yn) >> >> ? ? works vey well but if you change it to: >> >> ? ? def func(x, a, b, c): >> ? ? ? ?return a*np.exp(-b*x) + c >> ? ? x = list(np.linspace(0,4,50)) >> ? ? y = func(x, 2.5, 1.3, 0.5) >> ? ? yn = y + 0.2*np.random.normal(size=len(x)) >> ? ? popt, pcov = curve_fit(func, x, yn) >> >> ? ? then x is a list and we get this error: >> ? ? "TypeError: can't multiply sequence by non-int of type 'float'" >> >> ? ? However, according to the documentation, xdata : An N-length sequence >> ? ? or an (k,N)-shaped array. I understand this statement as : "either a >> ? ? list, a tuple or an array". Should optimize.curve_fit internally cast >> ? ? xdata to an array? I would think so. The docstring might be a bit misleading. x (xdata) is just handed through curve_fit and leastsq to your function and could be anything. The interpretation as xdata is just for the specific usecase y=f(x)+noise, but nothing in the implementation requires directly anything about x. (The only requirement is that f(x) returns a (N,) array for an x.) So, I don't think curve_fit should do any changes to x, it's an arg for the user function that the user should take care of, and a user can exploit it's flexibility. Josef >> >> >> >> ? ? Xavier >> ? ? _______________________________________________ >> ? ? SciPy-User mailing list >> ? ? SciPy-User at scipy.org >> ? ? http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Fri Aug 5 04:53:29 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Aug 2011 04:53:29 -0400 Subject: [SciPy-User] bug in optimize.curve_fit ? In-Reply-To: References: <4E3B9BC4.4000404@gmail.com> Message-ID: On Fri, Aug 5, 2011 at 4:39 AM, wrote: > On Fri, Aug 5, 2011 at 3:29 AM, Xavier Gnata wrote: >> Hi, >> >> Yes but the doc ( >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html >> ) claims that xdata can be "An N-length sequence or an (k,N)-shaped array" >> >> IRL, I would not cast a array to a list to call optimize.curve_fit. >> x = list(x) was only an easy way to write a short testcase based on the >> exemple in the doc. >> >> Xavier >> >>> Hi, >>> >>> if you did x = list(x) it becomes a simple list, which cannot be >>> multiplied by a number. >>> don't do this. >>> >>> -- >>> Oleksandr Huziy >>> >>> >>> 2011/8/4 Xavier Gnata >> > >>> >>> ? ? Hi, >>> >>> ? ? def func(x, a, b, c): >>> ? ? ? ?return a*np.exp(-b*x) + c >>> ? ? x = np.linspace(0,4,50) >>> ? ? y = func(x, 2.5, 1.3, 0.5) >>> ? ? yn = y + 0.2*np.random.normal(size=len(x)) >>> ? ? popt, pcov = curve_fit(func,x, yn) >>> >>> ? ? works vey well but if you change it to: >>> >>> ? ? def func(x, a, b, c): >>> ? ? ? ?return a*np.exp(-b*x) + c >>> ? ? x = list(np.linspace(0,4,50)) >>> ? ? y = func(x, 2.5, 1.3, 0.5) >>> ? ? yn = y + 0.2*np.random.normal(size=len(x)) >>> ? ? popt, pcov = curve_fit(func, x, yn) >>> >>> ? ? then x is a list and we get this error: >>> ? ? "TypeError: can't multiply sequence by non-int of type 'float'" >>> >>> ? ? However, according to the documentation, xdata : An N-length sequence >>> ? ? or an (k,N)-shaped array. I understand this statement as : "either a >>> ? ? list, a tuple or an array". Should optimize.curve_fit internally cast >>> ? ? xdata to an array? I would think so. > > The docstring might be a bit misleading. x (xdata) is just handed > through curve_fit and leastsq to your function and could be anything. > The interpretation as xdata is just for the specific usecase > y=f(x)+noise, but nothing in the implementation requires directly > anything about x. (The only requirement is that f(x) returns a (N,) > array for an x.) for example: import numpy as np from scipy.optimize import curve_fit def func(x, a, b, c): #print b, x, type(x) return a*np.exp(-b*x.x) + c x0 = list(np.linspace(0,4,50)) class Dummy(object): def __init__(self, x): self.x = np.asarray(x) xd = Dummy(x0) y = func(xd, 2.5, 1.3, 0.5) yn = y + 0.2*np.random.normal(size=len(x0)) popt, pcov = curve_fit(func, xd, yn) print popt Josef > > So, I don't think curve_fit should do any changes to x, it's an arg > for the user function that the user should take care of, and a user > can exploit it's flexibility. > > Josef > > >>> >>> >>> >>> ? ? Xavier >>> ? ? _______________________________________________ >>> ? ? SciPy-User mailing list >>> ? ? SciPy-User at scipy.org >>> ? ? http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From pav at iki.fi Fri Aug 5 04:54:08 2011 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 5 Aug 2011 08:54:08 +0000 (UTC) Subject: [SciPy-User] bug in optimize.curve_fit ? References: Message-ID: Thu, 04 Aug 2011 16:17:07 +0200, Xavier Gnata wrote: [clip] > def func(x, a, b, c): > return a*np.exp(-b*x) + c > x = list(np.linspace(0,4,50)) > y = func(x, 2.5, 1.3, 0.5) Your program fails already here -- before curve_fit. From josef.pktd at gmail.com Fri Aug 5 05:14:13 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Aug 2011 05:14:13 -0400 Subject: [SciPy-User] bug in optimize.curve_fit ? In-Reply-To: References: Message-ID: On Fri, Aug 5, 2011 at 4:54 AM, Pauli Virtanen wrote: > Thu, 04 Aug 2011 16:17:07 +0200, Xavier Gnata wrote: > [clip] >> def func(x, a, b, c): >> ? ? return a*np.exp(-b*x) + c >> x = list(np.linspace(0,4,50)) >> y = func(x, 2.5, 1.3, 0.5) > > Your program fails already here -- before curve_fit. Good catch, I had spent 10 minutes looking for the bug in my version without ever checking the line number of the exception. Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From R.Springuel at umit.maine.edu Sat Aug 6 18:25:22 2011 From: R.Springuel at umit.maine.edu (R. Padraic Springuel) Date: Sat, 06 Aug 2011 18:25:22 -0400 Subject: [SciPy-User] numpy, scipy, and python 3 In-Reply-To: References: Message-ID: <4E3DBF52.5040708@umit.maine.edu> Well, I've successfully built both numpy and scipy for python 3.2. I've also run the nose tests and only come up with one failed test, but it's the same test that fails on python 2.7 for me, and doesn't appear to be on a function that I've ever used. For those interested, here's the output on the failed test: > FAIL: test_expon (test_morestats.TestAnderson) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/stats/tests/test_morestats.py", line 72, in test_expon > assert_array_less(crit[:-1], A) > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 869, in assert_array_less > header='Arrays are not less-ordered') > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 613, in assert_array_compare > chk_same_position(x_id, y_id, hasval='inf') > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 588, in chk_same_position > raise AssertionError(msg) > AssertionError: > Arrays are not less-ordered > > x and y inf location mismatch: > x: array([ 0.911, 1.065, 1.325, 1.587]) > y: array(inf) > > ---------------------------------------------------------------------- The results are identical for python 3.2 except "2.7" is replaced by "3.2" everywhere that it occurs. -- R. Padraic Springuel, PhD From wesmckinn at gmail.com Sun Aug 7 16:26:28 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Sun, 7 Aug 2011 16:26:28 -0400 Subject: [SciPy-User] Status of TimeSeries SciKit In-Reply-To: References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk> <7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com> <7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com> <20110726222843.GB8920@phare.normalesup.org> <20110727141251.GB30024@phare.normalesup.org> Message-ID: On Sat, Jul 30, 2011 at 7:40 AM, Tim Michelsen wrote: >>> Since most of my code for meteorological data evaluations is based on >>> it, I would be happy to receive infomation on the conclusion and how I >>> need to adjust my code to upkeep with new developments. >> >> When it gets to that point I'd be happy to help (including looking at >> some of your existing code and data). Sorry I've been out of commission for the last week or so. > In short my process goes like: > * QC of incoming measurements data > * visualisation and statistics (basics, disribution analysis) > * reporting > * back & forcasting with other (modeled) data > * preparation of result data sets > > When it comes to QC I would need: > * check on missing dates (i.e. failure of aquisitition equipment) > * check on double dates (= failure of data logger) > * data integrity and plausability tests with certain filters/flags > > All these need to be reported on: > * data recovery > * invalid data by filter/flag type > > So far, I have been using the masked arrays. Mainly because it is heaily > ?used in the time series scikit and transfering masks from on array to > another is quite once you learned the basics. > > Would you work these items out in pandas, as well? I would need to look at code and see the concrete use cases. As with anything else, you adapt solutions to your problems based on your available tools. > P.S. Your presentation "Time series analysis in Python with statsmodels" > is really cool and has shown me good aspects about the HP filters > Thanks...still lots to do on the TSA front. The filtering work has all been Skipper's. > Regards, > Timmie > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From wesmckinn at gmail.com Sun Aug 7 16:37:05 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Sun, 7 Aug 2011 16:37:05 -0400 Subject: [SciPy-User] Status of TimeSeries SciKit In-Reply-To: References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk> <7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com> <7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com> <20110726222843.GB8920@phare.normalesup.org> <20110727141251.GB30024@phare.normalesup.org> Message-ID: On Tue, Aug 2, 2011 at 3:37 AM, Tim Michelsen wrote: >> >> I agree. I already have 50% or more of the features in >> >> scikits.timeseries, so this gets back to my fragmentation argument >> >> (users being stuck with a confusing choice between multiple >> >> libraries). Let's make it happen! >> > So what needs to be done to move things forward? >> > Do we need to draw up a roadmap? >> > A table with functions that respond to common use cases in natual >> > science, computing, and economics? >> Having a place to collect concrete use cases (like your list from the >> prior e-mail, but with illustrative code snippets) would be good. >> You're welcome to start doing it here: >> >> https://github.com/wesm/pandas/wiki > Here goes: > https://github.com/wesm/pandas/wiki/Time-Series-Manipulation > > I will fill it with my stuff. > Shall we file feature request directly as issues? Cool, I will start adding things when I have some time. Feel free to file features requests as issues tagged with "Enhancement". >> A good place to start, which I can do when I have some time, would be >> to start moving the scikits.timeseries code into pandas. There are >> several key components >> >> - Date and DateArray stuff, frequency implementations >> - masked array time series implementations (record array and not) >> - plotting >> - reporting, moving window functions, etc. >> >> We need to evaluate Date/DateArray as they relate to numpy.datetime64 >> and see what can be done. I haven't looked closely but I'm not sure if >> all the convenient attribute access stuff (day, month, day_of_week, >> weekday, etc.) is available in NumPy yet. I suspect it would be >> reasonably straightforward to wrap DateArray so it can be an Index for >> a pandas object. >> >> I won't have much time for this until mid-August, but a couple days' >> hacking should get most of the pieces into place. I guess we can just >> keep around the masked array classes for legacy API support and for >> feature completeness. > I value very much the work of Pierre and Matt. > But my difficulti with the scikit was that the code is too complex. So I was > only able to contribute helper functions for doc fixes. > Please, lets make it happen that this effort is not a on or 3 man show but > results in something whcih can be maintained by the whole community. Yes, I agree. I am painfully aware of being one of the only people consistently working on the data structure front (judging from commit activity at least) but I would like to get more people involved. I'm hopeful that increasing awareness to what we're working on (e.g. I've started blogging about pandas and related things) will draw new people into the projects. > Nevertheless, the timeseries scikit made my work more comfortable and > understadable than I was able to manage with R. > > Regards, > Timmie > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From chris at simplistix.co.uk Sun Aug 7 17:22:19 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Sun, 07 Aug 2011 22:22:19 +0100 Subject: [SciPy-User] getting started with arrays and matplotlib Message-ID: <4E3F020B.1000500@simplistix.co.uk> Hi All, I'm a new user returning to SciPy after quite a long break, so, a few high-level questions first: - Are there any good books or other narrative docs that cover the bulk of the core numpy stuff - Ditto, but for visualisation, particularly with matplotlib or the Enthought visualisation suites. I'm particularly interested in step-by-step docs/books with lots of examples, versus reference docs that basically need the user to know what they're looking for in a chicken and egg fashion, which was my previous experience of scipy docs... Now, the specific problem I'm looking to solve it a stacked bar chart of ticket sales for an event over time. The data I have is basically a log file of ticket sales. I was looking to build a 4-dimensonal array as follows, with each cell representing ticket sales for that week at that venue at that event: event: 2011 venue t-3 week t-2 week t-1 week v1 10 20 30 v2 15 30 45 event: 2010 venue t-3 week t-2 week t-1 week v1 1 2 3 v2 15 30 45 ...etc... Now, first question: what's the best way to build this array given that I may only see the arrival of a new venue a fair way through building the data structure? How can I efficiently say "please add a new row to my array", I don't know what the 4th dimension equivalent is ;-) Secondly, once I've populated this, any good examples of how to turn it into a bar chart? (the simple bar chart would be number of sales on the y-axis, weeks before the event on the x-axis, however, what I'd then like to do is split each bar into chunks for each venue's sales, if that makes sense?) Any help gratefully received! cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From wardefar at iro.umontreal.ca Mon Aug 8 01:29:44 2011 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Mon, 8 Aug 2011 01:29:44 -0400 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: <4E3F020B.1000500@simplistix.co.uk> References: <4E3F020B.1000500@simplistix.co.uk> Message-ID: On 2011-08-07, at 5:22 PM, Chris Withers wrote: > Now, first question: what's the best way to build this array given that > I may only see the arrival of a new venue a fair way through building > the data structure? How can I efficiently say "please add a new row to > my array", I don't know what the 4th dimension equivalent is ;-) It may be worth thinking about whether an ndarray is necessarily the right way to solve this problem. For one thing, you can't append to ndarrays easily. HDF5 tables (via PyTables, for example) are more forgiving in this respect and play nice with NumPy, but there are certainly other options. > Secondly, once I've populated this, any good examples of how to turn it > into a bar chart? (the simple bar chart would be number of sales on the > y-axis, weeks before the event on the x-axis, however, what I'd then > like to do is split each bar into chunks for each venue's sales, if that > makes sense?) This might give you an example of what you need: http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html but you'd be better off asking on matplotlib-users. David From josef.pktd at gmail.com Mon Aug 8 03:42:55 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Aug 2011 03:42:55 -0400 Subject: [SciPy-User] rejection sampling Message-ID: I got started a bit with rejection sampling. for example scipy.stats.rdist has a very slow random number generator. for shape parameters>=2 and not very large, rejection sampling works much faster. (for shape parameter <2, the pdf of rdist is unbound and rejection sampling against uniform doesn't work) attached is just the first version to show that it works. Does someone already have a more complete version that could be shared? Josef -------------- next part -------------- A non-text attachment was scrubbed... Name: try_sampling_reject.py Type: text/x-python Size: 1382 bytes Desc: not available URL: From rcarpenter at wdtinc.com Tue Aug 9 17:59:26 2011 From: rcarpenter at wdtinc.com (Richard Carpenter) Date: Tue, 9 Aug 2011 21:59:26 +0000 Subject: [SciPy-User] Building from source on RHEL5 Message-ID: <0C46B1FDF194D94CB9E329DFD315DB44252939CA@rain.wdtinc.com> I am able to install ATLAS and build its shared libraries. But I can't figure out how to build the BLAS and LAPACK shared libraries. Following the installation instructions and editing the LAPACK make.inc with -fPIC, it builds a .a file. Is that really a .so file? What about the BLAS library? Thanks in advance for the help. Richard Carpenter -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at graune.org Wed Aug 10 01:59:58 2011 From: contact at graune.org (Manuel Graune) Date: Wed, 10 Aug 2011 07:59:58 +0200 Subject: [SciPy-User] calculate definite integral of sampled data Message-ID: <20110810055958.GH2924@uriel> Hi everyone, to calculate the definite integral of a function or an array of sampled data scipy provides (among others) the quad and trapz functions. So it is possible to compute e. g. the definite integral of cos(t) over some interval by doing definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit) or definite_integral= scipy.integrate.trapz(some_array). Now, if I want to plot cos(t) and the integral of cos(t) from 0 to t in a graph, the necessary array can be calculated by: @numpy.vectorize def intfunc(fnc,upper_limit): return scipy.integrate.quad(fnc,0.0,upper_limit) definite_inegral= intfunc(cos,t) which seems (whithout knowing the actual code) a bit wasteful and slow but is relatively concise. Now for my question: scipy provides e. g. the trapz-function to calculate definite integral of a complete array of sampled data. However, I have no idea how to get achieve the same as above for sampled data (apart from manually iterating in a for-loop). Is there a function somewhere which delivers an array of the definite integrals for each of the data-points in an array? Regards, Manuel -- A hundred men did the rational thing. The sum of those rational choices was called panic. Neal Stephenson -- System of the world http://www.graune.org/GnuPG_pubkey.asc Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A 5828 5476 7E92 2DB4 3C99 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From gustavo.goretkin at gmail.com Wed Aug 10 02:46:19 2011 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Wed, 10 Aug 2011 02:46:19 -0400 Subject: [SciPy-User] calculate definite integral of sampled data In-Reply-To: <20110810055958.GH2924@uriel> References: <20110810055958.GH2924@uriel> Message-ID: You could try using the numpy.cumsum (standing for cumulative sum) function to accomplish this. This would give you the equivalent of a Riemann sum (the sum is approximated with rectangles, specifically I think this would be considered the midpoint Riemann sum). You should be able the accomplish the trapezoidal rule by first averaging consecutive samples and then applying the Riemann sum. Here's an example In [2]: sample_points = np.linspace(0,10,1000) In [3]: y = np.cos(sample_points) In [4]: y_midpoint = np.cumsum(y) In [5]: y_smooth = ( y[0:-1] + y[1:] ) * (.5) In [6]: y_trapezoidal = np.cumsum(y_smooth) Note that after trapezoidal integration, the array length is one fewer. In my opinion, the more elegant way to do the smoothing step is with the numpy.convolution operator. In this same way, you should be able to implement other equally-spaced quadrature rules like Simpson's rule, but I may be incorrect. Gustavo On Wed, Aug 10, 2011 at 1:59 AM, Manuel Graune wrote: > Hi everyone, > > to calculate the definite integral of a function or an array of sampled > data scipy provides (among others) the quad and trapz functions. > So it is possible to compute e. g. the definite integral of cos(t) over > some interval by doing > > definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit) > > or > > definite_integral= scipy.integrate.trapz(some_array). > > Now, if I want to plot cos(t) and the integral of cos(t) from 0 to t in > a graph, the necessary array can be calculated by: > > @numpy.vectorize > def intfunc(fnc,upper_limit): > return scipy.integrate.quad(fnc,0.0,upper_limit) > > definite_inegral= intfunc(cos,t) > > which seems (whithout knowing the actual code) a bit wasteful and slow > but is relatively concise. > > Now for my question: scipy provides e. g. the trapz-function to > calculate definite integral of a complete array of sampled data. > However, I have no idea how to get achieve the same as above for > sampled data (apart from manually iterating in a for-loop). Is there > a function somewhere which delivers an array of the definite integrals > for each of the data-points in an array? > > > Regards, > > Manuel > > -- > A hundred men did the rational thing. The sum of those rational choices was > called panic. Neal Stephenson -- System of the world > http://www.graune.org/GnuPG_pubkey.asc > Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A 5828 5476 7E92 2DB4 3C99 > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Wed Aug 10 08:55:50 2011 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 10 Aug 2011 14:55:50 +0200 Subject: [SciPy-User] ANN: SfePy 2011.3 Message-ID: <4E427FD6.5000707@ntc.zcu.cz> I am pleased to announce release 2011.3 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method. The code is based on NumPy and SciPy packages. It is distributed under the new BSD license. Home page: http://sfepy.org Mailing lists, issue tracking: http://code.google.com/p/sfepy/ Git (source) repository: http://github.com/sfepy Documentation: http://docs.sfepy.org/doc Highlights of this release -------------------------- - major update of terms aiming at easier usage and definition while retaining original C functions - overriding problem description items on command line - improved developer guide - Primer tutorial - a step-by-step walk-through of the process to solve a simple mechanics problem For more information on this release, see http://sfepy.googlecode.com/svn/web/releases/2011.3_RELEASE_NOTES.txt (full release notes, rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Vladim?r Luke?, Maty?? Nov?k, Andre Smit From rajs2010 at gmail.com Wed Aug 10 09:18:28 2011 From: rajs2010 at gmail.com (Rajeev Singh) Date: Wed, 10 Aug 2011 18:48:28 +0530 Subject: [SciPy-User] Speeding up Python Again Message-ID: Hi, I was trying out the codes discussed at http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html Here is a summary of my results - Computer: Desktop imsc9 aravali annapurna NumPy: 7.651419 4.219105 5.576453 4.858640 Cython: 4.259419 3.477259 3.204909 2.357819 Weave: 4.302778 * 3.298551 2.400000 Looped Fortran: 4.199148 3.414484 3.202963 2.315644 Vectorized Fortran: 3.118410 2.131966 1.512303 1.460251 pure fortran update1: 1.205727 1.964857 2.034688 1.336086 pure fortran update2: 0.600848 0.604649 0.573593 0.721339 imsc9, aravali and annapurna are HPC machines at my institute * for some reason Weave didn't compile on imsc9 Indeed there is about a factor of 7 to 12 difference between pure fortran with update2 (vectorized) and the numpy version. I should mention that I changed N to 150 in laplace_for.f90 Rajeev -------------- next part -------------- An HTML attachment was scrubbed... URL: From davclark at gmail.com Tue Aug 9 16:46:35 2011 From: davclark at gmail.com (Dav Clark) Date: Tue, 9 Aug 2011 13:46:35 -0700 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: <4E3F020B.1000500@simplistix.co.uk> References: <4E3F020B.1000500@simplistix.co.uk> Message-ID: On Aug 7, 2011, at 2:22 PM, Chris Withers wrote: > Hi All, > > I'm a new user returning to SciPy after quite a long break, so, a few > high-level questions first: > > - Are there any good books or other narrative docs that cover the bulk > of the core numpy stuff > > - Ditto, but for visualisation, particularly with matplotlib or the > Enthought visualisation suites. Well, this is probably more basic than you want, but O'Reilly's "Data Analysis with Open Source Tools" is certainly a nice low-level intro for a beginner: http://oreilly.com/catalog/9780596802363 It's available on Safari Bookshelf, and also talks about using matplotlib (and R and GSL and ...). I'm unaware of any nice Chaco "narratives." > Now, first question: what's the best way to build this array given that > I may only see the arrival of a new venue a fair way through building > the data structure? How can I efficiently say "please add a new row to > my array", I don't know what the 4th dimension equivalent is ;-) You might consider doing what matlab does under the hood and just double the array when you run out of space. You can also keep a view around that restricts to just the portion of the data that's "real." > Secondly, once I've populated this, any good examples of how to turn it > into a bar chart? (the simple bar chart would be number of sales on the > y-axis, weeks before the event on the x-axis, however, what I'd then > like to do is split each bar into chunks for each venue's sales, if that > makes sense?) The book above would do a good job with this. Cheers, Dav From contact at graune.org Wed Aug 10 01:59:58 2011 From: contact at graune.org (Manuel Graune) Date: Wed, 10 Aug 2011 07:59:58 +0200 Subject: [SciPy-User] calculate definite integral of sampled data Message-ID: <20110810055958.GH2924@uriel> Hi everyone, to calculate the definite integral of a function or an array of sampled data scipy provides (among others) the quad and trapz functions. So it is possible to compute e. g. the definite integral of cos(t) over some interval by doing definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit) or definite_integral= scipy.integrate.trapz(some_array). Now, if I want to plot cos(t) and the integral of cos(t) from 0 to t in a graph, the necessary array can be calculated by: @numpy.vectorize def intfunc(fnc,upper_limit): return scipy.integrate.quad(fnc,0.0,upper_limit) definite_inegral= intfunc(cos,t) which seems (whithout knowing the actual code) a bit wasteful and slow but is relatively concise. Now for my question: scipy provides e. g. the trapz-function to calculate definite integral of a complete array of sampled data. However, I have no idea how to get achieve the same as above for sampled data (apart from manually iterating in a for-loop). Is there a function somewhere which delivers an array of the definite integrals for each of the data-points in an array? Regards, Manuel -- A hundred men did the rational thing. The sum of those rational choices was called panic. Neal Stephenson -- System of the world http://www.graune.org/GnuPG_pubkey.asc Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A 5828 5476 7E92 2DB4 3C99 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From jeffalstott at gmail.com Wed Aug 10 09:14:59 2011 From: jeffalstott at gmail.com (Jeff Alstott) Date: Wed, 10 Aug 2011 15:14:59 +0200 Subject: [SciPy-User] firwin behavior Message-ID: firwin is producing unreasonable filters for me, and I'm not sure if I'm misusing the code or if there is a bug. Like so: In [5]: from scipy.signal import firwin In [6]: ny = 500 In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); savefig('FIR21_filter80.png') Produces the attached file. In contrast, Matlab: Trial>> ny = 500 ny = 500 Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80) Produces the other attached file. Quite different! The filter produced by the scipy function, if used with lfilter (or if taken to Matlab to use as a filter), produces a nonsense filtering, with many high frequency artifacts. Any thoughts? This is in python3, if that matters. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: FIR21_filter80.png Type: image/png Size: 16858 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matlab_FIR21_filter80.png Type: image/png Size: 8402 bytes Desc: not available URL: From warren.weckesser at enthought.com Wed Aug 10 10:39:11 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Aug 2011 09:39:11 -0500 Subject: [SciPy-User] firwin behavior In-Reply-To: References: Message-ID: On Wed, Aug 10, 2011 at 8:14 AM, Jeff Alstott wrote: > firwin is producing unreasonable filters for me, and I'm not sure if I'm > misusing the code or if there is a bug. Like so: > > In [5]: from scipy.signal import firwin > > In [6]: ny = 500 > > In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); > savefig('FIR21_filter80.png') > > Produces the attached file. > > In contrast, Matlab: > > Trial>> ny = 500 > > ny = > > ?? 500 > > Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80) > > Produces the other attached file. Quite different! The filter produced by > the scipy function, if used with lfilter (or if taken to Matlab to use as a > filter), produces a nonsense filtering, with many high frequency artifacts. > > Any thoughts? This is in python3, if that matters. By default, firwin creates a filter that passes DC (i.e. the zero frequency). To get a filter like the one produced by matlab, add the keyword argument pass_zero=False. Warren > > Thanks! > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From gokhansever at gmail.com Wed Aug 10 13:48:30 2011 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 10 Aug 2011 11:48:30 -0600 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: <4E3F020B.1000500@simplistix.co.uk> References: <4E3F020B.1000500@simplistix.co.uk> Message-ID: On Sun, Aug 7, 2011 at 3:22 PM, Chris Withers wrote: > Hi All, > > I'm a new user returning to SciPy after quite a long break, so, a few > high-level questions first: > > - Are there any good books or other narrative docs that cover the bulk > of the core numpy stuff > > - Ditto, but for visualisation, particularly with matplotlib or the > Enthought visualisation suites. > > I'm particularly interested in step-by-step docs/books with lots of > examples, versus reference docs that basically need the user to know > what they're looking for in a chicken and egg fashion, which was my > previous experience of scipy docs... Somewhat an advanced data analysis book, particularly if you are interested in error analysis, and not so surprisingly powered by Python: A Student's Guide to Data and Error Analysis [http://www.cambridge.org/gb/knowledge/isbn/item5731787/] -- G?khan From paul.blelloch at ata-e.com Wed Aug 10 14:48:11 2011 From: paul.blelloch at ata-e.com (Paul Blelloch) Date: Wed, 10 Aug 2011 11:48:11 -0700 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: Message-ID: <6a7c34c3f511f84f907b45e0a6dc6021@mail> I recently got Hans Petter Langtangen's "A Primer on Scientific Programming with Python." I thought that it was a good choice. It's more of a text book than a reference, but is well written. He has another book called "Python Scripting for Computational Science," which might also serve. -----Original Message----- From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of G?khan Sever Sent: Wednesday, August 10, 2011 10:49 AM To: SciPy Users List Subject: Re: [SciPy-User] getting started with arrays and matplotlib On Sun, Aug 7, 2011 at 3:22 PM, Chris Withers wrote: > Hi All, > > I'm a new user returning to SciPy after quite a long break, so, a few > high-level questions first: > > - Are there any good books or other narrative docs that cover the bulk > of the core numpy stuff > > - Ditto, but for visualisation, particularly with matplotlib or the > Enthought visualisation suites. > > I'm particularly interested in step-by-step docs/books with lots of > examples, versus reference docs that basically need the user to know > what they're looking for in a chicken and egg fashion, which was my > previous experience of scipy docs... Somewhat an advanced data analysis book, particularly if you are interested in error analysis, and not so surprisingly powered by Python: A Student's Guide to Data and Error Analysis [http://www.cambridge.org/gb/knowledge/isbn/item5731787/] -- G?khan _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From gmane at blindgoat.org Wed Aug 10 15:59:23 2011 From: gmane at blindgoat.org (martin smith) Date: Wed, 10 Aug 2011 15:59:23 -0400 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: <6a7c34c3f511f84f907b45e0a6dc6021@mail> References: <6a7c34c3f511f84f907b45e0a6dc6021@mail> Message-ID: On 8/10/2011 2:48 PM, Paul Blelloch wrote: > I recently got Hans Petter Langtangen's "A Primer on Scientific Programming with Python." I thought that it was a good choice. It's more of a text book than a reference, but is well written. He has another book called "Python Scripting for Computational Science," which might also serve. > I'd like to support the recommendation for Langtangen's book (I haven't seen the second one). I think it's an excellent combination of advanced script usage and scientific applications. - martin smith From wccarithers at lbl.gov Wed Aug 10 16:47:16 2011 From: wccarithers at lbl.gov (Bill Carithers) Date: Wed, 10 Aug 2011 13:47:16 -0700 Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion Message-ID: Hi all, When I upgraded to Lion, it wiped out my previous Python2.6 site-packages, including scipy-0.7.1. Now I?m trying to re-install the latest scipy from svn in the Python2.7 supplied from Apple. After reading the installation instructions, I followed the recommendation to use the fortran compiler (Gnu Fortran 4.2.4 for Lion) from http://r.research.att.com/tools/ which installed in Xcode 4.1. (Actually, I?m a little confused by this since, even though the installation said it succeeded, I couldn?t find it. Also when I did ?gfortran ?version? from the command line, it returned ?i686-apple-darwin11-gfortran-4.2.1: no input files?.) When I tried to build and install with ?sudo python setup.py install?, it was humming along until it got to ARPACK the exited with exit status 1. It looks to my untrained eye as if it failed when trying to compile scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c: The terminal output for this portion of the build is appended below. Another question... Mac OS X 10.7 has an Accelerator for vecLib that includes a built-in BLAS/ATLAS. Shouldn?t the build be using this? How do I tell it where to find it? Any ideas on how to fix this? Thanks. creating build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen creating build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack creating build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack/ARPACK creating build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack/ARPACK/FWR APPERS compile options: '-Iscipy/sparse/linalg/eigen/arpack/ARPACK/SRC -I/Library/Python/2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.7-intel.egg/ numpy/core/include -c' llvm-gcc-4.2: scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error: expected ?;?, ?,? or ?)? before ?*? token scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error: expected ?;?, ?,? or ?)? before ?*? token scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error: expected ?;?, ?,? or ?)? before ?*? token scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error: expected ?;?, ?,? or ?)? before ?*? token lipo: can't open input file: /var/tmp//ccBUPGd0.out (No such file or directory) scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error: expected ?;?, ?,? or ?)? before ?*? token scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error: expected ?;?, ?,? or ?)? before ?*? token scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: warning: type defaults to ?int? in declaration of ?complex? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error: expected ?;?, ?,? or ?)? before ?float? scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error: expected ?;?, ?,? or ?)? before ?*? token scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error: expected ?;?, ?,? or ?)? before ?*? token lipo: can't open input file: /var/tmp//ccBUPGd0.out (No such file or directory) error: Command "llvm-gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe -Iscipy/sparse/linalg/eigen/arpack/ARPACK/SRC -I/Library/Python/2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.7-intel.egg/ numpy/core/include -c scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c -o build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack/ARPACK/FWR APPERS/veclib_cabi_c.o" failed with exit status 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.felton at gmail.com Wed Aug 10 16:16:06 2011 From: chris.felton at gmail.com (Christopher Felton) Date: Wed, 10 Aug 2011 15:16:06 -0500 Subject: [SciPy-User] firwin behavior In-Reply-To: References: Message-ID: On 8/10/2011 8:14 AM, Jeff Alstott wrote: > firwin is producing unreasonable filters for me, and I'm not sure if I'm > misusing the code or if there is a bug. Like so: > > In [5]: from scipy.signal import firwin > > In [6]: ny = 500 > > In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0. Simply create a float ny = 500. , Note the "." then the divides will be floats. In Matlab everything is by default a double. Python not so. The version I am running encounters an error on the above, if I use floats or not, version 0.8.0. Regards, Chris > savefig('FIR21_filter80.png') > > Produces the attached file. > > In contrast, Matlab: > > Trial>> ny = 500 > > ny = > > 500 > > Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80) > > Produces the other attached file. Quite different! The filter produced by > the scipy function, if used with lfilter (or if taken to Matlab to use as a > filter), produces a nonsense filtering, with many high frequency artifacts. > > Any thoughts? This is in python3, if that matters. > > Thanks! > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ralf.gommers at googlemail.com Wed Aug 10 16:55:24 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 10 Aug 2011 22:55:24 +0200 Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion In-Reply-To: References: Message-ID: On Wed, Aug 10, 2011 at 10:47 PM, Bill Carithers wrote: > Hi all, > > When I upgraded to Lion, it wiped out my previous Python2.6 site-packages, > including scipy-0.7.1. Now I?m trying to re-install the latest scipy from > svn in the Python2.7 supplied from Apple. After reading the installation > instructions, I followed the recommendation to use the fortran compiler (Gnu > Fortran 4.2.4 for Lion) from http://r.research.att.com/tools/ which > installed in Xcode 4.1. (Actually, I?m a little confused by this since, even > though the installation said it succeeded, I couldn?t find it. Also when I > did ?gfortran ?version? from the command line, it returned > ?i686-apple-darwin11-gfortran-4.2.1: no input files?.) > > Try "gfortran --version" with two dashes to get more sensible output. You have it installed. When I tried to build and install with ?sudo python setup.py install?, it > was humming along until it got to ARPACK the exited with exit status 1. It > looks to my untrained eye as if it failed when trying to compile > scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c: The > terminal output for this portion of the build is appended below. > > This should have been fixed by commit effa6f6 about two weeks ago, please check that your checkout is up-to-date. > Another question... Mac OS X 10.7 has an Accelerator for vecLib that > includes a built-in BLAS/ATLAS. Shouldn?t the build be using this? How do I > tell it where to find it? > The build does use this by default. It's called "Accelerate Framework". Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Aug 10 17:02:55 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Aug 2011 16:02:55 -0500 Subject: [SciPy-User] firwin behavior In-Reply-To: References: Message-ID: On Wed, Aug 10, 2011 at 3:16 PM, Christopher Felton wrote: > On 8/10/2011 8:14 AM, Jeff Alstott wrote: >> firwin is producing unreasonable filters for me, and I'm not sure if I'm >> misusing the code or if there is a bug. Like so: >> >> In [5]: from scipy.signal import firwin >> >> In [6]: ny = 500 >> >> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); > > > Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0. Jeff said he is using Python 3, so the results of the divisions will be floats. Warren > > Simply create a float ny = 500. , Note the "." then the divides will be > floats. In Matlab everything is by default a double. ?Python not so. > > The version I am running encounters an error on the above, if I use > floats or not, version 0.8.0. > > Regards, > Chris > >> savefig('FIR21_filter80.png') >> >> Produces the attached file. >> >> In contrast, Matlab: >> >> Trial>> ?ny = 500 >> >> ny = >> >> ? ? 500 >> >> Trial>> ?[f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80) >> >> Produces the other attached file. Quite different! The filter produced by >> the scipy function, if used with lfilter (or if taken to Matlab to use as a >> filter), produces a nonsense filtering, with many high frequency artifacts. >> >> Any thoughts? This is in python3, if that matters. >> >> Thanks! >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From wccarithers at lbl.gov Wed Aug 10 17:14:21 2011 From: wccarithers at lbl.gov (Bill Carithers) Date: Wed, 10 Aug 2011 14:14:21 -0700 Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion In-Reply-To: Message-ID: Hi Ralf, Thanks for the prompt reply. I just checked out from trunk today. The checkout version at the end said: Checked out external at revision 8716. Checked out revision 7183. Should I be using a branch to get the latest code that includes your fix ? Thanks, Bill On 8/10/11 1:55 PM, "Ralf Gommers" wrote: > > > On Wed, Aug 10, 2011 at 10:47 PM, Bill Carithers wrote: >> Hi all, >> >> When I upgraded to Lion, it wiped out my previous Python2.6 site-packages, >> including scipy-0.7.1. Now I?m trying to re-install the latest scipy from svn >> in the Python2.7 supplied from Apple. After reading the installation >> instructions, I followed the recommendation to use the fortran compiler (Gnu >> Fortran 4.2.4 for Lion) from http://r.research.att.com/tools/ which installed >> in Xcode 4.1. (Actually, I?m a little confused by this since, even though the >> installation said it succeeded, I couldn?t find it. Also when I did ?gfortran >> ?version? from the command line, it returned >> ?i686-apple-darwin11-gfortran-4.2.1: no input files?.) >> > Try "gfortran --version" with two dashes to get more sensible output. You have > it installed. > >> When I tried to build and install with ?sudo python setup.py install?, it was >> humming along until it got to ARPACK the exited with exit status 1. It looks >> to my untrained eye as if it failed when trying to compile >> scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c: The >> terminal output for this portion of the build is appended below. >> > This should have been fixed by commit effa6f6 about two weeks ago, please > check that your checkout is up-to-date. > > ? >> Another question... Mac OS X 10.7 has an Accelerator for vecLib that includes >> a built-in BLAS/ATLAS. Shouldn?t the build be using this? How do I tell it >> where to find it? > > The build does use this by default. It's called "Accelerate Framework". > > Cheers, > Ralf > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Aug 10 17:16:37 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 10 Aug 2011 23:16:37 +0200 Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion In-Reply-To: References: Message-ID: On Wed, Aug 10, 2011 at 11:14 PM, Bill Carithers wrote: > Hi Ralf, > > Thanks for the prompt reply. I just checked out from trunk today. The > checkout version at the end said: > Checked out external at revision 8716. > Checked out revision 7183. > > Ah, with "svn" you actually meant svn:) I thought that was supposed to not even work anymore. > Should I be using a branch to get the latest code that includes your fix ? > > You should be using git: https://github.com/scipy/scipy Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From aarchiba at physics.mcgill.ca Wed Aug 10 19:42:48 2011 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Wed, 10 Aug 2011 19:42:48 -0400 Subject: [SciPy-User] calculate definite integral of sampled data In-Reply-To: <20110810055958.GH2924@uriel> References: <20110810055958.GH2924@uriel> Message-ID: I believe that scipy.integrate.cumtrapz exists to solve this problem. There might be a cumulative Simpson's rule too. Nobody put too much effort into this because integrating a sampled function is better divided into separate interpolation (e.g. with a spline) and integration (exact for spline interpolants). I'd approach your problem with splrep and splint. Anne On 8/10/11, Manuel Graune wrote: > Hi everyone, > > to calculate the definite integral of a function or an array of sampled > data scipy provides (among others) the quad and trapz functions. > So it is possible to compute e. g. the definite integral of cos(t) over > some interval by doing > > definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit) > > or > > definite_integral= scipy.integrate.trapz(some_array). > > Now, if I want to plot cos(t) and the integral of cos(t) from 0 to t in > a graph, the necessary array can be calculated by: > > @numpy.vectorize > def intfunc(fnc,upper_limit): > return scipy.integrate.quad(fnc,0.0,upper_limit) > > definite_inegral= intfunc(cos,t) > > which seems (whithout knowing the actual code) a bit wasteful and slow > but is relatively concise. > > Now for my question: scipy provides e. g. the trapz-function to > calculate definite integral of a complete array of sampled data. > However, I have no idea how to get achieve the same as above for > sampled data (apart from manually iterating in a for-loop). Is there > a function somewhere which delivers an array of the definite integrals > for each of the data-points in an array? > > > Regards, > > Manuel > > -- > A hundred men did the rational thing. The sum of those rational choices was > called panic. Neal Stephenson -- System of the world > http://www.graune.org/GnuPG_pubkey.asc > Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A 5828 5476 7E92 2DB4 3C99 > -- Sent from my mobile device From chris.felton at gmail.com Wed Aug 10 19:49:34 2011 From: chris.felton at gmail.com (Christopher Felton) Date: Wed, 10 Aug 2011 18:49:34 -0500 Subject: [SciPy-User] firwin behavior In-Reply-To: References: Message-ID: On 8/10/11 4:02 PM, Warren Weckesser wrote: > On Wed, Aug 10, 2011 at 3:16 PM, Christopher Felton > wrote: >> On 8/10/2011 8:14 AM, Jeff Alstott wrote: >>> firwin is producing unreasonable filters for me, and I'm not sure if I'm >>> misusing the code or if there is a bug. Like so: >>> >>> In [5]: from scipy.signal import firwin >>> >>> In [6]: ny = 500 >>> >>> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); >> >> >> Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0. > > > Jeff said he is using Python 3, so the results of the divisions will be floats. > > Warren Thanks for the correction Warren, I know very little about Python3. Is a float the default number type or is the result of the division a float? Thanks, Chris From warren.weckesser at enthought.com Wed Aug 10 19:56:53 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Aug 2011 18:56:53 -0500 Subject: [SciPy-User] firwin behavior In-Reply-To: References: Message-ID: On Wed, Aug 10, 2011 at 6:49 PM, Christopher Felton wrote: > On 8/10/11 4:02 PM, Warren Weckesser wrote: > > On Wed, Aug 10, 2011 at 3:16 PM, Christopher Felton > > wrote: > >> On 8/10/2011 8:14 AM, Jeff Alstott wrote: > >>> firwin is producing unreasonable filters for me, and I'm not sure if > I'm > >>> misusing the code or if there is a bug. Like so: > >>> > >>> In [5]: from scipy.signal import firwin > >>> > >>> In [6]: ny = 500 > >>> > >>> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); > >> > >> > >> Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0. > > > > > > Jeff said he is using Python 3, so the results of the divisions will be > floats. > > > > Warren > > Thanks for the correction Warren, > > I know very little about Python3. Is a float the default number type or > is the result of the division a float? > The result of division is a float. Take a look here: http://docs.python.org/release/3.0.1/whatsnew/3.0.html#integers and click on the "PEP 0238" link for all the details. Warren > > Thanks, > Chris > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Aug 10 21:26:32 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 10 Aug 2011 19:26:32 -0600 Subject: [SciPy-User] calculate definite integral of sampled data In-Reply-To: <20110810055958.GH2924@uriel> References: <20110810055958.GH2924@uriel> Message-ID: On Tue, Aug 9, 2011 at 11:59 PM, Manuel Graune wrote: > Hi everyone, > > to calculate the definite integral of a function or an array of sampled > data scipy provides (among others) the quad and trapz functions. > So it is possible to compute e. g. the definite integral of cos(t) over > some interval by doing > > definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit) > > or > > definite_integral= scipy.integrate.trapz(some_array). > > Now, if I want to plot cos(t) and the integral of cos(t) from 0 to t in > a graph, the necessary array can be calculated by: > > @numpy.vectorize > def intfunc(fnc,upper_limit): > return scipy.integrate.quad(fnc,0.0,upper_limit) > > definite_inegral= intfunc(cos,t) > > which seems (whithout knowing the actual code) a bit wasteful and slow > but is relatively concise. > > Now for my question: scipy provides e. g. the trapz-function to > calculate definite integral of a complete array of sampled data. > However, I have no idea how to get achieve the same as above for > sampled data (apart from manually iterating in a for-loop). Is there > a function somewhere which delivers an array of the definite integrals > for each of the data-points in an array? > > > Regards, > > Manuel > > How many data points do you have? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rashed.golam at gmail.com Wed Aug 10 10:26:19 2011 From: rashed.golam at gmail.com (Md. Golam Rashed) Date: Wed, 10 Aug 2011 07:26:19 -0700 (PDT) Subject: [SciPy-User] ANN: SfePy 2011.3 In-Reply-To: <4E427FD6.5000707@ntc.zcu.cz> References: <4E427FD6.5000707@ntc.zcu.cz> Message-ID: <29614221.73.1312986379136.JavaMail.geo-discussion-forums@yqfn40> GREAT! -------------- next part -------------- An HTML attachment was scrubbed... URL: From rashed.golam at gmail.com Wed Aug 10 12:19:08 2011 From: rashed.golam at gmail.com (Md. Golam Rashed) Date: Wed, 10 Aug 2011 09:19:08 -0700 (PDT) Subject: [SciPy-User] ANN: SfePy 2011.3 In-Reply-To: <4E427FD6.5000707@ntc.zcu.cz> References: <4E427FD6.5000707@ntc.zcu.cz> Message-ID: <5166843.67.1312993148868.JavaMail.geo-discussion-forums@prec11> 58 test file(s) executed in 561.13 s, 0 failure(s) of 88 test(s) tested on win7, Intel Atom dual core. simple installation on windows followed while installing sfepy. ** I'm busy with my MS, so being irregular sometime, but concentrate fully on sfepy when i'm free. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffalstott at gmail.com Thu Aug 11 07:36:33 2011 From: jeffalstott at gmail.com (Jeff Alstott) Date: Thu, 11 Aug 2011 13:36:33 +0200 Subject: [SciPy-User] firwin behavior In-Reply-To: References: Message-ID: Wow. The passing of the DC frequency is exactly the issue, and that default behavior is clearly shown in the documentation. I see now that given a band, the default behavior is band-stop, whereas I would expect it to be band-pass. So, that fixed it. What I don't understand, however, is *why* that would be default behavior. More importantly, even if that is the default behavior, the name of the pass_zero flag does not readily help a dumb user like me grok the functionality. Has there been any thought to renaming it? Thanks! On Wed, Aug 10, 2011 at 4:39 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > On Wed, Aug 10, 2011 at 8:14 AM, Jeff Alstott > wrote: > > firwin is producing unreasonable filters for me, and I'm not sure if I'm > > misusing the code or if there is a bug. Like so: > > > > In [5]: from scipy.signal import firwin > > > > In [6]: ny = 500 > > > > In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); > > savefig('FIR21_filter80.png') > > > > Produces the attached file. > > > > In contrast, Matlab: > > > > Trial>> ny = 500 > > > > ny = > > > > 500 > > > > Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80) > > > > Produces the other attached file. Quite different! The filter produced by > > the scipy function, if used with lfilter (or if taken to Matlab to use as > a > > filter), produces a nonsense filtering, with many high frequency > artifacts. > > > > Any thoughts? This is in python3, if that matters. > > > By default, firwin creates a filter that passes DC (i.e. the zero > frequency). To get a filter like the one produced by matlab, add the > keyword argument pass_zero=False. > > Warren > > > > > > Thanks! > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guyer at nist.gov Thu Aug 11 14:19:37 2011 From: guyer at nist.gov (Jonathan Guyer) Date: Thu, 11 Aug 2011 14:19:37 -0400 Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion In-Reply-To: References: Message-ID: On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote: > Ah, with "svn" you actually meant svn:) I thought that was supposed to not even work anymore. It does work and it's confusing. I had not been following the transition closely and so was under the impression that the svn repository was being mirrored from git. It's not. It's just old. From yyc at solvcon.net Thu Aug 11 23:44:48 2011 From: yyc at solvcon.net (Yung-Yu Chen) Date: Thu, 11 Aug 2011 23:44:48 -0400 Subject: [SciPy-User] ANN: SOLVCON 0.1 Message-ID: Hello, I am pleased to announce version 0.1 of SOLVCON. SOLVCON is a Python-based, multi-physics software framework for solving first-order hyperbolic PDEs. The source tarball can be downloaded at http://bitbucket.org/yungyuc/solvcon/downloads . More information can be found at http://solvcon.net/ . This release marks a milestone of SOLVCON. Future development of SOLVCON will focus on production use. The planned directions include (i) the high-order CESE method, (ii) improving the scalability by consolidating the distributed-memory parallel code, (iii) expanding the capabilities of the existing solver kernels, and (iv) incorporating more physical processes. New features: - Glue BCs are added. A pair of collocated BCs can now be glued together to work as an internal interface. The glued BCs helps to dynamically turn on or off the BC pair. - ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for multi-threaded computing. They were using a thread pool built-in SOLVCON for multi-threading. OpenMP makes multi-threaded functions more flexible in argument specification. - Add the ``soil/`` directory for providing building helpers for GCC 4.6.1. Note, the name ``gcc/`` is deliberately avoided for the directory, because of a bug in gcc itself (bug id 48306 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ). - Add ``-j`` command line option for building dependencies in the ``ground/`` directory and the ``soil/`` directory. Note that ATLAS doesn't work with ``make -j N``. Bug-fix: - METIS changes its download URL. Modify SConstruct accordingly. -- Yung-Yu Chen http://solvcon.net/yyc/ +1 (614) 859 2436 From chris at simplistix.co.uk Fri Aug 12 01:38:15 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 12 Aug 2011 06:38:15 +0100 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: References: <4E3F020B.1000500@simplistix.co.uk> Message-ID: <4E44BC47.7060400@simplistix.co.uk> On 09/08/2011 21:46, Dav Clark wrote: > Well, this is probably more basic than you want, but O'Reilly's "Data Analysis with Open Source Tools" is certainly a nice low-level intro for a beginner: > > http://oreilly.com/catalog/9780596802363 > > It's available on Safari Bookshelf, and also talks about using matplotlib (and R and GSL and ...). I'm unaware of any nice Chaco "narratives." Thanks to you and everyone else for the great suggestions :-) >> Now, first question: what's the best way to build this array given that >> I may only see the arrival of a new venue a fair way through building >> the data structure? How can I efficiently say "please add a new row to >> my array", I don't know what the 4th dimension equivalent is ;-) > > You might consider doing what matlab does under the hood and just double the array when you run out of space. What's the best way to do this? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From klonuo at gmail.com Fri Aug 12 01:49:57 2011 From: klonuo at gmail.com (Klonuo Umom) Date: Fri, 12 Aug 2011 07:49:57 +0200 Subject: [SciPy-User] ANN: SOLVCON 0.1 In-Reply-To: References: Message-ID: Interesting package. Congratulations on making milestone release I installed it and on a first look I can't see it workflow. On web portal I found tip to follow examples that come with this package, but those aren't trivial at all; I mean lot of classes and functions comes from nowhare and it's like no walkthrough provided If I may explain myself, I got summer seminar assigment, starting from shallow water eqs to derive eq of absolute vorticity for nondivergent flow in linearized form using perturbation method. I've done it by hand, but would like to understand the process with some of Python packages if feasible, and as staring eqs are hyperbolic PDEs maybe I could use this package although it uses different method for solving. Is it good idea to try to use this package, and if answer is yes, can you maybe provide some starting point for this simple task? Thanks On Fri, Aug 12, 2011 at 5:44 AM, Yung-Yu Chen wrote: > Hello, > > I am pleased to announce version 0.1 of SOLVCON. ?SOLVCON is a Python-based, > multi-physics software framework for solving first-order hyperbolic PDEs. > > The source tarball can be downloaded at > http://bitbucket.org/yungyuc/solvcon/downloads . ?More information can be > found at http://solvcon.net/ . > > This release marks a milestone of SOLVCON. ?Future development of SOLVCON will > focus on production use. ?The planned directions include (i) the high-order > CESE method, (ii) improving the scalability by consolidating the > distributed-memory parallel code, (iii) expanding the capabilities of the > existing solver kernels, and (iv) incorporating more physical processes. > > New features: > > - Glue BCs are added. ?A pair of collocated BCs can now be glued together to > ?work as an internal interface. ?The glued BCs helps to dynamically turn on or > ?off the BC pair. > - ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for > ?multi-threaded computing. ?They were using a thread pool built-in SOLVCON for > ?multi-threading. ?OpenMP makes multi-threaded functions more flexible in > ?argument specification. > - Add the ``soil/`` directory for providing building helpers for GCC 4.6.1. > ?Note, the name ``gcc/`` is deliberately avoided for the directory, because of > ?a bug in gcc itself (bug id 48306 > ?http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ). > - Add ``-j`` command line option for building dependencies in the ``ground/`` > ?directory and the ``soil/`` directory. ?Note that ATLAS doesn't work with > ?``make -j N``. > > Bug-fix: > > - METIS changes its download URL. ?Modify SConstruct accordingly. > > -- > Yung-Yu Chen > http://solvcon.net/yyc/ > +1 (614) 859 2436 > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jrocher at enthought.com Fri Aug 12 03:00:41 2011 From: jrocher at enthought.com (Jonathan Rocher) Date: Fri, 12 Aug 2011 09:00:41 +0200 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: <4E44BC47.7060400@simplistix.co.uk> References: <4E3F020B.1000500@simplistix.co.uk> <4E44BC47.7060400@simplistix.co.uk> Message-ID: Dear Chris, for documentation about the Enthought Tools Suite (open source, BSD-like licence), let me point you to http://code.enthought.com/ Specifically about 2D visualization, Chaco is good at building interactive plotting tools efficiently even with large datasets. Its entire documentation can be found at http://github.enthought.com/chaco/ The documentation is not perfect but you can definitely find lots of examples to follow in the Tutorialssection as well as in the gallery . Hope this helps. Jonathan On Fri, Aug 12, 2011 at 7:38 AM, Chris Withers wrote: > On 09/08/2011 21:46, Dav Clark wrote: > > Well, this is probably more basic than you want, but O'Reilly's "Data > Analysis with Open Source Tools" is certainly a nice low-level intro for a > beginner: > > > > http://oreilly.com/catalog/9780596802363 > > > > It's available on Safari Bookshelf, and also talks about using matplotlib > (and R and GSL and ...). I'm unaware of any nice Chaco "narratives." > > Thanks to you and everyone else for the great suggestions :-) > > >> Now, first question: what's the best way to build this array given that > >> I may only see the arrival of a new venue a fair way through building > >> the data structure? How can I efficiently say "please add a new row to > >> my array", I don't know what the 4th dimension equivalent is ;-) > > > > You might consider doing what matlab does under the hood and just double > the array when you run out of space. > > What's the best way to do this? > > cheers, > > Chris > > -- > Simplistix - Content Management, Batch Processing & Python Consulting > - http://www.simplistix.co.uk > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Jonathan Rocher, PhD Scientific software developer Enthought, Inc. jrocher at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooth29 at hotmail.com Fri Aug 12 06:02:35 2011 From: smooth29 at hotmail.com (Pawel Zmarz) Date: Fri, 12 Aug 2011 10:02:35 +0000 Subject: [SciPy-User] Data Acquisition with NIDAQmx error Message-ID: Hello scipy user community, I hope I am emailing the correct list... I'm a newbie, and I've been trying to implement the 'Data Acquisition with NIDAQmx' code from the SciPy Cookbook to use python to generate an analog signal out of my NI USB-6008 card. However, when I run the code I get the following error: RuntimeError: nidaq call failed with error -200077: 'Requested value is not a supported value for this property.' Any ideas on what this is and hot to solve it? I don't really understand the code, and I am not really sure where I specify the value to be generated... Link to the code (under Analog Generation):http://www.scipy.org/Cookbook/Data_Acquisition_with_NIDAQmx Thank you for help!!BW,paw -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyc at solvcon.net Fri Aug 12 07:20:44 2011 From: yyc at solvcon.net (Yung-Yu Chen) Date: Fri, 12 Aug 2011 07:20:44 -0400 Subject: [SciPy-User] ANN: SOLVCON 0.1 In-Reply-To: References: Message-ID: Hello, On Fri, Aug 12, 2011 at 01:49, Klonuo Umom wrote: > Interesting package. Congratulations on making milestone release > > I installed it and on a first look I can't see it workflow. On web > portal I found tip to follow examples that come with this package, but > those aren't trivial at all; I mean lot of classes and functions comes > from nowhare and it's like no walkthrough provided > We have not put efforts to make the package user friendly. SOLVCON began with the idea to provide a framework to collect important supportive functionalities needed by CFD codes, to enhance the robustness and coding efficiency. The proof of concept turned out to be successful, and we realized that SOLVCON has great potentials to facilitate a new category of practices of building high-performance conservation-law solvers for high-fidelity solutions. In the time being, SOLVCON is made for experts in computational science. In the foreseeable future, our collaborators and we will make it more accurate, more scalable, and more versatile. You can find the plan for the forthcoming development at http://solvcon.net/yyc/writing/2011/solvcon_0.1.html . We hope SOLVCON can be used to renew the technology used in high-end calculations of PDEs, e.g., CFD, computational electromagnetism, etc. > If I may explain myself, I got summer seminar assigment, starting from > shallow water eqs to derive eq of absolute vorticity for nondivergent > flow in linearized form using perturbation method. I've done it by > hand, but would like to understand the process with some of Python > packages if feasible, and as staring eqs are hyperbolic PDEs maybe I > could use this package although it uses different method for solving. > Is it good idea to try to use this package, and if answer is yes, can > you maybe provide some starting point for this simple task? > > The package was geared up for large-scale, complex calculations. Using SOLVCON for very simple calculations would be an overkill. Unfortunately, the short answer to your question could be no. When dealing with multi-physics, we take the approach to model the underlying numerical algorithm, mathematics and physics as much as possible. The approach looks complicated at the first glance, but is actually concise from theories to implementations. We believe the conciseness or compactness is critically important for scaling SOLVCON from thousands of CPUs to hundreds of thousand of CPUs. The price for this approach is to prolong the path to generic representation of PDEs. We do hope to provide the capability to compile the PDEs written by users in a symbolic form for SOLVCON to execute automatically. But this won't happen in foreseeable future. If you want to know more about SOLVCON and its theoretical background, you can check up with my dissertation at http://solvcon.net/yyc/publications.html . with regards, Yung-Yu Chen > Thanks > > On Fri, Aug 12, 2011 at 5:44 AM, Yung-Yu Chen wrote: >> Hello, >> >> I am pleased to announce version 0.1 of SOLVCON. ?SOLVCON is a Python-based, >> multi-physics software framework for solving first-order hyperbolic PDEs. >> >> The source tarball can be downloaded at >> http://bitbucket.org/yungyuc/solvcon/downloads . ?More information can be >> found at http://solvcon.net/ . >> >> This release marks a milestone of SOLVCON. ?Future development of SOLVCON will >> focus on production use. ?The planned directions include (i) the high-order >> CESE method, (ii) improving the scalability by consolidating the >> distributed-memory parallel code, (iii) expanding the capabilities of the >> existing solver kernels, and (iv) incorporating more physical processes. >> >> New features: >> >> - Glue BCs are added. ?A pair of collocated BCs can now be glued together to >> ?work as an internal interface. ?The glued BCs helps to dynamically turn on or >> ?off the BC pair. >> - ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for >> ?multi-threaded computing. ?They were using a thread pool built-in SOLVCON for >> ?multi-threading. ?OpenMP makes multi-threaded functions more flexible in >> ?argument specification. >> - Add the ``soil/`` directory for providing building helpers for GCC 4.6.1. >> ?Note, the name ``gcc/`` is deliberately avoided for the directory, because of >> ?a bug in gcc itself (bug id 48306 >> ?http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ). >> - Add ``-j`` command line option for building dependencies in the ``ground/`` >> ?directory and the ``soil/`` directory. ?Note that ATLAS doesn't work with >> ?``make -j N``. >> >> Bug-fix: >> >> - METIS changes its download URL. ?Modify SConstruct accordingly. >> >> -- >> Yung-Yu Chen >> http://solvcon.net/yyc/ >> +1 (614) 859 2436 >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Yung-Yu Chen http://solvcon.net/yyc/ +1 (614) 859 2436 From hasslerjc at comcast.net Fri Aug 12 09:37:17 2011 From: hasslerjc at comcast.net (John Hassler) Date: Fri, 12 Aug 2011 09:37:17 -0400 Subject: [SciPy-User] Data Acquisition with NIDAQmx error In-Reply-To: References: Message-ID: <4E452C8D.2090401@comcast.net> An HTML attachment was scrubbed... URL: From pjabardo at yahoo.com.br Fri Aug 12 09:43:41 2011 From: pjabardo at yahoo.com.br (Paulo Jabardo) Date: Fri, 12 Aug 2011 06:43:41 -0700 (PDT) Subject: [SciPy-User] Data Acquisition with NIDAQmx error In-Reply-To: References: Message-ID: <1313156621.72956.YahooMailNeo@web30004.mail.mud.yahoo.com> Try using pydaqtools, a more generic data acquisition interface. http://pydaqtools.org/ ________________________________ De: Pawel Zmarz Para: scipy-user at scipy.org Enviadas: Sexta-feira, 12 de Agosto de 2011 7:02 Assunto: [SciPy-User] Data Acquisition with NIDAQmx error Hello scipy user community, I hope I am emailing the correct list... I'm a newbie, and I've been trying to implement the 'Data?Acquisition?with NIDAQmx' code from the SciPy Cookbook to use python to generate an analog signal out of my NI USB-6008 card. However, when I run the code I get the following error: RuntimeError: nidaq call failed with error -200077: 'Requested value is not a supported value for this property.' Any ideas on what this is and hot to solve it? I don't really understand the code, and I am not really sure where I specify the value to be generated... Link to the code (under Analog Generation): http://www.scipy.org/Cookbook/Data_Acquisition_with_NIDAQmx Thank you for help!! BW, paw _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From hasslerjc at comcast.net Fri Aug 12 09:52:37 2011 From: hasslerjc at comcast.net (John Hassler) Date: Fri, 12 Aug 2011 09:52:37 -0400 Subject: [SciPy-User] Data Acquisition with NIDAQmx error In-Reply-To: <1313156621.72956.YahooMailNeo@web30004.mail.mud.yahoo.com> References: <1313156621.72956.YahooMailNeo@web30004.mail.mud.yahoo.com> Message-ID: <4E453025.9070100@comcast.net> An HTML attachment was scrubbed... URL: From pearu.peterson at gmail.com Fri Aug 12 09:53:52 2011 From: pearu.peterson at gmail.com (Pearu Peterson) Date: Fri, 12 Aug 2011 16:53:52 +0300 Subject: [SciPy-User] ANN: iocbio.microscope - a Python deconvolution software Message-ID: <4E453070.5000801@cens.ioc.ee> We are proud to release a new package for deconvolving 3D microscope images iocbio.microscope. It is a part of an open-source software project iocbio from the Laboratory of Systems Biology in the Institute of Cybernetics at Tallinn Technical University (http://sysbio.ioc.ee). Iocbio.microscope software package allows to deconvolve microscope images. In addition to the deconvolution program, the package includes the set of tools that is required for processing images, estimation of point spread function (PSF) and visualizing the results. This software is written in Python and is released with an open-source license (BSD). Homepage: http://code.google.com/p/iocbio/wiki/IOCBioMicroscope Tutorial: http://code.google.com/p/iocbio/wiki/DeconvolutionTutorial Iocbio API documentation: http://sysbio.ioc.ee/download/software/iocbio/index.html Sources and download: http://iocbio.googlecode.com Iocbio is developed under Linux (ubuntu) but will also run under Windows (we provide installer for Windows users to ease the process of setting up the iocbio software as well as its prerequisites). Mathematical background of implemented deconvolution algorithm, notes and guidelines on selection of parameters for deconvolution and application to real-life images are described in a recent paper Laasmaa, M, Vendelin, M, Peterson, P (2011). Application of regularized Richardson-Lucy algorithm for deconvolution of confocal microscopy images. J. Microscopy. Volume 243, Issue 2 , pages 124?140, August 2011: http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2818.2011.03486.x/full Pearu Peterson From contact at graune.org Fri Aug 12 12:00:17 2011 From: contact at graune.org (Manuel Graune) Date: Fri, 12 Aug 2011 18:00:17 +0200 Subject: [SciPy-User] calculate definite integral of sampled data In-Reply-To: References: <20110810055958.GH2924@uriel> Message-ID: <20110812160017.GB3741@uriel> On Wed, Aug 10, 2011 at 07:26:32PM -0600, Charles R Harris wrote: > > How many data points do you have? > Chuck > depending on the use-case about 10-20000. The solution suggested by Gustavo works pretty well for me. Just as scipy.integrate.cumtrapz does. I obiously had not read the documentation quite enough to understand what cumtrapz good for. Thanks to all. Manuel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From klonuo at gmail.com Fri Aug 12 12:56:55 2011 From: klonuo at gmail.com (Klonuo Umom) Date: Fri, 12 Aug 2011 18:56:55 +0200 Subject: [SciPy-User] ANN: SOLVCON 0.1 In-Reply-To: References: Message-ID: Thanks for your explanation I thought it would be something like this On Fri, Aug 12, 2011 at 1:20 PM, Yung-Yu Chen wrote: > Hello, > > On Fri, Aug 12, 2011 at 01:49, Klonuo Umom wrote: >> Interesting package. Congratulations on making milestone release >> >> I installed it and on a first look I can't see it workflow. On web >> portal I found tip to follow examples that come with this package, but >> those aren't trivial at all; I mean lot of classes and functions comes >> from nowhare and it's like no walkthrough provided >> > > We have not put efforts to make the package user friendly. ?SOLVCON > began with the idea to provide a framework to collect important > supportive functionalities needed by CFD codes, to enhance the > robustness and coding efficiency. ?The proof of concept turned out to > be successful, and we realized that SOLVCON has great potentials to > facilitate a new category of practices of building high-performance > conservation-law solvers for high-fidelity solutions. > > In the time being, SOLVCON is made for experts in computational > science. ?In the foreseeable future, our collaborators and we will > make it more accurate, more scalable, and more versatile. ?You can > find the plan for the forthcoming development at > http://solvcon.net/yyc/writing/2011/solvcon_0.1.html . ?We hope > SOLVCON can be used to renew the technology used in high-end > calculations of PDEs, e.g., CFD, computational electromagnetism, etc. > >> If I may explain myself, I got summer seminar assigment, starting from >> shallow water eqs to derive eq of absolute vorticity for nondivergent >> flow in linearized form using perturbation method. I've done it by >> hand, but would like to understand the process with some of Python >> packages if feasible, and as staring eqs are hyperbolic PDEs maybe I >> could use this package although it uses different method for solving. >> Is it good idea to try to use this package, and if answer is yes, can >> you maybe provide some starting point for this simple task? >> >> > > The package was geared up for large-scale, complex calculations. > Using SOLVCON for very simple calculations would be an overkill. > Unfortunately, the short answer to your question could be no. > > When dealing with multi-physics, we take the approach to model the > underlying numerical algorithm, mathematics and physics as much as > possible. ?The approach looks complicated at the first glance, but is > actually concise from theories to implementations. ?We believe the > conciseness or compactness is critically important for scaling SOLVCON > from thousands of CPUs to hundreds of thousand of CPUs. > > The price for this approach is to prolong the path to generic > representation of PDEs. ?We do hope to provide the capability to > compile the PDEs written by users in a symbolic form for SOLVCON to > execute automatically. ?But this won't happen in foreseeable future. > > If you want to know more about SOLVCON and its theoretical background, > you can check up with my dissertation at > http://solvcon.net/yyc/publications.html . > > with regards, > Yung-Yu Chen > >> Thanks >> >> On Fri, Aug 12, 2011 at 5:44 AM, Yung-Yu Chen wrote: >>> Hello, >>> >>> I am pleased to announce version 0.1 of SOLVCON. ?SOLVCON is a Python-based, >>> multi-physics software framework for solving first-order hyperbolic PDEs. >>> >>> The source tarball can be downloaded at >>> http://bitbucket.org/yungyuc/solvcon/downloads . ?More information can be >>> found at http://solvcon.net/ . >>> >>> This release marks a milestone of SOLVCON. ?Future development of SOLVCON will >>> focus on production use. ?The planned directions include (i) the high-order >>> CESE method, (ii) improving the scalability by consolidating the >>> distributed-memory parallel code, (iii) expanding the capabilities of the >>> existing solver kernels, and (iv) incorporating more physical processes. >>> >>> New features: >>> >>> - Glue BCs are added. ?A pair of collocated BCs can now be glued together to >>> ?work as an internal interface. ?The glued BCs helps to dynamically turn on or >>> ?off the BC pair. >>> - ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for >>> ?multi-threaded computing. ?They were using a thread pool built-in SOLVCON for >>> ?multi-threading. ?OpenMP makes multi-threaded functions more flexible in >>> ?argument specification. >>> - Add the ``soil/`` directory for providing building helpers for GCC 4.6.1. >>> ?Note, the name ``gcc/`` is deliberately avoided for the directory, because of >>> ?a bug in gcc itself (bug id 48306 >>> ?http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ). >>> - Add ``-j`` command line option for building dependencies in the ``ground/`` >>> ?directory and the ``soil/`` directory. ?Note that ATLAS doesn't work with >>> ?``make -j N``. >>> >>> Bug-fix: >>> >>> - METIS changes its download URL. ?Modify SConstruct accordingly. >>> >>> -- >>> Yung-Yu Chen >>> http://solvcon.net/yyc/ >>> +1 (614) 859 2436 >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > Yung-Yu Chen > http://solvcon.net/yyc/ > +1 (614) 859 2436 > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From nouiz at nouiz.org Fri Aug 12 16:06:49 2011 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 12 Aug 2011 16:06:49 -0400 Subject: [SciPy-User] Theano 0.4.1 released Message-ID: =========================== Announcing Theano 0.4.1 =========================== This is an important release, with lots of new features, bug fixes and some deprecation warning. The upgrade is recommended for everybody. For those using the bleeding edge version in the mercurial repository, we encourage you to update to the `0.4.1` tag. What's New ---------- New features: * `R_op `_ macro like theano.tensor.grad * Not all tests are done yet (TODO) * Added alias theano.tensor.bitwise_{and,or,xor,not}. They are the numpy names. * Updates returned by Scan (you need to pass them to the theano.function) are now a new Updates class. That allow more check and easier work with them. The Updates class is a subclass of dict * Scan can now work in a "do while" loop style. * We scan until a condition is met. * There is a minimum of 1 iteration(can't do "while do" style loop) * The "Interactive Debugger" (compute_test_value theano flags) * Now should work with all ops (even the one with only C code) * In the past some errors were caught and re-raised as unrelated errors (ShapeMismatch replaced with NotImplemented). We don't do that anymore. * The new Op.make_thunk function(introduced in 0.4.0) is now used by constant_folding and DebugMode * Added A_TENSOR_VARIABLE.astype() as a way to cast. NumPy allows this syntax. * New BLAS GER implementation. * Insert GEMV more frequently. * Added new ifelse(scalar condition, rval_if_true, rval_if_false) Op. * This is a subset of the elemwise switch (tensor condition, rval_if_true, rval_if_false). * With the new feature in the sandbox, only one of rval_if_true or rval_if_false will be evaluated. Optimizations: * Subtensor has C code * {Inc,Set}Subtensor has C code * ScalarFromTensor has C code * dot(zeros,x) and dot(x,zeros) * IncSubtensor(x, zeros, idx) -> x * SetSubtensor(x, x[idx], idx) -> x (when x is a constant) * subtensor(alloc,...) -> alloc * Many new scan optimization * Lower scan execution overhead with a Cython implementation * Removed scan double compilation (by using the new Op.make_thunk mechanism) * Certain computations from the inner graph are now Pushed out into the outer graph. This means they are not re-comptued at every step of scan. * Different scan ops get merged now into a single op (if possible), reducing the overhead and sharing computations between the two instances GPU: * PyCUDA/CUDAMat/Gnumpy/Theano bridge and `documentation `_. * New function to easily convert pycuda GPUArray object to and from CudaNdarray object * Fixed a bug if you crated a view of a manually created CudaNdarray that are view of GPUArray. * Removed a warning when nvcc is not available and the user did not requested it. * renamed config option cuda.nvccflags -> nvcc.flags * Allow GpuSoftmax and GpuSoftmaxWithBias to work with bigger input. Bugs fixed: * In one case an AdvancedSubtensor1 could be converted to a GpuAdvancedIncSubtensor1 insted of GpuAdvancedSubtensor1. It probably didn't happen due to the order of optimizations, but that order is not guaranteed to be the same on all computers. * Derivative of set_subtensor was wrong. * Derivative of Alloc was wrong. Crash fixed: * On an unusual Python 2.4.4 on Windows * When using a C cache copied from another location * On Windows 32 bits when setting a complex64 to 0. * Compilation crash with CUDA 4 * When wanting to copy the compilation cache from a computer to another * This can be useful for using Theano on a computer without a compiler. * GPU: * Compilation crash fixed under Ubuntu 11.04 * Compilation crash fixed with CUDA 4.0 Know bug: * CAReduce with nan in inputs don't return the good output (`Ticket `_). * This is used in tensor.{max,mean,prod,sum} and in the grad of PermuteRowElements. * This is not a new bug, just a bug discovered since the last release that we didn't had time to fix. Deprecation (will be removed in Theano 0.5, warning generated if you use them): * The string mode (accepted only by theano.function()) FAST_RUN_NOGC. Use Mode(linker='c|py_nogc') instead. * The string mode (accepted only by theano.function()) STABILIZE. Use Mode(optimizer='stabilize') instead. * scan interface change: * The use of `return_steps` for specifying how many entries of the output scan has been depricated * The same thing can be done by applying a subtensor on the output return by scan to select a certain slice * The inner function (that scan receives) should return its outputs and updates following this order: [outputs], [updates], [condition]. One can skip any of the three if not used, but the order has to stay unchanged. * tensor.grad(cost, wrt) will return an object of the "same type" as wrt (list/tuple/TensorVariable). * Currently tensor.grad return a type list when the wrt is a list/tuple of more then 1 element. Sandbox: * MRG random generator now implements the same casting behavior as the regular random generator. Sandbox New features(not enabled by default): * New Linkers (theano flags linker={vm,cvm}) * The new linker allows lazy evaluation of the new ifelse op, meaning we compute only the true or false branch depending of the condition. This can speed up some types of computation. * Uses a new profiling system (that currently tracks less stuff) * The cvm is implemented in C, so it lowers Theano's overhead. * The vm is implemented in python. So it can help debugging in some cases. * In the future, the default will be the cvm. * Some new not yet well tested sparse ops: theano.sparse.sandbox.{SpSum, Diag, SquareDiagonal, ColScaleCSC, RowScaleCSC, Remove0, EnsureSortedIndices, ConvolutionIndices} Documentation: * How to compute the `Jacobian, Hessian, Jacobian times a vector, Hessian times a vector `_. * Slide for a 3 hours class with exercises that was done at the HPCS2011 Conference in Montreal. Others: * Logger name renamed to be consistent. * Logger function simplified and made more consistent. * Fixed transformation of error by other not related error with the compute_test_value Theano flag. * Compilation cache enhancements. * Made compatible with NumPy 1.6 and SciPy 0.9 * Fix tests when there was new dtype in NumPy that is not supported by Theano. * Fixed some tests when SciPy is not available. * Don't compile anything when Theano is imported. Compile support code when we compile the first C code. * Python 2.4 fix: * Fix the file theano/misc/check_blas.py * For python 2.4.4 on Windows, replaced float("inf") with numpy.inf. * Removes useless inputs to a scan node * Beautification mostly, making the graph more visible. Such inputs would appear as a consequence of other optimizations Core: * there is a new mechanism that lets an Op permit that one of its inputs to be aliased to another destroyed input. This will generally result in incorrect calculation, so it should be used with care! The right way to use it is when the caller can guarantee that even if these two inputs look aliased, they actually will never overlap. This mechanism can be used, for example, by a new alternative approach to implementing Scan. If an op has an attribute called "destroyhandler_tolerate_aliased" then this is what's going on. IncSubtensor is thus far the only Op to use this mechanism.Mechanism Download -------- You can download Theano from http://pypi.python.org/pypi/Theano. Description ----------- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: * tight integration with NumPy: a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions. * transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only). * efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs. * speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1+ exp(x)) for large values of x. * dynamic C code generation: evaluate expressions faster. * extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal). Resources --------- About Theano: http://deeplearning.net/software/theano/ About NumPy: http://numpy.scipy.org/ About SciPy: http://www.scipy.org/ Machine Learning Tutorial with Theano on Deep Architectures: http://deeplearning.net/tutorial/ Acknowledgments --------------- I would like to thank all contributors of Theano. For this particular release, here is the people that contributed code and/or documentation: (in alphabetical order) Frederic Bastien, James Bergstra, Olivier Delalleau, Xavier Glorot, Ian Goodfellow, Pascal Lamblin, Gr?goire Mesnil, Razvan Pascanu, Ilya Sutskever and David Warde-Farley Also, thank you to all NumPy and Scipy developers as Theano builds on its strength. All questions/comments are always welcome on the Theano mailing-lists ( http://deeplearning.net/software/theano/ ) From stef.mientki at gmail.com Sat Aug 13 09:54:03 2011 From: stef.mientki at gmail.com (Stef Mientki) Date: Sat, 13 Aug 2011 15:54:03 +0200 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released In-Reply-To: References: Message-ID: <4E4681FA.4020905@gmail.com> hello, is it possible to create a windows executable (a lot of windows users can't compile C-code). I tried the prebuild versions from: http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck but the fast routines are all missing there. thanks, Stef On 13-06-2011 23:35, Keith Goodman wrote: > Bottleneck is a collection of fast NumPy array functions written in > Cython. It contains functions like median, nanmedian, nanargmax, > move_max, rankdata. > > The fifth release of bottleneck adds four new functions, comes in a > single source distribution instead of separate 32 and 64 bit versions, > and contains bug fixes. > > J. David Lee wrote the C-code implementation of the double heap moving > window median. > > New functions: > - move_median(), moving window median > - partsort(), partial sort > - argpartsort() > - ss(), sum of squares, faster version of scipy.stats.ss > > Changes: > - Single source distribution instead of separate 32 and 64 bit versions > - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN > > Bug fixes: > - #14 Support python 2.5 by importing `with` statement > - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements > - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows > - #29 rankdata and nanrankdata crashed on 64-bit Windows > > download > http://pypi.python.org/pypi/Bottleneck > docs > http://berkeleyanalytics.com/bottleneck > code > http://github.com/kwgoodman/bottleneck > mailing list > http://groups.google.com/group/bottle-neck > mailing list 2 > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cgohlke at uci.edu Sat Aug 13 11:09:55 2011 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sat, 13 Aug 2011 08:09:55 -0700 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released In-Reply-To: <4E4681FA.4020905@gmail.com> References: <4E4681FA.4020905@gmail.com> Message-ID: <4E4693C3.5010103@uci.edu> On 8/13/2011 6:54 AM, Stef Mientki wrote: > hello, > > is it possible to create a windows executable > (a lot of windows users can't compile C-code). > > I tried the prebuild versions from: > http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck > > but the fast routines are all missing there. I don't see anything missing. Tests and benchmarks yield expected results using numpy 1.6.1. What's the output of `import bottleneck as bn;bn.test()` (requires nose 1.x)? Christoph > > thanks, > Stef > > On 13-06-2011 23:35, Keith Goodman wrote: >> Bottleneck is a collection of fast NumPy array functions written in >> Cython. It contains functions like median, nanmedian, nanargmax, >> move_max, rankdata. >> >> The fifth release of bottleneck adds four new functions, comes in a >> single source distribution instead of separate 32 and 64 bit versions, >> and contains bug fixes. >> >> J. David Lee wrote the C-code implementation of the double heap moving >> window median. >> >> New functions: >> - move_median(), moving window median >> - partsort(), partial sort >> - argpartsort() >> - ss(), sum of squares, faster version of scipy.stats.ss >> >> Changes: >> - Single source distribution instead of separate 32 and 64 bit versions >> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN >> >> Bug fixes: >> - #14 Support python 2.5 by importing `with` statement >> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements >> - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows >> - #29 rankdata and nanrankdata crashed on 64-bit Windows >> >> download >> http://pypi.python.org/pypi/Bottleneck >> docs >> http://berkeleyanalytics.com/bottleneck >> code >> http://github.com/kwgoodman/bottleneck >> mailing list >> http://groups.google.com/group/bottle-neck >> mailing list 2 >> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ralf.gommers at googlemail.com Sat Aug 13 11:58:41 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 13 Aug 2011 17:58:41 +0200 Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion) Message-ID: On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer wrote: > > On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote: > > > Ah, with "svn" you actually meant svn:) I thought that was supposed to > not even work anymore. > > It does work and it's confusing. I had not been following the transition > closely and so was under the impression that the svn repository was being > mirrored from git. It's not. It's just old. > > Who can disable SVN access for numpy and scipy? There are still plenty of links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can confuse users. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ognen at enthought.com Sat Aug 13 12:00:49 2011 From: ognen at enthought.com (Ognen Duzlevski) Date: Sat, 13 Aug 2011 12:00:49 -0400 Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion) In-Reply-To: References: Message-ID: On Sat, Aug 13, 2011 at 11:58 AM, Ralf Gommers wrote: > > > On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer wrote: > >> >> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote: >> >> > Ah, with "svn" you actually meant svn:) I thought that was supposed to >> not even work anymore. >> >> It does work and it's confusing. I had not been following the transition >> closely and so was under the impression that the svn repository was being >> mirrored from git. It's not. It's just old. >> >> Who can disable SVN access for numpy and scipy? There are still plenty of > links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can > confuse users. > > Ralf > Ralf, I am the new Enthought sys admin. Is there anything I can do to help? Thanks, Ognen -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Aug 13 12:14:11 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 13 Aug 2011 18:14:11 +0200 Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion) In-Reply-To: References: Message-ID: On Sat, Aug 13, 2011 at 6:00 PM, Ognen Duzlevski wrote: > On Sat, Aug 13, 2011 at 11:58 AM, Ralf Gommers < > ralf.gommers at googlemail.com> wrote: > >> >> >> On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer wrote: >> >>> >>> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote: >>> >>> > Ah, with "svn" you actually meant svn:) I thought that was supposed to >>> not even work anymore. >>> >>> It does work and it's confusing. I had not been following the transition >>> closely and so was under the impression that the svn repository was being >>> mirrored from git. It's not. It's just old. >>> >>> Who can disable SVN access for numpy and scipy? There are still plenty of >> links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can >> confuse users. >> >> Ralf >> > > Hi Ognen, > Ralf, > > I am the new Enthought sys admin. Is there anything I can do to help? > > We should check if there's still any code in SVN branches that is useful. If so the people who are interested in it should move it somewhere else. Anyone? After that I think you can pull the plug on http://svn.scipy.org/svn/numpy/and http://svn.scipy.org/svn/scipy/. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ognen at enthought.com Sat Aug 13 12:57:55 2011 From: ognen at enthought.com (Ognen Duzlevski) Date: Sat, 13 Aug 2011 12:57:55 -0400 Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion) In-Reply-To: References: Message-ID: On Sat, Aug 13, 2011 at 12:14 PM, Ralf Gommers wrote: > On Sat, Aug 13, 2011 at 6:00 PM, Ognen Duzlevski wrote: > >> On Sat, Aug 13, 2011 at 11:58 AM, Ralf Gommers < >> ralf.gommers at googlemail.com> wrote: >> >>> >>> >>> On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer wrote: >>> >>>> >>>> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote: >>>> >>>> > Ah, with "svn" you actually meant svn:) I thought that was supposed to >>>> not even work anymore. >>>> >>>> It does work and it's confusing. I had not been following the transition >>>> closely and so was under the impression that the svn repository was being >>>> mirrored from git. It's not. It's just old. >>>> >>>> Who can disable SVN access for numpy and scipy? There are still plenty >>> of links to http://svn.scipy.org/svn/numpy/trunk/ floating around that >>> can confuse users. >>> >>> Ralf >>> >> >> Hi Ognen, > > >> Ralf, >> >> I am the new Enthought sys admin. Is there anything I can do to help? >> >> We should check if there's still any code in SVN branches that is useful. > If so the people who are interested in it should move it somewhere else. > Anyone? > > After that I think you can pull the plug on > http://svn.scipy.org/svn/numpy/ and http://svn.scipy.org/svn/scipy/. > > Ralf > OK - let me know. Ognen -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sat Aug 13 14:12:00 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 13 Aug 2011 13:12:00 -0500 Subject: [SciPy-User] firwin behavior In-Reply-To: References: Message-ID: On Thu, Aug 11, 2011 at 6:36 AM, Jeff Alstott wrote: > Wow. The passing of the DC frequency is exactly the issue, and that default > behavior is clearly shown in the documentation. I see now that given a band, > the default behavior is band-stop, whereas I would expect it to be > band-pass. So, that fixed it. > > What I don't understand, however, is *why* that would be default behavior. > More importantly, even if that is the default behavior, the name of the > pass_zero flag does not readily help a dumb user like me grok the > functionality. Has there been any thought to renaming it? > See here for the evolution of the firwin API: http://projects.scipy.org/scipy/ticket/902 Warren > > Thanks! > > > On Wed, Aug 10, 2011 at 4:39 PM, Warren Weckesser < > warren.weckesser at enthought.com> wrote: > >> On Wed, Aug 10, 2011 at 8:14 AM, Jeff Alstott >> wrote: >> > firwin is producing unreasonable filters for me, and I'm not sure if I'm >> > misusing the code or if there is a bug. Like so: >> > >> > In [5]: from scipy.signal import firwin >> > >> > In [6]: ny = 500 >> > >> > In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80); >> > savefig('FIR21_filter80.png') >> > >> > Produces the attached file. >> > >> > In contrast, Matlab: >> > >> > Trial>> ny = 500 >> > >> > ny = >> > >> > 500 >> > >> > Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80) >> > >> > Produces the other attached file. Quite different! The filter produced >> by >> > the scipy function, if used with lfilter (or if taken to Matlab to use >> as a >> > filter), produces a nonsense filtering, with many high frequency >> artifacts. >> > >> > Any thoughts? This is in python3, if that matters. >> >> >> By default, firwin creates a filter that passes DC (i.e. the zero >> frequency). To get a filter like the one produced by matlab, add the >> keyword argument pass_zero=False. >> >> Warren >> >> >> > >> > Thanks! >> > >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stef.mientki at gmail.com Sat Aug 13 15:05:10 2011 From: stef.mientki at gmail.com (Stef Mientki) Date: Sat, 13 Aug 2011 21:05:10 +0200 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released In-Reply-To: <4E4693C3.5010103@uci.edu> References: <4E4681FA.4020905@gmail.com> <4E4693C3.5010103@uci.edu> Message-ID: <4E46CAE6.8000403@gmail.com> thanks Cristoph, found the problem, I used a too old version of numpy, ( wouldn't it be an idea to replace line 17 in __init__.py with "print 'requires at least numpy 1.5.1" now the fast routines are working ... ... at least sometimes ... at least on some computers What I've seen until now: Computer 1: numpy 1.4, so it uses slow routines: functional ok Computer 2: exactly the same python + libs: screen starts to "blink" to black a few times (for about half a second, with an interval about 2 seconds), after 10 times, the screen is filled with a repeating part of the screen and computer hangs totally. Computer 2: numpy 1.6.1 : first program run, screen "blinks" black once, the fast bottleneck routines are use, and they function ok. Second run of the same program: screen blinks blank once, after a few seconds, the screen is again filled with a smaal repating part of the screen and the computer hangs totally. Any ideas ? Is the GPU used with these routines ? cheers, Stef On 13-08-2011 17:09, Christoph Gohlke wrote: > > On 8/13/2011 6:54 AM, Stef Mientki wrote: >> hello, >> >> is it possible to create a windows executable >> (a lot of windows users can't compile C-code). >> >> I tried the prebuild versions from: >> http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck >> >> but the fast routines are all missing there. > I don't see anything missing. Tests and benchmarks yield expected > results using numpy 1.6.1. > > What's the output of `import bottleneck as bn;bn.test()` (requires nose > 1.x)? > > Christoph > >> thanks, >> Stef >> >> On 13-06-2011 23:35, Keith Goodman wrote: >>> Bottleneck is a collection of fast NumPy array functions written in >>> Cython. It contains functions like median, nanmedian, nanargmax, >>> move_max, rankdata. >>> >>> The fifth release of bottleneck adds four new functions, comes in a >>> single source distribution instead of separate 32 and 64 bit versions, >>> and contains bug fixes. >>> >>> J. David Lee wrote the C-code implementation of the double heap moving >>> window median. >>> >>> New functions: >>> - move_median(), moving window median >>> - partsort(), partial sort >>> - argpartsort() >>> - ss(), sum of squares, faster version of scipy.stats.ss >>> >>> Changes: >>> - Single source distribution instead of separate 32 and 64 bit versions >>> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN >>> >>> Bug fixes: >>> - #14 Support python 2.5 by importing `with` statement >>> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements >>> - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows >>> - #29 rankdata and nanrankdata crashed on 64-bit Windows >>> >>> download >>> http://pypi.python.org/pypi/Bottleneck >>> docs >>> http://berkeleyanalytics.com/bottleneck >>> code >>> http://github.com/kwgoodman/bottleneck >>> mailing list >>> http://groups.google.com/group/bottle-neck >>> mailing list 2 >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From kwgoodman at gmail.com Sat Aug 13 20:39:57 2011 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 13 Aug 2011 17:39:57 -0700 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released In-Reply-To: <4E46CAE6.8000403@gmail.com> References: <4E4681FA.4020905@gmail.com> <4E4693C3.5010103@uci.edu> <4E46CAE6.8000403@gmail.com> Message-ID: On Sat, Aug 13, 2011 at 12:05 PM, Stef Mientki wrote: > thanks Cristoph, > > found the problem, > I used a too old version of numpy, > ( wouldn't it be an idea to replace line 17 in __init__.py with "print 'requires at least numpy 1.5.1" Sounds like a good idea but unfortunately there are other reasons, besides an old version of numpy, why the cython functions might fail to load. For example, the compilation may have failed. > now the fast routines are working ... Yay! > ... at least sometimes Oh :( > ... at least on some computers > > What I've seen until now: > Computer 1: numpy 1.4, so it uses slow routines: functional ok > Computer 2: exactly the same python + libs: screen starts to "blink" to black a few times (for about > half a second, with an interval about 2 seconds), > after 10 times, the screen is filled with a repeating part of the screen and computer hangs totally. Oh, my goodness! That is odd. > Computer 2: numpy 1.6.1 : first program run, screen "blinks" black once, the fast bottleneck > routines are use, and they function ok. > Second run of the same program: screen blinks blank once, after a few seconds, the screen is again > filled with a smaal repating part of the screen and the computer hangs totally. > Any ideas ? That is terrible. I have no clue as to the cause. > Is the GPU used with these routines ? No. > cheers, > Stef > > On 13-08-2011 17:09, Christoph Gohlke wrote: >> >> On 8/13/2011 6:54 AM, Stef Mientki wrote: >>> hello, >>> >>> is it possible to create a windows executable >>> (a lot of windows users can't compile C-code). >>> >>> I tried the prebuild versions from: >>> http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck >>> >>> but the fast routines are all missing there. >> I don't see anything missing. Tests and benchmarks yield expected >> results using numpy 1.6.1. >> >> What's the output of `import bottleneck as bn;bn.test()` (requires nose >> 1.x)? >> >> Christoph >> >>> thanks, >>> Stef >>> >>> On 13-06-2011 23:35, Keith Goodman wrote: >>>> Bottleneck is a collection of fast NumPy array functions written in >>>> Cython. It contains functions like median, nanmedian, nanargmax, >>>> move_max, rankdata. >>>> >>>> The fifth release of bottleneck adds four new functions, comes in a >>>> single source distribution instead of separate 32 and 64 bit versions, >>>> and contains bug fixes. >>>> >>>> J. David Lee wrote the C-code implementation of the double heap moving >>>> window median. >>>> >>>> New functions: >>>> - move_median(), moving window median >>>> - partsort(), partial sort >>>> - argpartsort() >>>> - ss(), sum of squares, faster version of scipy.stats.ss >>>> >>>> Changes: >>>> - Single source distribution instead of separate 32 and 64 bit versions >>>> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN >>>> >>>> Bug fixes: >>>> - #14 Support python 2.5 by importing `with` statement >>>> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements >>>> - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows >>>> - #29 rankdata and nanrankdata crashed on 64-bit Windows >>>> >>>> download >>>> ? ? ?http://pypi.python.org/pypi/Bottleneck >>>> docs >>>> ? ? ?http://berkeleyanalytics.com/bottleneck >>>> code >>>> ? ? ?http://github.com/kwgoodman/bottleneck >>>> mailing list >>>> ? ? ?http://groups.google.com/group/bottle-neck >>>> mailing list 2 >>>> ? ? ?http://mail.scipy.org/mailman/listinfo/scipy-user >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From paul.anton.letnes at gmail.com Sun Aug 14 16:45:30 2011 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Sun, 14 Aug 2011 21:45:30 +0100 Subject: [SciPy-User] segfault Message-ID: Hi! (I am cross posting this to the This code (see bottom of email) crashed with a segfault at the scipy.linalg.eigvals line: % time python iterative-test.py File read Eigvals: zsh: segmentation fault python iterative-test.py python iterative-test.py 536.82s user 2.90s system 96% cpu 9:18.75 total Numpy version: 1.6.1 Scipy version: 0.9.0 Python version: 2.7.2 (default, Jun 25 2011, 09:29:54) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] Mac OS X 10.6.8 I have 4 GB of memory (and I believe Mac OS X allows you to use as much hard drive space as theoretically available as swap space), and the matrix A is of dtype numpy.complex64 and has shape (4608, 4608). In 'activity monitor' the process claims to use just shy of 370 MB of memory, and does not increase with time. By my calculations the A matrix should be about 162 MB (not worrying about 'object overhead', which should be small). Anything I can do to help? I'd be happy to upload my matrix on my webpage, if someone wants to use it as test data. The hdf5 file is 162 MB so too big for the mailing list I suppose. Cheers, Paul *************** def main(): f = h5py.File('A2.h5', 'r') A = f['A'][:] b = numpy.loadtxt('RHS_p_formatted_copy', dtype=numpy.float32) b = b[:, 0] + 1.0j * b[:, 1] print 'File read' t0 = time.time() print 'Eigvals:' -> w = scipy.linalg.eigvals(A, overwrite_a=True) w = numpy.sort(w) t_eig = time.time() print 'eig time:', t_eig - t0 print w From paul.anton.letnes at gmail.com Sun Aug 14 16:56:34 2011 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Sun, 14 Aug 2011 21:56:34 +0100 Subject: [SciPy-User] Fwd: segfault References: Message-ID: Replying to myself with a bit more information. I tried installing the most recent scipy from the git repository into a virtualenv, but I ran into problems with umfpack. % python setup.py build blas_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] non-existing path in 'scipy/io': 'docs' lapack_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3'] umfpack_info: libraries umfpack not found in /Users/paulanto/Desktop/Rayleigh2D-debug/dev-scipy/bin/../lib libraries umfpack not found in /usr/local/lib libraries umfpack not found in /usr/lib amd_info: libraries amd not found in /Users/paulanto/Desktop/Rayleigh2D-debug/dev-scipy/bin/../lib libraries amd not found in /usr/local/lib libraries amd not found in /usr/lib FOUND: libraries = ['amd'] library_dirs = ['/opt/local/lib'] FOUND: libraries = ['umfpack', 'amd'] library_dirs = ['/opt/local/lib'] running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources building library "dfftpack" sources building library "fftpack" sources building library "linpack_lite" sources building library "mach" sources building library "quadpack" sources building library "odepack" sources building library "dop" sources building library "fitpack" sources building library "odrpack" sources building library "minpack" sources building library "rootfind" sources building library "superlu_src" sources building library "arpack_scipy" sources building library "qhull" sources building library "sc_c_misc" sources building library "sc_cephes" sources building library "sc_mach" sources building library "sc_toms" sources building library "sc_amos" sources building library "sc_cdf" sources building library "sc_specfun" sources building library "statlib" sources building extension "scipy.cluster._vq" sources building extension "scipy.cluster._hierarchy_wrap" sources building extension "scipy.fftpack._fftpack" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.fftpack.convolve" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.integrate._quadpack" sources building extension "scipy.integrate._odepack" sources building extension "scipy.integrate.vode" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.integrate._dop" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.interpolate.interpnd" sources building extension "scipy.interpolate._fitpack" sources building extension "scipy.interpolate.dfitpack" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. adding 'build/src.macosx-10.6-x86_64-2.7/scipy/interpolate/src/dfitpack-f2pywrappers.f' to sources. building extension "scipy.interpolate._interpolate" sources building extension "scipy.io.matlab.streams" sources building extension "scipy.io.matlab.mio_utils" sources building extension "scipy.io.matlab.mio5_utils" sources building extension "scipy.lib.blas.fblas" sources f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. adding 'build/src.macosx-10.6-x86_64-2.7/build/src.macosx-10.6-x86_64-2.7/scipy/lib/blas/fblas-f2pywrappers.f' to sources. building extension "scipy.lib.blas.cblas" sources adding 'build/src.macosx-10.6-x86_64-2.7/scipy/lib/blas/cblas.pyf' to sources. f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.lib.lapack.flapack" sources f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.lib.lapack.clapack" sources adding 'build/src.macosx-10.6-x86_64-2.7/scipy/lib/lapack/clapack.pyf' to sources. f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.lib.lapack.calc_lwork" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.lib.lapack.atlas_version" sources building extension "scipy.linalg.fblas" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. adding 'build/src.macosx-10.6-x86_64-2.7/build/src.macosx-10.6-x86_64-2.7/scipy/linalg/fblas-f2pywrappers.f' to sources. building extension "scipy.linalg.cblas" sources adding 'build/src.macosx-10.6-x86_64-2.7/scipy/linalg/cblas.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.linalg.flapack" sources adding 'build/src.macosx-10.6-x86_64-2.7/scipy/linalg/flapack.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. adding 'build/src.macosx-10.6-x86_64-2.7/build/src.macosx-10.6-x86_64-2.7/scipy/linalg/flapack-f2pywrappers.f' to sources. building extension "scipy.linalg.clapack" sources adding 'build/src.macosx-10.6-x86_64-2.7/scipy/linalg/clapack.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.linalg._flinalg" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.linalg.calc_lwork" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.linalg.atlas_version" sources building extension "scipy.odr.__odrpack" sources building extension "scipy.optimize._minpack" sources building extension "scipy.optimize._zeros" sources building extension "scipy.optimize._lbfgsb" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.optimize.moduleTNC" sources building extension "scipy.optimize._cobyla" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.optimize.minpack2" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.optimize._slsqp" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.optimize._nnls" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.signal.sigtools" sources building extension "scipy.signal.spectral" sources building extension "scipy.signal.spline" sources building extension "scipy.sparse.linalg.isolve._iterative" sources f2py options: [] adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs. building extension "scipy.sparse.linalg.dsolve._superlu" sources building extension "scipy.sparse.linalg.dsolve.umfpack.__umfpack" sources adding 'scipy/sparse/linalg/dsolve/umfpack/umfpack.i' to sources. swig: scipy/sparse/linalg/dsolve/umfpack/umfpack.i swig -python -o build/src.macosx-10.6-x86_64-2.7/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c -outdir build/src.macosx-10.6-x86_64-2.7/scipy/sparse/linalg/dsolve/umfpack scipy/sparse/linalg/dsolve/umfpack/umfpack.i scipy/sparse/linalg/dsolve/umfpack/umfpack.i:192: Error: Unable to find 'umfpack.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:193: Error: Unable to find 'umfpack_solve.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:194: Error: Unable to find 'umfpack_defaults.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:195: Error: Unable to find 'umfpack_triplet_to_col.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:196: Error: Unable to find 'umfpack_col_to_triplet.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:197: Error: Unable to find 'umfpack_transpose.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:198: Error: Unable to find 'umfpack_scale.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:200: Error: Unable to find 'umfpack_report_symbolic.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:201: Error: Unable to find 'umfpack_report_numeric.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:202: Error: Unable to find 'umfpack_report_info.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:203: Error: Unable to find 'umfpack_report_control.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:215: Error: Unable to find 'umfpack_symbolic.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:216: Error: Unable to find 'umfpack_numeric.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:225: Error: Unable to find 'umfpack_free_symbolic.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:226: Error: Unable to find 'umfpack_free_numeric.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:248: Error: Unable to find 'umfpack_get_lunz.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:272: Error: Unable to find 'umfpack_get_numeric.h' error: command 'swig' failed with exit status 1 From paul.anton.letnes at gmail.com Sun Aug 14 17:11:00 2011 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Sun, 14 Aug 2011 22:11:00 +0100 Subject: [SciPy-User] Fwd: segfault References: Message-ID: <52DAC708-8150-4FFA-9D31-00D6B722CCD8@gmail.com> Replying to myself with a bit more information - again. % time python iterative-test.py File read Eigvals: zsh: segmentation fault python iterative-test.py python iterative-test.py 537.58s user 2.66s system 97% cpu 9:13.79 total Memory use this time was approx. 550 MB (don't recall the exact number). The input matrix was the same. The code was modified to: def main(): f = h5py.File('A2.h5', 'r') A = f['A'][:] b = numpy.loadtxt('RHS_p_formatted_copy', dtype=numpy.float32) b = b[:, 0] + 1.0j * b[:, 1] print 'File read' t0 = time.time() print 'Eigvals:' -> w, vl, vr = scipy.linalg.eig(A, overwrite_a=True) From scipydevwikiaccount Sat Aug 13 19:20:36 2011 From: scipydevwikiaccount (scipydevwikiaccount) Date: Sun, 14 Aug 2011 03:20:36 +0400 Subject: [SciPy-User] fmin_bfgs stuck in infinite loop (but only with new version) Message-ID: ------------------------------------------------------------------------ -------------------------------- This email was sent via Anonymous email service for free. YOU CAN REMOVE THIS TEXT MESSAGE BY BEING A PAID MEMBER FOR $19/year. Message ID= 111663 ------------------------------------------------------------------------ -------------------------------- I have run into a bug where scipy's fmin_bfgs will get stuck in an infinite loop. I have submitted a bug report, but I would also like to bring this issue up in the mailing list to see if anybody has any suggestions. http://projects.scipy.org/scipy/ticket/1494 Thanks, Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: From st4s3a1l at gmail.com Sat Aug 13 19:38:46 2011 From: st4s3a1l at gmail.com (b9o2jnbm tsd71eam) Date: Sat, 13 Aug 2011 16:38:46 -0700 Subject: [SciPy-User] fmin_bfgs stuck in infinite loop Message-ID: I have run into a frustrating problem where scipy.optimize.fmin_bfgs will get stuck in an infinite loop. I have submitted a bug report: http://projects.scipy.org/scipy/ticket/1494 but would also like to see if anybody on this list has any suggestions or feedback. Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From newville at cars.uchicago.edu Mon Aug 15 09:05:50 2011 From: newville at cars.uchicago.edu (Matt Newville) Date: Mon, 15 Aug 2011 08:05:50 -0500 Subject: [SciPy-User] lmfit-py -- simple least squares minimization Message-ID: Hi, Having used on numpy and scipy for many years and being very pleased with them, I've found an area which I think might benefit from a modest improvement, and have tried to implement this. The scipy.optimize routines are robust, but seem a little unfriendly to people coming from proprietary environments or Numerical Recipes-level tools. Specifically, the Levenberg-Marquardt algorithm is used heavily in many domains (including the x-ray spectroscopy fields I am most familiar with), but the MINPACK and scipy.optimize.leastsq implementation lack convenient ways to: - turn on/off parameters for fitting, that is, to "fix" certain parameters. - place simple min/max bounds on parameters - place simple mathematical constraints on parameters. While these limitations can be worked around, doing so requires putting many options into the function to be minimized, which is somewhat inconvenient. On the other hand, these features do exist in less robust fitting code that is not based on directly on MINPACK or as well-supported as scipy. I've written a module to do this so that the least-squares minimization from scipy.optimize.leastsq can take bounded and constrained parameters, and tried to make it of general use. This code (BSD-licensed, somewhat documented) is at http://github.com/newville/lmfit-py The constraint mechanism is a bit involved (using the ast module instead of 'eval'), but the rest of the code is quite straightforward and simple. Currently, this supports minimization with scipy.optimize.leastsq, scipy.optimize.fmin_l_bfgs_b, and scipy.optimize.anneal. Supporting other algorithms could be possible. If you find this interesting or useful, I'd appreciate any feedback you might have. For example, this is not currently organized as a scikit -- would that be preferable? Cheers, --Matt Newville From tmp50 at ukr.net Mon Aug 15 15:21:30 2011 From: tmp50 at ukr.net (Dmitrey) Date: Mon, 15 Aug 2011 22:21:30 +0300 Subject: [SciPy-User] [ANN] Constrained optimization solver with guaranteed precision Message-ID: Hi all, I'm glad to inform you that general constraints handling for interalg (free solver with guaranteed user-defined precision) now is available. Despite it is very premature and requires lots of improvements, it is already capable of outperforming commercial BARON (example: http://openopt.org/interalg_bench#Test_4) and thus you could be interested in trying it right now (next OpenOpt release will be no sooner than 1 month). interalg can be especially more effective than BARON (and some other competitors) on problems with huge or absent Lipschitz constant, for example on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha<1, when domain of x is something like [small_positive_value, another_value]. Let me also remember you that interalg can search for all solutions of nonlinear equations / systems of them where local solvers like scipy.optimize fsolve cannot find anyone, and search single/multiple integral with guaranteed user-defined precision (speed of integration is intended to be enhanced in future). However, only FuncDesigner models are handled (read interalg webpage for more details). Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdeb at willmore.eu Tue Aug 16 08:06:14 2011 From: bdeb at willmore.eu (Ben Willmore) Date: Tue, 16 Aug 2011 13:06:14 +0100 Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion Message-ID: Hi Bill, As Ralf mentioned, the veclib_cabi_c.c (etc) problem you have has been fixed. But, even with new checkouts of scipy, I found that simply running 'python setup.py install' did not result in a version of scipy that worked correctly on Mac OSX 10.7 (for example, using scipy.fftpack, ifft(fft(signal)) != signal, problems with single precision vector arithmetic, failures and a crash in ARPACK tests). I was able to fix these by setting: export CC=gcc-4.2 export CXX=g++-4.2 export FFLAGS=-ff2c before running setup.py. Complete details at the links below. Ben [1] http://article.gmane.org/gmane.comp.python.scientific.devel/15349 [2] http://willmore.eu/blog/?p=5 From brockp at umich.edu Tue Aug 16 11:31:30 2011 From: brockp at umich.edu (Brock Palen) Date: Tue, 16 Aug 2011 11:31:30 -0400 Subject: [SciPy-User] numpy and mkl/10.3 Message-ID: <728A1A21-A219-4ADD-954A-A1497CABAB4E@umich.edu> I am trying to build the current numpy with mkl/10.3 Many library names have changed and I would like to get the effective link line of: -Wl,--start-group $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a $(MKLROOT)/lib/intel64/libmkl_sequential.a $(MKLROOT)/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread I have also tried doing the normal dynamic library link of: -L/usr/caen/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread I have tried messing about in site.cfg and following an intel forum post I set BLAS and LAPACK to these values with an empty site.cfg, No mater what I get: F2PY Version 2 blas_opt_info: blas_mkl_info: NOT AVAILABLE And it picks up redhats blas/lapack which I don't want to use (very very slow). Any tips for why numpy does not build with MKL 10.3 ? Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 From gerrit.holl at ltu.se Tue Aug 16 11:50:14 2011 From: gerrit.holl at ltu.se (Gerrit Holl) Date: Tue, 16 Aug 2011 17:50:14 +0200 Subject: [SciPy-User] loadtxt and complicated dtype Message-ID: Hello, I have a datafile with 5000 rows and 839 columns that have particular meanings. I use a complicated dtype to read this data, and used for this until now loadtxt. However, it seems that it has stopped working at some point. For example. >>> from numpy import loadtxt, uint8 >>> from StringIO import StringIO >>> from numpy.version import version >>> print version 2.0.0.dev-5cf0a07 >>> loadtxt(StringIO("0 1 2 3"), dtype=[("a", uint8, 2), ("b", uint8, 2)]) Traceback (most recent call last): File "", line 1, in File "/storage4/home/gerrit/.local/lib/python2.6/site-packages/numpy/lib/npyio.py", line 806, in loadtxt X = np.array(X, dtype) ValueError: setting an array element with a sequence. >>> loadtxt(StringIO("0 1 2 3"), dtype=[("a", uint8, 4)]) Traceback (most recent call last): File "", line 1, in File "/storage4/home/gerrit/.local/lib/python2.6/site-packages/numpy/lib/npyio.py", line 806, in loadtxt X = np.array(X, dtype) ValueError: setting an array element with a sequence. Why does this not work? I have filed a bug-report. http://projects.scipy.org/numpy/ticket/1936 Alright then, so I can try it in a different way. In my real case I have a 2-D array M with shape (5000, 839). I have my complicated dtype: [('temp', , 91), ('hum', , 91), ..., ('gpoint', , 1), ('ind', , 1)] ] whose numbers add up to 839. How do I turn this into an array of size (5000,) with my requested dtype? - .view(dtype) does not do what I mean, because this interprets the actual bytes, and my new array will have a different number of bytes compared to the old one - array(M, dtype) does not do what I mean, because this will try to expand every element of M according to the requested dtype, does making the array much larger (and throwing a MemoryError). I want this, because it's a very convenient way to access fields of my data. It's more convenient to say M["ciw"] than to say M[:, 455:546]. If someone can suggest another way to achieve this convenience, I'm open for suggestions. kind regards, Gerrit Holl. -- Gerrit Holl PhD student at Division of Space Technology, Lule? University of Technology, Kiruna, Sweden http://www.sat.ltu.se/members/gerrit/ From wccarithers at lbl.gov Tue Aug 16 13:18:03 2011 From: wccarithers at lbl.gov (Bill Carithers) Date: Tue, 16 Aug 2011 10:18:03 -0700 Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X 10.7 aka Lion In-Reply-To: References: Message-ID: <57C07E12-F6D4-45E8-8B7E-72960963E378@lbl.gov> Hi Ben, thanks for the info. I was happy just to get it to compile and I didn't run a full slate of tests. Cheers, Bill On Aug 16, 2011, at 5:06 AM, Ben Willmore wrote: > Hi Bill, > > As Ralf mentioned, the veclib_cabi_c.c (etc) problem you have has been fixed. But, even with new checkouts of scipy, I found that simply running 'python setup.py install' did not result in a version of scipy that worked correctly on Mac OSX 10.7 (for example, using scipy.fftpack, ifft(fft(signal)) != signal, problems with single precision vector arithmetic, failures and a crash in ARPACK tests). I was able to fix these by setting: > > export CC=gcc-4.2 > export CXX=g++-4.2 > export FFLAGS=-ff2c > > before running setup.py. Complete details at the links below. > > Ben > > > [1] http://article.gmane.org/gmane.comp.python.scientific.devel/15349 > [2] http://willmore.eu/blog/?p=5 > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ciampagg at usi.ch Tue Aug 16 13:26:47 2011 From: ciampagg at usi.ch (Giovanni Luca Ciampaglia) Date: Tue, 16 Aug 2011 10:26:47 -0700 Subject: [SciPy-User] loadtxt and complicated dtype In-Reply-To: References: Message-ID: <4E4AA857.1070508@usi.ch> Il 16. 08. 11 08:50, Gerrit Holl ha scritto: > In my real case I > have a 2-D array M with shape (5000, 839). I have my complicated > dtype: > > [('temp',, 91), > ('hum',, 91), > ..., > ('gpoint',, 1), > ('ind',, 1)] > ] > > whose numbers add up to 839. How do I turn this into an array of size > (5000,) with my requested dtype? > - .view(dtype) does not do what I mean, because this interprets the > actual bytes, and my new array will have a different number of bytes > compared to the old one > - array(M, dtype) does not do what I mean, because this will try to > expand every element of M according to the requested dtype, does > making the array much larger (and throwing a MemoryError). > > I want this, because it's a very convenient way to access fields of my > data. It's more convenient to say M["ciw"] than to say M[:, 455:546]. > If someone can suggest another way to achieve this convenience, I'm > open for suggestions. Hi Gerrit, you could use numpy.empty: # x has shape (5000, 184) ty = dtype([('temp', float64, 91), ('hum', float64, 91), ('gpoint', int32, 1), ('ind', int16, 1)]) data = empty((5000,), ty) # copy the individual columns data['temp'] = x[:,:91] data['hum'] = x[:,91:182] data['gpoint'] = x[:,182] data['ind'] = x[:,183] you can probably do the assignments in a for loop using the shape information from the individual fields cheers -- Giovanni Luca Ciampaglia Ph.D. Candidate Faculty of Informatics University of Lugano Web: http://www.inf.usi.ch/phd/ciampaglia/ Bertastra?e 36 ? 8003 Z?rich ? Switzerland From questions.anon at gmail.com Tue Aug 16 18:50:35 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 17 Aug 2011 08:50:35 +1000 Subject: [SciPy-User] numpy array append Message-ID: I would like to loop through a bunch of netcdf files in separate folders and select a particular time and then calculate the mean and plot this. I have been told to use append and make the selected times into a big array and then use numpy.mean but I can't seem to get the numpy array to work. The loop keeps calculating over the top of the last entry, if that makes sense? from netCDF4 import Dataset import matplotlib.pyplot as plt import numpy as N from mpl_toolkits.basemap import Basemap import os MainFolder=r"E:/temp_samples/" for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir for ncfile in files: if ncfile[-3:]=='.nc': ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][4::24] LAT=ncfile.variables['latitude'][:] LON=ncfile.variables['longitude'][:] TIME=ncfile.variables['time'][:] fillvalue=ncfile.variables['T_SFC']._FillValue ncfile.close() #calculate summary stats big_array=[] for i in TSFC: big_array.append(i) big_array=N.array(big_array) Mean=N.mean(big_array, axis=0) #plot output summary stats map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') map.drawcoastlines() map.drawstates() x,y=map(*N.meshgrid(LON,LAT)) plt.title('Total Mean at 3pm') ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] CS = map.contourf(x,y,Mean,ticks, cmap=plt.cm.jet) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h]) plt.colorbar(CS,cax=cax, drawedges=True) plt.show() -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsupinie at gmail.com Tue Aug 16 19:13:49 2011 From: tsupinie at gmail.com (Tim Supinie) Date: Tue, 16 Aug 2011 18:13:49 -0500 Subject: [SciPy-User] numpy array append In-Reply-To: References: Message-ID: Ah, TSFC is being overwritten every time you go through the "for ncfile in files" loop. So if you want to keep around all of them, you should declare a list before that loop called all_TSFC or something. Then instead of saying TSFC = ncfile... you would say all_TSFC.append(ncfile...) After that you could remove the second loop (for i in TSFC) and simply say # Convert all_TSFC to a numpy array. Not sure what happens # if some lists in all_TSFC are of different sizes, so you # should probably make sure they're all the same size. big_array = N.array(all_TSFC) # Take the mean of big_array along axis 0 (returns a # 1-dimensional numpy array the size of one of the lists in # all_TSFC). Mean = N.mean(big_array, axis=0) Hope that helps. Tim On Tue, Aug 16, 2011 at 5:50 PM, questions anon wrote: > I would like to loop through a bunch of netcdf files in separate folders > and select a particular time and then calculate the mean and plot this. I > have been told to use append and make the selected times into a big array > and then use numpy.mean but I can't seem to get the numpy array to work. The > loop keeps calculating over the top of the last entry, if that makes sense? > > from netCDF4 import Dataset > import matplotlib.pyplot as plt > import numpy as N > from mpl_toolkits.basemap import Basemap > import os > > MainFolder=r"E:/temp_samples/" > for (path, dirs, files) in os.walk(MainFolder): > for dir in dirs: > print dir > for ncfile in files: > if ncfile[-3:]=='.nc': > ncfile=os.path.join(path,ncfile) > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > TSFC=ncfile.variables['T_SFC'][4::24] > LAT=ncfile.variables['latitude'][:] > LON=ncfile.variables['longitude'][:] > TIME=ncfile.variables['time'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > ncfile.close() > > #calculate summary stats > big_array=[] > for i in TSFC: > big_array.append(i) > big_array=N.array(big_array) > Mean=N.mean(big_array, axis=0) > > #plot output summary stats > map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, > > llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') > map.drawcoastlines() > map.drawstates() > x,y=map(*N.meshgrid(LON,LAT)) > plt.title('Total Mean at 3pm') > ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] > CS = map.contourf(x,y,Mean,ticks, cmap=plt.cm.jet) > l,b,w,h =0.1,0.1,0.8,0.8 > cax = plt.axes([l+w+0.025, b, 0.025, h]) > plt.colorbar(CS,cax=cax, drawedges=True) > plt.show() > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Aug 16 20:42:59 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 17 Aug 2011 10:42:59 +1000 Subject: [SciPy-User] numpy array append In-Reply-To: References: Message-ID: Thanks Tim, that worked although I did run into a problem with different the sizes of each file. Each netcdf file contains a month of hourly data and some of those months are 31 days and some are 28 or 30. Is there a way to get this to work? here is the error I receive: *Traceback (most recent call last):* * File "d:\documents and settings\SLBurns\Work\My Dropbox\Python_code\calculate_the_mean_across_multiple_netcdf_files_in_multiple_folders.py", line 39, in * * Mean=N.mean(big_array, axis=0)* * File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 2374, in mean* * return mean(axis, dtype, out)* *ValueError: operands could not be broadcast together with shapes (31,106,193) (28,106,193)* from netCDF4 import Dataset import numpy as N import os MainFolder=r"E:/temp_samples/" all_TSFC=[] for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir path=path+'/' for ncfile in files: if ncfile[-3:]=='.nc': ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][4::24,:,:] all_TSFC.append(TSFC) big_array=N.array(all_TSFC) Mean=N.mean(big_array, axis=0) print "the mean for three months at 3pm is", Mean On Wed, Aug 17, 2011 at 9:13 AM, Tim Supinie wrote: > Ah, TSFC is being overwritten every time you go through the "for ncfile in > files" loop. So if you want to keep around all of them, you should declare > a list before that loop called all_TSFC or something. Then instead of > saying > > TSFC = ncfile... > > you would say > > all_TSFC.append(ncfile...) > > After that you could remove the second loop (for i in TSFC) and simply say > > # Convert all_TSFC to a numpy array. Not sure what happens > # if some lists in all_TSFC are of different sizes, so you > # should probably make sure they're all the same size. > big_array = N.array(all_TSFC) > > # Take the mean of big_array along axis 0 (returns a > # 1-dimensional numpy array the size of one of the lists in > # all_TSFC). > > Mean = N.mean(big_array, axis=0) > > Hope that helps. > > Tim > > On Tue, Aug 16, 2011 at 5:50 PM, questions anon wrote: > >> I would like to loop through a bunch of netcdf files in separate folders >> and select a particular time and then calculate the mean and plot this. I >> have been told to use append and make the selected times into a big array >> and then use numpy.mean but I can't seem to get the numpy array to work. The >> loop keeps calculating over the top of the last entry, if that makes sense? >> >> from netCDF4 import Dataset >> import matplotlib.pyplot as plt >> import numpy as N >> from mpl_toolkits.basemap import Basemap >> import os >> >> MainFolder=r"E:/temp_samples/" >> for (path, dirs, files) in os.walk(MainFolder): >> for dir in dirs: >> print dir >> for ncfile in files: >> if ncfile[-3:]=='.nc': >> ncfile=os.path.join(path,ncfile) >> ncfile=Dataset(ncfile, 'r+', 'NETCDF4') >> TSFC=ncfile.variables['T_SFC'][4::24] >> LAT=ncfile.variables['latitude'][:] >> LON=ncfile.variables['longitude'][:] >> TIME=ncfile.variables['time'][:] >> fillvalue=ncfile.variables['T_SFC']._FillValue >> ncfile.close() >> >> #calculate summary stats >> big_array=[] >> for i in TSFC: >> big_array.append(i) >> big_array=N.array(big_array) >> Mean=N.mean(big_array, axis=0) >> >> #plot output summary stats >> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, >> >> llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') >> map.drawcoastlines() >> map.drawstates() >> x,y=map(*N.meshgrid(LON,LAT)) >> plt.title('Total Mean at 3pm') >> ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] >> CS = map.contourf(x,y,Mean,ticks, cmap=plt.cm.jet) >> l,b,w,h =0.1,0.1,0.8,0.8 >> cax = plt.axes([l+w+0.025, b, 0.025, h]) >> plt.colorbar(CS,cax=cax, drawedges=True) >> plt.show() >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Aug 16 22:27:37 2011 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 16 Aug 2011 21:27:37 -0500 Subject: [SciPy-User] numpy array append In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 19:42, questions anon wrote: > Thanks Tim, that worked although I did run into a problem with different the > sizes of each file. > Each netcdf file contains a month of hourly data and some of those months > are 31 days and some are 28 or 30. Is there a way to get this to work? You probably want the following: big_array = N.concatenate(all_TSFC) Mean = big_array.mean(axis=0) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From questions.anon at gmail.com Tue Aug 16 23:49:09 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 17 Aug 2011 13:49:09 +1000 Subject: [SciPy-User] numpy array append In-Reply-To: References: Message-ID: Excellent, thank you, that worked!! On Wed, Aug 17, 2011 at 12:27 PM, Robert Kern wrote: > On Tue, Aug 16, 2011 at 19:42, questions anon > wrote: > > Thanks Tim, that worked although I did run into a problem with different > the > > sizes of each file. > > Each netcdf file contains a month of hourly data and some of those months > > are 31 days and some are 28 or 30. Is there a way to get this to work? > > You probably want the following: > > big_array = N.concatenate(all_TSFC) > Mean = big_array.mean(axis=0) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Wed Aug 17 02:17:26 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 17 Aug 2011 16:17:26 +1000 Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array Message-ID: I am trying to run simple stats on a bunch of monthly netcdfs files with hourly temperature data. With help from this list I am able to loop through a calculate the mean, but in doing this I have discovered that there are a some hours that have no values or -32767. I am sure there are some cases where I could slice out the section (if I know where they are) but is there a way I could just ignore these hours and calculate the mean? I have found something called "numpy.isnan" but this does not seem to work. from netCDF4 import Dataset import matplotlib.pyplot as plt import numpy as N from mpl_toolkits.basemap import Basemap import os MainFolder=r"E:/temp_samples/" all_TSFC=[] for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir path=path+'/' for ncfile in files: if ncfile[-3:]=='.nc': ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][:] LAT=ncfile.variables['latitude'][:] LON=ncfile.variables['longitude'][:] TIME=ncfile.variables['time'][:] fillvalue=ncfile.variables['T_SFC']._FillValue ncfile.close() #combine all TSFC to make one array for analyses all_TSFC.append(TSFC) big_array=N.concatenate(all_TSFC) Mean=big_array.mean(axis=0) print "the mean is", Mean #plot output summary stats map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') x,y=map(*N.meshgrid(LON,LAT)) CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h]) plt.colorbar(CS,cax=cax, drawedges=True) plt.show() -------------- next part -------------- An HTML attachment was scrubbed... URL: From jrocher at enthought.com Wed Aug 17 04:11:00 2011 From: jrocher at enthought.com (Jonathan Rocher) Date: Wed, 17 Aug 2011 10:11:00 +0200 Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array In-Reply-To: References: Message-ID: Hi, you can create a mask cutting out all the values you don't want to consider in your mean and compute the mean of the "masked array". To illustrate the concept, look at: In [1]: a = array([1,2,3,NaN,5]) In [4]: isnan(a) Out[4]: array([False, False, False, True, False], dtype=bool) In [5]: ~isnan(a) Out[5]: array([ True, True, True, False, True], dtype=bool) In [11]: mask = (~isnan(a)) & (a != 3) In [12]: mask Out[12]: array([ True, True, False, False, True], dtype=bool) In [13]: a[mask] Out[13]: array([ 1., 2., 5.]) In [14]: a[mask].mean() Out[14]: 2.6666666666666665 In you code, you need to use something similar before you compute the mean. Hope this helps, Jonathan On Wed, Aug 17, 2011 at 8:17 AM, questions anon wrote: > I am trying to run simple stats on a bunch of monthly netcdfs files with > hourly temperature data. With help from this list I am able to loop through > a calculate the mean, but in doing this I have discovered that there are a > some hours that have no values or -32767. I am sure there are some cases > where I could slice out the section (if I know where they are) but is there > a way I could just ignore these hours and calculate the mean? > I have found something called "numpy.isnan" but this does not seem to work. > > from netCDF4 import Dataset > import matplotlib.pyplot as plt > import numpy as N > from mpl_toolkits.basemap import Basemap > import os > > MainFolder=r"E:/temp_samples/" > > all_TSFC=[] > for (path, dirs, files) in os.walk(MainFolder): > for dir in dirs: > print dir > path=path+'/' > for ncfile in files: > if ncfile[-3:]=='.nc': > ncfile=os.path.join(path,ncfile) > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > TSFC=ncfile.variables['T_SFC'][:] > LAT=ncfile.variables['latitude'][:] > LON=ncfile.variables['longitude'][:] > TIME=ncfile.variables['time'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > ncfile.close() > > #combine all TSFC to make one array for analyses > all_TSFC.append(TSFC) > > big_array=N.concatenate(all_TSFC) > Mean=big_array.mean(axis=0) > print "the mean is", Mean > > #plot output summary stats > map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, > > llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') > x,y=map(*N.meshgrid(LON,LAT)) > CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) > l,b,w,h =0.1,0.1,0.8,0.8 > cax = plt.axes([l+w+0.025, b, 0.025, h]) > plt.colorbar(CS,cax=cax, drawedges=True) > > plt.show() > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Jonathan Rocher, PhD Scientific software developer Enthought, Inc. jrocher at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From klonuo at gmail.com Wed Aug 17 10:28:11 2011 From: klonuo at gmail.com (Klonuo Umom) Date: Wed, 17 Aug 2011 16:28:11 +0200 Subject: [SciPy-User] Building SciPy on Debian with Intel compilers In-Reply-To: References: Message-ID: I got new Ubuntu 11.04 PC, and actually I accidentally found older blog post which helped me pass this old sparsetools problems. Blog is here: http://marklodato.github.com/ In brief, solution was to first run custom: `python setup.py config` and then link sparsetools with icpc by hand: ======================================================================== for x in csr csc coo bsr dia; do icpc -xHost -O3 -fPIC -shared \ build/temp.linux-x86_64-2.6/scipy/sparse/sparsetools/${x}_wrap.o \ -o build/lib.linux-x86_64-2.6/scipy/sparse/sparsetools/_${x}.so done icpc -xHost -O3 -fPIC -openmp -shared \ build/temp.linux-x86_64-2.6/scipy/interpolate/src/_interpolate.o \ -o build/lib.linux-x86_64-2.6/scipy/interpolate/_interpolate.so ------------------------------------------------------------------------ Paths are dependant on Intel tools and Scipy version, and it's trivial to correct them So I build Numpy and Scipy with latest Intel Parallel Studio XE 2011 update 2 + SparseSuite (AMD and UMFPACK), but then running test I got this: ======================================================================== *** libmkl_p4p.so *** failed with error : /opt/intel/composerxe-2011.4.191/mkl/lib/ia32/libmkl_p4p.so: undefined symbol: i_malloc *** libmkl_def.so *** failed with error : /opt/intel/composerxe-2011.4.191/mkl/lib/ia32/libmkl_def.so: undefined symbol: i_malloc MKL FATAL ERROR: Cannot load neither libmkl_p4p.so nor libmkl_def.so ------------------------------------------------------------------------ Workaround is this: ======================================================================== export LD_PRELOAD=/opt/intel/mkl/lib/ia32/libmkl_core.so:/opt/intel/mkl/lib/ia32/libmkl_sequential.so ------------------------------------------------------------------------ Now I run tests again Numpy: FAILED (KNOWNFAIL=3, SKIP=4, failures=4) more info: http://pastebin.com/raw.php?i=m3sns5xU Scipy: FAILED (KNOWNFAIL=12, SKIP=35, errors=1, failures=3) more info: http://pastebin.com/raw.php?i=tvqg8PJ1 I wish someone reply about this 'export LD_PRELOAD' workaround, and also maybe correct online documentation about building Numpy/Scipy with Intel compilers - at least with earlier David's corrections in this thread about 'scipy/spatial/qhull/src/qhull_a.h', as if user does not understand what are C++ templates, he/she could hardly figure what to do. About sparsetools I'm happy I got it working, and this issue is open as of Numpy 1.3.0 at least it seems Cheers From gorkypl at gmail.com Wed Aug 17 17:13:52 2011 From: gorkypl at gmail.com (=?UTF-8?B?UGF3ZcWC?=) Date: Wed, 17 Aug 2011 23:13:52 +0200 Subject: [SciPy-User] My attempt to fix an issue with separate scales for left and right axis in scikits.timeseries - correct? Message-ID: Hello, I've tried to solve an issue with scikits.timeseries doesn't allowing to use separate scales for left and right axis with recent versions of matplotlib, like in this example: http://pytseries.sourceforge.net/lib.plotting.examples.html#separate-scales-for-left-and-right-axis The issue was raised twice: http://mail.scipy.org/pipermail/scipy-user/2011-April/029046.html http://permalink.gmane.org/gmane.comp.python.scientific.devel/14645 The example (and my code) works after changing single line 1196: - fsp_alt_args = (fsp._rows, fsp._cols, fsp._num + 1) + fsp_alt_args = fsp.get_geometry() I've done this after examining the file https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes.py (method get_geometry in line 8369). Can anyone take a look at the code and say if it makes sense? I'm certainly not an expert in Python and I'm not sure if this can be so simple and yet correct. Side note: I know scikits.timeseries may be abandoned for a while now (I've traced the recent discussion of its status), but I use it heavily in climate analysis and need to keep my code alive for some time. greetings, Pawe? Rumian From rmorgan466 at gmail.com Thu Aug 18 06:20:51 2011 From: rmorgan466 at gmail.com (Rita) Date: Thu, 18 Aug 2011 06:20:51 -0400 Subject: [SciPy-User] scipy.stats Message-ID: I am trying to import scipy.stats but I keep getting an import Error, ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos I compiled Numpy with Intel C compiler and Scipy compiled ok but just cant get this working. Any advise? -- --- Get your facts first, then you can distort them as you please.-- -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Thu Aug 18 10:26:09 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 18 Aug 2011 07:26:09 -0700 Subject: [SciPy-User] Enthought Python Distribution questions Message-ID: <4E4D2101.7060505@simplistix.co.uk> Hi All, A couple of questions about EPD, if this is the wrong list, please point me at the right one: - How can I install EPD in such a way that it leaves my system python completely alone? I installed it on my Mac and suddenly I have Python 2.7 with all the libraries everywhere, which isn't what I want :-S I'm now too petrified to try and install EPD on any of my Debian, Red Hat or Ubuntu servers in case the same thing is done there, which would have much more catastrophic consequences. I'm looking for something akin to 'make altinstall' for CPython, I'd love to be able to get a python-epd-x.y.z in the same way that gives me just a pythonx.y. - Once I have EPD installed, where can I find all the documentation for the included packages? I spend a lot of time working on trains and planes, and having the docs available offline would be extremely useful :-) cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From jrocher at enthought.com Thu Aug 18 10:45:14 2011 From: jrocher at enthought.com (Jonathan Rocher) Date: Thu, 18 Aug 2011 16:45:14 +0200 Subject: [SciPy-User] Enthought Python Distribution questions In-Reply-To: <4E4D2101.7060505@simplistix.co.uk> References: <4E4D2101.7060505@simplistix.co.uk> Message-ID: Hi Chris, Correct, this isn't the appropriate mailing list. To request information about EPD, you should contact info at enthought.com or epd-support at enthought.com once you are a subsciber. For your information, 1. EPD install its own python executable and libraries and doesn't interfere with any existing installed instances of python. The PATH environment variable allows you to select which one will be launched. 2. EPD comes with a large number of code samples/examples for most of the packages included, in particular the Enthought Tool Suite. The documentation is not included though, to save download time. It also comes with a DocLinks folder with links to each package home page for you to download the appropriate material. Best regards, Jonathan Rocher On Thu, Aug 18, 2011 at 4:26 PM, Chris Withers wrote: > Hi All, > > A couple of questions about EPD, if this is the wrong list, please point > me at the right one: > > - How can I install EPD in such a way that it leaves my system python > completely alone? I installed it on my Mac and suddenly I have Python > 2.7 with all the libraries everywhere, which isn't what I want :-S > I'm now too petrified to try and install EPD on any of my Debian, Red > Hat or Ubuntu servers in case the same thing is done there, which would > have much more catastrophic consequences. > > I'm looking for something akin to 'make altinstall' for CPython, I'd > love to be able to get a python-epd-x.y.z in the same way that gives me > just a pythonx.y. > > - Once I have EPD installed, where can I find all the documentation for > the included packages? I spend a lot of time working on trains and > planes, and having the docs available offline would be extremely useful :-) > > cheers, > > Chris > > -- > Simplistix - Content Management, Batch Processing & Python Consulting > - http://www.simplistix.co.uk > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Jonathan Rocher, PhD Scientific software developer Enthought, Inc. jrocher at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Thu Aug 18 10:45:27 2011 From: lists at hilboll.de (Andreas) Date: Thu, 18 Aug 2011 16:45:27 +0200 Subject: [SciPy-User] Enthought Python Distribution questions In-Reply-To: <4E4D2101.7060505@simplistix.co.uk> References: <4E4D2101.7060505@simplistix.co.uk> Message-ID: <4E4D2587.3040403@hilboll.de> Hi Chris, I installed EPD 7.1 in my home directory, in ~/lib/epd-7.1-1-x86_64/. Then I installed virtualenv and virtualenvwrapper in the EPD directory, by using $ ~/lib/epd-7.1-1-x86_64/bin/pip install virtualenv $ ~/lib/epd-7.1-1-x86_64/bin/pip install virtualenvwrapper Now, I can just do something like $ source ~/lib/epd-7.1-1-x86_64/bin/virtualenvwrapper.sh $ mkvirtualenv myepd In the virtualenv, I need to make sure that PATH and PYTHONPATH are set correctly. For this, I create a postactivate script: $ cat ~/.virtualenvs/myepd/bin/postactivate #!/bin/bash # This hook is run after this virtualenv is activated. export PATH=~/.virtualenvs/myepd/bin:~/lib/epd-7.1-1-x86_64/bin:$PATH export PYTHONPATH=~/lib/epd-7.1-1-x86_64/lib/python2.7/site-packages export LD_LIBRARY_PATH=~/lib/epd-7.1-1-x86_64/lib Now, you can just switch to your EPD environment using the ``workon`` command: $ python -V && which python Python 2.6.5 /usr/bin/python $ source ~/lib/epd-7.1-1-x86_64/bin/virtualenvwrapper.sh $ workon myepd (myepd)$ python -V && which python Python 2.7.2 -- CUSTOM /home/USERNAME/.virtualenvs/myepd/bin/python Hope this helps! Cheers, Andreas. On 2011-08-18 16:26, Chris Withers wrote: > Hi All, > > A couple of questions about EPD, if this is the wrong list, please point > me at the right one: > > - How can I install EPD in such a way that it leaves my system python > completely alone? I installed it on my Mac and suddenly I have Python > 2.7 with all the libraries everywhere, which isn't what I want :-S > I'm now too petrified to try and install EPD on any of my Debian, Red > Hat or Ubuntu servers in case the same thing is done there, which would > have much more catastrophic consequences. > > I'm looking for something akin to 'make altinstall' for CPython, I'd > love to be able to get a python-epd-x.y.z in the same way that gives me > just a pythonx.y. > > - Once I have EPD installed, where can I find all the documentation for > the included packages? I spend a lot of time working on trains and > planes, and having the docs available offline would be extremely useful :-) > > cheers, > > Chris > From kwatford at gmail.com Thu Aug 18 10:45:26 2011 From: kwatford at gmail.com (Ken Watford) Date: Thu, 18 Aug 2011 10:45:26 -0400 Subject: [SciPy-User] Enthought Python Distribution questions In-Reply-To: <4E4D2101.7060505@simplistix.co.uk> References: <4E4D2101.7060505@simplistix.co.uk> Message-ID: On Thu, Aug 18, 2011 at 10:26 AM, Chris Withers wrote: > - How can I install EPD in such a way that it leaves my system python > completely alone? I installed it on my Mac and suddenly I have Python > 2.7 with all the libraries everywhere, which isn't what I want :-S > I'm now too petrified to try and install EPD on any of my Debian, Red > Hat or Ubuntu servers in case the same thing is done there, which would > have much more catastrophic consequences. The situation is better on Linux - the installer asks where you want it, and it only touches that directory. You can install it as a normal user in a private directory. You can then add its bin directory to your path or not. From chris at simplistix.co.uk Thu Aug 18 10:51:57 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 18 Aug 2011 07:51:57 -0700 Subject: [SciPy-User] Enthought Python Distribution questions In-Reply-To: References: <4E4D2101.7060505@simplistix.co.uk> Message-ID: <4E4D270D.9090100@simplistix.co.uk> On 18/08/2011 07:45, Jonathan Rocher wrote: > Correct, this isn't the appropriate mailing list. To request information > about EPD, you should contact info at enthought.com > or epd-support at enthought.com > once you are a subsciber. I am a subscriber, so I just forwarded these questions there too :-) I have EPD 7.0-2 (32-bit) installed. > 1. EPD install its own python executable and libraries and doesn't > interfere with any existing installed instances of python. The PATH > environment variable allows you to select which one will be launched. Please can you verify that this is the case on MacOS X? > 2. EPD comes with a large number of code samples/examples for most of > the packages included, in particular the Enthought Tool Suite. Where would I find these on MacOS X? > The > documentation is not included though, to save download time. Where can I find a bulk download of all the docs for offline use? > It also > comes with a DocLinks folder with links to each package home page for > you to download the appropriate material. Again on Mac OS X, where would I find this DocLinks folder? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From aronne.merrelli at gmail.com Thu Aug 18 12:57:37 2011 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Thu, 18 Aug 2011 11:57:37 -0500 Subject: [SciPy-User] Enthought Python Distribution questions In-Reply-To: <4E4D270D.9090100@simplistix.co.uk> References: <4E4D2101.7060505@simplistix.co.uk> <4E4D270D.9090100@simplistix.co.uk> Message-ID: On Thu, Aug 18, 2011 at 9:51 AM, Chris Withers wrote: > On 18/08/2011 07:45, Jonathan Rocher wrote: > > Correct, this isn't the appropriate mailing list. To request information > > about EPD, you should contact info at enthought.com > > or epd-support at enthought.com > > once you are a subsciber. > > I am a subscriber, so I just forwarded these questions there too :-) > I have EPD 7.0-2 (32-bit) installed. > > > 1. EPD install its own python executable and libraries and doesn't > > interfere with any existing installed instances of python. The PATH > > environment variable allows you to select which one will be launched. > > Please can you verify that this is the case on MacOS X? > > > 2. EPD comes with a large number of code samples/examples for most of > > the packages included, in particular the Enthought Tool Suite. > > Where would I find these on MacOS X? > > > The > > documentation is not included though, to save download time. > > Where can I find a bulk download of all the docs for offline use? > > > It also > > comes with a DocLinks folder with links to each package home page for > > you to download the appropriate material. > > Again on Mac OS X, where would I find this DocLinks folder? > > On my Mac (OS X 10.6.8), the EPD stuff is installed here: (there are DocLinks and Example subdirectories): /Library/Frameworks/Python.framework/Versions/Current/ It looks like the "standard" python installations that come with MacOS are here: /System/Library/Frameworks/Python.framework/Versions/ I also have several python versions installed by macports into /opt/local/. I have not yet had any problems with different installations conflicting with each other, they each seem to have their own files (so I'm wasting a lot of disk space). My path is set to the EPD version which I almost always use, but it is easy to change if needed; right now the few times I need another version I just run it directly. (e.g. type /usr/bin/python) Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From dasneutron at gmail.com Thu Aug 18 14:21:30 2011 From: dasneutron at gmail.com (Piotr Zolnierczuk) Date: Thu, 18 Aug 2011 14:21:30 -0400 Subject: [SciPy-User] rotation matrices, Euler angles and all that jazz Message-ID: Hi, The question has probably been asked here before ... Is there a scipy/numpy module which facilitates computation of rotation matrices, Euler angles, etc.? They are ever present in many branches of physics and re-inventing them again seems like a waste of time. I've been using a module written by Chrisoph Gohlke (UCI) http://www.lfd.uci.edu/~gohlke/code/transformations.py.html, but it would be nice if I could use something that is already included in SciPy. Piotr From cgohlke at uci.edu Thu Aug 18 15:05:43 2011 From: cgohlke at uci.edu (Christoph Gohlke) Date: Thu, 18 Aug 2011 12:05:43 -0700 Subject: [SciPy-User] rotation matrices, Euler angles and all that jazz In-Reply-To: References: Message-ID: <4E4D6287.6010101@uci.edu> On 8/18/2011 11:21 AM, Piotr Zolnierczuk wrote: > Hi, > The question has probably been asked here before ... > > Is there a scipy/numpy module which facilitates computation of > rotation matrices, Euler angles, etc.? > They are ever present in many branches of physics and re-inventing > them again seems like a waste of time. > > I've been using a module written by Chrisoph Gohlke (UCI) > http://www.lfd.uci.edu/~gohlke/code/transformations.py.html, but it > would be nice if I could use something that is already included in > SciPy. > > Piotr A quaternion dtype will probably make into the next version of numpy . That will be able to replace ~1/3 of the transformations.py module. The transformations.py module was shortly discussed on the numpy list in 2009 . I had an off list discussion on how to integrate some of the functions in numpy/scipy. Consent was that if such a module makes it into numpy/scipy it should : 1) support any float dtypes 2) support 2D, 3D, and 3D homogeneous coordinates. 3) support both "column vectors on the right" and "row vectors on the left" conventions 4) Christoph From dperlman at wisc.edu Thu Aug 18 23:16:28 2011 From: dperlman at wisc.edu (David Perlman) Date: Thu, 18 Aug 2011 22:16:28 -0500 Subject: [SciPy-User] interpolate.interp1d: can't figure out what I'm doing wrong Message-ID: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu> I am absolutely sure that my x_new range doesn't go outside my original x, and yet it is giving me an error saying that it is: old time range: [0.0, 200.00000298023224, 400.00000596046448, 600.00000894069672] new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0] Traceback (most recent call last): File "/home/perlman/bin/pretty_fmri.py", line 840, in main() File "/home/perlman/bin/pretty_fmri.py", line 130, in main dataProcessor.interpolate() File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate self.interpolateddata=f(xnew) File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 333, in __call__ out_of_bounds = self._check_bounds(x_new) File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 391, in _check_bounds raise ValueError("A value in x_new is above the interpolation " ValueError: A value in x_new is above the interpolation range. Here is the snippet of code where this is going wrong: oldNum=numpy.shape(self.data)[0] endTime=(oldNum-1)*oldTR x=numpy.linspace(0, endTime, oldNum) if self.opts.verbose: print "old time range:", list(x) f=scipy.interpolate.interp1d(x, self.data, self.opts.interp, 0) # make the new time points xnew=self.crange(0, endTime, newTR) if self.opts.verbose: print "new time range:", list(xnew) self.interpolateddata=f(xnew) You can see from that, that there is no code between the displayed ranges and the calling of the interpolator. So I am at a loss for how to figure out what's going on here! Any help would be greatly appreciated. I have been looking into this for a while, even to the point of looking at the source code for the interp1d function. :-/ -- -dave---------------------------------------------------------------- "Let us work without theorizing... 'tis the only way to make life endurable." - Voltaire, Candide, Chapter 30 From dperlman at wisc.edu Thu Aug 18 23:32:08 2011 From: dperlman at wisc.edu (David Perlman) Date: Thu, 18 Aug 2011 22:32:08 -0500 Subject: [SciPy-User] interpolate.interp1d: can't figure out what I'm doing wrong In-Reply-To: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu> References: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu> Message-ID: Well at least I got it to give me a different error message. I thought it might not like the generator instead of list, so I converted to list first. Now I get this error: old time range: [0.0, 200.00000298023224, 400.00000596046448, 600.00000894069672] new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0] Traceback (most recent call last): File "/home/perlman/bin/pretty_fmri.py", line 840, in main() File "/home/perlman/bin/pretty_fmri.py", line 130, in main dataProcessor.interpolate() File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate self.interpolateddata=f(list(xnew)) File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 362, in __call__ y_new[out_of_bounds] = self.fill_value IndexError: invalid index On Aug 18, 2011, at 10:16 PM, David Perlman wrote: > I am absolutely sure that my x_new range doesn't go outside my original x, and yet it is giving me an error saying that it is: > old time range: [0.0, 200.00000298023224, 400.00000596046448, 600.00000894069672] > new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0] > Traceback (most recent call last): > File "/home/perlman/bin/pretty_fmri.py", line 840, in > main() > File "/home/perlman/bin/pretty_fmri.py", line 130, in main > dataProcessor.interpolate() > File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate > self.interpolateddata=f(xnew) > File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 333, in __call__ > out_of_bounds = self._check_bounds(x_new) > File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 391, in _check_bounds > raise ValueError("A value in x_new is above the interpolation " > ValueError: A value in x_new is above the interpolation range. > > > Here is the snippet of code where this is going wrong: > oldNum=numpy.shape(self.data)[0] > endTime=(oldNum-1)*oldTR > x=numpy.linspace(0, endTime, oldNum) > if self.opts.verbose: print "old time range:", list(x) > f=scipy.interpolate.interp1d(x, self.data, self.opts.interp, 0) > # make the new time points > xnew=self.crange(0, endTime, newTR) > if self.opts.verbose: print "new time range:", list(xnew) > self.interpolateddata=f(xnew) > > > You can see from that, that there is no code between the displayed ranges and the calling of the interpolator. So I am at a loss for how to figure out what's going on here! > > Any help would be greatly appreciated. I have been looking into this for a while, even to the point of looking at the source code for the interp1d function. :-/ > > -- > -dave---------------------------------------------------------------- > "Let us work without theorizing... 'tis the only way to make life endurable." > - Voltaire, Candide, Chapter 30 > -- -dave---------------------------------------------------------------- "Let us work without theorizing... 'tis the only way to make life endurable." - Voltaire, Candide, Chapter 30 From questions.anon at gmail.com Fri Aug 19 01:01:57 2011 From: questions.anon at gmail.com (questions anon) Date: Fri, 19 Aug 2011 15:01:57 +1000 Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array In-Reply-To: References: Message-ID: Thank you, what you suggested worked but now I don't think that is my problem. Within the dataset I am trying to calculate the mean from it appears there are some hours with no data, the output is: [[[-- -- -- ..., -- -- --] [-- -- -- ..., -- -- --] [-- -- -- ..., -- -- --] ..., [-- -- -- ..., -- -- --] [-- -- -- ..., -- -- --] [-- -- -- ..., -- -- --]]] So I would assume these would be ignored when I calculate the mean but when I make all my files/times into one big array these blanks turn into -32767. Is there some way to avoid this? Thanks On Wed, Aug 17, 2011 at 4:17 PM, questions anon wrote: > I am trying to run simple stats on a bunch of monthly netcdfs files with > hourly temperature data. With help from this list I am able to loop through > a calculate the mean, but in doing this I have discovered that there are a > some hours that have no values or -32767. I am sure there are some cases > where I could slice out the section (if I know where they are) but is there > a way I could just ignore these hours and calculate the mean? > I have found something called "numpy.isnan" but this does not seem to work. > > from netCDF4 import Dataset > import matplotlib.pyplot as plt > import numpy as N > from mpl_toolkits.basemap import Basemap > import os > > MainFolder=r"E:/temp_samples/" > > all_TSFC=[] > for (path, dirs, files) in os.walk(MainFolder): > for dir in dirs: > print dir > path=path+'/' > for ncfile in files: > if ncfile[-3:]=='.nc': > ncfile=os.path.join(path,ncfile) > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > TSFC=ncfile.variables['T_SFC'][:] > LAT=ncfile.variables['latitude'][:] > LON=ncfile.variables['longitude'][:] > TIME=ncfile.variables['time'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > ncfile.close() > > #combine all TSFC to make one array for analyses > all_TSFC.append(TSFC) > > big_array=N.concatenate(all_TSFC) > Mean=big_array.mean(axis=0) > print "the mean is", Mean > > #plot output summary stats > map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, > > llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') > x,y=map(*N.meshgrid(LON,LAT)) > CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) > l,b,w,h =0.1,0.1,0.8,0.8 > cax = plt.axes([l+w+0.025, b, 0.025, h]) > plt.colorbar(CS,cax=cax, drawedges=True) > > plt.show() > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Fri Aug 19 04:47:20 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 19 Aug 2011 10:47:20 +0200 Subject: [SciPy-User] interpolate.interp1d: can't figure out what I'm doing wrong In-Reply-To: References: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu> Message-ID: On Fri, Aug 19, 2011 at 5:32 AM, David Perlman wrote: > Well at least I got it to give me a different error message. I thought it > might not like the generator instead of list, so I converted to list first. > Now I get this error: > > old time range: [0.0, 200.00000298023224, 400.00000596046448, > 600.00000894069672] > new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0] > Traceback (most recent call last): > File "/home/perlman/bin/pretty_fmri.py", line 840, in > main() > File "/home/perlman/bin/pretty_fmri.py", line 130, in main > dataProcessor.interpolate() > File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate > self.interpolateddata=f(list(xnew)) > File > "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", > line 362, in __call__ > y_new[out_of_bounds] = self.fill_value > IndexError: invalid index > > > > On Aug 18, 2011, at 10:16 PM, David Perlman wrote: > > > I am absolutely sure that my x_new range doesn't go outside my original > x, and yet it is giving me an error saying that it is: > > old time range: [0.0, 200.00000298023224, 400.00000596046448, > 600.00000894069672] > > new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0] > > Traceback (most recent call last): > > File "/home/perlman/bin/pretty_fmri.py", line 840, in > > main() > > File "/home/perlman/bin/pretty_fmri.py", line 130, in main > > dataProcessor.interpolate() > > File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate > > self.interpolateddata=f(xnew) > > File > "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", > line 333, in __call__ > > out_of_bounds = self._check_bounds(x_new) > > File > "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", > line 391, in _check_bounds > > raise ValueError("A value in x_new is above the interpolation " > > ValueError: A value in x_new is above the interpolation range. > > > > > > Here is the snippet of code where this is going wrong: > > oldNum=numpy.shape(self.data)[0] > > endTime=(oldNum-1)*oldTR > > x=numpy.linspace(0, endTime, oldNum) > > if self.opts.verbose: print "old time range:", list(x) > > f=scipy.interpolate.interp1d(x, self.data, self.opts.interp, 0) > > # make the new time points > > xnew=self.crange(0, endTime, newTR) > > if self.opts.verbose: print "new time range:", list(xnew) > > self.interpolateddata=f(xnew) > > > Can you create a self-contained example that illustrates the problem? Ralf > > > You can see from that, that there is no code between the displayed ranges > and the calling of the interpolator. So I am at a loss for how to figure > out what's going on here! > > > > Any help would be greatly appreciated. I have been looking into this for > a while, even to the point of looking at the source code for the interp1d > function. :-/ > > > > -- > > -dave---------------------------------------------------------------- > > "Let us work without theorizing... 'tis the only way to make life > endurable." > > - Voltaire, Candide, Chapter 30 > > > > -- > -dave---------------------------------------------------------------- > "Let us work without theorizing... 'tis the only way to make life > endurable." > - Voltaire, Candide, Chapter 30 > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Fri Aug 19 06:35:19 2011 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Fri, 19 Aug 2011 10:35:19 +0000 (UTC) Subject: [SciPy-User] My attempt to fix an issue with separate scales for left and right axis in scikits.timeseries - correct? References: Message-ID: Pawe? gmail.com> writes: > > The example (and my code) works after changing single line 1196: > - fsp_alt_args = (fsp._rows, fsp._cols, fsp._num + 1) > + fsp_alt_args = fsp.get_geometry() > > > Can anyone take a look at the code and say if it makes sense? I'm > certainly not an expert in Python and I'm not sure if this can be so > simple and yet correct. > > greetings, > Pawe? Rumian FWIW I can confirm that the fix works for me - thanks! Unfortuantely I'm not an expert in the internals of either matplotlib or scikits.timeseries so I don't feel qualified to say whether it's the right fix :/ I'm running 32bit Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] on a Win7 x64 box. -Dave From gorkypl at gmail.com Fri Aug 19 08:13:04 2011 From: gorkypl at gmail.com (=?UTF-8?B?UGF3ZcWC?=) Date: Fri, 19 Aug 2011 14:13:04 +0200 Subject: [SciPy-User] My attempt to fix an issue with separate scales for left and right axis in scikits.timeseries - correct? In-Reply-To: References: Message-ID: 2011/8/19 Dave Hirschfeld : > Pawe? gmail.com> writes: >> >> The example (and my code) works after changing single line 1196: >> - ? ?fsp_alt_args = (fsp._rows, fsp._cols, fsp._num + 1) >> + ? ?fsp_alt_args = fsp.get_geometry() > > FWIW I can confirm that the fix works for me - thanks! Thanks for confirming :) I've just noticed I havent stated it clearly - the change has to be done in scikits/timeseries/lib/plotlib.py of course. greetings, Pawe? Rumian From chris at simplistix.co.uk Fri Aug 19 10:48:26 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 19 Aug 2011 07:48:26 -0700 Subject: [SciPy-User] Enthought Python Distribution questions In-Reply-To: References: <4E4D2101.7060505@simplistix.co.uk> <4E4D270D.9090100@simplistix.co.uk> Message-ID: <4E4E77BA.1000508@simplistix.co.uk> On 18/08/2011 09:57, Aronne Merrelli wrote: > > Again on Mac OS X, where would I find this DocLinks folder? > > > On my Mac (OS X 10.6.8), the EPD stuff is installed here: (there are > DocLinks and Example subdirectories): > > /Library/Frameworks/Python.framework/Versions/Current/ This is a symlink, which I really don't want EPD to touch. I guess I'm OK with using 7.0 in Python.framework, on the assumption that Python will never make it to 7.0, but really, it should be in EPD.framework, no? > It looks like the "standard" python installations that come with MacOS > are here: > > /System/Library/Frameworks/Python.framework/Versions/ Yes, but what about installs of "normal python"? Why has EPD stomped on my /Current without even asking me?! > own files (so I'm wasting a lot of disk space). My path is set to the > EPD version Which path? Aside from stomping on the /Current symlink, I'm curious about why EPD now appears to be the default python on my system... cheers, Chris PS: Thanks for pointing me at the DocLinks and Examples folders :-) -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From ralf.gommers at googlemail.com Fri Aug 19 11:00:40 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 19 Aug 2011 17:00:40 +0200 Subject: [SciPy-User] scipy.stats In-Reply-To: References: Message-ID: On Thu, Aug 18, 2011 at 12:20 PM, Rita wrote: > I am trying to import scipy.stats but I keep getting an import Error, > ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos > > I compiled Numpy with Intel C compiler and Scipy compiled ok but just cant > get this working. > > Any advise? > > The symbol is defined in an Intel math library. You'll need to give us more details in order to say more than that. What exact compilers and MKL did you use, what OS? Build command and build log? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From arokem at gmail.com Thu Aug 18 12:58:19 2011 From: arokem at gmail.com (Ariel Rokem) Date: Thu, 18 Aug 2011 09:58:19 -0700 Subject: [SciPy-User] [ANN] Nitime version 0.3 In-Reply-To: References: Message-ID: I am happy to announce the release of version 0.3 of nitime. Nitime, a member of the nipy family (http://nipy.org), is a software library for the analysis of time-series from neuroscience experiments. To read the online documentation, visit: http://nipy.org/nitime/ To download the source code, visit: http://pypi.python.org/pypi/nitime Version 0.3 of nitime includes several additions and improvements, including new analysis methods (MAR process estimation, Granger 'causality', seed correlation analysis, filtering), improvements to the API (slicing with epochs), many bug fixes, a dramatic increase in test coverage and many new examples (http://nipy.org/nitime/examples/index.html) To read the full release notes and see the list of contributors to this release, visit: http://nipy.org/nitime/whatsnew/version0.3.html On behalf of the nitime developers, Ariel Rokem From robert.kern at gmail.com Fri Aug 19 13:18:50 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 19 Aug 2011 12:18:50 -0500 Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array In-Reply-To: References: Message-ID: On Fri, Aug 19, 2011 at 00:01, questions anon wrote: > Thank you, what you suggested worked but now I don't think that is my > problem. > Within the dataset I am trying to calculate the mean from it appears there > are some hours with no data, the output is: > [[[-- -- -- ..., -- -- --] > ? [-- -- -- ..., -- -- --] > ? [-- -- -- ..., -- -- --] > ? ..., > ? [-- -- -- ..., -- -- --] > ? [-- -- -- ..., -- -- --] > ? [-- -- -- ..., -- -- --]]] > So I would assume these would be ignored when I calculate the mean but when > I make all my files/times into one big array these blanks turn into -32767. > Is there some way to avoid this? This is just how large arrays get summarized when printed. The data is all there. You can control how this summarization happens using the threshold parameter to numpy.set_printoptions(): http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From dasneutron at gmail.com Fri Aug 19 13:31:53 2011 From: dasneutron at gmail.com (Piotr Zolnierczuk) Date: Fri, 19 Aug 2011 13:31:53 -0400 Subject: [SciPy-User] rotation matrices, Euler angles and all that jazz Message-ID: Christoph, thanks for the answers and references. I will keep using your very useful module. What I would like to add is that one often needs only the 3x3 (pure rotation) part of it so it would be nice to provide versions for this case too. One obviously can "wrap" (that's what I do) the routines, for example: def rotation_matrix3(....): m = rotation_matrix(...) return m[:3,:3] Cheers Piotr From robert.kern at gmail.com Fri Aug 19 13:34:45 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 19 Aug 2011 12:34:45 -0500 Subject: [SciPy-User] Enthought Python Distribution questions In-Reply-To: <4E4E77BA.1000508@simplistix.co.uk> References: <4E4D2101.7060505@simplistix.co.uk> <4E4D270D.9090100@simplistix.co.uk> <4E4E77BA.1000508@simplistix.co.uk> Message-ID: On Fri, Aug 19, 2011 at 09:48, Chris Withers wrote: > On 18/08/2011 09:57, Aronne Merrelli wrote: >> >> ? ? Again on Mac OS X, where would I find this DocLinks folder? >> >> >> On my Mac (OS X 10.6.8), the EPD stuff is installed here: (there are >> DocLinks and Example subdirectories): >> >> /Library/Frameworks/Python.framework/Versions/Current/ > > This is a symlink, which I really don't want EPD to touch. Well, the installer will adjust it to point to the version that it installs, which is the usual thing to do. You can adjust it to whichever version you want to be considered "Current" afterwards. > I guess I'm OK with using 7.0 in Python.framework, on the assumption > that Python will never make it to 7.0, but really, it should be in > EPD.framework, no? Ideally, yes. Unfortunately, a number of tools rely on the framework being named "Python.framework" and are difficult to configure to deal with a different framework name. So we use Python.framework and a version number equal to EPD's version. That still gets us in trouble with a few tools that try to infer the Python version number from the framework version number, but I've only encountered one or two, and those were easy to patch to look up the version number robustly. >> It looks like the "standard" python installations that come with MacOS >> are here: >> >> /System/Library/Frameworks/Python.framework/Versions/ > > Yes, but what about installs of "normal python"? Why has EPD stomped on > my /Current without even asking me?! Because it's what the "normal python" installer would do too. If I had a www.python.org installation of Python 2.6 that Current pointed to, and then I installed the www.python.org Python 2.7, Current would get updated to 2.7 without asking me. EPD's installer just does the same thing. It's what every installer of frameworks does that I've ever seen. They expect that if you explicitly install a framework, that you want it to be Current. >> own files (so I'm wasting a lot of disk space). My path is set to the >> EPD version > > Which path? Aside from stomping on the /Current symlink, I'm curious > about why EPD now appears to be the default python on my system... $PATH -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From rmorgan466 at gmail.com Fri Aug 19 18:53:40 2011 From: rmorgan466 at gmail.com (Rita) Date: Fri, 19 Aug 2011 18:53:40 -0400 Subject: [SciPy-User] scipy.stats In-Reply-To: References: Message-ID: I apologize for the vague question. OS: Linux Compiler: Intel compiler suite. Version 11 (this also includes fortran compiler) MKL: 10.3 Numpy version: 1.6.1 When I do numpy.config() I see it properly compiled against Intel's BLAS and LAPACK Where are the build logs located? Do you need to build log for Numpy also? On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers wrote: > > > On Thu, Aug 18, 2011 at 12:20 PM, Rita wrote: > >> I am trying to import scipy.stats but I keep getting an import Error, >> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos >> >> I compiled Numpy with Intel C compiler and Scipy compiled ok but just cant >> get this working. >> >> Any advise? >> >> The symbol is defined in an Intel math library. You'll need to give us > more details in order to say more than that. What exact compilers and MKL > did you use, what OS? Build command and build log? > > Cheers, > Ralf > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- --- Get your facts first, then you can distort them as you please.-- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Fri Aug 19 20:00:39 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 19 Aug 2011 19:00:39 -0500 Subject: [SciPy-User] scipy.stats In-Reply-To: References: Message-ID: On Fri, Aug 19, 2011 at 5:53 PM, Rita wrote: > I apologize for the vague question. > OS: Linux > Compiler: Intel compiler suite. Version 11 (this also includes fortran > compiler) > MKL: 10.3 > Numpy version: 1.6.1 > When I do numpy.config() I see it properly compiled against Intel's BLAS and > LAPACK > > Where are the build logs located? Do you need to build log for Numpy also? > > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers > wrote: >> >> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita wrote: >>> >>> I am trying to import scipy.stats but I keep getting an import Error, >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just >>> cant get this working. >>> Any advise? >> >> The symbol is defined in an Intel math library. You'll need to give us >> more details in order to say more than that. What exact compilers and MKL >> did you use, what OS? Build command and build log? >> >> Cheers, >> Ralf >> >> >> A quick google indicates that you need to ensure that you link to the appropriate Intel Math library: http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/ Also what is the cpu type? Bruce From rmorgan466 at gmail.com Sat Aug 20 07:38:57 2011 From: rmorgan466 at gmail.com (Rita) Date: Sat, 20 Aug 2011 06:38:57 -0500 Subject: [SciPy-User] scipy.stats In-Reply-To: References: Message-ID: Thanks Bruce. I have already seen this Here are more details of my build. My Intel compiler exists here, /opt/intel/ self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I /opt/intel/ipp/em64t/in clude -I /etg/source/Linux/include -I /opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lstdc++ -DMKL_ILP64' Here is how I am doing the compilation CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config --compiler=intel --fcompiler=intelem build_clib --compiler=intel --fcompiler=intelem build_ext --compiler=intel install /opt/intel/ipp is what I was using for the math library. This compiles but I keep getting that problem I use the same compile statement to compile scipy On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey wrote: > On Fri, Aug 19, 2011 at 5:53 PM, Rita wrote: > > I apologize for the vague question. > > OS: Linux > > Compiler: Intel compiler suite. Version 11 (this also includes fortran > > compiler) > > MKL: 10.3 > > Numpy version: 1.6.1 > > When I do numpy.config() I see it properly compiled against Intel's BLAS > and > > LAPACK > > > > Where are the build logs located? Do you need to build log for Numpy > also? > > > > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers < > ralf.gommers at googlemail.com> > > wrote: > >> > >> > >> On Thu, Aug 18, 2011 at 12:20 PM, Rita wrote: > >>> > >>> I am trying to import scipy.stats but I keep getting an import Error, > >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos > >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just > >>> cant get this working. > >>> Any advise? > >> > >> The symbol is defined in an Intel math library. You'll need to give us > >> more details in order to say more than that. What exact compilers and > MKL > >> did you use, what OS? Build command and build log? > >> > >> Cheers, > >> Ralf > >> > >> > >> > > A quick google indicates that you need to ensure that you link to the > appropriate Intel Math library: > > http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/ > > Also what is the cpu type? > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- --- Get your facts first, then you can distort them as you please.-- -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmorgan466 at gmail.com Sat Aug 20 07:40:42 2011 From: rmorgan466 at gmail.com (Rita) Date: Sat, 20 Aug 2011 06:40:42 -0500 Subject: [SciPy-User] scipy.stats In-Reply-To: References: Message-ID: It should be 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I/opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lstdc++ -DMKL_ILP64' Here is how I am doing the compilation On Sat, Aug 20, 2011 at 6:38 AM, Rita wrote: > Thanks Bruce. I have already seen this > > Here are more details of my build. > > My Intel compiler exists here, /opt/intel/ > > self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L > /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I > /opt/intel/ipp/em64t/in clude -I /etg/source/Linux/include -I > /opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf > -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lstdc++ -DMKL_ILP64' > Here is how I am doing the compilation > > CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config > --compiler=intel --fcompiler=intelem build_clib --compiler=intel > --fcompiler=intelem build_ext --compiler=intel install > > /opt/intel/ipp is what I was using for the math library. This compiles but > I keep getting that problem > > I use the same compile statement to compile scipy > > > > On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey wrote: > >> On Fri, Aug 19, 2011 at 5:53 PM, Rita wrote: >> > I apologize for the vague question. >> > OS: Linux >> > Compiler: Intel compiler suite. Version 11 (this also includes fortran >> > compiler) >> > MKL: 10.3 >> > Numpy version: 1.6.1 >> > When I do numpy.config() I see it properly compiled against Intel's BLAS >> and >> > LAPACK >> > >> > Where are the build logs located? Do you need to build log for Numpy >> also? >> > >> > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers < >> ralf.gommers at googlemail.com> >> > wrote: >> >> >> >> >> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita wrote: >> >>> >> >>> I am trying to import scipy.stats but I keep getting an import Error, >> >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos >> >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just >> >>> cant get this working. >> >>> Any advise? >> >> >> >> The symbol is defined in an Intel math library. You'll need to give us >> >> more details in order to say more than that. What exact compilers and >> MKL >> >> did you use, what OS? Build command and build log? >> >> >> >> Cheers, >> >> Ralf >> >> >> >> >> >> >> >> A quick google indicates that you need to ensure that you link to the >> appropriate Intel Math library: >> >> http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/ >> >> Also what is the cpu type? >> >> Bruce >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > --- Get your facts first, then you can distort them as you please.-- > -- --- Get your facts first, then you can distort them as you please.-- -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Sat Aug 20 19:12:22 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Sat, 20 Aug 2011 16:12:22 -0700 Subject: [SciPy-User] IPython inline plots of stacked bars graphs Message-ID: <4E503F56.1090908@simplistix.co.uk> Hi All, If I do the following in an IPython 0.11 Qt shell: import matplotlib.pyplot as plt menMeans = (20, 35, 30, 35, 27) womenMeans = (25, 32, 34, 20, 25) plt.bar(ind, menMeans, color='r') plt.bar(ind, womenMeans, color='y', bottom=menMeans) I get, as I'd expect, a stacked bar graph. However, if I do: plt.bar(ind, menMeans, color='r') ...hit enter, and then do: plt.bar(ind, womenMeans, color='y', bottom=menMeans) ...I get two separate plots. How can I add to an existing inline plot? Also, and I guess this might be more of a matplotlib question, how do I "reach inside" an existing plot to, for example, adjust the width of the bars used? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From charlesr.harris at gmail.com Sat Aug 20 19:46:37 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Aug 2011 17:46:37 -0600 Subject: [SciPy-User] IPython inline plots of stacked bars graphs In-Reply-To: <4E503F56.1090908@simplistix.co.uk> References: <4E503F56.1090908@simplistix.co.uk> Message-ID: On Sat, Aug 20, 2011 at 5:12 PM, Chris Withers wrote: > Hi All, > > If I do the following in an IPython 0.11 Qt shell: > > import matplotlib.pyplot as plt > menMeans = (20, 35, 30, 35, 27) > womenMeans = (25, 32, 34, 20, 25) > plt.bar(ind, menMeans, color='r') > plt.bar(ind, womenMeans, color='y', bottom=menMeans) > > I get, as I'd expect, a stacked bar graph. > > However, if I do: > > plt.bar(ind, menMeans, color='r') > > ...hit enter, and then do: > > plt.bar(ind, womenMeans, color='y', bottom=menMeans) > > ...I get two separate plots. > > How can I add to an existing inline plot? > > Also, and I guess this might be more of a matplotlib question, how do I > "reach inside" an existing plot to, for example, adjust the width of the > bars used? > > cheers, > > I think it is more of an ipython question, possibly a matplotlib question ;) You might try the hold(True) command. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Sat Aug 20 19:52:01 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Sat, 20 Aug 2011 16:52:01 -0700 Subject: [SciPy-User] getting started with arrays and matplotlib In-Reply-To: References: <4E3F020B.1000500@simplistix.co.uk> Message-ID: <4E5048A1.5090105@simplistix.co.uk> On 07/08/2011 22:29, David Warde-Farley wrote: >> Secondly, once I've populated this, any good examples of how to turn it >> into a bar chart? (the simple bar chart would be number of sales on the >> y-axis, weeks before the event on the x-axis, however, what I'd then >> like to do is split each bar into chunks for each venue's sales, if that >> makes sense?) > > This might give you an example of what you need: > > http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html > > but you'd be better off asking on matplotlib-users. Thanks, that was a good start. One question: How can I automatically get a list of colours for each bar? I don't know how many bars I'm going to have so I can't manually pick them... This feels like a common enough problem that I'm guessing there's a solution somewhere in matplotlib? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From dragonmagi at gmail.com Sun Aug 21 04:49:04 2011 From: dragonmagi at gmail.com (Chris Thorne) Date: Sun, 21 Aug 2011 16:49:04 +0800 Subject: [SciPy-User] Scipy 0.9.0 can't be installed on this disk ..... Message-ID: When I run the installer for scipy (or numpy) on OSX 10.6.7 it will refuse to do the install saying: "Scipy 0.9.0 can't be installed on this disk. scipy requires python.orgPython 2.6 to install." version of python installed with the OS is 2.6.1. Installing the latest version does not help. I'm guessing the error message is misleading and he issue is something else?? Note: I recently installed this on OSX 10.6.8 on another machine without problems. One difference that perhaps matters is that one had macports on it. thanks, chris -- http://www.vrshed.com http://www.floatingorigin.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Aug 21 12:30:42 2011 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 21 Aug 2011 11:30:42 -0500 Subject: [SciPy-User] Scipy 0.9.0 can't be installed on this disk ..... In-Reply-To: References: Message-ID: On Sun, Aug 21, 2011 at 03:49, Chris Thorne wrote: > When I run the installer for scipy (or numpy) on OSX 10.6.7 > it will refuse to do the install saying: > > "Scipy 0.9.0 can't be installed on this disk. scipy requires python.org > Python 2.6 to install." > > version of python installed with the OS is 2.6.1. > Installing the latest version does not help. > > I'm guessing the error message is misleading and he issue is something > else?? Note this part: "python.org Python 2.6" It means that you need to install Python 2.6 from the installers on www.python.org, *not* the Python 2.6.1 that is included with the OS. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jeremy at jeremysanders.net Mon Aug 22 04:45:11 2011 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Mon, 22 Aug 2011 09:45:11 +0100 Subject: [SciPy-User] ANN: Veusz 1.13 Message-ID: Veusz 1.13 ---------- Velvet Ember Under Sky Zenith ----------------------------- http://home.gna.org/veusz/ Copyright (C) 2003-2011 Jeremy Sanders and contributors. Licenced under the GPL (version 2 or greater). Veusz is a Qt4 based scientific plotting package. It is written in Python, using PyQt4 for display and user-interfaces, and numpy for handling the numeric data. Veusz is designed to produce publication-ready Postscript/PDF/SVG output. The user interface aims to be simple, consistent and powerful. Veusz provides a GUI, command line, embedding and scripting interface (based on Python) to its plotting facilities. It also allows for manipulation and editing of datasets. Data can be captured from external sources such as internet sockets or other programs. Changes in 1.13: * Graphs are rendered in separate threads for speed and a responsive user interface * A changed Graph is rendered immediately on document modification, improving latency * A new ternary plot widget is included * Size of pages can be modified individually in a document * Binary data import added * NPY/NPZ numpy data import added * Axis and tick labels on axes can be rotated at 45 deg intervals * Labels can be plotted next to points on non-orthogonal plots * Add an option for DPI of output EPS and PDF files Minor improvements: * Import dialog detects filename extension to show correct tab * Polygon fill mode for non orthogonal plotting * --plugin command line option added, for loading and testing plugins * Plugin for swapping two colors in a plot * Dataset navigator is moved to right of window by default * Mac OS X binary release updated to Python 2.7.2 * Import plugins can say which file extensions they support * Import plugins can be "promoted" to their own tab on the import dialog * ForceUpdate command added to embedding API, to force an update of the displayed plot (useful if SetUpdateInterval is set to 0) * X or Y dataset can be left blank in plotter to plot by row number Bugs fixed: * Images plotted when axes are inverted are inverted too * Fixed crash when selecting datasets for plotting in the popup menu * Picker crashes with a constant function * 2D dataset creation using expressions fixed * CSV reader treated dataset names ending in + or - incorrectly * unique1d function no longer available in numpy Features of package: * X-Y plots (with errorbars) * Line and function plots * Contour plots * Images (with colour mappings and colorbars) * Stepped plots (for histograms) * Bar graphs * Vector field plots * Box plots * Polar plots * Ternary plots * Plotting dates * Fitting functions to data * Stacked plots and arrays of plots * Plot keys * Plot labels * Shapes and arrows on plots * LaTeX-like formatting for text * EPS/PDF/PNG/SVG/EMF export * Scripting interface * Dataset creation/manipulation * Embed Veusz within other programs * Text, CSV, FITS, NPY/NPZ, QDP, binary and user-plugin importing * Data can be captured from external sources * User defined functions, constants and can import external Python functions * Plugin interface to allow user to write or load code to - import data using new formats - make new datasets, optionally linked to existing datasets - arbitrarily manipulate the document * Data picker * Multithreaded rendering Requirements for source install: Python (2.4 or greater required) http://www.python.org/ Qt >= 4.3 (free edition) http://www.trolltech.com/products/qt/ PyQt >= 4.3 (SIP is required to be installed first) http://www.riverbankcomputing.co.uk/pyqt/ http://www.riverbankcomputing.co.uk/sip/ numpy >= 1.0 http://numpy.scipy.org/ Optional: Microsoft Core Fonts (recommended for nice output) http://corefonts.sourceforge.net/ PyFITS >= 1.1 (optional for FITS import) http://www.stsci.edu/resources/software_hardware/pyfits pyemf >= 2.0.0 (optional for EMF export) http://pyemf.sourceforge.net/ PyMinuit >= 1.1.2 (optional improved fitting) http://code.google.com/p/pyminuit/ For EMF and better SVG export, PyQt >= 4.6 or better is required, to fix a bug in the C++ wrapping For documentation on using Veusz, see the "Documents" directory. The manual is in PDF, HTML and text format (generated from docbook). The examples are also useful documentation. Please also see and contribute to the Veusz wiki: http://barmag.net/veusz-wiki/ Issues with the current version: * Some recent versions of PyQt/SIP will causes crashes when exporting SVG files. Update to 4.7.4 (if released) or a recent snapshot to solve this problem. If you enjoy using Veusz, we would love to hear from you. Please join the mailing lists at https://gna.org/mail/?group=veusz to discuss new features or if you'd like to contribute code. The latest code can always be found in the Git repository at https://github.com/jeremysanders/veusz.git. From WDyk at nobleenergyinc.com Mon Aug 22 15:23:03 2011 From: WDyk at nobleenergyinc.com (WDyk at nobleenergyinc.com) Date: Mon, 22 Aug 2011 13:23:03 -0600 Subject: [SciPy-User] IPython inline plots of stacked bars graphs In-Reply-To: References: Message-ID: In ipython 0.11, use Ctrl-Enter to enter multi-line edit mode. You can then send multiple commands to change your plot. Hit Enter on a blank line to send all commands at once. Wes Dyk, Production Systems Admin Noble Energy, Inc. From: scipy-user-request at scipy.org To: scipy-user at scipy.org Date: 08/21/2011 11:00 AM Subject: SciPy-User Digest, Vol 96, Issue 31 Sent by: scipy-user-bounces at scipy.org Message: 1 Date: Sat, 20 Aug 2011 16:12:22 -0700 From: Chris Withers Subject: [SciPy-User] IPython inline plots of stacked bars graphs To: SciPy Users List Message-ID: <4E503F56.1090908 at simplistix.co.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi All, If I do the following in an IPython 0.11 Qt shell: import matplotlib.pyplot as plt menMeans = (20, 35, 30, 35, 27) womenMeans = (25, 32, 34, 20, 25) plt.bar(ind, menMeans, color='r') plt.bar(ind, womenMeans, color='y', bottom=menMeans) I get, as I'd expect, a stacked bar graph. However, if I do: plt.bar(ind, menMeans, color='r') ...hit enter, and then do: plt.bar(ind, womenMeans, color='y', bottom=menMeans) ...I get two separate plots. How can I add to an existing inline plot? Also, and I guess this might be more of a matplotlib question, how do I "reach inside" an existing plot to, for example, adjust the width of the bars used? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk ------------------------------ Message: 2 Date: Sat, 20 Aug 2011 17:46:37 -0600 From: Charles R Harris Subject: Re: [SciPy-User] IPython inline plots of stacked bars graphs To: SciPy Users List Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Sat, Aug 20, 2011 at 5:12 PM, Chris Withers wrote: > Hi All, > > If I do the following in an IPython 0.11 Qt shell: > > import matplotlib.pyplot as plt > menMeans = (20, 35, 30, 35, 27) > womenMeans = (25, 32, 34, 20, 25) > plt.bar(ind, menMeans, color='r') > plt.bar(ind, womenMeans, color='y', bottom=menMeans) > > I get, as I'd expect, a stacked bar graph. > > However, if I do: > > plt.bar(ind, menMeans, color='r') > > ...hit enter, and then do: > > plt.bar(ind, womenMeans, color='y', bottom=menMeans) > > ...I get two separate plots. > > How can I add to an existing inline plot? > > Also, and I guess this might be more of a matplotlib question, how do I > "reach inside" an existing plot to, for example, adjust the width of the > bars used? > > cheers, > > I think it is more of an ipython question, possibly a matplotlib question ;) You might try the hold(True) command. Chuck ------------------------------ _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user End of SciPy-User Digest, Vol 96, Issue 31 ****************************************** The information contained in this e-mail and any attachments may be confidential. If you are not the intended recipient, please understand that dissemination, copying, or using such information is prohibited. If you have received this e-mail in error, please immediately advise the sender by reply e-mail and delete this e-mail and its attachments from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brockp at umich.edu Mon Aug 22 16:48:11 2011 From: brockp at umich.edu (Brock Palen) Date: Mon, 22 Aug 2011 16:48:11 -0400 Subject: [SciPy-User] Building Numpy/Scipy with MKL serial 10.3 Message-ID: <6D37C6AC-9941-4207-8E95-DE69DD124444@umich.edu> We need to force users to use serial MKL so using libmkl_rt.so is out of the questions, We built using the 'builder' wrapper that can make a custom MKL library with all the needed bits included: #copy makefile and function list from $MKLROOT/tools cp $MKLROOT/tools/builder/makefile /tmp/ cat $MKLROOT/tools/builder/blas_list >> /tmp/user_list cat $MKLROOT/tools/builder/lapack_list >> /tmp/user_list #build seqential version cd /tmp make libintel64 interface=lp64 name=libmkl_10.3_serial threading=sequential This creates a file libmkl_10.3_serial.so I then set: #create site.cfg with [mkl] library_dirs = /tmp mkl_libs = mkl_10.3_serial lapack_libs = If you ahve any questions let me know. Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 From ciampagg at usi.ch Mon Aug 22 17:48:29 2011 From: ciampagg at usi.ch (Giovanni Luca Ciampaglia) Date: Mon, 22 Aug 2011 14:48:29 -0700 Subject: [SciPy-User] Bootstrapping confidence interval of the maximum of a smoothing spline Message-ID: <4E52CEAD.4050702@usi.ch> Hi all, I have data on editing activity from an online community and I am trying to estimate the day of peak activity using smoothing splines. I determine the smoothing factor for scipy.interpolate.UnivariateSpline by leave-1-out crossvalidation, and then use scipy.optimize.fmin_tnc to evaluate the maximum from the resulting spline. This works pretty well and seems robust enough (e.g. http://tinypic.com/r/a3m739/7). Now I would like to compute the confidence intervals for this estimate, but I am not exactly sure on how to proceed, since I cannot sample data from my non-parametric model and generate a distribution for this estimator. I was thinking at applying some noise to the smoothing factor, but I am not sure whether this approach has any theoretical basis. Any idea? Cheers, -- Giovanni Luca Ciampaglia Ph.D. Candidate Faculty of Informatics University of Lugano Web: http://www.inf.usi.ch/phd/ciampaglia/ Bertastra?e 36 ? 8003 Z?rich ? Switzerland -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 77261 bytes Desc: not available URL: From josef.pktd at gmail.com Mon Aug 22 18:06:26 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 22 Aug 2011 18:06:26 -0400 Subject: [SciPy-User] Bootstrapping confidence interval of the maximum of a smoothing spline In-Reply-To: <4E52CEAD.4050702@usi.ch> References: <4E52CEAD.4050702@usi.ch> Message-ID: On Mon, Aug 22, 2011 at 5:48 PM, Giovanni Luca Ciampaglia wrote: > Hi all, I have data on editing activity from an online community and I am > trying to estimate the day of peak activity using smoothing splines. > > I determine the smoothing factor for scipy.interpolate.UnivariateSpline by > leave-1-out crossvalidation, and then use scipy.optimize.fmin_tnc to > evaluate the maximum from the resulting spline. This works pretty well and > seems robust enough (e.g. http://tinypic.com/r/a3m739/7). Now I would like > to compute the confidence intervals for this estimate, but I am not exactly > sure on how to proceed, since I cannot sample data from my non-parametric > model and generate a distribution for this estimator. My first idea would be to sample the residuals, the deviation from the actual observations and the spline, add them to the spline, and estimate the new spline on the generated data. And repeat for a number of bootstrap samples. Josef > > I was thinking at applying some noise to the smoothing factor, but I am not > sure whether this approach has any theoretical basis. Any idea? > > Cheers, > > -- > Giovanni Luca Ciampaglia > > Ph.D. Candidate > Faculty of Informatics > University of Lugano > Web: http://www.inf.usi.ch/phd/ciampaglia/ > > Bertastra?e 36 * 8003 Z?rich * Switzerland > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From questions.anon at gmail.com Mon Aug 22 19:00:31 2011 From: questions.anon at gmail.com (questions anon) Date: Tue, 23 Aug 2011 09:00:31 +1000 Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array In-Reply-To: References: Message-ID: Thank you, that is good to know, but that is not the case for this. I know I have blank data or something in a couple of sections and when I choose to print around those figures I still end up with what happens below (shown again). [[-- -- -- ..., -- -- --] [-- -- -- ..., -- -- --] [-- -- -- ..., -- -- --] ..., And then when I make this into one big array these turn into [ -3.27670000e+04 -3.27670000e+04 -3.27670000e+04 ..., -3.27670000e+04 -3.27670000e+04 -3.27670000e+04] Is there a way to identify these blanks and ignore them from the analyses? On Sat, Aug 20, 2011 at 3:18 AM, Robert Kern wrote: > On Fri, Aug 19, 2011 at 00:01, questions anon > wrote: > > Thank you, what you suggested worked but now I don't think that is my > > problem. > > Within the dataset I am trying to calculate the mean from it appears > there > > are some hours with no data, the output is: > > [[[-- -- -- ..., -- -- --] > > [-- -- -- ..., -- -- --] > > [-- -- -- ..., -- -- --] > > ..., > > [-- -- -- ..., -- -- --] > > [-- -- -- ..., -- -- --] > > [-- -- -- ..., -- -- --]]] > > So I would assume these would be ignored when I calculate the mean but > when > > I make all my files/times into one big array these blanks turn into > -32767. > > Is there some way to avoid this? > > This is just how large arrays get summarized when printed. The data is > all there. You can control how this summarization happens using the > threshold parameter to numpy.set_printoptions(): > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Aug 22 19:12:33 2011 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 22 Aug 2011 18:12:33 -0500 Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array In-Reply-To: References: Message-ID: On Mon, Aug 22, 2011 at 18:00, questions anon wrote: > Thank you, that is good to know, but that is not the case for this. I know I > have blank data or something in a couple of sections and when I choose to > print around those figures I still end up with what happens below (shown > again). > ?[[-- -- -- ..., -- -- --] > ? [-- -- -- ..., -- -- --] > ? [-- -- -- ..., -- -- --] > ? ..., > And then when I make this into one big array these turn into > ? [ -3.27670000e+04 ?-3.27670000e+04 ?-3.27670000e+04 ..., ?-3.27670000e+04 > ? ? -3.27670000e+04 ?-3.27670000e+04] > Is there a way to identify these blanks and ignore them from the analyses? Or, right, sorry. The -- indeed are masked values. Somehow, you are using masked_arrays. I don't know if the netCDF4 module is doing that for you automatically or if you are using different code than what you showed. numpy.concatenate() will ignore that the array is a masked_array and just treat it as if it were a regular numpy ndarray, and lose the mask information. You will need to use numpy.ma.concatenate() instead. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From questions.anon at gmail.com Mon Aug 22 19:25:54 2011 From: questions.anon at gmail.com (questions anon) Date: Tue, 23 Aug 2011 09:25:54 +1000 Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array In-Reply-To: References: Message-ID: yahhh that worked! thank you. I was showing all the code so maybe it is something with NETCDF4. Thanks again!! On Tue, Aug 23, 2011 at 9:12 AM, Robert Kern wrote: > On Mon, Aug 22, 2011 at 18:00, questions anon > wrote: > > Thank you, that is good to know, but that is not the case for this. I > know I > > have blank data or something in a couple of sections and when I choose to > > print around those figures I still end up with what happens below (shown > > again). > > [[-- -- -- ..., -- -- --] > > [-- -- -- ..., -- -- --] > > [-- -- -- ..., -- -- --] > > ..., > > And then when I make this into one big array these turn into > > [ -3.27670000e+04 -3.27670000e+04 -3.27670000e+04 ..., > -3.27670000e+04 > > -3.27670000e+04 -3.27670000e+04] > > Is there a way to identify these blanks and ignore them from the > analyses? > > Or, right, sorry. The -- indeed are masked values. Somehow, you are > using masked_arrays. I don't know if the netCDF4 module is doing that > for you automatically or if you are using different code than what you > showed. > > numpy.concatenate() will ignore that the array is a masked_array and > just treat it as if it were a regular numpy ndarray, and lose the mask > information. You will need to use numpy.ma.concatenate() instead. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajs2010 at gmail.com Tue Aug 23 05:11:41 2011 From: rajs2010 at gmail.com (Rajeev Singh) Date: Tue, 23 Aug 2011 14:41:41 +0530 Subject: [SciPy-User] Speeding up Python Again In-Reply-To: References: Message-ID: On Wed, Aug 10, 2011 at 6:48 PM, Rajeev Singh wrote: > Hi, > I was trying out the codes discussed > at http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html > Here is a summary of my results - > Computer: Desktop imsc9 aravali annapurna > NumPy: 7.651419 4.219105 5.576453 4.858640 > Cython: 4.259419 3.477259 3.204909 2.357819 > Weave: 4.302778 * 3.298551 2.400000 > Looped Fortran: 4.199148 3.414484 3.202963 2.315644 > Vectorized Fortran: 3.118410 2.131966 1.512303 1.460251 > pure fortran update1: 1.205727 1.964857 2.034688 1.336086 > pure fortran update2: 0.600848 0.604649 0.573593 0.721339 > imsc9, aravali and annapurna are HPC machines at my institute > * for some reason Weave didn't compile on imsc9 > > Indeed there is about a factor of 7 to 12 difference between pure fortran > with update2 (vectorized) and the numpy version. > I should mention that I changed N to 150 in laplace_for.f90 > Rajeev Hi, Continuing the comparison of various ways of implementing solving laplace equation, following result might interest you - Desktop imsc9 aravali annapurna Octave (0): 20.7866 * 21.6179 * Vectorized Fortran (pure) (1): 0.7487 0.6501 0.7507 1.1619 Vectorized Fortran (f2py) (2): 0.7190 0.6089 0.6243 1.0312 NumPy (3): 4.1343 2.5844 2.6565 3.7445 Cython (4): 1.7273 1.9927 2.0471 1.3525 Cython with C (5): 1.7248 1.9665 2.0354 1.3367 Weave (6): 1.9818 * 2.1326 1.4003 Looped Fortran (f2py) (7): 1.6996 1.9657 2.0429 1.3354 Looped Fortran (pure) (8): 1.7189 2.0145 2.0917 1.5086 C (pure) (9): 1.2820 1.9948 2.0527 1.4259 imsc9, aravali and annapurna are HPC machines at my institute * for some reason Weave didn't compile on imsc9 * octave isn't installed on imsc9 and annapurna The difference between numpy and fortran performance seems significant. However f2py does as well as pure fortran now. The difference from earlier case is that earlier there was a division inside the loop which I have replaced by multiplication by reciprocal. This does not affect the result but makes the execution faster in all cases except pure fortran (I guess fortran compiler was already doing it). I would be happy to give all the codes if someone is interested. Should we update the performance python page at scipy with these codes? Rajeev -------------- next part -------------- An HTML attachment was scrubbed... URL: From senthipa at in.ibm.com Thu Aug 18 02:03:34 2011 From: senthipa at in.ibm.com (Senthil Palanisamy) Date: Thu, 18 Aug 2011 11:33:34 +0530 Subject: [SciPy-User] scipy install issue Message-ID: Hi, i am trying to install scipy on my aix 5.3 machine, i am getting following error., compile options: '-DNO_ATLAS_INFO=1 -I/gpfs1/utils/python/Python-2.7.2/lib/python2.7/site-packages/numpy/core/include -I/gpfs1/utils/python/Python-2.7.2/include/python2.7 -c' xlc_r: scipy/integrate/_odepackmodule.c /gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix /usr/bin/xlf95 -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp -bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib -Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so ld: 0711-317 ERROR: Undefined symbol: .idamax_ ld: 0711-317 ERROR: Undefined symbol: .dscal_ ld: 0711-317 ERROR: Undefined symbol: .daxpy_ ld: 0711-317 ERROR: Undefined symbol: .ddot_ ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information. ld: 0711-317 ERROR: Undefined symbol: .idamax_ ld: 0711-317 ERROR: Undefined symbol: .dscal_ ld: 0711-317 ERROR: Undefined symbol: .daxpy_ ld: 0711-317 ERROR: Undefined symbol: .ddot_ ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information. error: Command "/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix /usr/bin/xlf95 -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp -bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib -Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so" failed with exit status 8 details -scipy-0.9.0 -numpy, BLAS , LAPACK are intalled already, - xlf and xlc compilersare using. please get back to me with solution, how can i edit the install script? --- --- SenthilRaja Palanisamy | HPC Team India Systems & Technology Lab, EGL-D, Bangalore, KA-560071 India Email : senthipa at in.ibm.com IBM India Pvt Ltd. From rmorgan466 at gmail.com Tue Aug 23 08:14:34 2011 From: rmorgan466 at gmail.com (Rita) Date: Tue, 23 Aug 2011 08:14:34 -0400 Subject: [SciPy-User] scipy.stats In-Reply-To: References: Message-ID: Any ideas? On Sat, Aug 20, 2011 at 7:40 AM, Rita wrote: > It should be > > > 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L > /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I/opt/intel/mkl/include > -I /opt/intel/include -fPIC -O3 -openmp -limf -lmkl_core -lmkl_intel_lp64 > -lmkl_intel_thread -lstdc++ -DMKL_ILP64' > > Here is how I am doing the compilation > > > > > > On Sat, Aug 20, 2011 at 6:38 AM, Rita wrote: > >> Thanks Bruce. I have already seen this >> >> Here are more details of my build. >> >> My Intel compiler exists here, /opt/intel/ >> >> self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib >> -L /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I >> /opt/intel/ipp/em64t/in clude -I /etg/source/Linux/include -I >> /opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf >> -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lstdc++ -DMKL_ILP64' >> Here is how I am doing the compilation >> >> CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config >> --compiler=intel --fcompiler=intelem build_clib --compiler=intel >> --fcompiler=intelem build_ext --compiler=intel install >> >> /opt/intel/ipp is what I was using for the math library. This compiles but >> I keep getting that problem >> >> I use the same compile statement to compile scipy >> >> >> >> On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey wrote: >> >>> On Fri, Aug 19, 2011 at 5:53 PM, Rita wrote: >>> > I apologize for the vague question. >>> > OS: Linux >>> > Compiler: Intel compiler suite. Version 11 (this also includes fortran >>> > compiler) >>> > MKL: 10.3 >>> > Numpy version: 1.6.1 >>> > When I do numpy.config() I see it properly compiled against Intel's >>> BLAS and >>> > LAPACK >>> > >>> > Where are the build logs located? Do you need to build log for Numpy >>> also? >>> > >>> > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers < >>> ralf.gommers at googlemail.com> >>> > wrote: >>> >> >>> >> >>> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita wrote: >>> >>> >>> >>> I am trying to import scipy.stats but I keep getting an import Error, >>> >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos >>> >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just >>> >>> cant get this working. >>> >>> Any advise? >>> >> >>> >> The symbol is defined in an Intel math library. You'll need to give us >>> >> more details in order to say more than that. What exact compilers and >>> MKL >>> >> did you use, what OS? Build command and build log? >>> >> >>> >> Cheers, >>> >> Ralf >>> >> >>> >> >>> >> >>> >>> A quick google indicates that you need to ensure that you link to the >>> appropriate Intel Math library: >>> >>> http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/ >>> >>> Also what is the cpu type? >>> >>> Bruce >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> >> -- >> --- Get your facts first, then you can distort them as you please.-- >> > > > > -- > --- Get your facts first, then you can distort them as you please.-- > -- --- Get your facts first, then you can distort them as you please.-- -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jerome.Kieffer at esrf.fr Tue Aug 23 10:25:13 2011 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 23 Aug 2011 16:25:13 +0200 Subject: [SciPy-User] Is this a bug in scipy.ndimage.interpolation.shift ??? Message-ID: <20110823162513.5dc792a1.Jerome.Kieffer@esrf.fr> Hello, I was using scipy.ndimage.interpolation.shift with order=0 and "wrap" mode because I did not want to swap the 4 blocs of memory myself ... but I got strange results. for shifting by hand one can do: def shift(input, shift): """ Shift an array like scipy.ndimage.interpolation.shift(input, shift, mode="wrap", order=0) but faster @param in: 2d numpy array @param d: 2-tuple of integers @return: shifted image """ re = numpy.zeros_like(input) s0, s1 = input.shape d0 = shift[0] % s0 d1 = shift[0] % s1 r0 = (-d0) % s0 r1 = (-d1) % s1 re[d0:, d1:] = input[:r0, :r1] re[:d0, d1:] = input[r0:, :r1] re[d0:, :d1] = input[:r0, r1:] re[:d0, :d1] = input[r0:, r1:] return re In [327]: a=np.random.random((5,5)) In [328]: scipy.ndimage.interpolation.shift(a,(2,3),order=0,mode="wrap")-shift(a,(2,3)) Out[328]: array([[-0.13484701, 0.43450823, 0.4920127 , -0.04826882, -0.40258904], [ 0.48403199, 0.02161651, -0.35774838, 0.73954376, 0.42218297], [-0.23808862, 0.4799521 , -0.39548832, 0. , 0. ], [-0.04105354, 0.06934301, -0.18976602, 0. , 0. ], [-0.38430434, 0.04591371, -0.33502248, 0. , 0. ]]) SHOULD BE 0 everywhere and it is only in the lower right corner ... Do you agree this is an error (or did I misinterpret scipy.ndimage.interpolation.shift since the begining ?) Shall I open a bug ? I am using an ubuntu 10.04 (LTS) Cheers, -- J?r?me Kieffer On-Line Data analysis / Software Group ISDD / ESRF tel +33 476 882 445 From bsouthey at gmail.com Tue Aug 23 14:04:30 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 23 Aug 2011 13:04:30 -0500 Subject: [SciPy-User] scipy.stats In-Reply-To: References: Message-ID: On Tue, Aug 23, 2011 at 7:14 AM, Rita wrote: > Any ideas? > > On Sat, Aug 20, 2011 at 7:40 AM, Rita wrote: >> >> It should be >> >> >> 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L >> /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I/opt/intel/mkl/include >> -I /opt/intel/include -fPIC -O3 -openmp -limf -lmkl_core -lmkl_intel_lp64 >> -lmkl_intel_thread???? -lstdc++ -DMKL_ILP64' >> >> Here is how I am doing the compilation >> >> >> >> >> >> On Sat, Aug 20, 2011 at 6:38 AM, Rita wrote: >>> >>> Thanks Bruce. I have already seen this >>> Here are more details of my build. >>> >>> My Intel compiler exists here, /opt/intel/ >>> >>> self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib >>> -L /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I >>> /opt/intel/ipp/em64t/in??? clude -I /etg/source/Linux/include -I >>> /opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf >>> -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread???? -lstdc++ -DMKL_ILP64' >>> Here is how I am doing the compilation >>> >>> CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config >>> --compiler=intel? --fcompiler=intelem build_clib --compiler=intel >>> --fcompiler=intelem build_ext --compiler=intel install >>> >>> /opt/intel/ipp is what I was using for the math library. This compiles >>> but I keep getting that problem >>> >>> I use the same compile statement to compile scipy >>> >>> >>> On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey >>> wrote: >>>> >>>> On Fri, Aug 19, 2011 at 5:53 PM, Rita wrote: >>>> > I apologize for the vague question. >>>> > OS: Linux >>>> > Compiler: Intel compiler suite. Version 11 (this also includes fortran >>>> > compiler) >>>> > MKL: 10.3 >>>> > Numpy version: 1.6.1 >>>> > When I do numpy.config() I see it properly compiled against Intel's >>>> > BLAS and >>>> > LAPACK >>>> > >>>> > Where are the build logs located? Do you need to build log for Numpy >>>> > also? >>>> > >>>> > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers >>>> > >>>> > wrote: >>>> >> >>>> >> >>>> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita wrote: >>>> >>> >>>> >>> I am trying to import scipy.stats but I keep getting an import >>>> >>> Error, >>>> >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos >>>> >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but >>>> >>> just >>>> >>> cant get this working. >>>> >>> Any advise? >>>> >> >>>> >> The symbol is defined in an Intel math library. You'll need to give >>>> >> us >>>> >> more details in order to say more than that. What exact compilers and >>>> >> MKL >>>> >> did you use, what OS? Build command and build log? >>>> >> >>>> >> Cheers, >>>> >> Ralf >>>> >> >>>> >> >>>> >> >>>> >>>> A quick google indicates that you need to ensure that you link to the >>>> appropriate Intel Math library: >>>> >>>> http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/ >>>> >>>> Also what is the cpu type? >>>> >>>> Bruce >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> >>> -- >>> --- Get your facts first, then you can distort them as you please.-- >> >> >> >> -- >> --- Get your facts first, then you can distort them as you please.-- > > > > -- > --- Get your facts first, then you can distort them as you please.-- > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > Just use Enthought's version: http://www.enthought.com/products/epd.php I do not use Intel's compiler so without more details it just appears that you have not given icc the correct paths to the libraries it needs when linking. Bruce From ralf.gommers at googlemail.com Tue Aug 23 15:36:52 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 23 Aug 2011 21:36:52 +0200 Subject: [SciPy-User] scipy install issue In-Reply-To: References: Message-ID: On Thu, Aug 18, 2011 at 8:03 AM, Senthil Palanisamy wrote: > > Hi, i am trying to install scipy on my aix 5.3 machine, > > i am getting following error., > This seems to be a common problem with BLAS on AIX. This was the most helpful message I found: https://stat.ethz.ch/pipermail/r-help/2000-July/007486.html. Ralf > > > > compile options: '-DNO_ATLAS_INFO=1 > > -I/gpfs1/utils/python/Python-2.7.2/lib/python2.7/site-packages/numpy/core/include > -I/gpfs1/utils/python/Python-2.7.2/include/python2.7 -c' > xlc_r: scipy/integrate/_odepackmodule.c > /gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix > /usr/bin/xlf95 > -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp > -bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg > build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib > -Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o > build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so > ld: 0711-317 ERROR: Undefined symbol: .idamax_ > ld: 0711-317 ERROR: Undefined symbol: .dscal_ > ld: 0711-317 ERROR: Undefined symbol: .daxpy_ > ld: 0711-317 ERROR: Undefined symbol: .ddot_ > ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more > information. > ld: 0711-317 ERROR: Undefined symbol: .idamax_ > ld: 0711-317 ERROR: Undefined symbol: .dscal_ > ld: 0711-317 ERROR: Undefined symbol: .daxpy_ > ld: 0711-317 ERROR: Undefined symbol: .ddot_ > ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more > information. > error: Command > "/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix > /usr/bin/xlf95 > -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp > -bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg > build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib > -Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o > build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so" failed with exit status > 8 > > > details > > -scipy-0.9.0 > -numpy, BLAS , LAPACK are intalled already, > - xlf and xlc compilersare using. > > > please get back to me with solution, how can i edit the install script? > > > --- --- > SenthilRaja Palanisamy | HPC Team > India Systems & Technology Lab, > EGL-D, Bangalore, KA-560071 India > Email : senthipa at in.ibm.com > IBM India Pvt Ltd. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Aug 23 19:00:36 2011 From: questions.anon at gmail.com (questions anon) Date: Wed, 24 Aug 2011 09:00:36 +1000 Subject: [SciPy-User] memory error - numpy mean - netcdf4 Message-ID: Hi All, I am receiving a memory error when I try to calculate the Numpy mean across many NetCDF files. Is there a way to fix this? The code I am using is below. Any feedback will be greatly appreciated. from netCDF4 import Dataset import matplotlib.pyplot as plt import numpy as N from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime import os MainFolder=r"E:/GriddedData/T_SFC/" all_TSFC=[] for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir path=path+'/' for ncfile in files: if ncfile[-3:]=='.nc': #print "dealing with ncfiles:", ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][4::24,:,:] LAT=ncfile.variables['latitude'][:] LON=ncfile.variables['longitude'][:] TIME=ncfile.variables['time'][:] fillvalue=ncfile.variables['T_SFC']._FillValue ncfile.close() #combine all TSFC to make one array for analyses all_TSFC.append(TSFC) big_array=N.ma.concatenate(all_TSFC) #calculate the mean of the combined array Mean=big_array.mean(axis=0) print "the mean is", Mean #plot output summary stats map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') map.drawcoastlines() map.drawstates() x,y=map(*N.meshgrid(LON,LAT)) plt.title('TSFC Mean at 3pm') ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h]) plt.colorbar(CS,cax=cax, drawedges=True) plt.savefig((os.path.join(MainFolder, 'Mean.png'))) plt.show() plt.close() print "end processing" -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsupinie at gmail.com Tue Aug 23 22:54:22 2011 From: tsupinie at gmail.com (Tim Supinie) Date: Tue, 23 Aug 2011 21:54:22 -0500 Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: References: Message-ID: At what point in the program are you getting the error? Is there a stack trace? Pending the answers to those to questions, my first thought is to ask how much data you're loading into memory? How many files are there? It's possible that you're loading a whole bunch of data that you don't need, and it's not getting cleared out by the garbage collector, which can generate memory errors when you run out of memory. Try removing as much data loading as you can. (Are you using TIME? How big is each array you load in?) Also, if the lats and lons in all the different files are the same, only load the lats and lons from one file. All these will not only help your program use less memory, but help it run faster. Finally, if that doesn't work, use the gc module and run gc.collect() after every loop iteration to make sure Python's cleaning up after itself like it should. I think the garbage collector might not always run during loops, which can create problems when you're loading a whole bunch of unused data. Tim On Tue, Aug 23, 2011 at 6:00 PM, questions anon wrote: > Hi All, > I am receiving a memory error when I try to calculate the Numpy mean across > many NetCDF files. > Is there a way to fix this? The code I am using is below. > Any feedback will be greatly appreciated. > > > from netCDF4 import Dataset > import matplotlib.pyplot as plt > import numpy as N > from mpl_toolkits.basemap import Basemap > from netcdftime import utime > from datetime import datetime > import os > > MainFolder=r"E:/GriddedData/T_SFC/" > > all_TSFC=[] > for (path, dirs, files) in os.walk(MainFolder): > for dir in dirs: > print dir > path=path+'/' > for ncfile in files: > if ncfile[-3:]=='.nc': > #print "dealing with ncfiles:", ncfile > ncfile=os.path.join(path,ncfile) > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > TSFC=ncfile.variables['T_SFC'][4::24,:,:] > LAT=ncfile.variables['latitude'][:] > LON=ncfile.variables['longitude'][:] > TIME=ncfile.variables['time'][:] > fillvalue=ncfile.variables['T_SFC']._FillValue > ncfile.close() > > #combine all TSFC to make one array for analyses > all_TSFC.append(TSFC) > > big_array=N.ma.concatenate(all_TSFC) > #calculate the mean of the combined array > Mean=big_array.mean(axis=0) > print "the mean is", Mean > > > #plot output summary stats > map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, > llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') > map.drawcoastlines() > map.drawstates() > x,y=map(*N.meshgrid(LON,LAT)) > plt.title('TSFC Mean at 3pm') > ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] > CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) > l,b,w,h =0.1,0.1,0.8,0.8 > cax = plt.axes([l+w+0.025, b, 0.025, h]) > plt.colorbar(CS,cax=cax, drawedges=True) > > plt.savefig((os.path.join(MainFolder, 'Mean.png'))) > plt.show() > plt.close() > > print "end processing" > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.fayolle at logilab.fr Wed Aug 24 07:38:50 2011 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Wed, 24 Aug 2011 13:38:50 +0200 Subject: [SciPy-User] 3d convex hull Message-ID: <201108241338.50940.alexandre.fayolle@logilab.fr> Hello, Is there any implementation of 3d convex hull computation algorithms in scipy? Thanks -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, CubicWeb, Debian : http://www.logilab.fr/formations D?veloppement logiciel sur mesure : http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science From keith.hughitt at gmail.com Wed Aug 24 09:00:59 2011 From: keith.hughitt at gmail.com (Keith Hughitt) Date: Wed, 24 Aug 2011 09:00:59 -0400 Subject: [SciPy-User] 3d convex hull In-Reply-To: <201108241338.50940.alexandre.fayolle@logilab.fr> References: <201108241338.50940.alexandre.fayolle@logilab.fr> Message-ID: How about http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.Delaunay.convex_hull.html ? On Wed, Aug 24, 2011 at 7:38 AM, Alexandre Fayolle < alexandre.fayolle at logilab.fr> wrote: > Hello, > > Is there any implementation of 3d convex hull computation algorithms in > scipy? > > Thanks > > -- > Alexandre Fayolle LOGILAB, Paris (France) > Formations Python, CubicWeb, Debian : http://www.logilab.fr/formations > D?veloppement logiciel sur mesure : http://www.logilab.fr/services > Informatique scientifique: http://www.logilab.fr/science > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abraham.zamudio at gmail.com Tue Aug 23 11:57:05 2011 From: abraham.zamudio at gmail.com (Abraham Zamudio) Date: Tue, 23 Aug 2011 08:57:05 -0700 (PDT) Subject: [SciPy-User] How to print the jacobian in the output of leastsq function Message-ID: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com> Hi All, i use the leastsq function from module scipy ... but what I want now is the jacobian matrix of the algorithm ... How should I use this function to print the Jacobian ??? . Thx . From josef.pktd at gmail.com Wed Aug 24 10:23:56 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Aug 2011 10:23:56 -0400 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? Message-ID: Does anyone know whether there is an algorithm that avoids the double loop to get a multivariate empirical distribution function? for point in data: count how many points in data are smaller or equal to point with 1d data it's just argsort(argsort(data)) double loop version with some test cases is attached. I didn't see a way that sorting would help. Thanks, Josef -------------- next part -------------- A non-text attachment was scrubbed... Name: try_mvecdf.py Type: text/x-python Size: 1022 bytes Desc: not available URL: From hhh.guo at gmail.com Wed Aug 24 11:56:36 2011 From: hhh.guo at gmail.com (Ning Guo) Date: Wed, 24 Aug 2011 23:56:36 +0800 Subject: [SciPy-User] 3d convex hull In-Reply-To: References: <201108241338.50940.alexandre.fayolle@logilab.fr> Message-ID: <4E551F34.4@gmail.com> On Wednesday, August 24, 2011 09:00 PM, Keith Hughitt wrote: Hi, I also want to try Delaunay function. But I cannot get enough info from the documentation. I want to output the Delaunay tetrahedral in 3D and need the vertex indices, facet areas and normals. How can I use the function in scipy.spatial? Now I have all the points with id and position. Thanks! > How about > http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.Delaunay.convex_hull.html ? > > On Wed, Aug 24, 2011 at 7:38 AM, Alexandre Fayolle > > > wrote: > > Hello, > > Is there any implementation of 3d convex hull computation > algorithms in scipy? > > Thanks > > -- > Alexandre Fayolle LOGILAB, Paris (France) > Formations Python, CubicWeb, Debian : http://www.logilab.fr/formations > D?veloppement logiciel sur mesure : http://www.logilab.fr/services > Informatique scientifique: http://www.logilab.fr/science > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Geotechnical Group Department of Civil and Environmental Engineering Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Aug 24 14:27:15 2011 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 24 Aug 2011 14:27:15 -0400 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? In-Reply-To: References: Message-ID: <4E554283.9040606@gmail.com> On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote: > Does anyone know whether there is an algorithm that avoids the double > loop to get a multivariate empirical distribution function? I think that is pretty standard. I'll attach something posted awhile ago. It seemed right at the time, but I did not test it. Once upon a time it was at http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py Cheers, Alan def empiricalcdf(data, method='Hazen'): """Return the empirical cdf. Methods available (here i goes from 1 to N) Hazen: (i-0.5)/N Weibull: i/(N+1) Chegodayev: (i-.3)/(N+.4) Cunnane: (i-.4)/(N+.2) Gringorten: (i-.44)/(N+.12) California: (i-1)/N :author: David Huard """ i = np.argsort(np.argsort(data)) + 1. nobs = len(data) method = method.lower() if method == 'hazen': cdf = (i-0.5)/nobs elif method == 'weibull': cdf = i/(nobs+1.) elif method == 'california': cdf = (i-1.)/nobs elif method == 'chegodayev': cdf = (i-.3)/(nobs+.4) elif method == 'cunnane': cdf = (i-.4)/(nobs+.2) elif method == 'gringorten': cdf = (i-.44)/(nobs+.12) else: raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.' return cdf From robert.kern at gmail.com Wed Aug 24 14:34:03 2011 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Aug 2011 13:34:03 -0500 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? In-Reply-To: <4E554283.9040606@gmail.com> References: <4E554283.9040606@gmail.com> Message-ID: On Wed, Aug 24, 2011 at 13:27, Alan G Isaac wrote: > On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote: >> Does anyone know whether there is an algorithm that avoids the double >> loop to get a multivariate empirical distribution function? > > I think that is pretty standard. > I'll attach something posted awhile ago. > It seemed right at the time, but I did > not test it. ?Once upon a time it was at > http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py That's *univariate*. He's asking for the multivariate case. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Wed Aug 24 14:59:09 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Aug 2011 14:59:09 -0400 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? In-Reply-To: <4E554283.9040606@gmail.com> References: <4E554283.9040606@gmail.com> Message-ID: On Wed, Aug 24, 2011 at 2:27 PM, Alan G Isaac wrote: > On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote: >> Does anyone know whether there is an algorithm that avoids the double >> loop to get a multivariate empirical distribution function? > > I think that is pretty standard. > I'll attach something posted awhile ago. > It seemed right at the time, but I did > not test it. ?Once upon a time it was at > http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py > > Cheers, > Alan > > > def empiricalcdf(data, method='Hazen'): > ? ? """Return the empirical cdf. > > ? ? Methods available (here i goes from 1 to N) > ? ? ? ? Hazen: ? ? ? (i-0.5)/N > ? ? ? ? Weibull: ? ? i/(N+1) > ? ? ? ? Chegodayev: ?(i-.3)/(N+.4) > ? ? ? ? Cunnane: ? ? (i-.4)/(N+.2) > ? ? ? ? Gringorten: ?(i-.44)/(N+.12) > ? ? ? ? California: ?(i-1)/N > > ? ? :author: David Huard > ? ? """ > ? ? i = np.argsort(np.argsort(data)) + 1. > ? ? nobs = len(data) > ? ? method = method.lower() > ? ? if method == 'hazen': > ? ? ? ? cdf = (i-0.5)/nobs > ? ? elif method == 'weibull': > ? ? ? ? cdf = i/(nobs+1.) > ? ? elif method == 'california': > ? ? ? ? cdf = (i-1.)/nobs > ? ? elif method == 'chegodayev': > ? ? ? ? cdf = (i-.3)/(nobs+.4) > ? ? elif method == 'cunnane': > ? ? ? ? cdf = (i-.4)/(nobs+.2) > ? ? elif method == 'gringorten': > ? ? ? ? cdf = (i-.44)/(nobs+.12) > ? ? else: > ? ? ? ? raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.' > ? ? return cdf Unfortunately it's 1d only, and I am working on multivariate, at least bivariate. Pierre has a 1d version similar to this in scipy.stats.mstats and a, so far unused, copy is in statsmodels. Thanks, Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From david_baddeley at yahoo.com.au Wed Aug 24 16:02:12 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Wed, 24 Aug 2011 13:02:12 -0700 (PDT) Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? Message-ID: <1314216132.86532.yint-ygo-j2me@web113416.mail.gq1.yahoo.com> Sounds like it could be a case for scipy.spatial.kdtree. Cheers, David On Thu, 25 Aug 2011 06:59 NZST josef.pktd at gmail.com wrote: >On Wed, Aug 24, 2011 at 2:27 PM, Alan G Isaac wrote: >> On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote: >>> Does anyone know whether there is an algorithm that avoids the double >>> loop to get a multivariate empirical distribution function? >> >> I think that is pretty standard. >> I'll attach something posted awhile ago. >> It seemed right at the time, but I did >> not test it. ?Once upon a time it was at >> http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py >> >> Cheers, >> Alan >> >> >> def empiricalcdf(data, method='Hazen'): >> ? ? """Return the empirical cdf. >> >> ? ? Methods available (here i goes from 1 to N) >> ? ? ? ? Hazen: ? ? ? (i-0.5)/N >> ? ? ? ? Weibull: ? ? i/(N+1) >> ? ? ? ? Chegodayev: ?(i-.3)/(N+.4) >> ? ? ? ? Cunnane: ? ? (i-.4)/(N+.2) >> ? ? ? ? Gringorten: ?(i-.44)/(N+.12) >> ? ? ? ? California: ?(i-1)/N >> >> ? ? :author: David Huard >> ? ? """ >> ? ? i = np.argsort(np.argsort(data)) + 1. >> ? ? nobs = len(data) >> ? ? method = method.lower() >> ? ? if method == 'hazen': >> ? ? ? ? cdf = (i-0.5)/nobs >> ? ? elif method == 'weibull': >> ? ? ? ? cdf = i/(nobs+1.) >> ? ? elif method == 'california': >> ? ? ? ? cdf = (i-1.)/nobs >> ? ? elif method == 'chegodayev': >> ? ? ? ? cdf = (i-.3)/(nobs+.4) >> ? ? elif method == 'cunnane': >> ? ? ? ? cdf = (i-.4)/(nobs+.2) >> ? ? elif method == 'gringorten': >> ? ? ? ? cdf = (i-.44)/(nobs+.12) >> ? ? else: >> ? ? ? ? raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.' >> ? ? return cdf > > >Unfortunately it's 1d only, and I am working on multivariate, at least >bivariate. > >Pierre has a 1d version similar to this in scipy.stats.mstats and a, >so far unused, copy is in statsmodels. > >Thanks, >Josef > > >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >_______________________________________________ >SciPy-User mailing list >SciPy-User at scipy.org >http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Wed Aug 24 17:52:48 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Aug 2011 17:52:48 -0400 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? In-Reply-To: <1314216132.86532.yint-ygo-j2me@web113416.mail.gq1.yahoo.com> References: <1314216132.86532.yint-ygo-j2me@web113416.mail.gq1.yahoo.com> Message-ID: On Wed, Aug 24, 2011 at 4:02 PM, David Baddeley wrote: > Sounds like it could be a case for scipy.spatial.kdtree. I don't see a way. Any suggestions? "find all smaller points" doesn't induce a complete order, so I don't see a way to define a distance. The only way to reduce comparisons is graphical (?) rule out cases that we know cannot be smaller. I got a little bit further def mvecdfvalues_noties(data): #use sort on first column ev = np.empty(data.shape[0], int) sortind0 = np.argsort(data[:,0]) datas = data[sortind0, ...] for i,x in enumerate(datas): ev[i] = mvecdf(datas[:i, 1:], x[1:]) + 1 #it should be possible to make this recursive ev2 = np.empty(data.shape[0], int) ev2[sortind0] = ev return ev2 poor man's timing import time t0 = time.time() x4 = np.random.randn(5000,2) mvecdfvalues(x4), t1 = time.time() mvecdfvalues_noties(x4) t2 = time.time() print t1-t0, t2-t1 >>> 5.492000103 1.59099984169 >>> 5.492000103 / 1.59099984169 3.4519174415292584 much better, but still pretty expensive in a Monte Carlo or Bootstrap. Cheers, Josef > > Cheers, David > > On Thu, 25 Aug 2011 06:59 NZST josef.pktd at gmail.com wrote: > >>On Wed, Aug 24, 2011 at 2:27 PM, Alan G Isaac wrote: >>> On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote: >>>> Does anyone know whether there is an algorithm that avoids the double >>>> loop to get a multivariate empirical distribution function? >>> >>> I think that is pretty standard. >>> I'll attach something posted awhile ago. >>> It seemed right at the time, but I did >>> not test it. ?Once upon a time it was at >>> http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py >>> >>> Cheers, >>> Alan >>> >>> >>> def empiricalcdf(data, method='Hazen'): >>> ? ? """Return the empirical cdf. >>> >>> ? ? Methods available (here i goes from 1 to N) >>> ? ? ? ? Hazen: ? ? ? (i-0.5)/N >>> ? ? ? ? Weibull: ? ? i/(N+1) >>> ? ? ? ? Chegodayev: ?(i-.3)/(N+.4) >>> ? ? ? ? Cunnane: ? ? (i-.4)/(N+.2) >>> ? ? ? ? Gringorten: ?(i-.44)/(N+.12) >>> ? ? ? ? California: ?(i-1)/N >>> >>> ? ? :author: David Huard >>> ? ? """ >>> ? ? i = np.argsort(np.argsort(data)) + 1. >>> ? ? nobs = len(data) >>> ? ? method = method.lower() >>> ? ? if method == 'hazen': >>> ? ? ? ? cdf = (i-0.5)/nobs >>> ? ? elif method == 'weibull': >>> ? ? ? ? cdf = i/(nobs+1.) >>> ? ? elif method == 'california': >>> ? ? ? ? cdf = (i-1.)/nobs >>> ? ? elif method == 'chegodayev': >>> ? ? ? ? cdf = (i-.3)/(nobs+.4) >>> ? ? elif method == 'cunnane': >>> ? ? ? ? cdf = (i-.4)/(nobs+.2) >>> ? ? elif method == 'gringorten': >>> ? ? ? ? cdf = (i-.44)/(nobs+.12) >>> ? ? else: >>> ? ? ? ? raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.' >>> ? ? return cdf >> >> >>Unfortunately it's 1d only, and I am working on multivariate, at least >>bivariate. >> >>Pierre has a 1d version similar to this in scipy.stats.mstats and a, >>so far unused, copy is in statsmodels. >> >>Thanks, >>Josef >> >> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>_______________________________________________ >>SciPy-User mailing list >>SciPy-User at scipy.org >>http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Wed Aug 24 19:25:12 2011 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Aug 2011 18:25:12 -0500 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? In-Reply-To: References: Message-ID: On Wed, Aug 24, 2011 at 09:23, wrote: > Does anyone know whether there is an algorithm that avoids the double > loop to get a multivariate empirical distribution function? > > for point in data: > ? ? count how many points in data are smaller or equal to point > > with 1d data it's just argsort(argsort(data)) > > double loop version with some test cases is attached. > > I didn't see a way that sorting would help. If you can bear to make a few (nobs, nobs) bool arrays, you can do just a kvars-sized loop in Python: dominates = np.ones((len(data), len(data)), dtype=bool) for x in data.T: dominates &= x[:,np.newaxis] > x sorta_ranks = dominates.sum(axis=1) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Wed Aug 24 21:23:09 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Aug 2011 21:23:09 -0400 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? In-Reply-To: References: Message-ID: On Wed, Aug 24, 2011 at 7:25 PM, Robert Kern wrote: > On Wed, Aug 24, 2011 at 09:23, ? wrote: >> Does anyone know whether there is an algorithm that avoids the double >> loop to get a multivariate empirical distribution function? >> >> for point in data: >> ? ? count how many points in data are smaller or equal to point >> >> with 1d data it's just argsort(argsort(data)) >> >> double loop version with some test cases is attached. >> >> I didn't see a way that sorting would help. > > If you can bear to make a few (nobs, nobs) bool arrays, you can do > just a kvars-sized loop in Python: > > dominates = np.ones((len(data), len(data)), dtype=bool) > for x in data.T: > ? ?dominates &= x[:,np.newaxis] > x > sorta_ranks = dominates.sum(axis=1) Thanks, quite a bit better, 14 times faster for (5000,2) and still 2.5 times faster for (5000,20), 12 times for (10000,3) compared to my original. Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From charlesr.harris at gmail.com Thu Aug 25 00:11:17 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Aug 2011 22:11:17 -0600 Subject: [SciPy-User] How to print the jacobian in the output of leastsq function In-Reply-To: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com> References: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com> Message-ID: On Tue, Aug 23, 2011 at 9:57 AM, Abraham Zamudio wrote: > Hi All, > > i use the leastsq function from module scipy ... but what I want now > is the jacobian matrix of the algorithm ... How should I use this > function to print the Jacobian ??? . > > That's tricky, as what is returned is the qr factorization of the Jacobian stored in condensed form containing the r part and what I think are the vectors of the Householder reflections that can be used to generate q. In addition, the columns are pivoted. So I think it would take a bit of research and work to recover the Jacobian. Perhaps someone here knows a bit more about the specific function used in this function. If you just want the covariance of the coefficients, that is easier to get, but note the documentation is incorrect, you need to multiply by the variance, not the standard deviation. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Thu Aug 25 00:39:47 2011 From: questions.anon at gmail.com (questions anon) Date: Thu, 25 Aug 2011 14:39:47 +1000 Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: References: Message-ID: Thanks for your response. The error I am receiving is: * * *Traceback (most recent call last):* * File "d:\documents and settings\SLBurns\Work\My Dropbox\Python_code\calculate_the_mean_across_multiple_netcdf_files_in_multiple_folders_add_shp_select_dirs.py", line 50, in * * big_array=N.ma.concatenate(all_TSFC)* * File "C:\Python27\lib\site-packages\numpy\ma\core.py", line 6155, in concatenate* * d = np.concatenate([getdata(a) for a in arrays], axis)* *MemoryError* I have tried ignoring TIME and only using one slice of lat and long (because they are the same for every file). I also tried entering the gc.collect() in the loop but nothing seemed to help. Anything else I could try? I am dealing with hundreds of files so maybe I need a whole different method to calculate the mean? On Wed, Aug 24, 2011 at 12:54 PM, Tim Supinie wrote: > At what point in the program are you getting the error? Is there a stack > trace? > > Pending the answers to those to questions, my first thought is to ask how > much data you're loading into memory? How many files are there? It's > possible that you're loading a whole bunch of data that you don't need, and > it's not getting cleared out by the garbage collector, which can generate > memory errors when you run out of memory. Try removing as much data loading > as you can. (Are you using TIME? How big is each array you load in?) > Also, if the lats and lons in all the different files are the same, only > load the lats and lons from one file. All these will not only help your > program use less memory, but help it run faster. > > Finally, if that doesn't work, use the gc module and run gc.collect() after > every loop iteration to make sure Python's cleaning up after itself like it > should. I think the garbage collector might not always run during loops, > which can create problems when you're loading a whole bunch of unused data. > > Tim > > On Tue, Aug 23, 2011 at 6:00 PM, questions anon wrote: > >> Hi All, >> I am receiving a memory error when I try to calculate the Numpy mean >> across many NetCDF files. >> Is there a way to fix this? The code I am using is below. >> Any feedback will be greatly appreciated. >> >> >> from netCDF4 import Dataset >> import matplotlib.pyplot as plt >> import numpy as N >> from mpl_toolkits.basemap import Basemap >> from netcdftime import utime >> from datetime import datetime >> import os >> >> MainFolder=r"E:/GriddedData/T_SFC/" >> >> all_TSFC=[] >> for (path, dirs, files) in os.walk(MainFolder): >> for dir in dirs: >> print dir >> path=path+'/' >> for ncfile in files: >> if ncfile[-3:]=='.nc': >> #print "dealing with ncfiles:", ncfile >> ncfile=os.path.join(path,ncfile) >> ncfile=Dataset(ncfile, 'r+', 'NETCDF4') >> TSFC=ncfile.variables['T_SFC'][4::24,:,:] >> LAT=ncfile.variables['latitude'][:] >> LON=ncfile.variables['longitude'][:] >> TIME=ncfile.variables['time'][:] >> fillvalue=ncfile.variables['T_SFC']._FillValue >> ncfile.close() >> >> #combine all TSFC to make one array for analyses >> all_TSFC.append(TSFC) >> >> big_array=N.ma.concatenate(all_TSFC) >> #calculate the mean of the combined array >> Mean=big_array.mean(axis=0) >> print "the mean is", Mean >> >> >> #plot output summary stats >> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, >> llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') >> map.drawcoastlines() >> map.drawstates() >> x,y=map(*N.meshgrid(LON,LAT)) >> plt.title('TSFC Mean at 3pm') >> ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] >> CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) >> l,b,w,h =0.1,0.1,0.8,0.8 >> cax = plt.axes([l+w+0.025, b, 0.025, h]) >> plt.colorbar(CS,cax=cax, drawedges=True) >> >> plt.savefig((os.path.join(MainFolder, 'Mean.png'))) >> plt.show() >> plt.close() >> >> print "end processing" >> >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Aug 25 00:50:04 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 25 Aug 2011 00:50:04 -0400 Subject: [SciPy-User] How to print the jacobian in the output of leastsq function In-Reply-To: References: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com> Message-ID: On Thu, Aug 25, 2011 at 12:11 AM, Charles R Harris wrote: > > > On Tue, Aug 23, 2011 at 9:57 AM, Abraham Zamudio > wrote: >> >> Hi All, >> >> i use ?the leastsq function from module scipy ... but what I want now >> is the jacobian matrix of the algorithm ... How should I use this >> function to print the Jacobian ??? . >> > > That's tricky, as what is returned is the qr factorization of the Jacobian > stored in condensed form containing the r part and what I think are the > vectors of the Householder reflections that can be used to generate q. In > addition, the columns are pivoted. So I think it would take a bit of > research and work to recover the Jacobian. Perhaps someone here knows a bit > more about the specific function used in this function. If you just want the > covariance of the coefficients, that is easier to get, but note the > documentation is incorrect, you need to multiply by the variance, not the > standard deviation. Is it even possible to recover the Jacobian from this? I never found a way (but I'm not an expert). I gave up and just calculate a numerical derivative at the solution. Since this question shows up regularly it would also be good to have an answer if it is a definite NO (or at least not with what the underlying function returns). Josef > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From charlesr.harris at gmail.com Thu Aug 25 01:04:06 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Aug 2011 23:04:06 -0600 Subject: [SciPy-User] How to print the jacobian in the output of leastsq function In-Reply-To: References: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com> Message-ID: On Wed, Aug 24, 2011 at 10:50 PM, wrote: > On Thu, Aug 25, 2011 at 12:11 AM, Charles R Harris > wrote: > > > > > > On Tue, Aug 23, 2011 at 9:57 AM, Abraham Zamudio < > abraham.zamudio at gmail.com> > > wrote: > >> > >> Hi All, > >> > >> i use the leastsq function from module scipy ... but what I want now > >> is the jacobian matrix of the algorithm ... How should I use this > >> function to print the Jacobian ??? . > >> > > > > That's tricky, as what is returned is the qr factorization of the > Jacobian > > stored in condensed form containing the r part and what I think are the > > vectors of the Householder reflections that can be used to generate q. In > > addition, the columns are pivoted. So I think it would take a bit of > > research and work to recover the Jacobian. Perhaps someone here knows a > bit > > more about the specific function used in this function. If you just want > the > > covariance of the coefficients, that is easier to get, but note the > > documentation is incorrect, you need to multiply by the variance, not the > > standard deviation. > > Is it even possible to recover the Jacobian from this? > > I never found a way (but I'm not an expert). I gave up and just > calculate a numerical derivative at the solution. > > Since this question shows up regularly it would also be good to have > an answer if it is a definite NO (or at least not with what the > underlying function returns). > > I'm pretty sure the Jacobian can be recovered but I'd have to read through the code to see how. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Thu Aug 25 01:53:59 2011 From: srean.list at gmail.com (srean) Date: Thu, 25 Aug 2011 00:53:59 -0500 Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: References: Message-ID: Since you are processing so many files, wouldnt it be better to update the mean from every file and close/unload that netcdf, i.e. do it one at a time ? You would not need to load the entore data set into memory and neither will you need to maintain the sum (which might risk an overflow in some cases) On Wed, Aug 24, 2011 at 11:39 PM, questions anon wrote: > Thanks for your response. > The error I am receiving is: > * > * > *Traceback (most recent call last):* > * File "d:\documents and settings\SLBurns\Work\My > Dropbox\Python_code\calculate_the_mean_across_multiple_netcdf_files_in_multiple_folders_add_shp_select_dirs.py", > line 50, in * > * big_array=N.ma.concatenate(all_TSFC)* > * File "C:\Python27\lib\site-packages\numpy\ma\core.py", line 6155, in > concatenate* > * d = np.concatenate([getdata(a) for a in arrays], axis)* > *MemoryError* > > > I have tried ignoring TIME and only using one slice of lat and long > (because they are the same for every file). I also tried entering > the gc.collect() in the loop but nothing seemed to help. > Anything else I could try? I am dealing with hundreds of files so maybe I > need a whole different method to calculate the mean? > > > > On Wed, Aug 24, 2011 at 12:54 PM, Tim Supinie wrote: > >> At what point in the program are you getting the error? Is there a stack >> trace? >> >> Pending the answers to those to questions, my first thought is to ask how >> much data you're loading into memory? How many files are there? It's >> possible that you're loading a whole bunch of data that you don't need, and >> it's not getting cleared out by the garbage collector, which can generate >> memory errors when you run out of memory. Try removing as much data loading >> as you can. (Are you using TIME? How big is each array you load in?) >> Also, if the lats and lons in all the different files are the same, only >> load the lats and lons from one file. All these will not only help your >> program use less memory, but help it run faster. >> >> Finally, if that doesn't work, use the gc module and run gc.collect() >> after every loop iteration to make sure Python's cleaning up after itself >> like it should. I think the garbage collector might not always run during >> loops, which can create problems when you're loading a whole bunch of unused >> data. >> >> Tim >> >> On Tue, Aug 23, 2011 at 6:00 PM, questions anon > > wrote: >> >>> Hi All, >>> I am receiving a memory error when I try to calculate the Numpy mean >>> across many NetCDF files. >>> Is there a way to fix this? The code I am using is below. >>> Any feedback will be greatly appreciated. >>> >>> >>> from netCDF4 import Dataset >>> import matplotlib.pyplot as plt >>> import numpy as N >>> from mpl_toolkits.basemap import Basemap >>> from netcdftime import utime >>> from datetime import datetime >>> import os >>> >>> MainFolder=r"E:/GriddedData/T_SFC/" >>> >>> all_TSFC=[] >>> for (path, dirs, files) in os.walk(MainFolder): >>> for dir in dirs: >>> print dir >>> path=path+'/' >>> for ncfile in files: >>> if ncfile[-3:]=='.nc': >>> #print "dealing with ncfiles:", ncfile >>> ncfile=os.path.join(path,ncfile) >>> ncfile=Dataset(ncfile, 'r+', 'NETCDF4') >>> TSFC=ncfile.variables['T_SFC'][4::24,:,:] >>> LAT=ncfile.variables['latitude'][:] >>> LON=ncfile.variables['longitude'][:] >>> TIME=ncfile.variables['time'][:] >>> fillvalue=ncfile.variables['T_SFC']._FillValue >>> ncfile.close() >>> >>> #combine all TSFC to make one array for analyses >>> all_TSFC.append(TSFC) >>> >>> big_array=N.ma.concatenate(all_TSFC) >>> #calculate the mean of the combined array >>> Mean=big_array.mean(axis=0) >>> print "the mean is", Mean >>> >>> >>> #plot output summary stats >>> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, >>> llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') >>> map.drawcoastlines() >>> map.drawstates() >>> x,y=map(*N.meshgrid(LON,LAT)) >>> plt.title('TSFC Mean at 3pm') >>> ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] >>> CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) >>> l,b,w,h =0.1,0.1,0.8,0.8 >>> cax = plt.axes([l+w+0.025, b, 0.025, h]) >>> plt.colorbar(CS,cax=cax, drawedges=True) >>> >>> plt.savefig((os.path.join(MainFolder, 'Mean.png'))) >>> plt.show() >>> plt.close() >>> >>> print "end processing" >>> >>> >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Aug 25 03:05:03 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 25 Aug 2011 09:05:03 +0200 Subject: [SciPy-User] Speeding up Python Again In-Reply-To: References: Message-ID: On Tue, Aug 23, 2011 at 11:11 AM, Rajeev Singh wrote: > > > On Wed, Aug 10, 2011 at 6:48 PM, Rajeev Singh wrote: > > Hi, > > I was trying out the codes discussed > > at > http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html > > Here is a summary of my results - > > Computer: Desktop imsc9 aravali annapurna > > NumPy: 7.651419 4.219105 5.576453 4.858640 > > Cython: 4.259419 3.477259 3.204909 2.357819 > > Weave: 4.302778 * 3.298551 2.400000 > > Looped Fortran: 4.199148 3.414484 3.202963 2.315644 > > Vectorized Fortran: 3.118410 2.131966 1.512303 1.460251 > > pure fortran update1: 1.205727 1.964857 2.034688 1.336086 > > pure fortran update2: 0.600848 0.604649 0.573593 0.721339 > > imsc9, aravali and annapurna are HPC machines at my institute > > * for some reason Weave didn't compile on imsc9 > > > > Indeed there is about a factor of 7 to 12 difference between pure fortran > > with update2 (vectorized) and the numpy version. > > I should mention that I changed N to 150 in laplace_for.f90 > > Rajeev > > Hi, > > Continuing the comparison of various ways of implementing solving laplace > equation, following result might interest you - > > Desktop imsc9 aravali annapurna > Octave (0): 20.7866 * 21.6179 * > Vectorized Fortran (pure) (1): 0.7487 0.6501 0.7507 1.1619 > Vectorized Fortran (f2py) (2): 0.7190 0.6089 0.6243 1.0312 > NumPy (3): 4.1343 2.5844 2.6565 3.7445 > Cython (4): 1.7273 1.9927 2.0471 1.3525 > Cython with C (5): 1.7248 1.9665 2.0354 1.3367 > Weave (6): 1.9818 * 2.1326 1.4003 > Looped Fortran (f2py) (7): 1.6996 1.9657 2.0429 1.3354 > Looped Fortran (pure) (8): 1.7189 2.0145 2.0917 1.5086 > C (pure) (9): 1.2820 1.9948 2.0527 1.4259 > > imsc9, aravali and annapurna are HPC machines at my institute > * for some reason Weave didn't compile on imsc9 > * octave isn't installed on imsc9 and annapurna > > The difference between numpy and fortran performance seems significant. > However f2py does as well as pure fortran now. The difference from earlier > case is that earlier there was a division inside the loop which I have > replaced by multiplication by reciprocal. This does not affect the result > but makes the execution faster in all cases except pure fortran (I guess > fortran compiler was already doing it). > > I would be happy to give all the codes if someone is interested. Should we > update the performance python page at scipy with these codes? > > It would be nice to this to http://www.scipy.org/PerformancePython. That page currently has only one problem, to see a few different ones compared with the same method gives a better impression of speed differences. It's a wiki page, so you should be able to add your code, problem description and results yourself. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From brockp at umich.edu Thu Aug 25 12:43:02 2011 From: brockp at umich.edu (Brock Palen) Date: Thu, 25 Aug 2011 12:43:02 -0400 Subject: [SciPy-User] SciPy on HPC Podcast Message-ID: I host an HPC podcast www.rce-cast.com we have had numpy featured on the show before and would like now to include SciPy into the show. Would a Scipy dev or two be willing to take an hour to speak to us for the show? Feel free to contact me off list. Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 From jrgray at gmail.com Fri Aug 26 12:06:19 2011 From: jrgray at gmail.com (Jeremy Gray) Date: Fri, 26 Aug 2011 12:06:19 -0400 Subject: [SciPy-User] obtaining residuals from linear regression? Message-ID: Hi, I'm new to scipy, and hope the answer is not trivial. I've searched the archives, googled, and looked at documentation for linalg.lstsq() numpy.polyfit(), and scipy.stats.linregress(), but it has not answered my question. my goal is to linearly adjust a set of observations (Y) for nuisance variables (X1 ... Xn), so I can use the adjusted Y values in further computations. One way to achieve what I want is to do a linear regression, regressing out the nuisance variables, and saving the residuals (being the part of Y that's not explained by X). I see the option full=True returns residuals, but its the sum of the residuals, whereas I am after the actual residuals on a case by case basis. is there an option to get the raw residuals? it would save me computing them again. --Jeremy -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Aug 26 12:29:28 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 26 Aug 2011 12:29:28 -0400 Subject: [SciPy-User] obtaining residuals from linear regression? In-Reply-To: References: Message-ID: On Fri, Aug 26, 2011 at 12:06 PM, Jeremy Gray wrote: > Hi, > > I'm new to scipy, and hope the answer is not trivial. I've searched the > archives, googled, and looked at documentation for linalg.lstsq() > numpy.polyfit(), and scipy.stats.linregress(), but it has not answered my > question. > > my goal is to linearly adjust a set of observations (Y) for nuisance > variables (X1 ... Xn), so I can use the adjusted Y values in further > computations. One way to achieve what I want is to do a linear regression, > regressing out the nuisance variables, and saving the residuals (being the > part of Y that's not explained by X). > > I see the option full=True returns residuals, but its the sum of the > residuals, whereas I am after the actual residuals on a case by case basis. > > is there an option to get the raw residuals? it would save me computing them > again. for a full answer for the linear (in parameter) case http://statsmodels.sourceforge.net/generated/scikits.statsmodels.regression.linear_model.OLS.html http://statsmodels.sourceforge.net/generated/scikits.statsmodels.regression.linear_model.RegressionResults.html Josef > > --Jeremy > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From Dharhas.Pothina at twdb.state.tx.us Fri Aug 26 12:52:27 2011 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 26 Aug 2011 11:52:27 -0500 Subject: [SciPy-User] Calculate surface area & volumes from delaunay triangulation Message-ID: <4E5788FB0200009B0003D940@GWWEB.twdb.state.tx.us> Hi, We have an old ArcGIS aml script that we are trying to replace. The original script takes the input from an ArcGIS TIN model (basically a delaunay triangulation of irregular xyz data points defining the bottom surface of the lake) and calculates the surface area and volume of the lake at different elevations (i.e. z cut planes) >From my googling it looks like I have options for the delaunay triangulation using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface area and volume calculations once I have the triangulation. I would appreciate any pointers. thanks, - dharhas -------------- next part -------------- An HTML attachment was scrubbed... URL: From philmorefield at yahoo.com Fri Aug 26 13:31:00 2011 From: philmorefield at yahoo.com (Phil Morefield) Date: Fri, 26 Aug 2011 10:31:00 -0700 (PDT) Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: References: Message-ID: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com> First off, the netCDF4 module has a multi-file class that concatonates multiple netCDF files for you: http://netcdf4-python.googlecode.com/svn/trunk/docs/netCDF4.MFDataset-class.html. That will simplify things a bit. ? Second, usually the "TIME" dimension is axis=2. Axes 0 and 1 usually correspond to the X and Y dimensions. ? Finally, you're getting the MemoryError because you're trying to?put an ginormous array into memory all at once. Your OS can't handle it.?Just loop through each?time step and keep a running total and?counter. Then divide your total?(which is an array) by?your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry. ? ? ? From: questions anon To: scipy-user at scipy.org Sent: Tuesday, August 23, 2011 7:00 PM Subject: [SciPy-User] memory error - numpy mean - netcdf4 Hi All, I am receiving a memory error when I try to calculate the Numpy mean across many NetCDF files. Is there a way to fix this? The code I am using is below. Any feedback will be greatly appreciated. from netCDF4 import Dataset import matplotlib.pyplot as plt import numpy as N from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime import os MainFolder=r"E:/GriddedData/T_SFC/" all_TSFC=[]? for (path, dirs, files) in os.walk(MainFolder): ? ? for dir in dirs: ? ? ? ? print dir ? ? path=path+'/' ? ? for ncfile in files: ? ? ? ? if ncfile[-3:]=='.nc': ? ? ? ? ? ? #print "dealing with ncfiles:", ncfile ? ? ? ? ? ? ncfile=os.path.join(path,ncfile) ? ? ? ? ? ? ncfile=Dataset(ncfile, 'r+', 'NETCDF4') ? ? ? ? ? ? TSFC=ncfile.variables['T_SFC'][4::24,:,:] ? ? ? ? ? ? LAT=ncfile.variables['latitude'][:] ? ? ? ? ? ? LON=ncfile.variables['longitude'][:] ? ? ? ? ? ? TIME=ncfile.variables['time'][:] ? ? ? ? ? ? fillvalue=ncfile.variables['T_SFC']._FillValue ? ? ? ? ? ? ncfile.close() ? ? ? ? ? ? #combine all TSFC to make one array for analyses ? ? ? ? ? ? all_TSFC.append(TSFC) ? ? ? ? ? ? big_array=N.ma.concatenate(all_TSFC) #calculate the mean of the combined array Mean=big_array.mean(axis=0) print "the mean is", Mean #plot output summary stats map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, ? ? ? ? ? ? ? llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') map.drawcoastlines() map.drawstates() x,y=map(*N.meshgrid(LON,LAT)) plt.title('TSFC Mean at 3pm') ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] CS = map.contourf(x,y,Mean, cmap=plt.cm.jet) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h]) plt.colorbar(CS,cax=cax, drawedges=True) plt.savefig((os.path.join(MainFolder, 'Mean.png'))) plt.show() plt.close() print "end processing" ? ? ? ? ? _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Fri Aug 26 14:00:18 2011 From: srean.list at gmail.com (srean) Date: Fri, 26 Aug 2011 13:00:18 -0500 Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com> References: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com> Message-ID: > Finally, you're getting the MemoryError because you're trying to put an > ginormous array into memory all at once. Your OS can't handle it. Just loop > through each time step and keep a running total and counter. Then divide > your total (which is an array) by your counter (which is an integer or > float) and presto: you have your average. It's plenty fast, don't worry. > In fact one can even avoid keeping the running total. If the values are integers then the running total may overflow. Say you have the mean \mu computed from N points and you have a new collection of m points whose mean is t. Then the mean on the N + m points is: \mu_{new} = \mu + (m)/(N+m) ( t - \mu) -------------- next part -------------- An HTML attachment was scrubbed... URL: From philmorefield at yahoo.com Fri Aug 26 15:33:53 2011 From: philmorefield at yahoo.com (Phil Morefield) Date: Fri, 26 Aug 2011 12:33:53 -0700 (PDT) Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: References: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com> Message-ID: <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com> "If the values are integers then the running total may overflow." That's a good point. Though you could just do this: ? ################################### import numpy as np ? array = netcdf_variable[0] ? for i in xrange(1, len(netcdf_variable) - 1, 1): ????array = np.true_divide(np.add(array,?array[i]), 2.0) ################################### ? The formula you have written looks like you're collapsing everything into a single value. I think he's trying to average a bunch of 2D arrays into a single 2D array. ? ? ? ? From: srean To: Phil Morefield ; SciPy Users List Sent: Friday, August 26, 2011 2:00 PM Subject: Re: [SciPy-User] memory error - numpy mean - netcdf4 Finally, you're getting the MemoryError because you're trying to?put an ginormous array into memory all at once. Your OS can't handle it.?Just loop through each?time step and keep a running total and?counter. Then divide your total?(which is an array) by?your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry. > In fact one can even avoid keeping the running total. If the values are integers then the running total may overflow. Say you have the mean \mu computed from N points and you have a new collection of m points whose mean is t. Then the mean on the N + m points is: ? \mu_{new} = \mu + (m)/(N+m) ( t - \mu) -------------- next part -------------- An HTML attachment was scrubbed... URL: From philmorefield at yahoo.com Fri Aug 26 17:58:17 2011 From: philmorefield at yahoo.com (Phil Morefield) Date: Fri, 26 Aug 2011 14:58:17 -0700 (PDT) Subject: [SciPy-User] Fw: memory error - numpy mean - netcdf4 In-Reply-To: <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com> References: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com> <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com> Message-ID: <1314395897.41715.YahooMailNeo@web161309.mail.bf1.yahoo.com> import numpy as np ? array = netcdf_variable[0] ? for i in xrange(1, len(netcdf_variable) - 1, 1): ????array = np.true_divide(np.add(array,?array[i]), 2.0) ? ? Oops. That's not right. That's what I get for being hasty. Something like this maybe: ? ######################################### import numpy as np ? array = np.true_divide(netcdf_variable[0], len(netcdf_variable)) ? for i in xrange(1, len(netcdf_variable) - 1, 1): ????array = np.add(array, np.true_divide(array[i], len(netcdf_variable))) ######################################### ? ----- Forwarded Message ----- From: Phil Morefield To: srean ; SciPy Users List Sent: Friday, August 26, 2011 3:33 PM Subject: Re: [SciPy-User] memory error - numpy mean - netcdf4 "If the values are integers then the running total may overflow." That's a good point. Though you could just do this: ################################### import numpy as np array = netcdf_variable[0] for i in xrange(1, len(netcdf_variable) - 1, 1): ????array = np.true_divide(np.add(array,?array[i]), 2.0) ################################### ? The formula you have written looks like you're collapsing everything into a single value. I think he's trying to average a bunch of 2D arrays into a single 2D array. ? ? ? ? From: srean To: Phil Morefield ; SciPy Users List Sent: Friday, August 26, 2011 2:00 PM Subject: Re: [SciPy-User] memory error - numpy mean - netcdf4 Finally, you're getting the MemoryError because you're trying to?put an ginormous array into memory all at once. Your OS can't handle it.?Just loop through each?time step and keep a running total and?counter. Then divide your total?(which is an array) by?your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry. > In fact one can even avoid keeping the running total. If the values are integers then the running total may overflow. Say you have the mean \mu computed from N points and you have a new collection of m points whose mean is t. Then the mean on the N + m points is: ? \mu_{new} = \mu + (m)/(N+m) ( t - \mu) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Fri Aug 26 20:54:41 2011 From: srean.list at gmail.com (srean) Date: Fri, 26 Aug 2011 19:54:41 -0500 Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com> References: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com> <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com> Message-ID: On Fri, Aug 26, 2011 at 2:33 PM, Phil Morefield wrote: > > The formula you have written looks like you're collapsing everything into a > single value. I think he's trying to average a bunch of 2D arrays into a > single 2D array. > You are correct, the form that I posted can be read as if it is for updating single mean vector \mu, but you can use the same for an nd-array trivially. Just have \mu and t as nd-arrays. m can be one too. Numpy broadcasting will take care of the rest. One advantage is that it requires only a constant amount of memory for the computation, you can even read the data in from an infinite pipe or generator that yields a single vector or a matrix at a time (or bundles them up m at a time). It will always be uptodate with the current estimate of the means. In fact will work for any moment too. --srean -------------- next part -------------- An HTML attachment was scrubbed... URL: From lutz.maibaum at gmail.com Fri Aug 26 21:39:54 2011 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Fri, 26 Aug 2011 18:39:54 -0700 Subject: [SciPy-User] 0 * inf = 0 for some sparse matrix operations Message-ID: Hello, most of the time, trying to multiply 0 by infinity results in NaNs. However, in some sparse matrix operations such as matrix multiplication, such terms seem to be silently ignored (or interpreted as zeros). Is this the intended behavior? Is there a way to perform matrix multiplications such that 0*inf = NaN? Any help would be much appreciated. Best wishes, Lutz In [2]: import numpy as np In [3]: from scipy.sparse import csr_matrix In [4]: 0 * np.inf Out[4]: nan In [5]: np.array([0]) * np.array([np.inf]) /opt/local/bin/ipython-2.6:1: RuntimeWarning: invalid value encountered in multiply #!/opt/local/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python Out[5]: array([ nan]) In [6]: np.array([0]).dot(np.array([np.inf])) Out[6]: nan In [7]: (csr_matrix([0]) * csr_matrix([np.inf])).toarray() Out[7]: array([[ 0.]]) In [8]: (csr_matrix([0]).dot(csr_matrix([np.inf]))).toarray() Out[8]: array([[ 0.]]) In [9]: (csr_matrix([0]).multiply(csr_matrix([np.inf]))).toarray() Out[9]: array([[ nan]]) From dbigbear at gmail.com Sat Aug 27 01:19:09 2011 From: dbigbear at gmail.com (Xiong Deng) Date: Sat, 27 Aug 2011 13:19:09 +0800 Subject: [SciPy-User] Install Scipy Errors: ImportError: /path_to/liblapack.so: undefined symbol: ztbsv_ Message-ID: Hi all, I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times, but always encountering the problem: [work at XXX]$ python -c 'import scipy.optimize; scipy.optimize.test()' Traceback (most recent call last): File "", line 1, in File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/scipy/optimize/__init__.py", line 11, in from lbfgsb import fmin_l_bfgs_b File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py", line 28, in import _lbfgsb ImportError: /home/work/local/lib/liblapack.so: undefined symbol: ztbsv_ I can pass some other tests like: [work at XXX:~/local]$ python -c 'import scipy.ndimage; scipy.ndimage.test()' Running unit tests for scipy.ndimage NumPy version 1.6.1 NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy SciPy version 0.9.0 SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-packages/scipy Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] nose version 1.1.2 .........S................................................................................................................................................................................................................................................................................................................................................................................................................. ---------------------------------------------------------------------- Ran 411 tests in 1.247s OK (SKIP=1) The problem seems due to the lib of Lapack. So I tried the solutions posted on the internet before. 1) The liblapack.so may be not complete...SO I tried this: # integrate lapack with atlas: cd lib/ mkdir tmp cd tmp/ ar x ../liblapack.a cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a ar r ../liblapack.a *.o cd ../.. make check make ptcheck cp include/* ~/include/ cp lib/*.a ~/lib/ That is, after installing atlas, there is another liblapack.a (in addition to the lapack_LINUX.a after Lapack) in its lib, but it is about 500k, so I integrate it with the lapack_LINUX.a from installing Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about 5m 2) re-install Lapack and atlas many times....No use 3) I found there is a lapack.so under scipy/lib, and it is about 500K, but I think it may be not the problem, becaues the failure is "ImportError: /home/work/local/lib/liblapack.so: undefined symbol: ztbsv_". Scipy seemed to import liblapack.so in my general lib directory... 4) One thing I am not sure is that I used gcc 4.7 and gfortran to compile lapack and atlas, but my python 2.7 was built using gcc 3.4.5.....Is this a problem? Anyone can help? _______________________________________________________________ My configuration of the installation: * ATLAS 3.8.4 * lapack 3.3.1 * numpy 1.6.1 * SciPy version 0.9.0 * dateutil 1.5 * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] * nose version 1.1.2 * gcc (GCC) 4.7.0 20110820 (experimental) * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 x86_64 x86_64 x86_64 GNU/Linux site.cfg of Scipy: [DEFAULT] library_dirs = /home/work/local/lib include_dirs = /home/work/local/include [lapack_opt] libraries = lapack, f77blas, cblas, atlas site.cfg of Numpy: [DEFAULT] library_dirs = /home/work/local/lib include_dirs = /home/work/local/include [lapack_opt] libraries = lapack, f77blas, cblas, atlas In addition, there are failures as well when test Numpy: >>> import numpy >>> numpy.test('1') Running unit tests for numpy NumPy version 1.6.1 NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] nose version 1.1.2 ====================================================================== FAIL: Test basic arithmetic function errors ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", line 367, in test_floating_exceptions_power np.power, ftype(2), ftype(2**fi.nexp)) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe "Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Type did not raise fpe error 'overflow'. ====================================================================== FAIL: Test generic loops. ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops assert_almost_equal(fone(x), fone_val, err_msg=msg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py", line 448, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal to 7 decimals PyUFunc_F_F ACTUAL: array([ 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], dtype=complex64) DESIRED: 1 ====================================================================== FAIL: test_umath.TestComplexFunctions.test_loss_of_precision(,) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision check(x_basic, 2*eps/1e-3) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", line 901, in check 'arcsinh') AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh') ====================================================================== FAIL: test_umath.TestComplexFunctions.test_precisions_consistent ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", line 812, in test_precisions_consistent assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py", line 448, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal to 6 decimals fch-fcd ACTUAL: 2.3561945j DESIRED: (0.66623943249251527+1.0612750619050355j) ====================================================================== FAIL: test_kind.TestKind.test_all ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/f2py/tests/test_kind.py", line 30, in test_all 'selectedrealkind(%s): expected %r but got %r' % (i, selected_real_kind(i), selectedrealkind(i))) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: selectedrealkind(19): expected -1 but got 16 ---------------------------------------------------------------------- Ran 3552 tests in 29.977s FAILED (KNOWNFAIL=3, failures=5) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Aug 27 03:53:56 2011 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 27 Aug 2011 07:53:56 +0000 (UTC) Subject: [SciPy-User] 3d convex hull References: <201108241338.50940.alexandre.fayolle@logilab.fr> <4E551F34.4@gmail.com> Message-ID: Wed, 24 Aug 2011 23:56:36 +0800, Ning Guo wrote: > I also want to try Delaunay function. But I cannot get enough info from > the documentation. I want to output the Delaunay tetrahedral in 3D and > need the vertex indices, facet areas and normals. How can I use the > function in scipy.spatial? Now I have all the points with id and > position. Suppose you have 20 points in 3-D: import numpy as np import scipy.spatial points = np.random.rand(20, 3) tri = scipy.spatial.Delaunay(points) The indices of the vertices of tetrahedron number `j` are in `tri.vertices[j]`. The facet areas and normals can be computed for each tetrahedron via vector cross products: tetra_points = tri.points[tri.vertices] # (N, 4, 3) array face_normals = np.empty_like(tetra_points) face_normals[:,0] = np.cross(tetra_points[:,0], tetra_points[:,1]) face_normals[:,1] = np.cross(tetra_points[:,0], tetra_points[:,2]) face_normals[:,2] = np.cross(tetra_points[:,0], tetra_points[:,3]) face_normals[:,3] = np.cross(tetra_points[:,1], tetra_points[:,2]) face_normal_lengths = np.sqrt(np.sum(face_normals**2, axis=2)) face_normals /= face_normal_lengths[:,:,np.newaxis] face_areas = 0.5 * face_normal_lengths One important point I don't know at the moment is if those normals actually point away from the center of the tetrahedra. You'd have to check the Qhull documentation to check whether they have a winding convention that guarantees certain ordering of the vertices of the simplices. (Note: the above is untested code, so check it works first :) -- Pauli Virtanen From ralf.gommers at googlemail.com Sat Aug 27 07:59:47 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 27 Aug 2011 13:59:47 +0200 Subject: [SciPy-User] lmfit-py -- simple least squares minimization In-Reply-To: References: Message-ID: Hi Matt, On Mon, Aug 15, 2011 at 3:05 PM, Matt Newville wrote: > Hi, > > Having used on numpy and scipy for many years and being very pleased > with them, I've found an area which I think might benefit from a > modest improvement, and have tried to implement this. > > The scipy.optimize routines are robust, but seem a little unfriendly > to people coming from proprietary environments or Numerical > Recipes-level tools. Specifically, the Levenberg-Marquardt algorithm > is used heavily in many domains (including the x-ray spectroscopy > fields I am most familiar with), but the MINPACK and > scipy.optimize.leastsq implementation lack convenient ways to: > - turn on/off parameters for fitting, that is, to "fix" > certain parameters. > - place simple min/max bounds on parameters > - place simple mathematical constraints on parameters. > > While these limitations can be worked around, doing so requires > putting many options into the function to be minimized, which is > somewhat inconvenient. On the other hand, these features do exist > in less robust fitting code that is not based on directly on MINPACK > or as well-supported as scipy. > > I've written a module to do this so that the least-squares > minimization from scipy.optimize.leastsq can take bounded and > constrained parameters, and tried to make it of general use. This > code (BSD-licensed, somewhat documented) is at > http://github.com/newville/lmfit-py > > The constraint mechanism is a bit involved (using the ast module > instead of 'eval'), but the rest of the code is quite straightforward > and simple. Currently, this supports minimization with > scipy.optimize.leastsq, scipy.optimize.fmin_l_bfgs_b, and > scipy.optimize.anneal. Supporting other algorithms could be possible. > > If you find this interesting or useful, I'd appreciate any feedback > you might have. For example, this is not currently organized as a > scikit -- would that be preferable? > > This will probably be useful to me at some point. Whether or not you organize it as a scikit, it may be good to list your package at http://scipy.org/Topical_Software. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From otrov at hush.ai Fri Aug 26 13:25:10 2011 From: otrov at hush.ai (Kliment) Date: Fri, 26 Aug 2011 19:25:10 +0200 Subject: [SciPy-User] Return variable value by function value Message-ID: <20110826172510.2EC696F446@smtp.hushmail.com> Hello, this will be very simple to any of you I guess, but I don't know well numpy. I declare "x = arange(1,100)" and "y = sqrt(1 - x**2/10E+4)" How can I return x value when y = 0.95 for example? I hope I don't have to transform y equation by x as I know that and expect some more strait approach Thanks for your time From josef.pktd at gmail.com Sat Aug 27 10:35:44 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Aug 2011 10:35:44 -0400 Subject: [SciPy-User] Return variable value by function value In-Reply-To: <20110826172510.2EC696F446@smtp.hushmail.com> References: <20110826172510.2EC696F446@smtp.hushmail.com> Message-ID: On Fri, Aug 26, 2011 at 1:25 PM, Kliment wrote: > Hello, > > this will be very simple to any of you I guess, but I don't know > well numpy. > > I declare "x = arange(1,100)" and "y = sqrt(1 - x**2/10E+4)" > How can I return x value when y = 0.95 for example? > > I hope I don't have to transform y equation by x as I know that and > expect some more strait approach numerically: scipy.optimize rootfinding, fsolve, brentq, ... see docs Josef > > > Thanks for your time > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From hhh.guo at gmail.com Sat Aug 27 11:12:24 2011 From: hhh.guo at gmail.com (Ning Guo) Date: Sat, 27 Aug 2011 23:12:24 +0800 Subject: [SciPy-User] 3d convex hull In-Reply-To: References: <201108241338.50940.alexandre.fayolle@logilab.fr> <4E551F34.4@gmail.com> Message-ID: <4E590958.30103@gmail.com> On Saturday, August 27, 2011 03:53 PM, Pauli Virtanen wrote: Thanks Pauli! You pointed out how to calculate the normals and areas using cross product. It's really smart and I will use this method if the Delaunay function cannot provide results directly. Also, the formula to calculate normal may be like this: face_normals[:,0] = np.cross(tetra_points[:,0]-tetra_points[:,2],tetra_points[:,1]-tetra_points[:,2]) // facet 0-1-2 face_normals[:,1] = np.cross(tetra_points[:,0]-tetra_points[:,3],tetra_points[:,2]-tetra_points[:,3]) // facet 0-2-3 face_normals[:,2] = np.cross(tetra_points[:,0]-tetra_points[:,1],tetra_points[:,3]-tetra_points[:,1]) // facet 0-3-1 face_normals[:,3] = np.cross(tetra_points[:,1]-tetra_points[:,3],tetra_points[:,2]-tetra_points[:,3]) // facet 1-2-3 Regarding to the order of the vertices, I'm also not sure about their convention. I'm trying to figure it out. Best regards! Ning > Wed, 24 Aug 2011 23:56:36 +0800, Ning Guo wrote: >> I also want to try Delaunay function. But I cannot get enough info from >> the documentation. I want to output the Delaunay tetrahedral in 3D and >> need the vertex indices, facet areas and normals. How can I use the >> function in scipy.spatial? Now I have all the points with id and >> position. > Suppose you have 20 points in 3-D: > > import numpy as np > import scipy.spatial > > points = np.random.rand(20, 3) > tri = scipy.spatial.Delaunay(points) > > The indices of the vertices of tetrahedron number `j` are > in `tri.vertices[j]`. The facet areas and normals can be computed > for each tetrahedron via vector cross products: > > tetra_points = tri.points[tri.vertices] # (N, 4, 3) array > > face_normals = np.empty_like(tetra_points) > face_normals[:,0] = np.cross(tetra_points[:,0], tetra_points[:,1]) > face_normals[:,1] = np.cross(tetra_points[:,0], tetra_points[:,2]) > face_normals[:,2] = np.cross(tetra_points[:,0], tetra_points[:,3]) > face_normals[:,3] = np.cross(tetra_points[:,1], tetra_points[:,2]) > > face_normal_lengths = np.sqrt(np.sum(face_normals**2, axis=2)) > > face_normals /= face_normal_lengths[:,:,np.newaxis] > face_areas = 0.5 * face_normal_lengths > > One important point I don't know at the moment is if those normals > actually point away from the center of the tetrahedra. You'd have > to check the Qhull documentation to check whether they have a winding > convention that guarantees certain ordering of the vertices of > the simplices. > > (Note: the above is untested code, so check it works first :) > -- Geotechnical Group Department of Civil and Environmental Engineering Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong From prabhu at aero.iitb.ac.in Sat Aug 27 12:48:33 2011 From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran) Date: Sat, 27 Aug 2011 22:18:33 +0530 Subject: [SciPy-User] PySPH 0.9beta release Message-ID: <4E591FE1.2030304@aero.iitb.ac.in> Hi, We are pleased to announce a 0.9beta release of PySPH. This is our first public release. PySPH (http://pysph.googlecode.com) is an open source framework for Smoothed Particle Hydrodynamics (SPH) simulations. It is implemented in Python and the performance critical parts are implemented in Cython. The framework provides for load balanced, parallel execution of solvers. It is designed to be easy to extend. Check our homepage for more details. Quick Installation ------------------- The major prerequisite is NumPy (http://numpy.scipy.org) and a C++ compiler. To use the built-in viewer you will need to have Mayavi installed. If you need parallel support you must have mpi4py installed but this is optional. To install a released version do: $ easy_install pysph More information ----------------- Project home: http://pysph.googlecode.com Documentation: http://packages.python.org/PySPH PyPI: http://pypi.python.org/pypi/PySPH Cheers, PySPH developers From cjordan1 at uw.edu Sat Aug 27 14:19:24 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Sat, 27 Aug 2011 14:19:24 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis Message-ID: Hi--I've been a moderately heavy R user for the past two years, so about a month ago I took an (abbreviated) version of a simple data analysis I did in R and tried to rewrite as much of it as possible, line by line, into python using numpy and statsmodels. I didn't use pandas, and I can't comment on how much it might have simplified things. This comparison might be useful to some people, so I stuck it up on a github repo. My overall impression is that R is much stronger for interactive data analysis. Click on the link for more details why, which are summarized in the README file. https://github.com/chrisjordansquire/r_vs_py The code examples should run out of the box with no downloads (other than R, Python, numpy, scipy, and statsmodels) required. -Chris Jordan-Squire From matthew.brett at gmail.com Sat Aug 27 14:27:39 2011 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 27 Aug 2011 11:27:39 -0700 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: Message-ID: Hi, On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire wrote: > Hi--I've been a moderately heavy R user for the past two years, so > about a month ago I took an (abbreviated) version of a simple data > analysis I did in R and tried to rewrite as much of it as possible, > line by line, into python using numpy and statsmodels. I didn't use > pandas, and I can't comment on how much it might have simplified > things. > > This comparison might be useful to some people, so I stuck it up on a > github repo. My overall impression is that R is much stronger for > interactive data analysis. Click on the link for more details why, > which are summarized in the README file. > > https://github.com/chrisjordansquire/r_vs_py > > The code examples should run out of the box with no downloads (other > than R, Python, numpy, scipy, and statsmodels) required. Thank you very much for doing that - it's a very useful exercise. I hope we can make use of it to discuss how to get better, in the true spirit of: Confront the Brutal Facts http://en.wikipedia.org/wiki/Good_to_Great See you, Matthew From cjordan1 at uw.edu Sat Aug 27 14:44:12 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Sat, 27 Aug 2011 14:44:12 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: Message-ID: On Sat, Aug 27, 2011 at 2:27 PM, Matthew Brett wrote: > Hi, > > On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire > wrote: >> Hi--I've been a moderately heavy R user for the past two years, so >> about a month ago I took an (abbreviated) version of a simple data >> analysis I did in R and tried to rewrite as much of it as possible, >> line by line, into python using numpy and statsmodels. I didn't use >> pandas, and I can't comment on how much it might have simplified >> things. >> >> This comparison might be useful to some people, so I stuck it up on a >> github repo. My overall impression is that R is much stronger for >> interactive data analysis. Click on the link for more details why, >> which are summarized in the README file. >> >> https://github.com/chrisjordansquire/r_vs_py >> >> The code examples should run out of the box with no downloads (other >> than R, Python, numpy, scipy, and statsmodels) required. > > Thank you very much for doing that - it's a very useful exercise. ?I > hope we can make use of it to discuss how to get better, in the true Hopefully. I suppose I should also mention, for those that don't want to click on the link, that the two largest reasons R was much simpler to use were because it was easier to construct models and easier to view entries I'd stuck into matrices. R's graphing capabilities seemed slightly more friendly, but that might have just been my familiarity with them. (As an aside, numpy arrays' print method don't make them friendly for interactive viewing. Even ipython couldn't make a few of the matrices I made very intelligible, and it's easy to construct examples that make numpy arrays hideous to behold. For example, x = np.arange(5).reshape(5,1) y = np.ones(5).reshape(1,5) z = x*y z[0,0] += 0.0001 print z [[ 1.00000000e-04 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00] [ 1.00000000e+00 1.00000000e+00 1.00000000e+00 1.00000000e+00 1.00000000e+00] [ 2.00000000e+00 2.00000000e+00 2.00000000e+00 2.00000000e+00 2.00000000e+00] [ 3.00000000e+00 3.00000000e+00 3.00000000e+00 3.00000000e+00 3.00000000e+00] [ 4.00000000e+00 4.00000000e+00 4.00000000e+00 4.00000000e+00 4.00000000e+00]] (Strangely, it looks much more tolerable if x = np.arange(1,6).reshape(5,1) instead.) If you do the same thing in R, x = rep(0:4,5) x = matrix(x,ncol=5) x[1,1] = 0.000001 x you get [,1] [,2] [,3] [,4] [,5] [1,] 1e-06 0 0 0 0 [2,] 1e+00 1 1 1 1 [3,] 2e+00 2 2 2 2 [4,] 3e+00 3 3 3 3 [5,] 4e+00 4 4 4 4 much more readable.) As a simple metric, my .r file was about 1/2 the size of the .py file, even though I couldn't do everything in python that I could in R. (These commands were meant to be entered interactively, so the length of the length of the file is, perhaps, a more valid metric then usual to be concerned about.) -Chris Jordan-Squire > spirit of: > > Confront the Brutal Facts > http://en.wikipedia.org/wiki/Good_to_Great > > See you, > > Matthew > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From dbigbear at gmail.com Sat Aug 27 15:16:16 2011 From: dbigbear at gmail.com (Xiong Deng) Date: Sun, 28 Aug 2011 03:16:16 +0800 Subject: [SciPy-User] How can I solve a equation like solve a function containint expressions like sqrt(log(x) - 1) = 2 and exp((log(x) - 1.5)**2 - 3) = 5 Message-ID: HI, Hi, I am trying to solve an equation containing both exp, log, erfc, and they may be embedded into each other....But sympy cannot handle this, as shown below: >>> from sympy import solve, exp, log, pi >>>from sympy.mpmath import * >>>from sympy import Symbol >>>x=Symbol('x') >>>sigma = 4 >>>mu = 1.5 >>>solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - 1, x) Traceback (most recent call last): File "", line 1, in File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/functions/functions.py", line 287, in log return ctx.ln(x) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py", line 984, in f x = ctx.convert(x) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py", line 662, in convert return ctx._convert_fallback(x, strings) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp.py", line 556, in _convert_fallback raise TypeError("cannot create mpf from " + repr(x)) TypeError: cannot create mpf from x But sqrt, log, exp, itself is OK, as shown as below: >>> solve((1.0 / sqrt(2 * pi) * x * sigma) - 1, x) [0.626657068657750] SO, How can I solve an equation containint expressions like sqrt(log(x) - 1)=0 or exp((log(x) - mu)**2 - 3) = 0??? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Aug 27 15:55:29 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Aug 2011 15:55:29 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: Message-ID: On Sat, Aug 27, 2011 at 2:44 PM, Christopher Jordan-Squire wrote: > On Sat, Aug 27, 2011 at 2:27 PM, Matthew Brett wrote: >> Hi, >> >> On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire >> wrote: >>> Hi--I've been a moderately heavy R user for the past two years, so >>> about a month ago I took an (abbreviated) version of a simple data >>> analysis I did in R and tried to rewrite as much of it as possible, >>> line by line, into python using numpy and statsmodels. I didn't use >>> pandas, and I can't comment on how much it might have simplified >>> things. >>> >>> This comparison might be useful to some people, so I stuck it up on a >>> github repo. My overall impression is that R is much stronger for >>> interactive data analysis. Click on the link for more details why, >>> which are summarized in the README file. >>> >>> https://github.com/chrisjordansquire/r_vs_py >>> >>> The code examples should run out of the box with no downloads (other >>> than R, Python, numpy, scipy, and statsmodels) required. >> >> Thank you very much for doing that - it's a very useful exercise. ?I >> hope we can make use of it to discuss how to get better, in the true > > Hopefully. I suppose I should also mention, for those that don't want > to click on the link, that the two largest reasons R was much simpler > to use were because it was easier to construct models and easier to > view entries I'd stuck into matrices. R's graphing capabilities seemed > slightly more friendly, but that might have just been my familiarity > with them. > > (As an aside, numpy arrays' print method don't make them friendly for > interactive viewing. Even ipython couldn't make a few of the matrices > I made very intelligible, and it's easy to construct examples that > make numpy arrays hideous to behold. For example, for interactive viewing spyder has an array viewer (variable explorer) similar to matlab > > x = np.arange(5).reshape(5,1) > y = np.ones(5).reshape(1,5) > z = x*y > z[0,0] += 0.0001 > print z > > [[ ?1.00000000e-04 ? 0.00000000e+00 ? 0.00000000e+00 ? 0.00000000e+00 > ? ?0.00000000e+00] > ?[ ?1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 > ? ?1.00000000e+00] > ?[ ?2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00 > ? ?2.00000000e+00] > ?[ ?3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00 > ? ?3.00000000e+00] > ?[ ?4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00 > ? ?4.00000000e+00]] >>> from scikits.statsmodels.iolib import SimpleTable >>> print SimpleTable(z) ====================== 0.0001 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0 4.0 4.0 4.0 4.0 4.0 ---------------------- >>> z[0,0] = 1e-6 >>> print SimpleTable(z) ===================== 1e-06 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0 4.0 4.0 4.0 4.0 4.0 --------------------- > > (Strangely, it looks much more tolerable if x ?= > np.arange(1,6).reshape(5,1) instead.) > > If you do the same thing in R, > > x = rep(0:4,5) > x = matrix(x,ncol=5) > x[1,1] = 0.000001 > x > > you get > > ? ? ?[,1] [,2] [,3] [,4] [,5] > [1,] 1e-06 ? ?0 ? ?0 ? ?0 ? ?0 > [2,] 1e+00 ? ?1 ? ?1 ? ?1 ? ?1 > [3,] 2e+00 ? ?2 ? ?2 ? ?2 ? ?2 > [4,] 3e+00 ? ?3 ? ?3 ? ?3 ? ?3 > [5,] 4e+00 ? ?4 ? ?4 ? ?4 ? ?4 > > much more readable.) > > > As a simple metric, my .r file was about 1/2 the size of the .py file, > even though I couldn't do everything in python that I could in R. > (These commands were meant to be entered interactively, so the length > of the length of the file is, perhaps, a more valid metric then usual > to be concerned about.) predefining your categorical variables would save quite a few lines. Josef > > -Chris Jordan-Squire > > >> spirit of: >> >> Confront the Brutal Facts >> http://en.wikipedia.org/wiki/Good_to_Great >> >> See you, >> >> Matthew >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jason-sage at creativetrax.com Sat Aug 27 17:03:43 2011 From: jason-sage at creativetrax.com (Jason Grout) Date: Sat, 27 Aug 2011 16:03:43 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: Message-ID: <4E595BAF.1080509@creativetrax.com> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: > This comparison might be useful to some people, so I stuck it up on a > github repo. My overall impression is that R is much stronger for > interactive data analysis. Click on the link for more details why, > which are summarized in the README file. From the README: "In fact, using Python without the IPython qtconsole is practically impossible for this sort of cut and paste, interactive analysis. The shell IPython doesn't allow it because it automatically adds whitespace on multiline bits of code, breaking pre-formatted code's alignment. Cutting and pasting works for the standard python shell, but then you lose all the advantages of IPython." You might use %cpaste in the ipython normal shell to paste without it automatically inserting spaces: In [5]: %cpaste Pasting code; enter '--' alone on the line to stop. :if 1>0: : print 'hi' :-- hi Thanks, Jason From robert.kern at gmail.com Sat Aug 27 18:02:28 2011 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 27 Aug 2011 17:02:28 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: <4E595BAF.1080509@creativetrax.com> References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sat, Aug 27, 2011 at 16:03, Jason Grout wrote: > On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >> This comparison might be useful to some people, so I stuck it up on a >> github repo. My overall impression is that R is much stronger for >> interactive data analysis. Click on the link for more details why, >> which are summarized in the README file. > > ?From the README: > > "In fact, using Python without the IPython qtconsole is practically > impossible for this sort of cut and paste, interactive analysis. > The shell IPython doesn't allow it because it automatically adds > whitespace on multiline bits of code, breaking pre-formatted code's > alignment. Cutting and pasting works for the standard python shell, > but then you lose all the advantages of IPython." > > > > You might use %cpaste in the ipython normal shell to paste without it > automatically inserting spaces: > > In [5]: %cpaste > Pasting code; enter '--' alone on the line to stop. > :if 1>0: > : ? ?print 'hi' > :-- > hi Or even just %paste! |1> %paste? Type: Magic function Base Class: String Form:> Namespace: IPython internal File: /Users/rkern/git/ipython/IPython/frontend/terminal/interactiveshell.py Definition: %paste(self, parameter_s='') Docstring: Paste & execute a pre-formatted code block from clipboard. The text is pulled directly from the clipboard without user intervention and printed back on the screen before execution (unless the -q flag is given to force quiet mode). The block is dedented prior to execution to enable execution of method definitions. '>' and '+' characters at the beginning of a line are ignored, to allow pasting directly from e-mails, diff files and doctests (the '...' continuation prompt is also stripped). The executed block is also assigned to variable named 'pasted_block' for later editing with '%edit pasted_block'. You can also pass a variable name as an argument, e.g. '%paste foo'. This assigns the pasted block to variable 'foo' as string, without dedenting or executing it (preceding >>> and + is still stripped) Options ------- -r: re-executes the block previously entered by cpaste. -q: quiet mode: do not echo the pasted text back to the terminal. IPython statements (magics, shell escapes) are not supported (yet). See also -------- cpaste: manually paste code into terminal until you mark its end. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From wesmckinn at gmail.com Sat Aug 27 18:06:47 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Sat, 27 Aug 2011 18:06:47 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: <4E595BAF.1080509@creativetrax.com> References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout wrote: > On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >> This comparison might be useful to some people, so I stuck it up on a >> github repo. My overall impression is that R is much stronger for >> interactive data analysis. Click on the link for more details why, >> which are summarized in the README file. > > ?From the README: > > "In fact, using Python without the IPython qtconsole is practically > impossible for this sort of cut and paste, interactive analysis. > The shell IPython doesn't allow it because it automatically adds > whitespace on multiline bits of code, breaking pre-formatted code's > alignment. Cutting and pasting works for the standard python shell, > but then you lose all the advantages of IPython." > > > > You might use %cpaste in the ipython normal shell to paste without it > automatically inserting spaces: > > In [5]: %cpaste > Pasting code; enter '--' alone on the line to stop. > :if 1>0: > : ? ?print 'hi' > :-- > hi > > Thanks, > > Jason > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > This strikes me as a textbook example of why we need an integrated formula framework in statsmodels. I'll make a pass through when I get a chance and see if there are some places where pandas would really help out. For example, the weighted average by sex and occupation is what groupby is all about: hrdf = DataFrame(hrdat) # note DataFrame allows you to change the dtype of a column! hrdf['sex'] = np.where(hrdf['sex'] == 1, 'male', 'female') def compute_stats(group): sum_weight = group['A_ERNLWT'].sum() wave_hrwage = (group['hrwage'] * group['A_ERNLWT']).sum() / sum_weight return Series({'sum_weight' : sum_weight, 'wave_hrwage' : wave_hrwage}) wocc = hrdf.groupby(['sex', 'occ']).apply(compute_stats) In [39]: wocc Out[39]: sum_weight wave_hrwage female 1 7.669e+05 23.41 2 1.541e+06 24.39 3 1.082e+06 10.02 4 6.996e+05 13.49 5 1.325e+06 16.28 8 5.796e+04 20.44 9 1.277e+05 12.27 10 1.12e+05 12.44 male 1 7.325e+05 34.96 2 1.198e+06 29.06 3 8.283e+05 13.45 4 5.013e+05 20.48 5 4.367e+05 14.96 7 6.484e+05 17.78 8 4.424e+05 20.39 9 6.064e+05 17.64 10 5.256e+05 17.76 (Of course I'm showing up some swank new pandas 0.4 stuff, i.e. hierarchical indexing and multi-key groupby) From pav at iki.fi Sat Aug 27 18:08:18 2011 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 27 Aug 2011 22:08:18 +0000 (UTC) Subject: [SciPy-User] 3d convex hull References: <201108241338.50940.alexandre.fayolle@logilab.fr> <4E551F34.4@gmail.com> <4E590958.30103@gmail.com> Message-ID: Sat, 27 Aug 2011 23:12:24 +0800, Ning Guo wrote: > On Saturday, August 27, 2011 03:53 PM, Pauli Virtanen wrote: [clip] > Also, the formula to calculate normal may be like this: > > face_normals[:,0] = > np.cross(tetra_points[:,0]-tetra_points[:,2],tetra_points[:,1]-tetra_points[:,2]) [clip] Ah yes, exactly like that, my brain apparently wasn't working properly. > Regarding to the order of the vertices, I'm also not sure about their > convention. I'm trying to figure it out. If you find it out, please let us know, as this would be an useful thing to mention in the documentation. However, I'm not sure at the moment whether Qhull provides such ordering guarantees. -- Pauli Virtanen From matthew.brett at gmail.com Sat Aug 27 18:56:29 2011 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 27 Aug 2011 15:56:29 -0700 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: Hi, On Sat, Aug 27, 2011 at 3:06 PM, Wes McKinney wrote: > On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout > wrote: >> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>> This comparison might be useful to some people, so I stuck it up on a >>> github repo. My overall impression is that R is much stronger for >>> interactive data analysis. Click on the link for more details why, >>> which are summarized in the README file. >> >> ?From the README: >> >> "In fact, using Python without the IPython qtconsole is practically >> impossible for this sort of cut and paste, interactive analysis. >> The shell IPython doesn't allow it because it automatically adds >> whitespace on multiline bits of code, breaking pre-formatted code's >> alignment. Cutting and pasting works for the standard python shell, >> but then you lose all the advantages of IPython." >> >> >> >> You might use %cpaste in the ipython normal shell to paste without it >> automatically inserting spaces: >> >> In [5]: %cpaste >> Pasting code; enter '--' alone on the line to stop. >> :if 1>0: >> : ? ?print 'hi' >> :-- >> hi >> >> Thanks, >> >> Jason >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > This strikes me as a textbook example of why we need an integrated > formula framework in statsmodels. Yes, at a superficial glance Chris' document sounded like an ideal use-case tester for the battle of the formulas, or, in a less martial mode, for defining what we want formulas to do. I got sidetracked on my document on that - and am still sidetracked, but should get to it soon, by which I mean, in the next month, unless someone prompts me earlier... See you, Matthew From cjordan1 at uw.edu Sat Aug 27 19:30:46 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Sat, 27 Aug 2011 19:30:46 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sat, Aug 27, 2011 at 6:06 PM, Wes McKinney wrote: > On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout > wrote: >> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>> This comparison might be useful to some people, so I stuck it up on a >>> github repo. My overall impression is that R is much stronger for >>> interactive data analysis. Click on the link for more details why, >>> which are summarized in the README file. >> >> ?From the README: >> >> "In fact, using Python without the IPython qtconsole is practically >> impossible for this sort of cut and paste, interactive analysis. >> The shell IPython doesn't allow it because it automatically adds >> whitespace on multiline bits of code, breaking pre-formatted code's >> alignment. Cutting and pasting works for the standard python shell, >> but then you lose all the advantages of IPython." >> >> >> >> You might use %cpaste in the ipython normal shell to paste without it >> automatically inserting spaces: >> >> In [5]: %cpaste >> Pasting code; enter '--' alone on the line to stop. >> :if 1>0: >> : ? ?print 'hi' >> :-- >> hi >> >> Thanks, >> >> Jason >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > This strikes me as a textbook example of why we need an integrated > formula framework in statsmodels. I'll make a pass through when I get > a chance and see if there are some places where pandas would really > help out. For example, the weighted average by sex and occupation is > what groupby is all about: > > hrdf = DataFrame(hrdat) > > # note DataFrame allows you to change the dtype of a column! > hrdf['sex'] = np.where(hrdf['sex'] == 1, 'male', 'female') > > def compute_stats(group): > ?sum_weight = group['A_ERNLWT'].sum() > ?wave_hrwage = (group['hrwage'] * group['A_ERNLWT']).sum() / sum_weight > ?return Series({'sum_weight' : sum_weight, > ? ? ? ? ? ? ? ? 'wave_hrwage' : wave_hrwage}) > > wocc = hrdf.groupby(['sex', 'occ']).apply(compute_stats) > > In [39]: wocc > Out[39]: > ? ? ? ? ? ?sum_weight ?wave_hrwage > female ?1 ? 7.669e+05 ? 23.41 > ? ? ? ?2 ? 1.541e+06 ? 24.39 > ? ? ? ?3 ? 1.082e+06 ? 10.02 > ? ? ? ?4 ? 6.996e+05 ? 13.49 > ? ? ? ?5 ? 1.325e+06 ? 16.28 > ? ? ? ?8 ? 5.796e+04 ? 20.44 > ? ? ? ?9 ? 1.277e+05 ? 12.27 > ? ? ? ?10 ?1.12e+05 ? ?12.44 > male ? ?1 ? 7.325e+05 ? 34.96 > ? ? ? ?2 ? 1.198e+06 ? 29.06 > ? ? ? ?3 ? 8.283e+05 ? 13.45 > ? ? ? ?4 ? 5.013e+05 ? 20.48 > ? ? ? ?5 ? 4.367e+05 ? 14.96 > ? ? ? ?7 ? 6.484e+05 ? 17.78 > ? ? ? ?8 ? 4.424e+05 ? 20.39 > ? ? ? ?9 ? 6.064e+05 ? 17.64 > ? ? ? ?10 ?5.256e+05 ? 17.76 > > (Of course I'm showing up some swank new pandas 0.4 stuff, i.e. > hierarchical indexing and multi-key groupby) > Nifty! I will have to look at these parts of pandas closer. -Chris JS _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From wesmckinn at gmail.com Sat Aug 27 19:32:22 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Sat, 27 Aug 2011 19:32:22 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sat, Aug 27, 2011 at 7:30 PM, Christopher Jordan-Squire wrote: > On Sat, Aug 27, 2011 at 6:06 PM, Wes McKinney wrote: >> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >> wrote: >>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>> This comparison might be useful to some people, so I stuck it up on a >>>> github repo. My overall impression is that R is much stronger for >>>> interactive data analysis. Click on the link for more details why, >>>> which are summarized in the README file. >>> >>> ?From the README: >>> >>> "In fact, using Python without the IPython qtconsole is practically >>> impossible for this sort of cut and paste, interactive analysis. >>> The shell IPython doesn't allow it because it automatically adds >>> whitespace on multiline bits of code, breaking pre-formatted code's >>> alignment. Cutting and pasting works for the standard python shell, >>> but then you lose all the advantages of IPython." >>> >>> >>> >>> You might use %cpaste in the ipython normal shell to paste without it >>> automatically inserting spaces: >>> >>> In [5]: %cpaste >>> Pasting code; enter '--' alone on the line to stop. >>> :if 1>0: >>> : ? ?print 'hi' >>> :-- >>> hi >>> >>> Thanks, >>> >>> Jason >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> This strikes me as a textbook example of why we need an integrated >> formula framework in statsmodels. I'll make a pass through when I get >> a chance and see if there are some places where pandas would really >> help out. For example, the weighted average by sex and occupation is >> what groupby is all about: >> >> hrdf = DataFrame(hrdat) >> >> # note DataFrame allows you to change the dtype of a column! >> hrdf['sex'] = np.where(hrdf['sex'] == 1, 'male', 'female') >> >> def compute_stats(group): >> ?sum_weight = group['A_ERNLWT'].sum() >> ?wave_hrwage = (group['hrwage'] * group['A_ERNLWT']).sum() / sum_weight >> ?return Series({'sum_weight' : sum_weight, >> ? ? ? ? ? ? ? ? 'wave_hrwage' : wave_hrwage}) >> >> wocc = hrdf.groupby(['sex', 'occ']).apply(compute_stats) >> >> In [39]: wocc >> Out[39]: >> ? ? ? ? ? ?sum_weight ?wave_hrwage >> female ?1 ? 7.669e+05 ? 23.41 >> ? ? ? ?2 ? 1.541e+06 ? 24.39 >> ? ? ? ?3 ? 1.082e+06 ? 10.02 >> ? ? ? ?4 ? 6.996e+05 ? 13.49 >> ? ? ? ?5 ? 1.325e+06 ? 16.28 >> ? ? ? ?8 ? 5.796e+04 ? 20.44 >> ? ? ? ?9 ? 1.277e+05 ? 12.27 >> ? ? ? ?10 ?1.12e+05 ? ?12.44 >> male ? ?1 ? 7.325e+05 ? 34.96 >> ? ? ? ?2 ? 1.198e+06 ? 29.06 >> ? ? ? ?3 ? 8.283e+05 ? 13.45 >> ? ? ? ?4 ? 5.013e+05 ? 20.48 >> ? ? ? ?5 ? 4.367e+05 ? 14.96 >> ? ? ? ?7 ? 6.484e+05 ? 17.78 >> ? ? ? ?8 ? 4.424e+05 ? 20.39 >> ? ? ? ?9 ? 6.064e+05 ? 17.64 >> ? ? ? ?10 ?5.256e+05 ? 17.76 >> >> (Of course I'm showing up some swank new pandas 0.4 stuff, i.e. >> hierarchical indexing and multi-key groupby) >> > > Nifty! I will have to look at these parts of pandas closer. > > -Chris JS > > > > ?_______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I am working hard on documentation for all the new stuff, but I am but one person :) I hope to have the docs (pandas.sourceforge.net , under heavy construction at the moment) in more complete shape within a week. - Wes From josef.pktd at gmail.com Sat Aug 27 20:00:29 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Aug 2011 20:00:29 -0400 Subject: [SciPy-User] multivariate empirical distribution function, avoid double loop ? In-Reply-To: References: Message-ID: On Wed, Aug 24, 2011 at 9:23 PM, wrote: > On Wed, Aug 24, 2011 at 7:25 PM, Robert Kern wrote: >> On Wed, Aug 24, 2011 at 09:23, ? wrote: >>> Does anyone know whether there is an algorithm that avoids the double >>> loop to get a multivariate empirical distribution function? >>> >>> for point in data: >>> ? ? count how many points in data are smaller or equal to point >>> >>> with 1d data it's just argsort(argsort(data)) >>> >>> double loop version with some test cases is attached. >>> >>> I didn't see a way that sorting would help. >> >> If you can bear to make a few (nobs, nobs) bool arrays, you can do >> just a kvars-sized loop in Python: >> >> dominates = np.ones((len(data), len(data)), dtype=bool) >> for x in data.T: >> ? ?dominates &= x[:,np.newaxis] > x >> sorta_ranks = dominates.sum(axis=1) > > Thanks, quite a bit better, 14 times faster for (5000,2) and still 2.5 > times faster for (5000,20), > 12 times for (10000,3) compared to my original. attached a first draft of what I'm after Josef > > Josef > >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> ? -- Umberto Eco >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: mvecdf.py Type: text/x-python Size: 5168 bytes Desc: not available URL: From bsouthey at gmail.com Sat Aug 27 22:15:01 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Sat, 27 Aug 2011 21:15:01 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: > On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout > wrote: >> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>> This comparison might be useful to some people, so I stuck it up on a >>> github repo. My overall impression is that R is much stronger for >>> interactive data analysis. Click on the link for more details why, >>> which are summarized in the README file. >> >> ?From the README: >> >> "In fact, using Python without the IPython qtconsole is practically >> impossible for this sort of cut and paste, interactive analysis. >> The shell IPython doesn't allow it because it automatically adds >> whitespace on multiline bits of code, breaking pre-formatted code's >> alignment. Cutting and pasting works for the standard python shell, >> but then you lose all the advantages of IPython." >> >> >> >> You might use %cpaste in the ipython normal shell to paste without it >> automatically inserting spaces: >> >> In [5]: %cpaste >> Pasting code; enter '--' alone on the line to stop. >> :if 1>0: >> : ? ?print 'hi' >> :-- >> hi >> >> Thanks, >> >> Jason >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > This strikes me as a textbook example of why we need an integrated > formula framework in statsmodels. I'll make a pass through when I get > a chance and see if there are some places where pandas would really > help out. We used to have a formula class is scipy.stats and I do not follow nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also had this (extremely flexible but very hard to comprehend). It was what I had argued was needed ages ago for statsmodel. But it needs a community effort because the syntax required serves multiple communities with different annotations and needs. That is also seen from the different approaches taken by the stats packages from S/R, SAS, Genstat (and those are just are ones I have used). Bruce From dbigbear at gmail.com Sun Aug 28 02:16:06 2011 From: dbigbear at gmail.com (Johnny) Date: Sat, 27 Aug 2011 23:16:06 -0700 (PDT) Subject: [SciPy-User] Install Scipy Errors: ImportError: /path_to/liblapack.so: undefined symbol: ztbsv_ Message-ID: <7b0b4836-fb15-4489-860a-c1684529f30c@p37g2000prp.googlegroups.com> Hi all, I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times, but always encountering the problem: [work XXX]$ python -c 'import scipy.optimize; scipy.optimize.test()' Traceback (most recent call last): File "", line 1, in File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ scipy/optimize/__init__.py", line 11, in from lbfgsb import fmin_l_bfgs_b File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ scipy/optimize/lbfgsb.py", line 28, in import _lbfgsb ImportError: /home/work/local/lib/liblapack.so: undefined symbol: ztbsv_ I can pass some other tests like: [work XXX:~/local]$ python -c 'import scipy.ndimage; scipy.ndimage.test()' Running unit tests for scipy.ndimage NumPy version 1.6.1 NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- packages/numpy SciPy version 0.9.0 SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- packages/scipy Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] nose version 1.1.2 .........S................................................................................................................................................................................................................................................................................................................................................................................................................. ---------------------------------------------------------------------- Ran 411 tests in 1.247s OK (SKIP=1) The problem seems due to the lib of Lapack. So I tried the solutions posted on the internet before. 1) The liblapack.so may be not complete...SO I tried this: # integrate lapack with atlas: cd lib/ mkdir tmp cd tmp/ ar x ../liblapack.a cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a ar r ../liblapack.a *.o cd ../.. make check make ptcheck cp include/* ~/include/ cp lib/*.a ~/lib/ That is, after installing atlas, there is another liblapack.a (in addition to the lapack_LINUX.a after Lapack) in its lib, but it is about 500k, so I integrate it with the lapack_LINUX.a from installing Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about 5m 2) re-install Lapack and atlas many times....No use 3) I found there is a lapack.so under scipy/lib, and it is about 500K, but I think it may be not the problem, becaues the failure is "ImportError: /home/work/local/lib/liblapack.so: undefined symbol: ztbsv_". Scipy seemed to import liblapack.so in my general lib directory... 4) One thing I am not sure is that I used gcc 4.7 and gfortran to compile lapack and atlas, but my python 2.7 was built using gcc 3.4.5.....Is this a problem? Anyone can help? _______________________________________________________________ My configuration of the installation: * ATLAS 3.8.4 * lapack 3.3.1 * numpy 1.6.1 * SciPy version 0.9.0 * dateutil 1.5 * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] * nose version 1.1.2 * gcc (GCC) 4.7.0 20110820 (experimental) * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 x86_64 x86_64 x86_64 GNU/Linux site.cfg of Scipy: [DEFAULT] library_dirs = /home/work/local/lib include_dirs = /home/work/local/include [lapack_opt] libraries = lapack, f77blas, cblas, atlas site.cfg of Numpy: [DEFAULT] library_dirs = /home/work/local/lib include_dirs = /home/work/local/include [lapack_opt] libraries = lapack, f77blas, cblas, atlas In addition, there are failures as well when test Numpy: >>> import numpy >>> numpy.test('1') Running unit tests for numpy NumPy version 1.6.1 NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- packages/numpy Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] nose version 1.1.2 ====================================================================== FAIL: Test basic arithmetic function errors ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/core/tests/test_numeric.py", line 367, in test_floating_exceptions_power np.power, ftype(2), ftype(2**fi.nexp)) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe "Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Type did not raise fpe error 'overflow'. ====================================================================== FAIL: Test generic loops. ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops assert_almost_equal(fone(x), fone_val, err_msg=msg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/testing/utils.py", line 448, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal to 7 decimals PyUFunc_F_F ACTUAL: array([ 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], dtype=complex64) DESIRED: 1 ====================================================================== FAIL: test_umath.TestComplexFunctions.test_loss_of_precision(,) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision check(x_basic, 2*eps/1e-3) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/core/tests/test_umath.py", line 901, in check 'arcsinh') AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh') ====================================================================== FAIL: test_umath.TestComplexFunctions.test_precisions_consistent ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/core/tests/test_umath.py", line 812, in test_precisions_consistent assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/testing/utils.py", line 448, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal to 6 decimals fch-fcd ACTUAL: 2.3561945j DESIRED: (0.66623943249251527+1.0612750619050355j) ====================================================================== FAIL: test_kind.TestKind.test_all ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/f2py/tests/test_kind.py", line 30, in test_all 'selectedrealkind(%s): expected %r but got %r' % (i, selected_real_kind(i), selectedrealkind(i))) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: selectedrealkind(19): expected -1 but got 16 ---------------------------------------------------------------------- Ran 3552 tests in 29.977s FAILED (KNOWNFAIL=3, failures=5) From dbigbear at gmail.com Sun Aug 28 02:18:54 2011 From: dbigbear at gmail.com (Johnny) Date: Sat, 27 Aug 2011 23:18:54 -0700 (PDT) Subject: [SciPy-User] How can I solve a equation like sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 - 3) = 5 Message-ID: <57952d36-c803-451c-bdeb-b2f1299290e7@y39g2000prd.googlegroups.com> Hi, I am trying to solve the follow equation: solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - 1, x) I am not sure Scipy can do it and how it can do it ? Many thanks Xiong From robert.kern at gmail.com Sun Aug 28 03:36:06 2011 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 28 Aug 2011 02:36:06 -0500 Subject: [SciPy-User] How can I solve a equation like sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 - 3) = 5 In-Reply-To: <57952d36-c803-451c-bdeb-b2f1299290e7@y39g2000prd.googlegroups.com> References: <57952d36-c803-451c-bdeb-b2f1299290e7@y39g2000prd.googlegroups.com> Message-ID: On Sun, Aug 28, 2011 at 01:18, Johnny wrote: > Hi, I am trying to solve the follow equation: > > solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - > mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - > 1, x) > > I am not sure Scipy can do it and how it can do it ? [~] |18> from numpy import sqrt, log, exp [~] |19> from scipy.special import erfc [~] |20> def f(x, mu=1.0, sigma=0.1): ...> return x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - ...> mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - 1.0 ...> [~] |21> from scipy.optimize import fsolve [~] |22> fsolve(f, 3.0) array([ 2.88207063]) [~] |23> f(_) array([ 4.44089210e-16]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From wilson.andrew.j at gmail.com Sat Aug 27 10:47:28 2011 From: wilson.andrew.j at gmail.com (Andy Wilson) Date: Sat, 27 Aug 2011 09:47:28 -0500 Subject: [SciPy-User] Return variable value by function value In-Reply-To: <20110826172510.2EC696F446@smtp.hushmail.com> References: <20110826172510.2EC696F446@smtp.hushmail.com> Message-ID: If an approximation is good enough, you can use scipy.interpolate.interp1d to get a function that returns interpolated values. Your example doesn't quite work because 0.95 is out of the range of the initial input. import numpy as np import scipy.interpolate x = np.arange(0,100) y = np.sqrt(1 - x**2/10E+4) interp_func = scipy.interpolate.interp1d(x, y, kind='quadratic') new_x = 0.95 interp_y = interp_func(new_x) actual_y = np.sqrt(1 - new_x**2/10E+4) print "actual value: %s" % actual_y print "interpolated value %s" % interp_y print "difference: %s" % (actual_y - interp_y) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali.franco95 at gmail.com Sat Aug 27 23:21:40 2011 From: ali.franco95 at gmail.com (ali franco) Date: Sun, 28 Aug 2011 15:21:40 +1200 Subject: [SciPy-User] Sotring data for fast access Message-ID: There are two parts to my question. One: I have to do a double integration on a grid answer = integrate ( f(x,y) times besselfunction(x,y)) Now, I have read that the besselfunction can be precomputed and saved to disk for fast access. How do I do this? Right now, I am evaluating the besselfunction from scipy.special as it is required. Second question: I have numerically integrated a differential equation and I use the splined solution to solve other differential equations. However the splined solution is slow. Is there a way to make this faster? thanks guys -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali.franco95 at gmail.com Sun Aug 28 07:02:54 2011 From: ali.franco95 at gmail.com (ali franco) Date: Sun, 28 Aug 2011 23:02:54 +1200 Subject: [SciPy-User] Does odeint return derivative Message-ID: I am solving a system of differential equations using odeint. Is there a simpler way of also getting the derivative other than calculating the derivative from the solutions obtained which in my case is going to take alot of extra time. thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali.franco95 at gmail.com Sun Aug 28 07:48:57 2011 From: ali.franco95 at gmail.com (ali franco) Date: Sun, 28 Aug 2011 23:48:57 +1200 Subject: [SciPy-User] RectBivariateSpline Message-ID: Can RectBivariateSpline be used to calculated derivatives and integrals? If it can't, can you please suggest some thing else that does. I have to use a two dimensional spline on a rectangular mesh. The problem with alternatives to RectBivariateSpline such as BivariateSpline , And UnivariateSpline I found was that they require the data be specified either on a square grid or that they be equally spaced neither of which my data points satisfy. thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Sun Aug 28 10:40:15 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 28 Aug 2011 09:40:15 -0500 Subject: [SciPy-User] Install Scipy Errors: ImportError: /path_to/liblapack.so: undefined symbol: ztbsv_ In-Reply-To: <7b0b4836-fb15-4489-860a-c1684529f30c@p37g2000prp.googlegroups.com> References: <7b0b4836-fb15-4489-860a-c1684529f30c@p37g2000prp.googlegroups.com> Message-ID: On Sun, Aug 28, 2011 at 1:16 AM, Johnny wrote: > Hi all, > > I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times, > but always encountering the problem: > > [work XXX]$ python -c 'import scipy.optimize; > scipy.optimize.test()' > Traceback (most recent call last): > ?File "", line 1, in > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > scipy/optimize/__init__.py", line 11, in > ? ?from lbfgsb import fmin_l_bfgs_b > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > scipy/optimize/lbfgsb.py", line 28, in > ? ?import _lbfgsb > ImportError: /home/work/local/lib/liblapack.so: undefined symbol: > ztbsv_ > > I can pass some other tests like: > > > [work XXX:~/local]$ python -c 'import scipy.ndimage; > scipy.ndimage.test()' > Running unit tests for scipy.ndimage > NumPy version 1.6.1 > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > packages/numpy > SciPy version 0.9.0 > SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > packages/scipy > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > 20051201 (Red Hat 3.4.5-2)] > nose version 1.1.2 > .........S................................................................................................................................................................................................................................................................................................................................................................................................................. > ---------------------------------------------------------------------- > Ran 411 tests in 1.247s > > OK (SKIP=1) > > The problem seems due to the lib of Lapack. So I tried the solutions > posted on the internet before. > > 1) The liblapack.so may be not complete...SO I tried this: > ? ?# integrate lapack with atlas: > ? ?cd lib/ > ? ?mkdir tmp > ? ?cd tmp/ > ? ?ar x ../liblapack.a > ? ?cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a > ? ?ar r ../liblapack.a *.o > ? ?cd ../.. > ? ?make check > ? ?make ptcheck > ? ?cp include/* ~/include/ > ? ?cp lib/*.a ~/lib/ > > That is, after installing atlas, there is another liblapack.a (in > addition to the lapack_LINUX.a after Lapack) in its lib, but it is > about 500k, so I integrate it with the lapack_LINUX.a from installing > Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about > 5m > > 2) re-install Lapack and atlas many times....No use > > 3) I found there is a lapack.so under scipy/lib, and it is about 500K, > but I think it may be not the problem, becaues the failure is > "ImportError: /home/work/local/lib/liblapack.so: undefined symbol: > ztbsv_". Scipy seemed to import liblapack.so in my general lib > directory... > > 4) One thing ?I am not sure is that I used gcc 4.7 and gfortran to > compile lapack and atlas, but my python 2.7 was built using gcc > 3.4.5.....Is this a problem? > > > Anyone can help? > _______________________________________________________________ > My configuration of the installation: > > * ATLAS 3.8.4 > * lapack 3.3.1 > * numpy 1.6.1 > * SciPy version 0.9.0 > * dateutil 1.5 > * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > 20051201 (Red Hat 3.4.5-2)] > * nose version 1.1.2 > * gcc (GCC) 4.7.0 20110820 (experimental) > * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 > x86_64 x86_64 x86_64 GNU/Linux > > site.cfg of Scipy: > > [DEFAULT] > library_dirs = /home/work/local/lib > include_dirs = /home/work/local/include > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > site.cfg of Numpy: > > [DEFAULT] > library_dirs = /home/work/local/lib > include_dirs = /home/work/local/include > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > > In addition, there are failures as well when test Numpy: > >>>> import numpy >>>> numpy.test('1') > Running unit tests for numpy > NumPy version 1.6.1 > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > packages/numpy > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > 20051201 (Red Hat 3.4.5-2)] > nose version 1.1.2 > ====================================================================== > FAIL: Test basic arithmetic function errors > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/decorators.py", line 215, in knownfailer > ? ?return f(*args, **kwargs) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_numeric.py", line 367, in > test_floating_exceptions_power > ? ?np.power, ftype(2), ftype(2**fi.nexp)) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe > ? ?"Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 34, in assert_ > ? ?raise AssertionError(msg) > AssertionError: Type did not raise fpe error > 'overflow'. > > ====================================================================== > FAIL: Test generic loops. > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops > ? ?assert_almost_equal(fone(x), fone_val, err_msg=msg) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 448, in assert_almost_equal > ? ?raise AssertionError(msg) > AssertionError: > Arrays are not almost equal to 7 decimals PyUFunc_F_F > ?ACTUAL: array([ 0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j], > dtype=complex64) > ?DESIRED: 1 > > ====================================================================== > FAIL: test_umath.TestComplexFunctions.test_loss_of_precision( 'numpy.complex64'>,) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision > ? ?check(x_basic, 2*eps/1e-3) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_umath.py", line 901, in check > ? ?'arcsinh') > AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh') > > ====================================================================== > FAIL: test_umath.TestComplexFunctions.test_precisions_consistent > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_umath.py", line 812, in > test_precisions_consistent > ? ?assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 448, in assert_almost_equal > ? ?raise AssertionError(msg) > AssertionError: > Arrays are not almost equal to 6 decimals fch-fcd > ?ACTUAL: 2.3561945j > ?DESIRED: (0.66623943249251527+1.0612750619050355j) > > ====================================================================== > FAIL: test_kind.TestKind.test_all > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/f2py/tests/test_kind.py", line 30, in test_all > ? ?'selectedrealkind(%s): expected %r but got %r' % ?(i, > selected_real_kind(i), selectedrealkind(i))) > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 34, in assert_ > ? ?raise AssertionError(msg) > AssertionError: selectedrealkind(19): expected -1 but got 16 > > ---------------------------------------------------------------------- > Ran 3552 tests in 29.977s > > FAILED (KNOWNFAIL=3, failures=5) > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Hi, What Linux distro are you actually using? Unless you have some issue, I would install the atlas version provided by the distro as I have long term success with Fedora's packages across multiple versions. If you still want to build it yourself, then you need to be using the same compiler version everywhere. The ztbsv_ error suggests that you have not build blas, lapack and atlas correctly. It is hard to get those right so very carefully check the build logs and run the associated tests. Bruce From rob.clewley at gmail.com Sun Aug 28 12:32:42 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Sun, 28 Aug 2011 12:32:42 -0400 Subject: [SciPy-User] Does odeint return derivative In-Reply-To: References: Message-ID: Hi, On Sun, Aug 28, 2011 at 7:02 AM, ali franco wrote: > I am solving a system of differential equations using odeint. Is there a > simpler way of also getting the derivative other than calculating the > derivative from the solutions obtained which in my case is going to take > alot of extra time. Maybe I misunderstand which derivative you are interested in, but if you have a system x' = f(x, t) then the rates of change of the state variables x at any given time and known state position are simply given by calling function f directly. That's what the ODE means by definition. So if you have solved for a trajectory and have an array of time t and state x values, just pass a pair of x, t values to f to find out how fast x is changing at that point. This *is* a very simple and cheap way to get the derivative (you can even vectorize it), but I'm guessing that you were considering doing some kind of finite differencing to obtain approximate derivatives from the trajectory. Anyway, hope that helps. Rob From rob.clewley at gmail.com Sun Aug 28 14:40:52 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Sun, 28 Aug 2011 14:40:52 -0400 Subject: [SciPy-User] Sotring data for fast access In-Reply-To: References: Message-ID: Hi Ali, On Sat, Aug 27, 2011 at 11:21 PM, ali franco wrote: > There are two parts to my question. I'm not sure I understand enough about the first q to answer it with authority, but my gut instinct, FWIW, is that using a table of pre-computed values on some mesh of (x,y) means you'll have to accept interpolated values for the Bessel function when it's needed at new (x,y) values. I mean, if you already know all the (x,y) values you'll need, I don't see the benefit in any precomputation. Maybe the best splines to use in this and your ODE problem are ones where you impose the knots from the sampled values and their first derivatives there, since you can explicitly compute the derivatives for Bessel functions and for ODE right-hand sides. That guarantees pretty good accuracy of the fit, and you'll be using quadratics between every pair of knots. If you don't specify the derivatives and use cubics, you'll run the risk of nasty behavior such as Runge's phenomenon. > Second question: I have numerically integrated a differential equation and I > use the splined solution to solve other differential equations. However the > splined solution is slow. Is there a way to make this faster? What exactly do you mean by using the solution to solve other DEs? It would help for you to provide concrete examples when posting on forums. If you mean that you are using a spline-interpolated curve as a time-dependent term in another DE's right-hand side, then yes, that's going to be slower in pure python. But if speed is such an issue, you shouldn't be using odeint. PyDSTool supports external input signals of a similar kind running C-based integrators very quickly compared to odeint, but currently only supports piecewise-linear interpolation of those signals. This can be accurate enough if you have a very high sampling rate for that signal relative to your new DE's time steps, particularly because the integrator is guaranteed to step to the knot points. Higher order splines in C are on my to do list, though. PyDSTool's interface to scipy's vode wrapper does support higher order splines but it's done in python, so probably won't be faster than what you're already doing. -Rob From jsseabold at gmail.com Sun Aug 28 15:07:19 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 28 Aug 2011 15:07:19 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: Message-ID: On Sat, Aug 27, 2011 at 2:44 PM, Christopher Jordan-Squire wrote: > On Sat, Aug 27, 2011 at 2:27 PM, Matthew Brett wrote: >> Hi, >> >> On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire >> wrote: >>> Hi--I've been a moderately heavy R user for the past two years, so >>> about a month ago I took an (abbreviated) version of a simple data >>> analysis I did in R and tried to rewrite as much of it as possible, >>> line by line, into python using numpy and statsmodels. I didn't use >>> pandas, and I can't comment on how much it might have simplified >>> things. >>> >>> This comparison might be useful to some people, so I stuck it up on a >>> github repo. My overall impression is that R is much stronger for >>> interactive data analysis. Click on the link for more details why, >>> which are summarized in the README file. >>> >>> https://github.com/chrisjordansquire/r_vs_py >>> >>> The code examples should run out of the box with no downloads (other >>> than R, Python, numpy, scipy, and statsmodels) required. >> >> Thank you very much for doing that - it's a very useful exercise. ?I >> hope we can make use of it to discuss how to get better, in the true > > Hopefully. I suppose I should also mention, for those that don't want > to click on the link, that the two largest reasons R was much simpler > to use were because it was easier to construct models and easier to > view entries I'd stuck into matrices. R's graphing capabilities seemed > slightly more friendly, but that might have just been my familiarity > with them. > > (As an aside, numpy arrays' print method don't make them friendly for > interactive viewing. Even ipython couldn't make a few of the matrices > I made very intelligible, and it's easy to construct examples that > make numpy arrays hideous to behold. For example, > > x = np.arange(5).reshape(5,1) > y = np.ones(5).reshape(1,5) > z = x*y > z[0,0] += 0.0001 > print z > > [[ ?1.00000000e-04 ? 0.00000000e+00 ? 0.00000000e+00 ? 0.00000000e+00 > ? ?0.00000000e+00] > ?[ ?1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 > ? ?1.00000000e+00] > ?[ ?2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00 > ? ?2.00000000e+00] > ?[ ?3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00 > ? ?3.00000000e+00] > ?[ ?4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00 > ? ?4.00000000e+00]] > My default [~/statsmodels/] [1]: [~/statsmodels/] [1]: x = np.arange(5).reshape(5,1) [~/statsmodels/] [2]: y = np.ones(5).reshape(1,5) [~/statsmodels/] [3]: z = x*y [~/statsmodels/] [4]: z[0,0] += 0.0001 [~/statsmodels/] [5]: print z [[ 0.0001 0. 0. 0. 0. ] [ 1. 1. 1. 1. 1. ] [ 2. 2. 2. 2. 2. ] [ 3. 3. 3. 3. 3. ] [ 4. 4. 4. 4. 4. ]] [~/statsmodels/] [6]: np.set_printoptions(suppress=False) [~/statsmodels/] [7]: print z [[ 1.00000000e-04 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00] [ 1.00000000e+00 1.00000000e+00 1.00000000e+00 1.00000000e+00 1.00000000e+00] [ 2.00000000e+00 2.00000000e+00 2.00000000e+00 2.00000000e+00 2.00000000e+00] [ 3.00000000e+00 3.00000000e+00 3.00000000e+00 3.00000000e+00 3.00000000e+00] [ 4.00000000e+00 4.00000000e+00 4.00000000e+00 4.00000000e+00 4.00000000e+00]] Skipper > (Strangely, it looks much more tolerable if x ?= > np.arange(1,6).reshape(5,1) instead.) > > If you do the same thing in R, > > x = rep(0:4,5) > x = matrix(x,ncol=5) > x[1,1] = 0.000001 > x > > you get > > ? ? ?[,1] [,2] [,3] [,4] [,5] > [1,] 1e-06 ? ?0 ? ?0 ? ?0 ? ?0 > [2,] 1e+00 ? ?1 ? ?1 ? ?1 ? ?1 > [3,] 2e+00 ? ?2 ? ?2 ? ?2 ? ?2 > [4,] 3e+00 ? ?3 ? ?3 ? ?3 ? ?3 > [5,] 4e+00 ? ?4 ? ?4 ? ?4 ? ?4 > > much more readable.) > > > As a simple metric, my .r file was about 1/2 the size of the .py file, > even though I couldn't do everything in python that I could in R. > (These commands were meant to be entered interactively, so the length > of the length of the file is, perhaps, a more valid metric then usual > to be concerned about.) > > -Chris Jordan-Squire > > >> spirit of: >> >> Confront the Brutal Facts >> http://en.wikipedia.org/wiki/Good_to_Great >> >> See you, >> >> Matthew >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Sun Aug 28 15:54:49 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 28 Aug 2011 15:54:49 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: > On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >> wrote: >>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>> This comparison might be useful to some people, so I stuck it up on a >>>> github repo. My overall impression is that R is much stronger for >>>> interactive data analysis. Click on the link for more details why, >>>> which are summarized in the README file. >>> >>> ?From the README: >>> >>> "In fact, using Python without the IPython qtconsole is practically >>> impossible for this sort of cut and paste, interactive analysis. >>> The shell IPython doesn't allow it because it automatically adds >>> whitespace on multiline bits of code, breaking pre-formatted code's >>> alignment. Cutting and pasting works for the standard python shell, >>> but then you lose all the advantages of IPython." >>> >>> >>> >>> You might use %cpaste in the ipython normal shell to paste without it >>> automatically inserting spaces: >>> >>> In [5]: %cpaste >>> Pasting code; enter '--' alone on the line to stop. >>> :if 1>0: >>> : ? ?print 'hi' >>> :-- >>> hi >>> >>> Thanks, >>> >>> Jason >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> This strikes me as a textbook example of why we need an integrated >> formula framework in statsmodels. I'll make a pass through when I get >> a chance and see if there are some places where pandas would really >> help out. > > We used to have a formula class is scipy.stats and I do not follow > nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also > had this (extremely flexible but very hard to comprehend). It was what > I had argued was needed ages ago for statsmodel. But it needs a > community effort because the syntax required serves multiple > communities with different annotations and needs. That is also seen > from the different approaches taken by the stats packages from S/R, > SAS, Genstat (and those are just are ones I have used). > We have held this discussion at _great_ length multiple times on the statsmodels list and are in the process of trying to integrate Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into the statsmodels base. http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework and more recently https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? https://github.com/statsmodels/formula https://github.com/statsmodels/charlton Wes and I made some effort to go through this at SciPy. From where I sit, I think it's difficult to disentangle the data structures from the formula implementation, or maybe I'd just prefer to finish tackling the former because it's much more straightforward. So I'd like to first finish the pandas-integration branch that we've started and then focus on the formula support. This is on my (our, I hope...) immediate long-term goal list. Then I'd like to come back to the community and hash out the 'rules of the game' details for formulas after we have some code for people to play with, which promises to be "fun." https://github.com/statsmodels/statsmodels/tree/pandas-integration FWIW, I could also improve the categorical function to be much nicer for the given examples (ie., take a list, drop a reference category), but I don't know that it's worth it, because it's really just a stop-gap and ideally users shouldn't have to rely on it. Thoughts on more stop-gap? If I understand Chris' concerns, I think pandas + formula will go a long way towards bridging the gap between Python and R usability, but it's a large effort and there are only a handful (at best) of people writing code -- Wes being the only one who's more or less "full time" as far as I can tell. The 0.4 statsmodels release should be very exciting though, I hope. I'm looking forward to it, at least. Then there's only the small problem of building an infrastructure and community like CRAN so we can have specialists writing and maintaining code...but I hope once all the tools are in place this will seem much less daunting. There certainly seems to be the right sentiment for it. Skipper From bsouthey at gmail.com Sun Aug 28 21:16:08 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 28 Aug 2011 20:16:08 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: > On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>> wrote: >>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>> This comparison might be useful to some people, so I stuck it up on a >>>>> github repo. My overall impression is that R is much stronger for >>>>> interactive data analysis. Click on the link for more details why, >>>>> which are summarized in the README file. >>>> >>>> ?From the README: >>>> >>>> "In fact, using Python without the IPython qtconsole is practically >>>> impossible for this sort of cut and paste, interactive analysis. >>>> The shell IPython doesn't allow it because it automatically adds >>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>> alignment. Cutting and pasting works for the standard python shell, >>>> but then you lose all the advantages of IPython." >>>> >>>> >>>> >>>> You might use %cpaste in the ipython normal shell to paste without it >>>> automatically inserting spaces: >>>> >>>> In [5]: %cpaste >>>> Pasting code; enter '--' alone on the line to stop. >>>> :if 1>0: >>>> : ? ?print 'hi' >>>> :-- >>>> hi >>>> >>>> Thanks, >>>> >>>> Jason >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> This strikes me as a textbook example of why we need an integrated >>> formula framework in statsmodels. I'll make a pass through when I get >>> a chance and see if there are some places where pandas would really >>> help out. >> >> We used to have a formula class is scipy.stats and I do not follow >> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >> had this (extremely flexible but very hard to comprehend). It was what >> I had argued was needed ages ago for statsmodel. But it needs a >> community effort because the syntax required serves multiple >> communities with different annotations and needs. That is also seen >> from the different approaches taken by the stats packages from S/R, >> SAS, Genstat (and those are just are ones I have used). >> > > We have held this discussion at _great_ length multiple times on the > statsmodels list and are in the process of trying to integrate > Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into > the statsmodels base. > > http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework > > and more recently > > https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? > > https://github.com/statsmodels/formula > https://github.com/statsmodels/charlton > > Wes and I made some effort to go through this at SciPy. From where I > sit, I think it's difficult to disentangle the data structures from > the formula implementation, or maybe I'd just prefer to finish > tackling the former because it's much more straightforward. So I'd > like to first finish the pandas-integration branch that we've started > and then focus on the formula support. This is on my (our, I hope...) > immediate long-term goal list. Then I'd like to come back to the > community and hash out the 'rules of the game' details for formulas > after we have some code for people to play with, which promises to be > "fun." > > https://github.com/statsmodels/statsmodels/tree/pandas-integration > > FWIW, I could also improve the categorical function to be much nicer > for the given examples (ie., take a list, drop a reference category), > but I don't know that it's worth it, because it's really just a > stop-gap and ideally users shouldn't have to rely on it. Thoughts on > more stop-gap? > > If I understand Chris' concerns, I think pandas + formula will go a > long way towards bridging the gap between Python and R usability, but > it's a large effort and there are only a handful (at best) of people > writing code -- Wes being the only one who's more or less "full time" > as far as I can tell. The 0.4 statsmodels release should be very > exciting though, I hope. I'm looking forward to it, at least. Then > there's only the small problem of building an infrastructure and > community like CRAN so we can have specialists writing and maintaining > code...but I hope once all the tools are in place this will seem much > less daunting. There certainly seems to be the right sentiment for it. > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Thanks for the info! Actually it is impossible to "disentangle the data structures from the formula implementation". You have to make design designs for example defining factors- R does that in the dataframe (as.factor() is not part of the formula), SAS using class statements etc. Bruce From dbigbear at gmail.com Mon Aug 29 02:54:42 2011 From: dbigbear at gmail.com (Xiong Deng) Date: Mon, 29 Aug 2011 14:54:42 +0800 Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 47 In-Reply-To: References: Message-ID: Dear Bruce, My Linux is? Red Hat Enterprise Linux AS release 4 (Nahant Update 3) Many thanks John On 28 August 2011 22:39, wrote: > Send SciPy-User mailing list submissions to > scipy-user at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/scipy-user > or, via email, send a message with subject or body 'help' to > scipy-user-request at scipy.org > > You can reach the person managing the list at > scipy-user-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-User digest..." > > > Today's Topics: > > 1. Install Scipy Errors: ImportError: /path_to/liblapack.so: > undefined symbol: ztbsv_ (Johnny) > 2. How can I solve a equation like > sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - > 1.5)**2 - 3) = 5 (Johnny) > 3. Re: How can I solve a equation like > sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - > 1.5)**2 - 3) = 5 (Robert Kern) > 4. Re: Return variable value by function value (Andy Wilson) > 5. Sotring data for fast access (ali franco) > 6. Does odeint return derivative (ali franco) > 7. RectBivariateSpline (ali franco) > 8. Re: Install Scipy Errors: ImportError: /path_to/liblapack.so: > undefined symbol: ztbsv_ (Bruce Southey) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 27 Aug 2011 23:16:06 -0700 (PDT) > From: Johnny > Subject: [SciPy-User] Install Scipy Errors: ImportError: > /path_to/liblapack.so: undefined symbol: ztbsv_ > To: scipy-user at scipy.org > Message-ID: > <7b0b4836-fb15-4489-860a-c1684529f30c at p37g2000prp.googlegroups.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi all, > > I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times, > but always encountering the problem: > > [work XXX]$ python -c 'import scipy.optimize; > scipy.optimize.test()' > Traceback (most recent call last): > File "", line 1, in > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > scipy/optimize/__init__.py", line 11, in > from lbfgsb import fmin_l_bfgs_b > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > scipy/optimize/lbfgsb.py", line 28, in > import _lbfgsb > ImportError: /home/work/local/lib/liblapack.so: undefined symbol: > ztbsv_ > > I can pass some other tests like: > > > [work XXX:~/local]$ python -c 'import scipy.ndimage; > scipy.ndimage.test()' > Running unit tests for scipy.ndimage > NumPy version 1.6.1 > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > packages/numpy > SciPy version 0.9.0 > SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > packages/scipy > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > 20051201 (Red Hat 3.4.5-2)] > nose version 1.1.2 > > .........S................................................................................................................................................................................................................................................................................................................................................................................................................. > ---------------------------------------------------------------------- > Ran 411 tests in 1.247s > > OK (SKIP=1) > > The problem seems due to the lib of Lapack. So I tried the solutions > posted on the internet before. > > 1) The liblapack.so may be not complete...SO I tried this: > # integrate lapack with atlas: > cd lib/ > mkdir tmp > cd tmp/ > ar x ../liblapack.a > cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a > ar r ../liblapack.a *.o > cd ../.. > make check > make ptcheck > cp include/* ~/include/ > cp lib/*.a ~/lib/ > > That is, after installing atlas, there is another liblapack.a (in > addition to the lapack_LINUX.a after Lapack) in its lib, but it is > about 500k, so I integrate it with the lapack_LINUX.a from installing > Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about > 5m > > 2) re-install Lapack and atlas many times....No use > > 3) I found there is a lapack.so under scipy/lib, and it is about 500K, > but I think it may be not the problem, becaues the failure is > "ImportError: /home/work/local/lib/liblapack.so: undefined symbol: > ztbsv_". Scipy seemed to import liblapack.so in my general lib > directory... > > 4) One thing I am not sure is that I used gcc 4.7 and gfortran to > compile lapack and atlas, but my python 2.7 was built using gcc > 3.4.5.....Is this a problem? > > > Anyone can help? > _______________________________________________________________ > My configuration of the installation: > > * ATLAS 3.8.4 > * lapack 3.3.1 > * numpy 1.6.1 > * SciPy version 0.9.0 > * dateutil 1.5 > * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > 20051201 (Red Hat 3.4.5-2)] > * nose version 1.1.2 > * gcc (GCC) 4.7.0 20110820 (experimental) > * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 > x86_64 x86_64 x86_64 GNU/Linux > > site.cfg of Scipy: > > [DEFAULT] > library_dirs = /home/work/local/lib > include_dirs = /home/work/local/include > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > site.cfg of Numpy: > > [DEFAULT] > library_dirs = /home/work/local/lib > include_dirs = /home/work/local/include > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > > In addition, there are failures as well when test Numpy: > > >>> import numpy > >>> numpy.test('1') > Running unit tests for numpy > NumPy version 1.6.1 > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > packages/numpy > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > 20051201 (Red Hat 3.4.5-2)] > nose version 1.1.2 > ====================================================================== > FAIL: Test basic arithmetic function errors > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/decorators.py", line 215, in knownfailer > return f(*args, **kwargs) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_numeric.py", line 367, in > test_floating_exceptions_power > np.power, ftype(2), ftype(2**fi.nexp)) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe > "Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 34, in assert_ > raise AssertionError(msg) > AssertionError: Type did not raise fpe error > 'overflow'. > > ====================================================================== > FAIL: Test generic loops. > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops > assert_almost_equal(fone(x), fone_val, err_msg=msg) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 448, in assert_almost_equal > raise AssertionError(msg) > AssertionError: > Arrays are not almost equal to 7 decimals PyUFunc_F_F > ACTUAL: array([ 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], > dtype=complex64) > DESIRED: 1 > > ====================================================================== > FAIL: test_umath.TestComplexFunctions.test_loss_of_precision( 'numpy.complex64'>,) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > case.py", line 197, in runTest > self.test(*self.arg) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision > check(x_basic, 2*eps/1e-3) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_umath.py", line 901, in check > 'arcsinh') > AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh') > > ====================================================================== > FAIL: test_umath.TestComplexFunctions.test_precisions_consistent > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > case.py", line 197, in runTest > self.test(*self.arg) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/core/tests/test_umath.py", line 812, in > test_precisions_consistent > assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 448, in assert_almost_equal > raise AssertionError(msg) > AssertionError: > Arrays are not almost equal to 6 decimals fch-fcd > ACTUAL: 2.3561945j > DESIRED: (0.66623943249251527+1.0612750619050355j) > > ====================================================================== > FAIL: test_kind.TestKind.test_all > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > case.py", line 197, in runTest > self.test(*self.arg) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/f2py/tests/test_kind.py", line 30, in test_all > 'selectedrealkind(%s): expected %r but got %r' % (i, > selected_real_kind(i), selectedrealkind(i))) > File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > numpy/testing/utils.py", line 34, in assert_ > raise AssertionError(msg) > AssertionError: selectedrealkind(19): expected -1 but got 16 > > ---------------------------------------------------------------------- > Ran 3552 tests in 29.977s > > FAILED (KNOWNFAIL=3, failures=5) > > > > ------------------------------ > > Message: 2 > Date: Sat, 27 Aug 2011 23:18:54 -0700 (PDT) > From: Johnny > Subject: [SciPy-User] How can I solve a equation like > sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 - > 3) = 5 > To: scipy-user at scipy.org > Message-ID: > <57952d36-c803-451c-bdeb-b2f1299290e7 at y39g2000prd.googlegroups.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi, I am trying to solve the follow equation: > > solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - > mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - > 1, x) > > I am not sure Scipy can do it and how it can do it ? > > > Many thanks > Xiong > > > > > > ------------------------------ > > Message: 3 > Date: Sun, 28 Aug 2011 02:36:06 -0500 > From: Robert Kern > Subject: Re: [SciPy-User] How can I solve a equation like > sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 - > 3) = 5 > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset=UTF-8 > > On Sun, Aug 28, 2011 at 01:18, Johnny wrote: > > Hi, I am trying to solve the follow equation: > > > > solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - > > mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - > > 1, x) > > > > I am not sure Scipy can do it and how it can do it ? > > [~] > |18> from numpy import sqrt, log, exp > > [~] > |19> from scipy.special import erfc > > [~] > |20> def f(x, mu=1.0, sigma=0.1): > ...> return x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) > - > ...> mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * > sqrt(2))) - 1.0 > ...> > > [~] > |21> from scipy.optimize import fsolve > > [~] > |22> fsolve(f, 3.0) > array([ 2.88207063]) > > [~] > |23> f(_) > array([ 4.44089210e-16]) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > > > ------------------------------ > > Message: 4 > Date: Sat, 27 Aug 2011 09:47:28 -0500 > From: Andy Wilson > Subject: Re: [SciPy-User] Return variable value by function value > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > If an approximation is good enough, you can use scipy.interpolate.interp1d > to get a function that returns interpolated values. Your example doesn't > quite work because 0.95 is out of the range of the initial input. > > > import numpy as np > import scipy.interpolate > > x = np.arange(0,100) > y = np.sqrt(1 - x**2/10E+4) > > interp_func = scipy.interpolate.interp1d(x, y, kind='quadratic') > > new_x = 0.95 > interp_y = interp_func(new_x) > actual_y = np.sqrt(1 - new_x**2/10E+4) > > print "actual value: %s" % actual_y > print "interpolated value %s" % interp_y > print "difference: %s" % (actual_y - interp_y) > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-user/attachments/20110827/544ae7a2/attachment-0001.html > > ------------------------------ > > Message: 5 > Date: Sun, 28 Aug 2011 15:21:40 +1200 > From: ali franco > Subject: [SciPy-User] Sotring data for fast access > To: scipy-user at scipy.org > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > There are two parts to my question. One: I have to do a double integration > on a grid > > answer = integrate ( f(x,y) times besselfunction(x,y)) > > Now, I have read that the besselfunction can be precomputed and saved to > disk for fast access. How do I do this? Right now, I am evaluating the > besselfunction from scipy.special as it is required. > > Second question: I have numerically integrated a differential equation and > I > use the splined solution to solve other differential equations. However the > splined solution is slow. Is there a way to make this faster? > > thanks guys > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/3f4acea4/attachment-0001.html > > ------------------------------ > > Message: 6 > Date: Sun, 28 Aug 2011 23:02:54 +1200 > From: ali franco > Subject: [SciPy-User] Does odeint return derivative > To: scipy-user at scipy.org > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > I am solving a system of differential equations using odeint. Is there a > simpler way of also getting the derivative other than calculating the > derivative from the solutions obtained which in my case is going to take > alot of extra time. > > thanks > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/f2f55861/attachment-0001.html > > ------------------------------ > > Message: 7 > Date: Sun, 28 Aug 2011 23:48:57 +1200 > From: ali franco > Subject: [SciPy-User] RectBivariateSpline > To: scipy-user at scipy.org > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > Can RectBivariateSpline be used to calculated derivatives and integrals? If > it can't, can you please suggest some thing else that does. I have to use a > two dimensional spline on a rectangular mesh. The problem with alternatives > to RectBivariateSpline such as BivariateSpline , And UnivariateSpline I > found was that they require the data be specified either on a square grid > or > that they be equally spaced neither of which my data points satisfy. > thanks > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/1251bbf9/attachment-0001.html > > ------------------------------ > > Message: 8 > Date: Sun, 28 Aug 2011 09:40:15 -0500 > From: Bruce Southey > Subject: Re: [SciPy-User] Install Scipy Errors: ImportError: > /path_to/liblapack.so: undefined symbol: ztbsv_ > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset=ISO-8859-1 > > On Sun, Aug 28, 2011 at 1:16 AM, Johnny wrote: > > Hi all, > > > > I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times, > > but always encountering the problem: > > > > [work XXX]$ python -c 'import scipy.optimize; > > scipy.optimize.test()' > > Traceback (most recent call last): > > ?File "", line 1, in > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > scipy/optimize/__init__.py", line 11, in > > ? ?from lbfgsb import fmin_l_bfgs_b > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > scipy/optimize/lbfgsb.py", line 28, in > > ? ?import _lbfgsb > > ImportError: /home/work/local/lib/liblapack.so: undefined symbol: > > ztbsv_ > > > > I can pass some other tests like: > > > > > > [work XXX:~/local]$ python -c 'import scipy.ndimage; > > scipy.ndimage.test()' > > Running unit tests for scipy.ndimage > > NumPy version 1.6.1 > > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > > packages/numpy > > SciPy version 0.9.0 > > SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > > packages/scipy > > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > > 20051201 (Red Hat 3.4.5-2)] > > nose version 1.1.2 > > > .........S................................................................................................................................................................................................................................................................................................................................................................................................................. > > ---------------------------------------------------------------------- > > Ran 411 tests in 1.247s > > > > OK (SKIP=1) > > > > The problem seems due to the lib of Lapack. So I tried the solutions > > posted on the internet before. > > > > 1) The liblapack.so may be not complete...SO I tried this: > > ? ?# integrate lapack with atlas: > > ? ?cd lib/ > > ? ?mkdir tmp > > ? ?cd tmp/ > > ? ?ar x ../liblapack.a > > ? ?cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a > > ? ?ar r ../liblapack.a *.o > > ? ?cd ../.. > > ? ?make check > > ? ?make ptcheck > > ? ?cp include/* ~/include/ > > ? ?cp lib/*.a ~/lib/ > > > > That is, after installing atlas, there is another liblapack.a (in > > addition to the lapack_LINUX.a after Lapack) in its lib, but it is > > about 500k, so I integrate it with the lapack_LINUX.a from installing > > Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about > > 5m > > > > 2) re-install Lapack and atlas many times....No use > > > > 3) I found there is a lapack.so under scipy/lib, and it is about 500K, > > but I think it may be not the problem, becaues the failure is > > "ImportError: /home/work/local/lib/liblapack.so: undefined symbol: > > ztbsv_". Scipy seemed to import liblapack.so in my general lib > > directory... > > > > 4) One thing ?I am not sure is that I used gcc 4.7 and gfortran to > > compile lapack and atlas, but my python 2.7 was built using gcc > > 3.4.5.....Is this a problem? > > > > > > Anyone can help? > > _______________________________________________________________ > > My configuration of the installation: > > > > * ATLAS 3.8.4 > > * lapack 3.3.1 > > * numpy 1.6.1 > > * SciPy version 0.9.0 > > * dateutil 1.5 > > * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > > 20051201 (Red Hat 3.4.5-2)] > > * nose version 1.1.2 > > * gcc (GCC) 4.7.0 20110820 (experimental) > > * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 > > x86_64 x86_64 x86_64 GNU/Linux > > > > site.cfg of Scipy: > > > > [DEFAULT] > > library_dirs = /home/work/local/lib > > include_dirs = /home/work/local/include > > [lapack_opt] > > libraries = lapack, f77blas, cblas, atlas > > > > site.cfg of Numpy: > > > > [DEFAULT] > > library_dirs = /home/work/local/lib > > include_dirs = /home/work/local/include > > [lapack_opt] > > libraries = lapack, f77blas, cblas, atlas > > > > > > In addition, there are failures as well when test Numpy: > > > >>>> import numpy > >>>> numpy.test('1') > > Running unit tests for numpy > > NumPy version 1.6.1 > > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site- > > packages/numpy > > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > > 20051201 (Red Hat 3.4.5-2)] > > nose version 1.1.2 > > ====================================================================== > > FAIL: Test basic arithmetic function errors > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/testing/decorators.py", line 215, in knownfailer > > ? ?return f(*args, **kwargs) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/core/tests/test_numeric.py", line 367, in > > test_floating_exceptions_power > > ? ?np.power, ftype(2), ftype(2**fi.nexp)) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe > > ? ?"Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/testing/utils.py", line 34, in assert_ > > ? ?raise AssertionError(msg) > > AssertionError: Type did not raise fpe error > > 'overflow'. > > > > ====================================================================== > > FAIL: Test generic loops. > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops > > ? ?assert_almost_equal(fone(x), fone_val, err_msg=msg) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/testing/utils.py", line 448, in assert_almost_equal > > ? ?raise AssertionError(msg) > > AssertionError: > > Arrays are not almost equal to 7 decimals PyUFunc_F_F > > ?ACTUAL: array([ 0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j], > > dtype=complex64) > > ?DESIRED: 1 > > > > ====================================================================== > > FAIL: test_umath.TestComplexFunctions.test_loss_of_precision( > 'numpy.complex64'>,) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > > case.py", line 197, in runTest > > ? ?self.test(*self.arg) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision > > ? ?check(x_basic, 2*eps/1e-3) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/core/tests/test_umath.py", line 901, in check > > ? ?'arcsinh') > > AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh') > > > > ====================================================================== > > FAIL: test_umath.TestComplexFunctions.test_precisions_consistent > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > > case.py", line 197, in runTest > > ? ?self.test(*self.arg) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/core/tests/test_umath.py", line 812, in > > test_precisions_consistent > > ? ?assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/testing/utils.py", line 448, in assert_almost_equal > > ? ?raise AssertionError(msg) > > AssertionError: > > Arrays are not almost equal to 6 decimals fch-fcd > > ?ACTUAL: 2.3561945j > > ?DESIRED: (0.66623943249251527+1.0612750619050355j) > > > > ====================================================================== > > FAIL: test_kind.TestKind.test_all > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/ > > case.py", line 197, in runTest > > ? ?self.test(*self.arg) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/f2py/tests/test_kind.py", line 30, in test_all > > ? ?'selectedrealkind(%s): expected %r but got %r' % ?(i, > > selected_real_kind(i), selectedrealkind(i))) > > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/ > > numpy/testing/utils.py", line 34, in assert_ > > ? ?raise AssertionError(msg) > > AssertionError: selectedrealkind(19): expected -1 but got 16 > > > > ---------------------------------------------------------------------- > > Ran 3552 tests in 29.977s > > > > FAILED (KNOWNFAIL=3, failures=5) > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > Hi, > What Linux distro are you actually using? > Unless you have some issue, I would install the atlas version provided > by the distro as I have long term success with Fedora's packages > across multiple versions. > > If you still want to build it yourself, then you need to be using the > same compiler version everywhere. > > The ztbsv_ error suggests that you have not build blas, lapack and > atlas correctly. It is hard to get those right so very carefully check > the build logs and run the associated tests. > > Bruce > > > ------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > End of SciPy-User Digest, Vol 96, Issue 47 > ****************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dbigbear at gmail.com Mon Aug 29 03:13:08 2011 From: dbigbear at gmail.com (Xiong Deng) Date: Mon, 29 Aug 2011 15:13:08 +0800 Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined symbol: mpfr_get_z_exp" Message-ID: Hi ALL, I am trying to install numpy, scipy on my Linux. I have build and installed numpy on it kindof correctly with only one failure shown that: [work at tc-fcr-bid03.tc.baidu.com:~/xiongdeng/soft/scipy-0.9.0]$ python Python 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Running unit tests for numpy NumPy version 1.6.1 NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] nose version 1.1.2 ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F...........................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................K......................K..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ====================================================================== FAIL: Test basic arithmetic function errors ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", line 367, in test_floating_exceptions_power np.power, ftype(2), ftype(2**fi.nexp)) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe "Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: Type did not raise fpe error 'overflow'. ---------------------------------------------------------------------- Ran 3533 tests in 12.494s FAILED (KNOWNFAIL=3, failures=1) HOWEVER, when I am building my scipy, there is a big error, causing termination of the building process. The messages are as below: /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951: symbol lookup error: /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951: undefined symbol: mpfr_get_z_exp error: Command "/home/work/local/gcc-4.7/bin/gfortran -Wall -ffixed-form -fno-second-underscore -fPIC -O3 -funroll-loops -I/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/include -c -c scipy/special/specfun/specfun.f -o build/temp.linux-x86_64-2.7/scipy/special/specfun/specfun.o" failed with exit status 1 IN ADDITION: One thing I have to say: when I compiled and installed gcc-4.7 locally, I did not install GMP, *MPFR*, and MPC. They are installed after gcc-4.7....The problem may be due to this???? But How can I fix it without re-installing gcc-4.7 ??? ----------------------------------------------------------------------------------------------------------------- My configuration: My configuration of the installation: * ATLAS 3.8.4 * lapack 3.3.1 * numpy 1.6.1 * SciPy version 0.9.0 (when building scipy version 0.8.0, the error is the same ) * dateutil 1.5 * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] * nose version 1.1.2 * gcc (GCC) 4.7.0 20110820 (experimental) * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux AS release 4 (Nahant Update 3) export: declare -x ATLAS="/home/work/local/lib/libatlas.so" declare -x G_BROKEN_FILENAMES="1" declare -x HISTSIZE="1000" declare -x HOME="/home/work" declare -x INPUTRC="/etc/inputrc" declare -x LANG="en_US" declare -x LAPACK="/home/work/local/lib/liblapack.so" declare -x LC_CTYPE="zh_CN.gb18030" declare -x LD_LIBRARY_PATH=":/home/work/local/lib/:/home/work/local/gcc-4.7/lib/" declare -x LESSOPEN="|/usr/bin/lesspipe.sh %s" declare -x LOGNAME="work" declare -x LS_COLORS="no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:" declare -x MAC="64" declare -x MAIL="/var/spool/mail/work" declare -x OLDPWD="/home/work/xiongdeng/soft" declare -x PATH="/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/share/baidu/bin:/home/work/local/python-2.7.1/bin/:/home/work/local/vim/bin/:/home/work/local/svn/bin/:/home/work/local/hadoop-client/hadoop/bin/:/home/work/local/script/:/home/work/local/bin/:/home/work/local/gcc-4.7/bin/:/home/work/bin" declare -x PROMPT_COMMAND="echo -ne \"\\e]0;tc-fcr-bid03\\a\"" declare -x PWD="/home/work/xiongdeng/soft/scipy-0.9.0" declare -x SDIR_FILE="/home/work/.sdir_label" declare -x SHELL="/bin/bash" declare -x SHLVL="1" declare -x SVN_EDITOR="/home/work/local/vim/bin/vim" declare -x TERM="linux" declare -x USER="work" ll /home/work/local/lib -rw-r--r-- 1 work work 13424422 Aug 29 01:55 libatlas.a -rwxr-xr-x 1 work work 8463221 Aug 29 01:55 libatlas.so -rw-r--r-- 1 work work 466344 Aug 29 01:55 libcblas.a -rwxr-xr-x 1 work work 142677 Aug 29 01:55 libcblas.so -rw-r--r-- 1 work work 566602 Aug 29 01:55 libf77blas.a -rwxr-xr-x 1 work work 158579 Aug 29 01:55 libf77blas.so -rw-r--r-- 1 work work 515618 Aug 23 21:05 libgcc_s.so -rw-r--r-- 1 work work 515618 Aug 23 21:05 libgcc_s.so.1 -rw-r--r-- 1 work work 16135904 Aug 27 02:04 libgfortran.a -rwxr-xr-x 1 work work 1034 Aug 27 02:04 libgfortran.la -rw-r--r-- 1 work work 5962327 Aug 27 02:04 libgfortran.so -rwxr-xr-x 1 work work 5962327 Aug 27 02:04 libgfortran.so.3 -rwxr-xr-x 1 work work 5962327 Aug 27 02:04 libgfortran.so.3.0.0 -rw-r--r-- 1 work work 269 Aug 27 02:04 libgfortran.spec -rw-r--r-- 1 work work 1134848 Aug 24 16:01 libgmp.a -rwxr-xr-x 1 work work 923 Aug 24 16:01 libgmp.la lrwxrwxrwx 1 work work 16 Aug 24 16:01 libgmp.so -> libgmp.so.10.0.2 lrwxrwxrwx 1 work work 16 Aug 24 16:01 libgmp.so.10 -> libgmp.so.10.0.2 -rw-r--r-- 1 work work 471045 Aug 24 16:01 libgmp.so.10.0.2 -rw-r--r-- 1 work work 642892 Aug 23 21:05 libgomp.a -rwxr-xr-x 1 work work 958 Aug 23 21:05 libgomp.la -rw-r--r-- 1 work work 329946 Aug 23 21:05 libgomp.so -rwxr-xr-x 1 work work 329946 Aug 23 21:05 libgomp.so.1 -rwxr-xr-x 1 work work 329946 Aug 23 21:05 libgomp.so.1.0.0 -rw-r--r-- 1 work work 170 Aug 23 21:05 libgomp.spec -rw-r--r-- 1 work work 1275386 Aug 23 21:05 libiberty.a -rw-r--r-- 1 work work 10561756 Aug 29 01:55 liblapack.a -rwxr-xr-x 1 work work 5683831 Aug 29 01:55 liblapack.so -rw-r--r-- 1 work work 1133984 Aug 23 20:48 liblzma.a -rwxr-xr-x 1 work work 924 Aug 23 20:48 liblzma.la lrwxrwxrwx 1 work work 16 Aug 23 20:48 liblzma.so -> liblzma.so.5.0.3 lrwxrwxrwx 1 work work 16 Aug 23 20:48 liblzma.so.5 -> liblzma.so.5.0.3 -rw-r--r-- 1 work work 582278 Aug 23 20:48 liblzma.so.5.0.3 -rw-r--r-- 1 work work 199486 Aug 24 16:06 libmpc.a -rwxr-xr-x 1 work work 1027 Aug 24 16:06 libmpc.la lrwxrwxrwx 1 work work 15 Aug 24 16:06 libmpc.so -> libmpc.so.2.0.0 lrwxrwxrwx 1 work work 15 Aug 24 16:06 libmpc.so.2 -> libmpc.so.2.0.0 -rw-r--r-- 1 work work 91064 Aug 24 16:06 libmpc.so.2.0.0 -rw-r--r-- 1 work work 2905606 Aug 24 15:53 libmpfr.a -rwxr-xr-x 1 work work 948 Aug 24 15:53 libmpfr.la lrwxrwxrwx 1 work work 16 Aug 24 15:53 libmpfr.so -> libmpfr.so.4.0.1 lrwxrwxrwx 1 work work 16 Aug 24 20:58 libmpfr.so.1 -> libmpfr.so.4.0.1 lrwxrwxrwx 1 work work 16 Aug 24 15:53 libmpfr.so.4 -> libmpfr.so.4.0.1 -rw-r--r-- 1 work work 1317215 Aug 24 15:53 libmpfr.so.4.0.1 -rw-r--r-- 1 work work 466902 Aug 29 01:55 libptcblas.a -rw-r--r-- 1 work work 567162 Aug 29 01:55 libptf77blas.a -rw-r--r-- 1 work work 1583884 Aug 23 21:05 libquadmath.a -rwxr-xr-x 1 work work 985 Aug 23 21:05 libquadmath.la -rw-r--r-- 1 work work 903165 Aug 23 21:05 libquadmath.so -rwxr-xr-x 1 work work 903165 Aug 23 21:05 libquadmath.so.0 -rwxr-xr-x 1 work work 903165 Aug 23 21:05 libquadmath.so.0.0.0 -rw-r--r-- 1 work work 103546 Aug 23 21:05 libssp.a -rwxr-xr-x 1 work work 946 Aug 23 21:05 libssp.la -rw-r--r-- 1 work work 3534 Aug 23 21:05 libssp_nonshared.a -rwxr-xr-x 1 work work 928 Aug 23 21:05 libssp_nonshared.la -rw-r--r-- 1 work work 48981 Aug 23 21:05 libssp.so -rwxr-xr-x 1 work work 48981 Aug 23 21:05 libssp.so.0 -rwxr-xr-x 1 work work 48981 Aug 23 21:05 libssp.so.0.0.0 -rw-r--r-- 1 work work 15669078 Aug 23 21:05 libstdc++.a -rwxr-xr-x 1 work work 973 Aug 23 21:05 libstdc++.la -rw-r--r-- 1 work work 6408505 Aug 23 21:05 libstdc++.so -rwxr-xr-x 1 work work 6408505 Aug 23 21:05 libstdc++.so.6 -rwxr-xr-x 1 work work 6408505 Aug 23 21:05 libstdc++.so.6.0.17 -rw-r--r-- 1 work work 2330 Aug 23 21:05 libstdc++.so.6.0.17-gdb.py -rw-r--r-- 1 work work 1092892 Aug 23 21:05 libsupc++.a -rwxr-xr-x 1 work work 907 Aug 23 21:05 libsupc++.la -rw-rw-r-- 1 work work 490328 Aug 29 01:55 libtstatlas.a -rw-rw-r-- 1 work work 515296 Aug 27 12:15 lilapack.a ll /home/work/local/include drwxr-xr-x 2 work work 4096 Aug 24 14:43 atlas -rw-rw-r-- 1 work work 1773 Aug 27 11:48 atlas_buildinfo.h -rw-rw-r-- 1 work work 90 Aug 27 11:48 atlas_cacheedge.h -rw-rw-r-- 1 work work 147 Aug 27 11:48 atlas_cmv.h -rw-rw-r-- 1 work work 451 Aug 27 11:48 atlas_cmvN.h -rw-rw-r-- 1 work work 387 Aug 27 11:48 atlas_cmvS.h -rw-rw-r-- 1 work work 481 Aug 27 11:48 atlas_cmvT.h -rw-rw-r-- 1 work work 455 Aug 27 11:48 atlas_cNCmm.h -rw-rw-r-- 1 work work 353 Aug 27 11:48 atlas_cr1.h -rw-rw-r-- 1 work work 77 Aug 27 11:48 atlas_csNKB.h -rw-rw-r-- 1 work work 195 Aug 27 11:48 atlas_csysinfo.h -rw-rw-r-- 1 work work 0 Aug 27 11:48 atlas_ctrsmXover.h -rw-rw-r-- 1 work work 147 Aug 27 11:48 atlas_dmv.h -rw-rw-r-- 1 work work 451 Aug 27 11:48 atlas_dmvN.h -rw-rw-r-- 1 work work 387 Aug 27 11:48 atlas_dmvS.h -rw-rw-r-- 1 work work 482 Aug 27 11:48 atlas_dmvT.h -rw-rw-r-- 1 work work 455 Aug 27 11:48 atlas_dNCmm.h -rw-rw-r-- 1 work work 352 Aug 27 11:48 atlas_dr1.h -rw-rw-r-- 1 work work 195 Aug 27 11:48 atlas_dsysinfo.h -rw-rw-r-- 1 work work 112 Aug 27 11:48 atlas_dtrsmXover.h -rw-rw-r-- 1 work work 112 Aug 27 11:48 atlas_pthreads.h -rw-rw-r-- 1 work work 147 Aug 27 11:48 atlas_smv.h -rw-rw-r-- 1 work work 451 Aug 27 11:48 atlas_smvN.h -rw-rw-r-- 1 work work 438 Aug 27 11:48 atlas_smvS.h -rw-rw-r-- 1 work work 637 Aug 27 11:48 atlas_smvT.h -rw-rw-r-- 1 work work 455 Aug 27 11:48 atlas_sNCmm.h -rw-rw-r-- 1 work work 353 Aug 27 11:48 atlas_sr1.h -rw-rw-r-- 1 work work 195 Aug 27 11:48 atlas_ssysinfo.h -rw-rw-r-- 1 work work 112 Aug 27 11:48 atlas_strsmXover.h -rw-rw-r-- 1 work work 191 Aug 27 11:48 atlas_trsmNB.h -rw-rw-r-- 1 work work 562 Aug 27 11:48 atlas_type.h -rw-rw-r-- 1 work work 77 Aug 27 11:48 atlas_zdNKB.h -rw-rw-r-- 1 work work 147 Aug 27 11:48 atlas_zmv.h -rw-rw-r-- 1 work work 451 Aug 27 11:48 atlas_zmvN.h -rw-rw-r-- 1 work work 386 Aug 27 11:48 atlas_zmvS.h -rw-rw-r-- 1 work work 481 Aug 27 11:48 atlas_zmvT.h -rw-rw-r-- 1 work work 455 Aug 27 11:48 atlas_zNCmm.h -rw-rw-r-- 1 work work 353 Aug 27 11:48 atlas_zr1.h -rw-rw-r-- 1 work work 195 Aug 27 11:48 atlas_zsysinfo.h -rw-rw-r-- 1 work work 0 Aug 27 11:48 atlas_ztrsmXover.h -rw-r--r-- 1 work work 33895 Aug 28 16:17 cblas.h -rw-r--r-- 1 work work 8225 Aug 28 16:17 clapack.h -rw-rw-r-- 1 work work 2719 Aug 27 11:48 cmm.h -rw-rw-r-- 1 work work 540 Aug 27 11:48 cXover.h -rw-rw-r-- 1 work work 657 Aug 27 11:48 dmm.h -rw-rw-r-- 1 work work 526 Aug 27 11:48 dXover.h -rw-r--r-- 1 work work 86216 Aug 24 16:01 gmp.h drwxrwxr-x 2 work work 4096 Aug 23 20:48 lzma -rw-r--r-- 1 work work 9274 Aug 23 20:48 lzma.h -rw-r--r-- 1 work work 13049 Aug 24 16:06 mpc.h -rw-r--r-- 1 work work 6288 Aug 24 15:53 mpf2mpfr.h -rw-r--r-- 1 work work 47981 Aug 24 15:53 mpfr.h -rw-rw-r-- 1 work work 658 Aug 27 11:48 smm.h -rw-rw-r-- 1 work work 523 Aug 27 11:48 sXover.h -rw-rw-r-- 1 work work 2718 Aug 27 11:48 zmm.h -rw-rw-r-- 1 work work 555 Aug 27 11:48 zXover.h Many thanks John -------------- next part -------------- An HTML attachment was scrubbed... URL: From hhh.guo at gmail.com Mon Aug 29 03:36:21 2011 From: hhh.guo at gmail.com (Ning Guo) Date: Mon, 29 Aug 2011 15:36:21 +0800 Subject: [SciPy-User] 3d convex hull In-Reply-To: References: <201108241338.50940.alexandre.fayolle@logilab.fr> <4E551F34.4@gmail.com> <4E590958.30103@gmail.com> Message-ID: <4E5B4175.10300@gmail.com> On Sunday, August 28, 2011 06:08 AM, Pauli Virtanen wrote: It seems qhull does not output the vertices in a consistent order. I have to use the inner product of the normal with a side edge to determine the sign :-( > Sat, 27 Aug 2011 23:12:24 +0800, Ning Guo wrote: >> On Saturday, August 27, 2011 03:53 PM, Pauli Virtanen wrote: > [clip] >> Also, the formula to calculate normal may be like this: >> >> face_normals[:,0] = >> np.cross(tetra_points[:,0]-tetra_points[:,2],tetra_points[:,1]-tetra_points[:,2]) > [clip] > > Ah yes, exactly like that, my brain apparently wasn't working properly. > >> Regarding to the order of the vertices, I'm also not sure about their >> convention. I'm trying to figure it out. > If you find it out, please let us know, as this would be an useful thing > to mention in the documentation. However, I'm not sure at the moment > whether Qhull provides such ordering guarantees. > -- Geotechnical Group Department of Civil and Environmental Engineering Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong From cournape at gmail.com Mon Aug 29 04:12:03 2011 From: cournape at gmail.com (David Cournapeau) Date: Mon, 29 Aug 2011 10:12:03 +0200 Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined symbol: mpfr_get_z_exp" In-Reply-To: References: Message-ID: On Mon, Aug 29, 2011 at 9:13 AM, Xiong Deng wrote: > Hi ALL, > > I am trying to install numpy, scipy on my Linux. I have build and installed > numpy on it kindof correctly with only one failure shown that: > > [work at tc-fcr-bid03.tc.baidu.com:~/xiongdeng/soft/scipy-0.9.0]$ python > Python 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) > [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy >>>> numpy.test() > Running unit tests for numpy > NumPy version 1.6.1 > NumPy is installed in > /home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201 > (Red Hat 3.4.5-2)] > nose version 1.1.2 > ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F...........................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................K......................K..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... > ====================================================================== > FAIL: Test basic arithmetic function errors > ---------------------------------------------------------------------- > Traceback (most recent call last): > ? File > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py", > line 215, in knownfailer > ??? return f(*args, **kwargs) > ? File > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", > line 367, in test_floating_exceptions_power > ??? np.power, ftype(2), ftype(2**fi.nexp)) > ? File > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", > line 271, in assert_raises_fpe > ??? "Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) > ? File > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py", > line 34, in assert_ > ??? raise AssertionError(msg) > AssertionError: Type did not raise fpe error > 'overflow'. > > ---------------------------------------------------------------------- > Ran 3533 tests in 12.494s > > FAILED (KNOWNFAIL=3, failures=1) > > > > HOWEVER, when I am building my scipy, there is a big error, causing > termination of the building process. The messages are as below: > > /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951: > symbol lookup error: > /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951: > undefined symbol: mpfr_get_z_exp > error: Command "/home/work/local/gcc-4.7/bin/gfortran -Wall -ffixed-form > -fno-second-underscore -fPIC -O3 -funroll-loops > -I/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/include > -c -c scipy/special/specfun/specfun.f -o > build/temp.linux-x86_64-2.7/scipy/special/specfun/specfun.o" failed with > exit status 1 > > IN ADDITION: One thing I have to say: when I compiled and installed gcc-4.7 > locally, I did not install GMP, MPFR, and MPC. They are installed after > gcc-4.7....The problem may be due to this???? But How can I fix it without > re-installing gcc-4.7 ??? Most likely, you did not build gcc and gfortran correctly. Why don't you use the gcc included on your system ? cheers, David From collinstocks at gmail.com Mon Aug 29 04:49:02 2011 From: collinstocks at gmail.com (Collin Stocks) Date: Mon, 29 Aug 2011 04:49:02 -0400 Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 47 In-Reply-To: References: Message-ID: <1314607742.4449.14.camel@SietchTabr> When you use digest email for the list, it makes it very difficult to keep the thread of the conversation. Please try to keep this in mind when replying. Just common courtesy when being active in mailing lists. Most email clients support filters of some sort, so that is generally a better alternative to using digest email. My $0.02 USD. -- Collin -------------- next part -------------- An embedded message was scrubbed... From: Xiong Deng Subject: Re: [SciPy-User] SciPy-User Digest, Vol 96, Issue 47 Date: Mon, 29 Aug 2011 14:54:42 +0800 Size: 68581 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From dbigbear at gmail.com Mon Aug 29 05:13:53 2011 From: dbigbear at gmail.com (Xiong Deng) Date: Mon, 29 Aug 2011 17:13:53 +0800 Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined symbol: mpfr_get_z_exp" Message-ID: Hi, I just find out that the gcc-4.7 is downloaded as a binary distri. I did not compile gcc-4.7 myself... The gcc included on my system is gcc 3.4.5 and there seems no gfortran built on it (However there is a g77 on it, which cause problems while building numpy/scipy....)...In addition, there are not mpc, mpfr, gmp on it with gcc 3.4.5, so I need a new gcc with gfortran, mpc ,mpfr ,gmp, which is necessay for numpy/scipy.... Btw: I am not sure if there is a way to install mpc, mpfr, and gmp, after gcc has been installed and link gcc with mpc, mpfr, gmp... Many thanks XIong Message: 2 Date: Mon, 29 Aug 2011 10:12:03 +0200 From: David Cournapeau Subject: Re: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined symbol: mpfr_get_z_exp" To: SciPy Users List Message-ID: Content-Type: text/plain; charset=UTF-8 On Mon, Aug 29, 2011 at 9:13 AM, Xiong Deng wrote: > Hi ALL, > > I am trying to install numpy, scipy on my Linux. I have build and installed > numpy on it kindof correctly with only one failure shown that: > > [work at tc-fcr-bid03.tc.baidu. > > com:~/xiongdeng/soft/scipy-0.9.0]$ python > > Python 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) > > [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy > >>>> numpy.test() > > Running unit tests for numpy > > NumPy version 1.6.1 > > NumPy is installed in > > /home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy > > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 > 20051201 > > (Red Hat 3.4.5-2)] > > nose version 1.1.2 > > > ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F...........................................................................................................................................................................................................................................................................................................................................................................................................................K............................. > > ....................................................................K......................K................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. > > ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. > > ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... > > ====================================================================== > > FAIL: Test basic arithmetic function errors > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > ? File > > > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py", > > line 215, in knownfailer > > ??? return f(*args, **kwargs) > > ? File > > > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", > > line 367, in test_floating_exceptions_power > > ??? np.power, ftype(2), ftype(2**fi.nexp)) > > ? File > > > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", > > line 271, in assert_raises_fpe > > ??? "Type %s did not raise fpe error '%s'." % (ftype, fpeerr)) > > ? File > > > "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py", > > line 34, in assert_ > > ??? raise AssertionError(msg) > > AssertionError: Type did not raise fpe error > > 'overflow'. > > > > ---------------------------------------------------------------------- > > Ran 3533 tests in 12.494s > > > > FAILED (KNOWNFAIL=3, failures=1) > > > > > > > > HOWEVER, when I am building my scipy, there is a big error, causing > > termination of the building process. The messages are as below: > > > > > /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951: > > symbol lookup error: > > > /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951: > > undefined symbol: mpfr_get_z_exp > > error: Command "/home/work/local/gcc-4.7/bin/gfortran -Wall -ffixed-form > > -fno-second-underscore -fPIC -O3 -funroll-loops > > > -I/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/include > > -c -c scipy/special/specfun/specfun.f -o > > build/temp.linux-x86_64-2.7/scipy/special/specfun/specfun.o" failed with > > exit status 1 > > > > IN ADDITION: One thing I have to say: when I compiled and installed > gcc-4.7 > > locally, I did not install GMP, MPFR, and MPC. They are installed after > > gcc-4.7....The problem may be due to this???? But How can I fix it > without > > re-installing gcc-4.7 ??? > > Most likely, you did not build gcc and gfortran correctly. Why don't > you use the gcc included on your system ? > > cheers, > > David -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Aug 29 07:20:17 2011 From: cournape at gmail.com (David Cournapeau) Date: Mon, 29 Aug 2011 13:20:17 +0200 Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined symbol: mpfr_get_z_exp" In-Reply-To: References: Message-ID: On Mon, Aug 29, 2011 at 11:13 AM, Xiong Deng wrote: > Hi, > > I just find out that the gcc-4.7 is downloaded as a binary distri. I did not > compile gcc-4.7 myself... Then the binary is buggy or not adapted to your platform. > > The gcc included on my system is gcc 3.4.5 and there seems no gfortran built > on it (However there is a g77 on it, which cause problems while building > numpy/scipy....). You should be able to build numpy and scipy with g77. You should not try mixing compiler versions unlesss you are willing to spend quite some time debugging subtle mismatches issues. >..In addition, there are not mpc, mpfr, gmp on it with gcc > 3.4.5, so I need a new gcc with gfortran, mpc ,mpfr ,gmp, which is necessay > for numpy/scipy.... The usual way to do this is to first build mpfr and gmp with whatever compiler you have (gcc 3.4.5 here), and then build the new version of gcc and gfortran. But again, you would be better just using the compilers you have on your machine. cheers, David From bsouthey at gmail.com Mon Aug 29 09:42:44 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 29 Aug 2011 08:42:44 -0500 Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined symbol: mpfr_get_z_exp" In-Reply-To: References: Message-ID: <4E5B9754.1070204@gmail.com> On 08/29/2011 06:20 AM, David Cournapeau wrote: > On Mon, Aug 29, 2011 at 11:13 AM, Xiong Deng wrote: >> Hi, >> >> I just find out that the gcc-4.7 is downloaded as a binary distri. I did not >> compile gcc-4.7 myself... > Then the binary is buggy or not adapted to your platform. > >> The gcc included on my system is gcc 3.4.5 and there seems no gfortran built >> on it (However there is a g77 on it, which cause problems while building >> numpy/scipy....). > You should be able to build numpy and scipy with g77. You should not > try mixing compiler versions unlesss you are willing to spend quite > some time debugging subtle mismatches issues. > >> ..In addition, there are not mpc, mpfr, gmp on it with gcc >> 3.4.5, so I need a new gcc with gfortran, mpc ,mpfr ,gmp, which is necessay >> for numpy/scipy.... > The usual way to do this is to first build mpfr and gmp with whatever > compiler you have (gcc 3.4.5 here), and then build the new version of > gcc and gfortran. But again, you would be better just using the > compilers you have on your machine. > > cheers, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Assuming this is the same John or Johnny as before, RHEL does provide Atlas as a package so you should get it from Red Hat package manager. If you build things yourself, you must ensure that all previous versions have been removed from everywhere and that you are linking paths are to the correct locations. Bruce PS please use one name and one thread From cjordan1 at uw.edu Mon Aug 29 10:57:43 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Mon, 29 Aug 2011 09:57:43 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: > On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>> wrote: >>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>> This comparison might be useful to some people, so I stuck it up on a >>>>> github repo. My overall impression is that R is much stronger for >>>>> interactive data analysis. Click on the link for more details why, >>>>> which are summarized in the README file. >>>> >>>> ?From the README: >>>> >>>> "In fact, using Python without the IPython qtconsole is practically >>>> impossible for this sort of cut and paste, interactive analysis. >>>> The shell IPython doesn't allow it because it automatically adds >>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>> alignment. Cutting and pasting works for the standard python shell, >>>> but then you lose all the advantages of IPython." >>>> >>>> >>>> >>>> You might use %cpaste in the ipython normal shell to paste without it >>>> automatically inserting spaces: >>>> >>>> In [5]: %cpaste >>>> Pasting code; enter '--' alone on the line to stop. >>>> :if 1>0: >>>> : ? ?print 'hi' >>>> :-- >>>> hi >>>> >>>> Thanks, >>>> >>>> Jason >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> This strikes me as a textbook example of why we need an integrated >>> formula framework in statsmodels. I'll make a pass through when I get >>> a chance and see if there are some places where pandas would really >>> help out. >> >> We used to have a formula class is scipy.stats and I do not follow >> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >> had this (extremely flexible but very hard to comprehend). It was what >> I had argued was needed ages ago for statsmodel. But it needs a >> community effort because the syntax required serves multiple >> communities with different annotations and needs. That is also seen >> from the different approaches taken by the stats packages from S/R, >> SAS, Genstat (and those are just are ones I have used). >> > > We have held this discussion at _great_ length multiple times on the > statsmodels list and are in the process of trying to integrate > Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into > the statsmodels base. > > http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework > > and more recently > > https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? > > https://github.com/statsmodels/formula > https://github.com/statsmodels/charlton > > Wes and I made some effort to go through this at SciPy. From where I > sit, I think it's difficult to disentangle the data structures from > the formula implementation, or maybe I'd just prefer to finish > tackling the former because it's much more straightforward. So I'd > like to first finish the pandas-integration branch that we've started > and then focus on the formula support. This is on my (our, I hope...) > immediate long-term goal list. Then I'd like to come back to the > community and hash out the 'rules of the game' details for formulas > after we have some code for people to play with, which promises to be > "fun." > > https://github.com/statsmodels/statsmodels/tree/pandas-integration > > FWIW, I could also improve the categorical function to be much nicer > for the given examples (ie., take a list, drop a reference category), > but I don't know that it's worth it, because it's really just a > stop-gap and ideally users shouldn't have to rely on it. Thoughts on > more stop-gap? > I want more usability, but I agree that a stop-gap probably isn't the right way to go, unless it has things we'd eventually want anyways. > If I understand Chris' concerns, I think pandas + formula will go a > long way towards bridging the gap between Python and R usability, but Yes, I agree. pandas + formulas would go a long, long way towards more usability. Though I really, really want a scatterplot smoother (i.e., lowess) in statsmodels. I use it a lot, and the final part of my R file was entirely lowess. (And, I should add, that was the part people liked best since one of the main goals of the assignment was to generate nifty pictures that could be used to summarize the data.) > it's a large effort and there are only a handful (at best) of people > writing code -- Wes being the only one who's more or less "full time" > as far as I can tell. The 0.4 statsmodels release should be very > exciting though, I hope. I'm looking forward to it, at least. Then > there's only the small problem of building an infrastructure and > community like CRAN so we can have specialists writing and maintaining > code...but I hope once all the tools are in place this will seem much > less daunting. There certainly seems to be the right sentiment for it. > At the very least creating and testing models would be much simpler. For weeks I've been wanting to see if gmm is the same as gee by fitting both models to the same dataset, but I've been putting it off because I didn't want to construct the design matrices by hand for such a simple question. (GMM--Generalized Method of Moments--is a standard econometrics model and GEE--Generalized Estimating Equations--is a standard biostatics model. They're both generalizations of quasi-likelihood and appear very similar, but I want to fit some models to figure out if they're exactly the same.) -Chris JS > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Mon Aug 29 11:10:17 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 29 Aug 2011 11:10:17 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire wrote: > On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>> wrote: >>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>> github repo. My overall impression is that R is much stronger for >>>>>> interactive data analysis. Click on the link for more details why, >>>>>> which are summarized in the README file. >>>>> >>>>> ?From the README: >>>>> >>>>> "In fact, using Python without the IPython qtconsole is practically >>>>> impossible for this sort of cut and paste, interactive analysis. >>>>> The shell IPython doesn't allow it because it automatically adds >>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>> alignment. Cutting and pasting works for the standard python shell, >>>>> but then you lose all the advantages of IPython." >>>>> >>>>> >>>>> >>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>> automatically inserting spaces: >>>>> >>>>> In [5]: %cpaste >>>>> Pasting code; enter '--' alone on the line to stop. >>>>> :if 1>0: >>>>> : ? ?print 'hi' >>>>> :-- >>>>> hi >>>>> >>>>> Thanks, >>>>> >>>>> Jason >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>>> This strikes me as a textbook example of why we need an integrated >>>> formula framework in statsmodels. I'll make a pass through when I get >>>> a chance and see if there are some places where pandas would really >>>> help out. >>> >>> We used to have a formula class is scipy.stats and I do not follow >>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>> had this (extremely flexible but very hard to comprehend). It was what >>> I had argued was needed ages ago for statsmodel. But it needs a >>> community effort because the syntax required serves multiple >>> communities with different annotations and needs. That is also seen >>> from the different approaches taken by the stats packages from S/R, >>> SAS, Genstat (and those are just are ones I have used). >>> >> >> We have held this discussion at _great_ length multiple times on the >> statsmodels list and are in the process of trying to integrate >> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >> the statsmodels base. >> >> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >> >> and more recently >> >> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >> >> https://github.com/statsmodels/formula >> https://github.com/statsmodels/charlton >> >> Wes and I made some effort to go through this at SciPy. From where I >> sit, I think it's difficult to disentangle the data structures from >> the formula implementation, or maybe I'd just prefer to finish >> tackling the former because it's much more straightforward. So I'd >> like to first finish the pandas-integration branch that we've started >> and then focus on the formula support. This is on my (our, I hope...) >> immediate long-term goal list. Then I'd like to come back to the >> community and hash out the 'rules of the game' details for formulas >> after we have some code for people to play with, which promises to be >> "fun." >> >> https://github.com/statsmodels/statsmodels/tree/pandas-integration >> >> FWIW, I could also improve the categorical function to be much nicer >> for the given examples (ie., take a list, drop a reference category), >> but I don't know that it's worth it, because it's really just a >> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >> more stop-gap? >> > > I want more usability, but I agree that a stop-gap probably isn't the > right way to go, unless it has things we'd eventually want anyways. > >> If I understand Chris' concerns, I think pandas + formula will go a >> long way towards bridging the gap between Python and R usability, but > > Yes, I agree. pandas + formulas would go a long, long way towards more > usability. > > Though I really, really want a scatterplot smoother (i.e., lowess) in > statsmodels. I use it a lot, and the final part of my R file was > entirely lowess. (And, I should add, that was the part people liked > best since one of the main goals of the assignment was to generate > nifty pictures that could be used to summarize the data.) > Working my way through the pull requests. Very time poor... >> it's a large effort and there are only a handful (at best) of people >> writing code -- Wes being the only one who's more or less "full time" >> as far as I can tell. The 0.4 statsmodels release should be very >> exciting though, I hope. I'm looking forward to it, at least. Then >> there's only the small problem of building an infrastructure and >> community like CRAN so we can have specialists writing and maintaining >> code...but I hope once all the tools are in place this will seem much >> less daunting. There certainly seems to be the right sentiment for it. >> > > At the very least creating and testing models would be much simpler. > For weeks I've been wanting to see if gmm is the same as gee by > fitting both models to the same dataset, but I've been putting it off > because I didn't want to construct the design matrices by hand for > such a simple question. (GMM--Generalized Method of Moments--is a > standard econometrics model and GEE--Generalized Estimating > Equations--is a standard biostatics model. They're both > generalizations of quasi-likelihood and appear very similar, but I > want to fit some models to figure out if they're exactly the same.) > Oh, it's not *that* bad. I agree, of course, that it could be better, but I've been using mainly Python for my work, including GMM and estimating equations models (mainly empirical likelihood and generalized maximum entropy) for the last ~two years. Skipper From cjordan1 at uw.edu Mon Aug 29 11:21:35 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Mon, 29 Aug 2011 10:21:35 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 10:10 AM, Skipper Seabold wrote: > On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire > wrote: >> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>> wrote: >>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>> which are summarized in the README file. >>>>>> >>>>>> ?From the README: >>>>>> >>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>> but then you lose all the advantages of IPython." >>>>>> >>>>>> >>>>>> >>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>> automatically inserting spaces: >>>>>> >>>>>> In [5]: %cpaste >>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>> :if 1>0: >>>>>> : ? ?print 'hi' >>>>>> :-- >>>>>> hi >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Jason >>>>>> >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>>>> >>>>> This strikes me as a textbook example of why we need an integrated >>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>> a chance and see if there are some places where pandas would really >>>>> help out. >>>> >>>> We used to have a formula class is scipy.stats and I do not follow >>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>> had this (extremely flexible but very hard to comprehend). It was what >>>> I had argued was needed ages ago for statsmodel. But it needs a >>>> community effort because the syntax required serves multiple >>>> communities with different annotations and needs. That is also seen >>>> from the different approaches taken by the stats packages from S/R, >>>> SAS, Genstat (and those are just are ones I have used). >>>> >>> >>> We have held this discussion at _great_ length multiple times on the >>> statsmodels list and are in the process of trying to integrate >>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>> the statsmodels base. >>> >>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>> >>> and more recently >>> >>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>> >>> https://github.com/statsmodels/formula >>> https://github.com/statsmodels/charlton >>> >>> Wes and I made some effort to go through this at SciPy. From where I >>> sit, I think it's difficult to disentangle the data structures from >>> the formula implementation, or maybe I'd just prefer to finish >>> tackling the former because it's much more straightforward. So I'd >>> like to first finish the pandas-integration branch that we've started >>> and then focus on the formula support. This is on my (our, I hope...) >>> immediate long-term goal list. Then I'd like to come back to the >>> community and hash out the 'rules of the game' details for formulas >>> after we have some code for people to play with, which promises to be >>> "fun." >>> >>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>> >>> FWIW, I could also improve the categorical function to be much nicer >>> for the given examples (ie., take a list, drop a reference category), >>> but I don't know that it's worth it, because it's really just a >>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>> more stop-gap? >>> >> >> I want more usability, but I agree that a stop-gap probably isn't the >> right way to go, unless it has things we'd eventually want anyways. >> >>> If I understand Chris' concerns, I think pandas + formula will go a >>> long way towards bridging the gap between Python and R usability, but >> >> Yes, I agree. pandas + formulas would go a long, long way towards more >> usability. >> >> Though I really, really want a scatterplot smoother (i.e., lowess) in >> statsmodels. I use it a lot, and the final part of my R file was >> entirely lowess. (And, I should add, that was the part people liked >> best since one of the main goals of the assignment was to generate >> nifty pictures that could be used to summarize the data.) >> > > Working my way through the pull requests. Very time poor... :-) Thanks Skipper! > >>> it's a large effort and there are only a handful (at best) of people >>> writing code -- Wes being the only one who's more or less "full time" >>> as far as I can tell. The 0.4 statsmodels release should be very >>> exciting though, I hope. I'm looking forward to it, at least. Then >>> there's only the small problem of building an infrastructure and >>> community like CRAN so we can have specialists writing and maintaining >>> code...but I hope once all the tools are in place this will seem much >>> less daunting. There certainly seems to be the right sentiment for it. >>> >> >> At the very least creating and testing models would be much simpler. >> For weeks I've been wanting to see if gmm is the same as gee by >> fitting both models to the same dataset, but I've been putting it off >> because I didn't want to construct the design matrices by hand for >> such a simple question. (GMM--Generalized Method of Moments--is a >> standard econometrics model and GEE--Generalized Estimating >> Equations--is a standard biostatics model. They're both >> generalizations of quasi-likelihood and appear very similar, but I >> want to fit some models to figure out if they're exactly the same.) >> > > Oh, it's not *that* bad. I agree, of course, that it could be better, > but I've been using mainly Python for my work, including GMM and > estimating equations models (mainly empirical likelihood and > generalized maximum entropy) for the last ~two years. > Yes, I didn't mean to imply it was unusable. Merely that it's kinda time consuming but not fun to think about design matrices. I'm sure it becomes easier if you keep doing it for awhile. My main point was that it would be a simpler to try to put new models into statsmodels with the formula because it'd make testing easier. Since you could add/remove terms and interactions from the model in attempts to break the fitting procedure. -Chris JS > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Mon Aug 29 11:27:06 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Aug 2011 11:27:06 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold wrote: > On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire > wrote: >> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>> wrote: >>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>> which are summarized in the README file. >>>>>> >>>>>> ?From the README: >>>>>> >>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>> but then you lose all the advantages of IPython." >>>>>> >>>>>> >>>>>> >>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>> automatically inserting spaces: >>>>>> >>>>>> In [5]: %cpaste >>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>> :if 1>0: >>>>>> : ? ?print 'hi' >>>>>> :-- >>>>>> hi >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Jason >>>>>> >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>>>> >>>>> This strikes me as a textbook example of why we need an integrated >>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>> a chance and see if there are some places where pandas would really >>>>> help out. >>>> >>>> We used to have a formula class is scipy.stats and I do not follow >>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>> had this (extremely flexible but very hard to comprehend). It was what >>>> I had argued was needed ages ago for statsmodel. But it needs a >>>> community effort because the syntax required serves multiple >>>> communities with different annotations and needs. That is also seen >>>> from the different approaches taken by the stats packages from S/R, >>>> SAS, Genstat (and those are just are ones I have used). >>>> >>> >>> We have held this discussion at _great_ length multiple times on the >>> statsmodels list and are in the process of trying to integrate >>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>> the statsmodels base. >>> >>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>> >>> and more recently >>> >>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>> >>> https://github.com/statsmodels/formula >>> https://github.com/statsmodels/charlton >>> >>> Wes and I made some effort to go through this at SciPy. From where I >>> sit, I think it's difficult to disentangle the data structures from >>> the formula implementation, or maybe I'd just prefer to finish >>> tackling the former because it's much more straightforward. So I'd >>> like to first finish the pandas-integration branch that we've started >>> and then focus on the formula support. This is on my (our, I hope...) >>> immediate long-term goal list. Then I'd like to come back to the >>> community and hash out the 'rules of the game' details for formulas >>> after we have some code for people to play with, which promises to be >>> "fun." >>> >>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>> >>> FWIW, I could also improve the categorical function to be much nicer >>> for the given examples (ie., take a list, drop a reference category), >>> but I don't know that it's worth it, because it's really just a >>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>> more stop-gap? >>> >> >> I want more usability, but I agree that a stop-gap probably isn't the >> right way to go, unless it has things we'd eventually want anyways. >> >>> If I understand Chris' concerns, I think pandas + formula will go a >>> long way towards bridging the gap between Python and R usability, but >> >> Yes, I agree. pandas + formulas would go a long, long way towards more >> usability. >> >> Though I really, really want a scatterplot smoother (i.e., lowess) in >> statsmodels. I use it a lot, and the final part of my R file was >> entirely lowess. (And, I should add, that was the part people liked >> best since one of the main goals of the assignment was to generate >> nifty pictures that could be used to summarize the data.) >> > > Working my way through the pull requests. Very time poor... > >>> it's a large effort and there are only a handful (at best) of people >>> writing code -- Wes being the only one who's more or less "full time" >>> as far as I can tell. The 0.4 statsmodels release should be very >>> exciting though, I hope. I'm looking forward to it, at least. Then >>> there's only the small problem of building an infrastructure and >>> community like CRAN so we can have specialists writing and maintaining >>> code...but I hope once all the tools are in place this will seem much >>> less daunting. There certainly seems to be the right sentiment for it. >>> >> >> At the very least creating and testing models would be much simpler. >> For weeks I've been wanting to see if gmm is the same as gee by >> fitting both models to the same dataset, but I've been putting it off >> because I didn't want to construct the design matrices by hand for >> such a simple question. (GMM--Generalized Method of Moments--is a >> standard econometrics model and GEE--Generalized Estimating >> Equations--is a standard biostatics model. They're both >> generalizations of quasi-likelihood and appear very similar, but I >> want to fit some models to figure out if they're exactly the same.) Since GMM is still in the sandbox, the interface is not very polished, and it's missing some enhancements. I recommend asking on the mailing list if it's not clear. Note GMM itself is very general and will never be a quick interactive method. The main work will always be to define the moment conditions (a bit similar to non-linear function estimation, optimize.leastsq). There are and will be special subclasses, eg. IV2SLS, that have predefined moment conditions, but, still, it's up to the user do construct design and instrument arrays. And as far as I remember, the GMM/GEE package in R doesn't have a formula interface either. Josef >> > > Oh, it's not *that* bad. I agree, of course, that it could be better, > but I've been using mainly Python for my work, including GMM and > estimating equations models (mainly empirical likelihood and > generalized maximum entropy) for the last ~two years. > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cjordan1 at uw.edu Mon Aug 29 11:34:06 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Mon, 29 Aug 2011 10:34:06 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 10:27 AM, wrote: > On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold wrote: >> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire >> wrote: >>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>>> wrote: >>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>>> which are summarized in the README file. >>>>>>> >>>>>>> ?From the README: >>>>>>> >>>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>>> but then you lose all the advantages of IPython." >>>>>>> >>>>>>> >>>>>>> >>>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>>> automatically inserting spaces: >>>>>>> >>>>>>> In [5]: %cpaste >>>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>>> :if 1>0: >>>>>>> : ? ?print 'hi' >>>>>>> :-- >>>>>>> hi >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Jason >>>>>>> >>>>>>> _______________________________________________ >>>>>>> SciPy-User mailing list >>>>>>> SciPy-User at scipy.org >>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>>> >>>>>> >>>>>> This strikes me as a textbook example of why we need an integrated >>>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>>> a chance and see if there are some places where pandas would really >>>>>> help out. >>>>> >>>>> We used to have a formula class is scipy.stats and I do not follow >>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>>> had this (extremely flexible but very hard to comprehend). It was what >>>>> I had argued was needed ages ago for statsmodel. But it needs a >>>>> community effort because the syntax required serves multiple >>>>> communities with different annotations and needs. That is also seen >>>>> from the different approaches taken by the stats packages from S/R, >>>>> SAS, Genstat (and those are just are ones I have used). >>>>> >>>> >>>> We have held this discussion at _great_ length multiple times on the >>>> statsmodels list and are in the process of trying to integrate >>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>>> the statsmodels base. >>>> >>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>>> >>>> and more recently >>>> >>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>>> >>>> https://github.com/statsmodels/formula >>>> https://github.com/statsmodels/charlton >>>> >>>> Wes and I made some effort to go through this at SciPy. From where I >>>> sit, I think it's difficult to disentangle the data structures from >>>> the formula implementation, or maybe I'd just prefer to finish >>>> tackling the former because it's much more straightforward. So I'd >>>> like to first finish the pandas-integration branch that we've started >>>> and then focus on the formula support. This is on my (our, I hope...) >>>> immediate long-term goal list. Then I'd like to come back to the >>>> community and hash out the 'rules of the game' details for formulas >>>> after we have some code for people to play with, which promises to be >>>> "fun." >>>> >>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>>> >>>> FWIW, I could also improve the categorical function to be much nicer >>>> for the given examples (ie., take a list, drop a reference category), >>>> but I don't know that it's worth it, because it's really just a >>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>>> more stop-gap? >>>> >>> >>> I want more usability, but I agree that a stop-gap probably isn't the >>> right way to go, unless it has things we'd eventually want anyways. >>> >>>> If I understand Chris' concerns, I think pandas + formula will go a >>>> long way towards bridging the gap between Python and R usability, but >>> >>> Yes, I agree. pandas + formulas would go a long, long way towards more >>> usability. >>> >>> Though I really, really want a scatterplot smoother (i.e., lowess) in >>> statsmodels. I use it a lot, and the final part of my R file was >>> entirely lowess. (And, I should add, that was the part people liked >>> best since one of the main goals of the assignment was to generate >>> nifty pictures that could be used to summarize the data.) >>> >> >> Working my way through the pull requests. Very time poor... >> >>>> it's a large effort and there are only a handful (at best) of people >>>> writing code -- Wes being the only one who's more or less "full time" >>>> as far as I can tell. The 0.4 statsmodels release should be very >>>> exciting though, I hope. I'm looking forward to it, at least. Then >>>> there's only the small problem of building an infrastructure and >>>> community like CRAN so we can have specialists writing and maintaining >>>> code...but I hope once all the tools are in place this will seem much >>>> less daunting. There certainly seems to be the right sentiment for it. >>>> >>> >>> At the very least creating and testing models would be much simpler. >>> For weeks I've been wanting to see if gmm is the same as gee by >>> fitting both models to the same dataset, but I've been putting it off >>> because I didn't want to construct the design matrices by hand for >>> such a simple question. (GMM--Generalized Method of Moments--is a >>> standard econometrics model and GEE--Generalized Estimating >>> Equations--is a standard biostatics model. They're both >>> generalizations of quasi-likelihood and appear very similar, but I >>> want to fit some models to figure out if they're exactly the same.) > > Since GMM is still in the sandbox, the interface is not very polished, > and it's missing some enhancements. I recommend asking on the mailing > list if it's not clear. > > Note GMM itself is very general and will never be a quick interactive > method. The main work will always be to define the moment conditions > (a bit similar to non-linear function estimation, optimize.leastsq). > > There are and will be special subclasses, eg. IV2SLS, that have > predefined moment conditions, but, still, it's up to the user do > construct design and instrument arrays. > And as far as I remember, the GMM/GEE package in R doesn't have a > formula interface either. > Both of the two gee packages in R I know of have formula interfaces. http://cran.r-project.org/web/packages/geepack/ http://cran.r-project.org/web/packages/gee/index.html -Chris JS > Josef > >>> >> >> Oh, it's not *that* bad. I agree, of course, that it could be better, >> but I've been using mainly Python for my work, including GMM and >> estimating equations models (mainly empirical likelihood and >> generalized maximum entropy) for the last ~two years. >> >> Skipper >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jjstickel at vcn.com Mon Aug 29 11:36:32 2011 From: jjstickel at vcn.com (Jonathan Stickel) Date: Mon, 29 Aug 2011 09:36:32 -0600 Subject: [SciPy-User] R vs Python for simple interactive data, analysis In-Reply-To: References: Message-ID: <4E5BB200.1060300@vcn.com> On 8/29/11 08:57 , scipy-user-request at scipy.org wrote: > Though I really, really want a scatterplot smoother (i.e., lowess) in > statsmodels. I use it a lot, and the final part of my R file was > entirely lowess. (And, I should add, that was the part people liked > best since one of the main goals of the assignment was to generate > nifty pictures that could be used to summarize the data.) I have an interest in smoothing methods and created the scikits.datasmooth package: http://pypi.python.org/pypi/scikits.datasmooth/ Right now it just contains a regularization method, but it might be a good place for loess/lowess if someone is interested in contributing it there. From a google search it seems that there are some implementations floating around. Alternatively, I would be satisfied with moving my smoothing by regularization code over to another module/package if it would get more use. Regards, Jonathan From josef.pktd at gmail.com Mon Aug 29 11:42:32 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Aug 2011 11:42:32 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire wrote: > On Mon, Aug 29, 2011 at 10:27 AM, ? wrote: >> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold wrote: >>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire >>> wrote: >>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>>>> wrote: >>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>>>> which are summarized in the README file. >>>>>>>> >>>>>>>> ?From the README: >>>>>>>> >>>>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>>>> but then you lose all the advantages of IPython." >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>>>> automatically inserting spaces: >>>>>>>> >>>>>>>> In [5]: %cpaste >>>>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>>>> :if 1>0: >>>>>>>> : ? ?print 'hi' >>>>>>>> :-- >>>>>>>> hi >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Jason >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> SciPy-User mailing list >>>>>>>> SciPy-User at scipy.org >>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>>>> >>>>>>> >>>>>>> This strikes me as a textbook example of why we need an integrated >>>>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>>>> a chance and see if there are some places where pandas would really >>>>>>> help out. >>>>>> >>>>>> We used to have a formula class is scipy.stats and I do not follow >>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>>>> had this (extremely flexible but very hard to comprehend). It was what >>>>>> I had argued was needed ages ago for statsmodel. But it needs a >>>>>> community effort because the syntax required serves multiple >>>>>> communities with different annotations and needs. That is also seen >>>>>> from the different approaches taken by the stats packages from S/R, >>>>>> SAS, Genstat (and those are just are ones I have used). >>>>>> >>>>> >>>>> We have held this discussion at _great_ length multiple times on the >>>>> statsmodels list and are in the process of trying to integrate >>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>>>> the statsmodels base. >>>>> >>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>>>> >>>>> and more recently >>>>> >>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>>>> >>>>> https://github.com/statsmodels/formula >>>>> https://github.com/statsmodels/charlton >>>>> >>>>> Wes and I made some effort to go through this at SciPy. From where I >>>>> sit, I think it's difficult to disentangle the data structures from >>>>> the formula implementation, or maybe I'd just prefer to finish >>>>> tackling the former because it's much more straightforward. So I'd >>>>> like to first finish the pandas-integration branch that we've started >>>>> and then focus on the formula support. This is on my (our, I hope...) >>>>> immediate long-term goal list. Then I'd like to come back to the >>>>> community and hash out the 'rules of the game' details for formulas >>>>> after we have some code for people to play with, which promises to be >>>>> "fun." >>>>> >>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>>>> >>>>> FWIW, I could also improve the categorical function to be much nicer >>>>> for the given examples (ie., take a list, drop a reference category), >>>>> but I don't know that it's worth it, because it's really just a >>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>>>> more stop-gap? >>>>> >>>> >>>> I want more usability, but I agree that a stop-gap probably isn't the >>>> right way to go, unless it has things we'd eventually want anyways. >>>> >>>>> If I understand Chris' concerns, I think pandas + formula will go a >>>>> long way towards bridging the gap between Python and R usability, but >>>> >>>> Yes, I agree. pandas + formulas would go a long, long way towards more >>>> usability. >>>> >>>> Though I really, really want a scatterplot smoother (i.e., lowess) in >>>> statsmodels. I use it a lot, and the final part of my R file was >>>> entirely lowess. (And, I should add, that was the part people liked >>>> best since one of the main goals of the assignment was to generate >>>> nifty pictures that could be used to summarize the data.) >>>> >>> >>> Working my way through the pull requests. Very time poor... >>> >>>>> it's a large effort and there are only a handful (at best) of people >>>>> writing code -- Wes being the only one who's more or less "full time" >>>>> as far as I can tell. The 0.4 statsmodels release should be very >>>>> exciting though, I hope. I'm looking forward to it, at least. Then >>>>> there's only the small problem of building an infrastructure and >>>>> community like CRAN so we can have specialists writing and maintaining >>>>> code...but I hope once all the tools are in place this will seem much >>>>> less daunting. There certainly seems to be the right sentiment for it. >>>>> >>>> >>>> At the very least creating and testing models would be much simpler. >>>> For weeks I've been wanting to see if gmm is the same as gee by >>>> fitting both models to the same dataset, but I've been putting it off >>>> because I didn't want to construct the design matrices by hand for >>>> such a simple question. (GMM--Generalized Method of Moments--is a >>>> standard econometrics model and GEE--Generalized Estimating >>>> Equations--is a standard biostatics model. They're both >>>> generalizations of quasi-likelihood and appear very similar, but I >>>> want to fit some models to figure out if they're exactly the same.) >> >> Since GMM is still in the sandbox, the interface is not very polished, >> and it's missing some enhancements. I recommend asking on the mailing >> list if it's not clear. >> >> Note GMM itself is very general and will never be a quick interactive >> method. The main work will always be to define the moment conditions >> (a bit similar to non-linear function estimation, optimize.leastsq). >> >> There are and will be special subclasses, eg. IV2SLS, that have >> predefined moment conditions, but, still, it's up to the user do >> construct design and instrument arrays. >> And as far as I remember, the GMM/GEE package in R doesn't have a >> formula interface either. >> > > Both of the two gee packages in R I know of have formula interfaces. > > http://cran.r-project.org/web/packages/geepack/ > http://cran.r-project.org/web/packages/gee/index.html I have to look at this. I mixed up some acronyms, I meant GEL and GMM http://cran.r-project.org/web/packages/gmm/index.html the vignette was one of my readings, and the STATA description for GMM. I never really looked at GEE. (That's Skipper's private work so far.) Josef > > -Chris JS > >> Josef >> >>>> >>> >>> Oh, it's not *that* bad. I agree, of course, that it could be better, >>> but I've been using mainly Python for my work, including GMM and >>> estimating equations models (mainly empirical likelihood and >>> generalized maximum entropy) for the last ~two years. >>> >>> Skipper >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Mon Aug 29 12:27:49 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 29 Aug 2011 12:27:49 -0400 Subject: [SciPy-User] R vs Python for simple interactive data, analysis In-Reply-To: <4E5BB200.1060300@vcn.com> References: <4E5BB200.1060300@vcn.com> Message-ID: On Mon, Aug 29, 2011 at 11:36 AM, Jonathan Stickel wrote: > On 8/29/11 08:57 , scipy-user-request at scipy.org wrote: >> Though I really, really want a scatterplot smoother (i.e., lowess) in >> statsmodels. I use it a lot, and the final part of my R file was >> entirely lowess. (And, I should add, that was the part people liked >> best since one of the main goals of the assignment was to generate >> nifty pictures that could be used to summarize the data.) > > I have an interest in smoothing methods and created the > scikits.datasmooth package: > > http://pypi.python.org/pypi/scikits.datasmooth/ > > Right now it just contains a regularization method, but it might be a > good place for loess/lowess if someone is interested in contributing it > there. ?From a google search it seems that there are some > implementations floating around. ?Alternatively, I would be satisfied > with moving my smoothing by regularization code over to another > module/package if it would get more use. > Chris has a pending pull request for lowess in statsmodels. https://github.com/statsmodels/statsmodels/pull/5 Perhaps there is some desire for keeping these tools together? I don't know what's out there well enough. Skipper From josef.pktd at gmail.com Mon Aug 29 12:59:08 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Aug 2011 12:59:08 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 11:42 AM, wrote: > On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire > wrote: >> On Mon, Aug 29, 2011 at 10:27 AM, ? wrote: >>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold wrote: >>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire >>>> wrote: >>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>>>>> wrote: >>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>>>>> which are summarized in the README file. >>>>>>>>> >>>>>>>>> ?From the README: >>>>>>>>> >>>>>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>>>>> but then you lose all the advantages of IPython." >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>>>>> automatically inserting spaces: >>>>>>>>> >>>>>>>>> In [5]: %cpaste >>>>>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>>>>> :if 1>0: >>>>>>>>> : ? ?print 'hi' >>>>>>>>> :-- >>>>>>>>> hi >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Jason >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> SciPy-User mailing list >>>>>>>>> SciPy-User at scipy.org >>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>>>>> >>>>>>>> >>>>>>>> This strikes me as a textbook example of why we need an integrated >>>>>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>>>>> a chance and see if there are some places where pandas would really >>>>>>>> help out. >>>>>>> >>>>>>> We used to have a formula class is scipy.stats and I do not follow >>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>>>>> had this (extremely flexible but very hard to comprehend). It was what >>>>>>> I had argued was needed ages ago for statsmodel. But it needs a >>>>>>> community effort because the syntax required serves multiple >>>>>>> communities with different annotations and needs. That is also seen >>>>>>> from the different approaches taken by the stats packages from S/R, >>>>>>> SAS, Genstat (and those are just are ones I have used). >>>>>>> >>>>>> >>>>>> We have held this discussion at _great_ length multiple times on the >>>>>> statsmodels list and are in the process of trying to integrate >>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>>>>> the statsmodels base. >>>>>> >>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>>>>> >>>>>> and more recently >>>>>> >>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>>>>> >>>>>> https://github.com/statsmodels/formula >>>>>> https://github.com/statsmodels/charlton >>>>>> >>>>>> Wes and I made some effort to go through this at SciPy. From where I >>>>>> sit, I think it's difficult to disentangle the data structures from >>>>>> the formula implementation, or maybe I'd just prefer to finish >>>>>> tackling the former because it's much more straightforward. So I'd >>>>>> like to first finish the pandas-integration branch that we've started >>>>>> and then focus on the formula support. This is on my (our, I hope...) >>>>>> immediate long-term goal list. Then I'd like to come back to the >>>>>> community and hash out the 'rules of the game' details for formulas >>>>>> after we have some code for people to play with, which promises to be >>>>>> "fun." >>>>>> >>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>>>>> >>>>>> FWIW, I could also improve the categorical function to be much nicer >>>>>> for the given examples (ie., take a list, drop a reference category), >>>>>> but I don't know that it's worth it, because it's really just a >>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>>>>> more stop-gap? >>>>>> >>>>> >>>>> I want more usability, but I agree that a stop-gap probably isn't the >>>>> right way to go, unless it has things we'd eventually want anyways. >>>>> >>>>>> If I understand Chris' concerns, I think pandas + formula will go a >>>>>> long way towards bridging the gap between Python and R usability, but >>>>> >>>>> Yes, I agree. pandas + formulas would go a long, long way towards more >>>>> usability. >>>>> >>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in >>>>> statsmodels. I use it a lot, and the final part of my R file was >>>>> entirely lowess. (And, I should add, that was the part people liked >>>>> best since one of the main goals of the assignment was to generate >>>>> nifty pictures that could be used to summarize the data.) >>>>> >>>> >>>> Working my way through the pull requests. Very time poor... >>>> >>>>>> it's a large effort and there are only a handful (at best) of people >>>>>> writing code -- Wes being the only one who's more or less "full time" >>>>>> as far as I can tell. The 0.4 statsmodels release should be very >>>>>> exciting though, I hope. I'm looking forward to it, at least. Then >>>>>> there's only the small problem of building an infrastructure and >>>>>> community like CRAN so we can have specialists writing and maintaining >>>>>> code...but I hope once all the tools are in place this will seem much >>>>>> less daunting. There certainly seems to be the right sentiment for it. >>>>>> >>>>> >>>>> At the very least creating and testing models would be much simpler. >>>>> For weeks I've been wanting to see if gmm is the same as gee by >>>>> fitting both models to the same dataset, but I've been putting it off >>>>> because I didn't want to construct the design matrices by hand for >>>>> such a simple question. (GMM--Generalized Method of Moments--is a >>>>> standard econometrics model and GEE--Generalized Estimating >>>>> Equations--is a standard biostatics model. They're both >>>>> generalizations of quasi-likelihood and appear very similar, but I >>>>> want to fit some models to figure out if they're exactly the same.) >>> >>> Since GMM is still in the sandbox, the interface is not very polished, >>> and it's missing some enhancements. I recommend asking on the mailing >>> list if it's not clear. >>> >>> Note GMM itself is very general and will never be a quick interactive >>> method. The main work will always be to define the moment conditions >>> (a bit similar to non-linear function estimation, optimize.leastsq). >>> >>> There are and will be special subclasses, eg. IV2SLS, that have >>> predefined moment conditions, but, still, it's up to the user do >>> construct design and instrument arrays. >>> And as far as I remember, the GMM/GEE package in R doesn't have a >>> formula interface either. >>> >> >> Both of the two gee packages in R I know of have formula interfaces. >> >> http://cran.r-project.org/web/packages/geepack/ >> http://cran.r-project.org/web/packages/gee/index.html This is very different from what's in GMM in statsmodels so far. The help file is very short, so I'm mostly guessing. It seems to be for (a subset) of generalized linear models with longitudinal/panel covariance structures. Something like this will eventually (once we get panel data models) as a special case of GMM in statsmodels, assuming it's similar to what I know from the econometrics literature. Most of the subclasses of GMM that I currently have, are focused on instrumental variable estimation, including non-linear regression. This should be expanded over time. But GMM itself is designed for subclassing by someone who wants to use her/his own moment conditions, as in http://cran.r-project.org/web/packages/gmm/index.html or for us to implement specific models with it. If someone wants to use it, then I have to quickly add the options for the kernels of the weighting matrix, which I keep postponing. Currently there is only a truncated, uniform kernel that assumes observations are order by time, but users can provide their own weighting function. Josef > > I have to look at this. I mixed up some acronyms, I meant GEL and GMM > http://cran.r-project.org/web/packages/gmm/index.html > the vignette was one of my readings, and the STATA description for GMM. > > I never really looked at GEE. (That's Skipper's private work so far.) > > Josef > >> >> -Chris JS >> >>> Josef >>> >>>>> >>>> >>>> Oh, it's not *that* bad. I agree, of course, that it could be better, >>>> but I've been using mainly Python for my work, including GMM and >>>> estimating equations models (mainly empirical likelihood and >>>> generalized maximum entropy) for the last ~two years. >>>> >>>> Skipper >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From josef.pktd at gmail.com Mon Aug 29 13:13:48 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Aug 2011 13:13:48 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 12:59 PM, wrote: > On Mon, Aug 29, 2011 at 11:42 AM, ? wrote: >> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire >> wrote: >>> On Mon, Aug 29, 2011 at 10:27 AM, ? wrote: >>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold wrote: >>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire >>>>> wrote: >>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>>>>>> wrote: >>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>>>>>> which are summarized in the README file. >>>>>>>>>> >>>>>>>>>> ?From the README: >>>>>>>>>> >>>>>>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>>>>>> but then you lose all the advantages of IPython." >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>>>>>> automatically inserting spaces: >>>>>>>>>> >>>>>>>>>> In [5]: %cpaste >>>>>>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>>>>>> :if 1>0: >>>>>>>>>> : ? ?print 'hi' >>>>>>>>>> :-- >>>>>>>>>> hi >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Jason >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> SciPy-User mailing list >>>>>>>>>> SciPy-User at scipy.org >>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>>>>>> >>>>>>>>> >>>>>>>>> This strikes me as a textbook example of why we need an integrated >>>>>>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>>>>>> a chance and see if there are some places where pandas would really >>>>>>>>> help out. >>>>>>>> >>>>>>>> We used to have a formula class is scipy.stats and I do not follow >>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>>>>>> had this (extremely flexible but very hard to comprehend). It was what >>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a >>>>>>>> community effort because the syntax required serves multiple >>>>>>>> communities with different annotations and needs. That is also seen >>>>>>>> from the different approaches taken by the stats packages from S/R, >>>>>>>> SAS, Genstat (and those are just are ones I have used). >>>>>>>> >>>>>>> >>>>>>> We have held this discussion at _great_ length multiple times on the >>>>>>> statsmodels list and are in the process of trying to integrate >>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>>>>>> the statsmodels base. >>>>>>> >>>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>>>>>> >>>>>>> and more recently >>>>>>> >>>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>>>>>> >>>>>>> https://github.com/statsmodels/formula >>>>>>> https://github.com/statsmodels/charlton >>>>>>> >>>>>>> Wes and I made some effort to go through this at SciPy. From where I >>>>>>> sit, I think it's difficult to disentangle the data structures from >>>>>>> the formula implementation, or maybe I'd just prefer to finish >>>>>>> tackling the former because it's much more straightforward. So I'd >>>>>>> like to first finish the pandas-integration branch that we've started >>>>>>> and then focus on the formula support. This is on my (our, I hope...) >>>>>>> immediate long-term goal list. Then I'd like to come back to the >>>>>>> community and hash out the 'rules of the game' details for formulas >>>>>>> after we have some code for people to play with, which promises to be >>>>>>> "fun." >>>>>>> >>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>>>>>> >>>>>>> FWIW, I could also improve the categorical function to be much nicer >>>>>>> for the given examples (ie., take a list, drop a reference category), >>>>>>> but I don't know that it's worth it, because it's really just a >>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>>>>>> more stop-gap? >>>>>>> >>>>>> >>>>>> I want more usability, but I agree that a stop-gap probably isn't the >>>>>> right way to go, unless it has things we'd eventually want anyways. >>>>>> >>>>>>> If I understand Chris' concerns, I think pandas + formula will go a >>>>>>> long way towards bridging the gap between Python and R usability, but >>>>>> >>>>>> Yes, I agree. pandas + formulas would go a long, long way towards more >>>>>> usability. >>>>>> >>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in >>>>>> statsmodels. I use it a lot, and the final part of my R file was >>>>>> entirely lowess. (And, I should add, that was the part people liked >>>>>> best since one of the main goals of the assignment was to generate >>>>>> nifty pictures that could be used to summarize the data.) >>>>>> >>>>> >>>>> Working my way through the pull requests. Very time poor... >>>>> >>>>>>> it's a large effort and there are only a handful (at best) of people >>>>>>> writing code -- Wes being the only one who's more or less "full time" >>>>>>> as far as I can tell. The 0.4 statsmodels release should be very >>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then >>>>>>> there's only the small problem of building an infrastructure and >>>>>>> community like CRAN so we can have specialists writing and maintaining >>>>>>> code...but I hope once all the tools are in place this will seem much >>>>>>> less daunting. There certainly seems to be the right sentiment for it. >>>>>>> >>>>>> >>>>>> At the very least creating and testing models would be much simpler. >>>>>> For weeks I've been wanting to see if gmm is the same as gee by >>>>>> fitting both models to the same dataset, but I've been putting it off >>>>>> because I didn't want to construct the design matrices by hand for >>>>>> such a simple question. (GMM--Generalized Method of Moments--is a >>>>>> standard econometrics model and GEE--Generalized Estimating >>>>>> Equations--is a standard biostatics model. They're both >>>>>> generalizations of quasi-likelihood and appear very similar, but I >>>>>> want to fit some models to figure out if they're exactly the same.) >>>> >>>> Since GMM is still in the sandbox, the interface is not very polished, >>>> and it's missing some enhancements. I recommend asking on the mailing >>>> list if it's not clear. >>>> >>>> Note GMM itself is very general and will never be a quick interactive >>>> method. The main work will always be to define the moment conditions >>>> (a bit similar to non-linear function estimation, optimize.leastsq). >>>> >>>> There are and will be special subclasses, eg. IV2SLS, that have >>>> predefined moment conditions, but, still, it's up to the user do >>>> construct design and instrument arrays. >>>> And as far as I remember, the GMM/GEE package in R doesn't have a >>>> formula interface either. >>>> >>> >>> Both of the two gee packages in R I know of have formula interfaces. >>> >>> http://cran.r-project.org/web/packages/geepack/ >>> http://cran.r-project.org/web/packages/gee/index.html > > This is very different from what's in GMM in statsmodels so far. The > help file is very short, so I'm mostly guessing. > It seems to be for (a subset) of generalized linear models with > longitudinal/panel covariance structures. Something like this will > eventually (once we get panel data models) ?as a special case of GMM > in statsmodels, assuming it's similar to what I know from the > econometrics literature. > > Most of the subclasses of GMM that I currently have, are focused on > instrumental variable estimation, including non-linear regression. > This should be expanded over time. > > But GMM itself is designed for subclassing by someone who wants to use > her/his own moment conditions, as in > http://cran.r-project.org/web/packages/gmm/index.html > or for us to implement specific models with it. > > If someone wants to use it, then I have to quickly add the options for > the kernels of the weighting matrix, which I keep postponing. > Currently there is only a truncated, uniform kernel that assumes > observations are order by time, but users can provide their own > weighting function. > > Josef > >> >> I have to look at this. I mixed up some acronyms, I meant GEL and GMM >> http://cran.r-project.org/web/packages/gmm/index.html >> the vignette was one of my readings, and the STATA description for GMM. >> >> I never really looked at GEE. (That's Skipper's private work so far.) >> >> Josef >> >>> >>> -Chris JS >>> >>>> Josef >>>> >>>>>> >>>>> >>>>> Oh, it's not *that* bad. I agree, of course, that it could be better, >>>>> but I've been using mainly Python for my work, including GMM and >>>>> estimating equations models (mainly empirical likelihood and >>>>> generalized maximum entropy) for the last ~two years. >>>>> >>>>> Skipper >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > just to make another point: Without someone adding mixed effects, hierachical, panel/longitudinal models, and .... it will not help to have a formula interface to them. (Thanks to Scott we will soon have survival) Josef From alacast at gmail.com Mon Aug 29 13:38:09 2011 From: alacast at gmail.com (Alacast) Date: Mon, 29 Aug 2011 18:38:09 +0100 Subject: [SciPy-User] Hilbert transform Message-ID: I'm doing some analyses on sets of real-valued time series in which I want to know the envelope/instantaneous amplitude of each series in the set. Consequently, I've been taking the Hilbert transform (using scipy.signal.hilbert), then taking the absolute value of the result. The problem is that sometimes this process is far too slow. These time series can have on the order of 10^5 to 10^6 data points, and the sets can have up to 128 time series. Some datasets have been taking an hour or hours to compute on a perfectly modern computing node (1TB of RAM, plenty of 2.27Ghz cores, etc.). Is this expected behavior? I learned that Scipy's Hilbert transform implementation uses FFT, and that Scipy's FFT implementation can run in O(n^2) time when the number of time points is prime. This happened in a few of my datasets, but I've now included a check and correction for that (drop the last data point, so now the number is even and consequently not prime). Still, I observe a good amount of variability in run times, and they are rather long. Thoughts? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Aug 29 14:06:02 2011 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 29 Aug 2011 13:06:02 -0500 Subject: [SciPy-User] Hilbert transform In-Reply-To: References: Message-ID: On Mon, Aug 29, 2011 at 12:38, Alacast wrote: > I'm doing some analyses on sets of real-valued time series in which I want > to know the envelope/instantaneous amplitude of each series in the set. > Consequently, I've been taking the Hilbert transform (using > scipy.signal.hilbert), then taking the absolute value of the result. > The problem is that sometimes this process is far too slow. These time > series can have on the order of 10^5 to 10^6 data points, and the sets can > have up to 128 time series. Some datasets have been taking an hour or hours > to compute on a perfectly modern computing node (1TB of RAM, plenty of > 2.27Ghz cores, etc.). Is this expected behavior? > I learned that Scipy's Hilbert transform implementation uses FFT, and that > Scipy's FFT implementation can run in O(n^2) time when the number of time > points is prime. This happened in a few of my datasets, but I've now > included a check and correction for that (drop the last data point, so now > the number is even and consequently not prime). Still, I observe a good > amount of variability in run times, and they are rather long. Thoughts? Having N be prime is just the extreme case. Basically, the FFT recursively computes the DFT. It can only recurse on integral factors of N, so any prime factor M must be computed the slow way, taking O(M^2) steps. You probably have large prime factors sitting around. A typical approach is to pad your signal with 0s until the next power of 2 or other reasonably-factorable size. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From otrov at hush.ai Sun Aug 28 10:04:25 2011 From: otrov at hush.ai (Kliment) Date: Sun, 28 Aug 2011 16:04:25 +0200 Subject: [SciPy-User] Return variable value by function value Message-ID: <20110828140425.E64D9E6719@smtp.hushmail.com> Thanks for your input guys So in similar cases I should use interpolation function (or solver depending on initial function) from SciPy package Example I provided was from scratch of course, but it seems that 0.95 is still in y range: >>> sqrt(1 - 98**2/10E+4) 0.95076811052958654 >>> sqrt(1 - 99**2/10E+4) 0.94973154101567037 Regards, Kliment From cjordan1 at uw.edu Mon Aug 29 16:55:08 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Mon, 29 Aug 2011 15:55:08 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: Message-ID: I've just pushed an updated version of the .r and .py files to github, as well as a summary of the corrections/suggestions from the mailing list. I'd appreciate any further comments/suggestions. Compared to the original .r and .py files, in these revised version: -The R code was cleaned up because I realized I didn't need to use as.factor if I made the relevant variables into factors -The python code was cleaned up by computing the 'sub-design matrices' associated with each factor variable before hand and stashing them in a dictionary -Names were added to the variables in the regression by creating them from the calls to sm.categorical and stashing them in a dictionary Notably, the helper fucntions and stashing of the pieces of design matrices simplified the calls for model fitting, but they didn't noticeably shorten the code. They also required a small increase in complexity. (In terms of the data structures and function calls used to create the list of names and the design matrices.) I also added some comments to the effect that: *one can use paste or cpaste in the IPython shell *np.set_printoptions or sm.iolib.SimpleTable can be used to help with printing of numpy arrays *names can be added by the user to regression model summaries *one can make helper functions to construct design matrices and keep track of names, but the simplest way of doing it isn't robust to subset-ing the data in the presence of categorical variables Did I miss anything? -Chris JS On Sat, Aug 27, 2011 at 1:19 PM, Christopher Jordan-Squire wrote: > Hi--I've been a moderately heavy R user for the past two years, so > about a month ago I took an (abbreviated) version of a simple data > analysis I did in R and tried to rewrite as much of it as possible, > line by line, into python using numpy and statsmodels. I didn't use > pandas, and I can't comment on how much it might have simplified > things. > > This comparison might be useful to some people, so I stuck it up on a > github repo. My overall impression is that R is much stronger for > interactive data analysis. Click on the link for more details why, > which are summarized in the README file. > > https://github.com/chrisjordansquire/r_vs_py > > The code examples should run out of the box with no downloads (other > than R, Python, numpy, scipy, and statsmodels) required. > > -Chris Jordan-Squire > From cjordan1 at uw.edu Mon Aug 29 17:03:00 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Mon, 29 Aug 2011 16:03:00 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 12:13 PM, wrote: > On Mon, Aug 29, 2011 at 12:59 PM, ? wrote: >> On Mon, Aug 29, 2011 at 11:42 AM, ? wrote: >>> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire >>> wrote: >>>> On Mon, Aug 29, 2011 at 10:27 AM, ? wrote: >>>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold wrote: >>>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire >>>>>> wrote: >>>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>>>>>>> wrote: >>>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>>>>>>> which are summarized in the README file. >>>>>>>>>>> >>>>>>>>>>> ?From the README: >>>>>>>>>>> >>>>>>>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>>>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>>>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>>>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>>>>>>> but then you lose all the advantages of IPython." >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>>>>>>> automatically inserting spaces: >>>>>>>>>>> >>>>>>>>>>> In [5]: %cpaste >>>>>>>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>>>>>>> :if 1>0: >>>>>>>>>>> : ? ?print 'hi' >>>>>>>>>>> :-- >>>>>>>>>>> hi >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Jason >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> SciPy-User mailing list >>>>>>>>>>> SciPy-User at scipy.org >>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This strikes me as a textbook example of why we need an integrated >>>>>>>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>>>>>>> a chance and see if there are some places where pandas would really >>>>>>>>>> help out. >>>>>>>>> >>>>>>>>> We used to have a formula class is scipy.stats and I do not follow >>>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>>>>>>> had this (extremely flexible but very hard to comprehend). It was what >>>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a >>>>>>>>> community effort because the syntax required serves multiple >>>>>>>>> communities with different annotations and needs. That is also seen >>>>>>>>> from the different approaches taken by the stats packages from S/R, >>>>>>>>> SAS, Genstat (and those are just are ones I have used). >>>>>>>>> >>>>>>>> >>>>>>>> We have held this discussion at _great_ length multiple times on the >>>>>>>> statsmodels list and are in the process of trying to integrate >>>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>>>>>>> the statsmodels base. >>>>>>>> >>>>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>>>>>>> >>>>>>>> and more recently >>>>>>>> >>>>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>>>>>>> >>>>>>>> https://github.com/statsmodels/formula >>>>>>>> https://github.com/statsmodels/charlton >>>>>>>> >>>>>>>> Wes and I made some effort to go through this at SciPy. From where I >>>>>>>> sit, I think it's difficult to disentangle the data structures from >>>>>>>> the formula implementation, or maybe I'd just prefer to finish >>>>>>>> tackling the former because it's much more straightforward. So I'd >>>>>>>> like to first finish the pandas-integration branch that we've started >>>>>>>> and then focus on the formula support. This is on my (our, I hope...) >>>>>>>> immediate long-term goal list. Then I'd like to come back to the >>>>>>>> community and hash out the 'rules of the game' details for formulas >>>>>>>> after we have some code for people to play with, which promises to be >>>>>>>> "fun." >>>>>>>> >>>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>>>>>>> >>>>>>>> FWIW, I could also improve the categorical function to be much nicer >>>>>>>> for the given examples (ie., take a list, drop a reference category), >>>>>>>> but I don't know that it's worth it, because it's really just a >>>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>>>>>>> more stop-gap? >>>>>>>> >>>>>>> >>>>>>> I want more usability, but I agree that a stop-gap probably isn't the >>>>>>> right way to go, unless it has things we'd eventually want anyways. >>>>>>> >>>>>>>> If I understand Chris' concerns, I think pandas + formula will go a >>>>>>>> long way towards bridging the gap between Python and R usability, but >>>>>>> >>>>>>> Yes, I agree. pandas + formulas would go a long, long way towards more >>>>>>> usability. >>>>>>> >>>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in >>>>>>> statsmodels. I use it a lot, and the final part of my R file was >>>>>>> entirely lowess. (And, I should add, that was the part people liked >>>>>>> best since one of the main goals of the assignment was to generate >>>>>>> nifty pictures that could be used to summarize the data.) >>>>>>> >>>>>> >>>>>> Working my way through the pull requests. Very time poor... >>>>>> >>>>>>>> it's a large effort and there are only a handful (at best) of people >>>>>>>> writing code -- Wes being the only one who's more or less "full time" >>>>>>>> as far as I can tell. The 0.4 statsmodels release should be very >>>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then >>>>>>>> there's only the small problem of building an infrastructure and >>>>>>>> community like CRAN so we can have specialists writing and maintaining >>>>>>>> code...but I hope once all the tools are in place this will seem much >>>>>>>> less daunting. There certainly seems to be the right sentiment for it. >>>>>>>> >>>>>>> >>>>>>> At the very least creating and testing models would be much simpler. >>>>>>> For weeks I've been wanting to see if gmm is the same as gee by >>>>>>> fitting both models to the same dataset, but I've been putting it off >>>>>>> because I didn't want to construct the design matrices by hand for >>>>>>> such a simple question. (GMM--Generalized Method of Moments--is a >>>>>>> standard econometrics model and GEE--Generalized Estimating >>>>>>> Equations--is a standard biostatics model. They're both >>>>>>> generalizations of quasi-likelihood and appear very similar, but I >>>>>>> want to fit some models to figure out if they're exactly the same.) >>>>> >>>>> Since GMM is still in the sandbox, the interface is not very polished, >>>>> and it's missing some enhancements. I recommend asking on the mailing >>>>> list if it's not clear. >>>>> >>>>> Note GMM itself is very general and will never be a quick interactive >>>>> method. The main work will always be to define the moment conditions >>>>> (a bit similar to non-linear function estimation, optimize.leastsq). >>>>> >>>>> There are and will be special subclasses, eg. IV2SLS, that have >>>>> predefined moment conditions, but, still, it's up to the user do >>>>> construct design and instrument arrays. >>>>> And as far as I remember, the GMM/GEE package in R doesn't have a >>>>> formula interface either. >>>>> >>>> >>>> Both of the two gee packages in R I know of have formula interfaces. >>>> >>>> http://cran.r-project.org/web/packages/geepack/ >>>> http://cran.r-project.org/web/packages/gee/index.html >> >> This is very different from what's in GMM in statsmodels so far. The >> help file is very short, so I'm mostly guessing. >> It seems to be for (a subset) of generalized linear models with >> longitudinal/panel covariance structures. Something like this will >> eventually (once we get panel data models) ?as a special case of GMM >> in statsmodels, assuming it's similar to what I know from the >> econometrics literature. >> >> Most of the subclasses of GMM that I currently have, are focused on >> instrumental variable estimation, including non-linear regression. >> This should be expanded over time. >> >> But GMM itself is designed for subclassing by someone who wants to use >> her/his own moment conditions, as in >> http://cran.r-project.org/web/packages/gmm/index.html >> or for us to implement specific models with it. >> >> If someone wants to use it, then I have to quickly add the options for >> the kernels of the weighting matrix, which I keep postponing. >> Currently there is only a truncated, uniform kernel that assumes >> observations are order by time, but users can provide their own >> weighting function. >> >> Josef >> >>> >>> I have to look at this. I mixed up some acronyms, I meant GEL and GMM >>> http://cran.r-project.org/web/packages/gmm/index.html >>> the vignette was one of my readings, and the STATA description for GMM. >>> >>> I never really looked at GEE. (That's Skipper's private work so far.) >>> >>> Josef >>> >>>> >>>> -Chris JS >>>> >>>>> Josef >>>>> >>>>>>> >>>>>> >>>>>> Oh, it's not *that* bad. I agree, of course, that it could be better, >>>>>> but I've been using mainly Python for my work, including GMM and >>>>>> estimating equations models (mainly empirical likelihood and >>>>>> generalized maximum entropy) for the last ~two years. >>>>>> >>>>>> Skipper >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >> > > just to make another point: > > Without someone adding mixed effects, hierachical, panel/longitudinal > models, and .... it will not help to have a formula interface to them. > (Thanks to Scott we will soon have survival) > I don't think I understand. I assumed that the formula framework is essentially orthogonal to the models themselves. In the sense that it should be simple to adapt a formula framework to new models. At least if they're some variety of linear model, and provided the formula framework is designed to allow for grouping syntax from the beginning. I think easy of extension to new models is a major goal, in fact, since we want it to be easy for people to contribute new models. -Chris JS > Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Mon Aug 29 17:51:02 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Aug 2011 17:51:02 -0400 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 5:03 PM, Christopher Jordan-Squire wrote: > On Mon, Aug 29, 2011 at 12:13 PM, ? wrote: >> On Mon, Aug 29, 2011 at 12:59 PM, ? wrote: >>> On Mon, Aug 29, 2011 at 11:42 AM, ? wrote: >>>> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire >>>> wrote: >>>>> On Mon, Aug 29, 2011 at 10:27 AM, ? wrote: >>>>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold wrote: >>>>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire >>>>>>> wrote: >>>>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold wrote: >>>>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey wrote: >>>>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney wrote: >>>>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout >>>>>>>>>>> wrote: >>>>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: >>>>>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a >>>>>>>>>>>>> github repo. My overall impression is that R is much stronger for >>>>>>>>>>>>> interactive data analysis. Click on the link for more details why, >>>>>>>>>>>>> which are summarized in the README file. >>>>>>>>>>>> >>>>>>>>>>>> ?From the README: >>>>>>>>>>>> >>>>>>>>>>>> "In fact, using Python without the IPython qtconsole is practically >>>>>>>>>>>> impossible for this sort of cut and paste, interactive analysis. >>>>>>>>>>>> The shell IPython doesn't allow it because it automatically adds >>>>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's >>>>>>>>>>>> alignment. Cutting and pasting works for the standard python shell, >>>>>>>>>>>> but then you lose all the advantages of IPython." >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it >>>>>>>>>>>> automatically inserting spaces: >>>>>>>>>>>> >>>>>>>>>>>> In [5]: %cpaste >>>>>>>>>>>> Pasting code; enter '--' alone on the line to stop. >>>>>>>>>>>> :if 1>0: >>>>>>>>>>>> : ? ?print 'hi' >>>>>>>>>>>> :-- >>>>>>>>>>>> hi >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Jason >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> SciPy-User mailing list >>>>>>>>>>>> SciPy-User at scipy.org >>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This strikes me as a textbook example of why we need an integrated >>>>>>>>>>> formula framework in statsmodels. I'll make a pass through when I get >>>>>>>>>>> a chance and see if there are some places where pandas would really >>>>>>>>>>> help out. >>>>>>>>>> >>>>>>>>>> We used to have a formula class is scipy.stats and I do not follow >>>>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also >>>>>>>>>> had this (extremely flexible but very hard to comprehend). It was what >>>>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a >>>>>>>>>> community effort because the syntax required serves multiple >>>>>>>>>> communities with different annotations and needs. That is also seen >>>>>>>>>> from the different approaches taken by the stats packages from S/R, >>>>>>>>>> SAS, Genstat (and those are just are ones I have used). >>>>>>>>>> >>>>>>>>> >>>>>>>>> We have held this discussion at _great_ length multiple times on the >>>>>>>>> statsmodels list and are in the process of trying to integrate >>>>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into >>>>>>>>> the statsmodels base. >>>>>>>>> >>>>>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework >>>>>>>>> >>>>>>>>> and more recently >>>>>>>>> >>>>>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931? >>>>>>>>> >>>>>>>>> https://github.com/statsmodels/formula >>>>>>>>> https://github.com/statsmodels/charlton >>>>>>>>> >>>>>>>>> Wes and I made some effort to go through this at SciPy. From where I >>>>>>>>> sit, I think it's difficult to disentangle the data structures from >>>>>>>>> the formula implementation, or maybe I'd just prefer to finish >>>>>>>>> tackling the former because it's much more straightforward. So I'd >>>>>>>>> like to first finish the pandas-integration branch that we've started >>>>>>>>> and then focus on the formula support. This is on my (our, I hope...) >>>>>>>>> immediate long-term goal list. Then I'd like to come back to the >>>>>>>>> community and hash out the 'rules of the game' details for formulas >>>>>>>>> after we have some code for people to play with, which promises to be >>>>>>>>> "fun." >>>>>>>>> >>>>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration >>>>>>>>> >>>>>>>>> FWIW, I could also improve the categorical function to be much nicer >>>>>>>>> for the given examples (ie., take a list, drop a reference category), >>>>>>>>> but I don't know that it's worth it, because it's really just a >>>>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on >>>>>>>>> more stop-gap? >>>>>>>>> >>>>>>>> >>>>>>>> I want more usability, but I agree that a stop-gap probably isn't the >>>>>>>> right way to go, unless it has things we'd eventually want anyways. >>>>>>>> >>>>>>>>> If I understand Chris' concerns, I think pandas + formula will go a >>>>>>>>> long way towards bridging the gap between Python and R usability, but >>>>>>>> >>>>>>>> Yes, I agree. pandas + formulas would go a long, long way towards more >>>>>>>> usability. >>>>>>>> >>>>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in >>>>>>>> statsmodels. I use it a lot, and the final part of my R file was >>>>>>>> entirely lowess. (And, I should add, that was the part people liked >>>>>>>> best since one of the main goals of the assignment was to generate >>>>>>>> nifty pictures that could be used to summarize the data.) >>>>>>>> >>>>>>> >>>>>>> Working my way through the pull requests. Very time poor... >>>>>>> >>>>>>>>> it's a large effort and there are only a handful (at best) of people >>>>>>>>> writing code -- Wes being the only one who's more or less "full time" >>>>>>>>> as far as I can tell. The 0.4 statsmodels release should be very >>>>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then >>>>>>>>> there's only the small problem of building an infrastructure and >>>>>>>>> community like CRAN so we can have specialists writing and maintaining >>>>>>>>> code...but I hope once all the tools are in place this will seem much >>>>>>>>> less daunting. There certainly seems to be the right sentiment for it. >>>>>>>>> >>>>>>>> >>>>>>>> At the very least creating and testing models would be much simpler. >>>>>>>> For weeks I've been wanting to see if gmm is the same as gee by >>>>>>>> fitting both models to the same dataset, but I've been putting it off >>>>>>>> because I didn't want to construct the design matrices by hand for >>>>>>>> such a simple question. (GMM--Generalized Method of Moments--is a >>>>>>>> standard econometrics model and GEE--Generalized Estimating >>>>>>>> Equations--is a standard biostatics model. They're both >>>>>>>> generalizations of quasi-likelihood and appear very similar, but I >>>>>>>> want to fit some models to figure out if they're exactly the same.) >>>>>> >>>>>> Since GMM is still in the sandbox, the interface is not very polished, >>>>>> and it's missing some enhancements. I recommend asking on the mailing >>>>>> list if it's not clear. >>>>>> >>>>>> Note GMM itself is very general and will never be a quick interactive >>>>>> method. The main work will always be to define the moment conditions >>>>>> (a bit similar to non-linear function estimation, optimize.leastsq). >>>>>> >>>>>> There are and will be special subclasses, eg. IV2SLS, that have >>>>>> predefined moment conditions, but, still, it's up to the user do >>>>>> construct design and instrument arrays. >>>>>> And as far as I remember, the GMM/GEE package in R doesn't have a >>>>>> formula interface either. >>>>>> >>>>> >>>>> Both of the two gee packages in R I know of have formula interfaces. >>>>> >>>>> http://cran.r-project.org/web/packages/geepack/ >>>>> http://cran.r-project.org/web/packages/gee/index.html >>> >>> This is very different from what's in GMM in statsmodels so far. The >>> help file is very short, so I'm mostly guessing. >>> It seems to be for (a subset) of generalized linear models with >>> longitudinal/panel covariance structures. Something like this will >>> eventually (once we get panel data models) ?as a special case of GMM >>> in statsmodels, assuming it's similar to what I know from the >>> econometrics literature. >>> >>> Most of the subclasses of GMM that I currently have, are focused on >>> instrumental variable estimation, including non-linear regression. >>> This should be expanded over time. >>> >>> But GMM itself is designed for subclassing by someone who wants to use >>> her/his own moment conditions, as in >>> http://cran.r-project.org/web/packages/gmm/index.html >>> or for us to implement specific models with it. >>> >>> If someone wants to use it, then I have to quickly add the options for >>> the kernels of the weighting matrix, which I keep postponing. >>> Currently there is only a truncated, uniform kernel that assumes >>> observations are order by time, but users can provide their own >>> weighting function. >>> >>> Josef >>> >>>> >>>> I have to look at this. I mixed up some acronyms, I meant GEL and GMM >>>> http://cran.r-project.org/web/packages/gmm/index.html >>>> the vignette was one of my readings, and the STATA description for GMM. >>>> >>>> I never really looked at GEE. (That's Skipper's private work so far.) >>>> >>>> Josef >>>> >>>>> >>>>> -Chris JS >>>>> >>>>>> Josef >>>>>> >>>>>>>> >>>>>>> >>>>>>> Oh, it's not *that* bad. I agree, of course, that it could be better, >>>>>>> but I've been using mainly Python for my work, including GMM and >>>>>>> estimating equations models (mainly empirical likelihood and >>>>>>> generalized maximum entropy) for the last ~two years. >>>>>>> >>>>>>> Skipper >>>>>>> _______________________________________________ >>>>>>> SciPy-User mailing list >>>>>>> SciPy-User at scipy.org >>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>>> >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>> >> >> just to make another point: >> >> Without someone adding mixed effects, hierachical, panel/longitudinal >> models, and .... it will not help to have a formula interface to them. >> (Thanks to Scott we will soon have survival) >> > > I don't think I understand. > > I assumed that the formula framework is essentially orthogonal to the > models themselves. In the sense that it should be simple to adapt a > formula framework to new models. At least if they're some variety of > linear model, and provided the formula framework is designed to allow > for grouping syntax from the beginning. I think easy of extension to > new models is a major goal, in fact, since we want it to be easy for > people to contribute new models. We still need to program the linear algebra to find the estimator, and we need to define and calculate all the result statistics for the different models. (generic GLS won't work well because of the nobs*nobs covariance matrix, I tried a little bit in the sandbox.) As an example: mixed effects model with REML, ... y = X*b + Z*g, with X fixed regressors/effects and Z random effects. assume design matrices X and Z are already constructed. Since I don't know the statistics literature well (in contrast to econometrics panel data), I started to translate a matlab version to help me understand this. But the results don't match up, and I haven't had access to matlab for a while now. And I think now literal translation of long matlab functions doesn't really help, compared to writing from a good textbook with checking of some crucial steps. It's only 250 lines of code, but dense, and I had spent quite some time on this. The standard solution of normal equation looks still simple, but that's just the beginning and writing the tests often takes almost as much time as writing the code. My experience for the things I don't know well: It takes 2 weeks of staring at it and playing with it, and then it ends up just as a few lines (or a few hundred lines) of code. The old mixed effects model with repeated measurements (EM algorithm) based on the original formula code still sits in the sandbox. It doesn't quite work, but the formula code makes it difficult to understand, and it would require a week or five to cleanup, enhance, test, ... Since neither Skipper nor I are specifically interested (in the sense of: It is not what we know and would use ourselves), it is still waiting there. The old survival is also still sitting in the sandbox, but Scott wrote a new version without formula, I looks like it is also soon ready for a pull request, or review leading up to a pull request. (I find Scott's version much easier to read because it uses basic python and numpy data structures, instead of several layers of formula abstraction.) Josef > > -Chris JS > > >> Josef >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- A non-text attachment was scrubbed... Name: mixed.py Type: text/x-python Size: 15805 bytes Desc: not available URL: From questions.anon at gmail.com Mon Aug 29 18:55:50 2011 From: questions.anon at gmail.com (questions anon) Date: Tue, 30 Aug 2011 08:55:50 +1000 Subject: [SciPy-User] memory error - numpy mean - netcdf4 In-Reply-To: References: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com> <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com> Message-ID: Thanks for all of the responses. I have tried adding in the code you mentioned (see below). I am not sure if I am putting it in the correct place? and I am now receiving another error: "UserWarning: Warning: converting a masked element to nan." Not sure if that is bringing me any closer? Any feedback will be greatly appreciated. from netCDF4 import Dataset import matplotlib.pyplot as plt import numpy as N from mpl_toolkits.basemap import Basemap import os MainFolder=r"E:/DSE_BushfireClimatologyProject/griddeddatasamples/GriddedData/T_SFC/" all_TSFC=[] for (path, dirs, files) in os.walk(MainFolder): for dir in dirs: print dir path=path+'/' for ncfile in files: if ncfile[-3:]=='.nc': ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') TSFC=ncfile.variables['T_SFC'][4::24,:,:] LAT=ncfile.variables['latitude'][:] LON=ncfile.variables['longitude'][:] #TIME=ncfile.variables['time'][:] fillvalue=ncfile.variables['T_SFC']._FillValue ncfile.close() array=N.true_divide(TSFC[0],len(TSFC)) for i in xrange(1, len(TSFC)-1,1): array=N.add(array, N.true_divide(array[i],len(TSFC))) #plot output summary stats map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') x,y=map(*N.meshgrid(LON,LAT)) plt.title('TSFC Mean at 3pm') ticks=[-5,0,5,10,15,20,25,30,35,40,45,50] CS = map.contourf(x,y,array, cmap=plt.cm.jet) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h]) plt.colorbar(CS,cax=cax, drawedges=True) plt.savefig((os.path.join(MainFolder, 'Mean.png'))) plt.show() plt.close() On Sat, Aug 27, 2011 at 10:54 AM, srean wrote: > > On Fri, Aug 26, 2011 at 2:33 PM, Phil Morefield wrote: > >> >> The formula you have written looks like you're collapsing everything into >> a single value. I think he's trying to average a bunch of 2D arrays into a >> single 2D array. >> > > You are correct, the form that I posted can be read as if it is for > updating single mean vector \mu, but you can use the same for an nd-array > trivially. Just have \mu and t as nd-arrays. m can be one too. Numpy > broadcasting will take care of the rest. > > One advantage is that it requires only a constant amount of memory for the > computation, you can even read the data in from an infinite pipe or > generator that yields a single vector or a matrix at a time (or bundles them > up m at a time). It will always be uptodate with the current estimate of the > means. In fact will work for any moment too. > > --srean > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbassi at gmail.com Mon Aug 29 18:58:44 2011 From: sbassi at gmail.com (Sebastian Bassi) Date: Mon, 29 Aug 2011 19:58:44 -0300 Subject: [SciPy-User] density map? Message-ID: Hello, I have a 2-D Numpy array with intensity data. I'd like to plot it like this http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif For each value in a position, it will be colored with a color, if the value is higher the color will be more intense (maybe from blue to red). All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were using functions instead of data from a matrix/array. Any idea? Best, SB. From josef.pktd at gmail.com Mon Aug 29 19:03:14 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Aug 2011 19:03:14 -0400 Subject: [SciPy-User] density map? In-Reply-To: References: Message-ID: On Mon, Aug 29, 2011 at 6:58 PM, Sebastian Bassi wrote: > Hello, > > I have a 2-D Numpy array with intensity data. > I'd like to plot it like this > http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif > For each value in a position, it will be colored with a color, if the > value is higher the color will be more intense (maybe from blue to > red). > All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were > using functions instead of data from a matrix/array. > Any idea? scipy.stats.gaussian_kde https://picasaweb.google.com/106983885143680349926/Joepy#5611180522655961714 or some other non-parametric density estimator Josef > Best, > SB. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Mon Aug 29 19:05:29 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Aug 2011 19:05:29 -0400 Subject: [SciPy-User] density map? In-Reply-To: References: Message-ID: On Mon, Aug 29, 2011 at 7:03 PM, wrote: > On Mon, Aug 29, 2011 at 6:58 PM, Sebastian Bassi wrote: >> Hello, >> >> I have a 2-D Numpy array with intensity data. >> I'd like to plot it like this >> http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif >> For each value in a position, it will be colored with a color, if the >> value is higher the color will be more intense (maybe from blue to >> red). >> All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were >> using functions instead of data from a matrix/array. >> Any idea? > > scipy.stats.gaussian_kde > > https://picasaweb.google.com/106983885143680349926/Joepy#5611180522655961714 > > or some other non-parametric density estimator That's not the right answer, I guess if you have already intensities, then you don't need to estimate the density anymore. Is it interpolation to a meshgrid that you need? Josef > > Josef > > >> Best, >> SB. >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From njs at pobox.com Mon Aug 29 19:19:55 2011 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 29 Aug 2011 16:19:55 -0700 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 2:51 PM, wrote: > As an example: ? mixed effects model with REML, ... > > y = X*b + Z*g, with X fixed regressors/effects and Z random effects. > assume design matrices X and Z are already constructed. > > Since I don't know the statistics literature well (in contrast to > econometrics panel data), I started to translate a matlab version to > help me understand this. > But the results don't match up, and I haven't had access to matlab for > a while now. > And I think now literal translation of long matlab functions doesn't > really help, compared to writing from a good textbook with checking of > some crucial steps. I found the "vignettes" that Doug Bates wrote alongside the lme4 package to be pretty good descriptions of the relevant implementation tricks: http://cran.r-project.org/web/packages/lme4/index.html -- Nathaniel From fspaolo at gmail.com Mon Aug 29 19:35:37 2011 From: fspaolo at gmail.com (Fernando Paolo) Date: Mon, 29 Aug 2011 16:35:37 -0700 Subject: [SciPy-User] density map? In-Reply-To: References: Message-ID: If what you want is to plot the intensities ("on a grid"), and you have a 2-D Numpy array (`data`) where the columns are (say) `x`, `y`, `z`, you can do: import numpy as np import matplotlib.pyplot as plt from matplotlib.mlab import griddata x = data[:,0] y = data[:,1] z = data[:,2] # define the grid: nx, ny == number of grid points xi = np.linspace(x.min(), x.max(), nx) yi = np.linspace(y.min(), y.max(), ny) # interpolate your data to a regular grid Zi = ml.griddata(x, y, z, xi, yi) # plot a continuous surface plt.contourf(xi, yi, Zi, 15, cmap=plt.cm.jet) plt.colorbar() plt.show() you can check: http://www.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data -Fernando On Mon, Aug 29, 2011 at 4:05 PM, wrote: > On Mon, Aug 29, 2011 at 7:03 PM, ? wrote: >> On Mon, Aug 29, 2011 at 6:58 PM, Sebastian Bassi wrote: >>> Hello, >>> >>> I have a 2-D Numpy array with intensity data. >>> I'd like to plot it like this >>> http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif >>> For each value in a position, it will be colored with a color, if the >>> value is higher the color will be more intense (maybe from blue to >>> red). >>> All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were >>> using functions instead of data from a matrix/array. >>> Any idea? >> >> scipy.stats.gaussian_kde >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5611180522655961714 >> >> or some other non-parametric density estimator > > That's not the right answer, I guess if you have already intensities, > then you don't need to estimate the density anymore. > > Is it interpolation to a meshgrid that you need? > > Josef > >> >> Josef >> >> >>> Best, >>> SB. >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Fernando Paolo Institute of Geophysics & Planetary Physics Scripps Institution of Oceanography University of California, San Diego 9500 Gilman Drive La Jolla, CA 92093-0225 From bsouthey at gmail.com Mon Aug 29 20:57:03 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 29 Aug 2011 19:57:03 -0500 Subject: [SciPy-User] R vs Python for simple interactive data analysis In-Reply-To: References: <4E595BAF.1080509@creativetrax.com> Message-ID: On Mon, Aug 29, 2011 at 6:19 PM, Nathaniel Smith wrote: > On Mon, Aug 29, 2011 at 2:51 PM, ? wrote: >> As an example: ? mixed effects model with REML, ... >> >> y = X*b + Z*g, with X fixed regressors/effects and Z random effects. >> assume design matrices X and Z are already constructed. >> >> Since I don't know the statistics literature well (in contrast to >> econometrics panel data), I started to translate a matlab version to >> help me understand this. >> But the results don't match up, and I haven't had access to matlab for >> a while now. >> And I think now literal translation of long matlab functions doesn't >> really help, compared to writing from a good textbook with checking of >> some crucial steps. > > I found the "vignettes" that Doug Bates wrote alongside the lme4 > package to be pretty good descriptions of the relevant implementation > tricks: http://cran.r-project.org/web/packages/lme4/index.html > > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Lots of memories... As Josef said, you need the formula to create: 1) The design matrix of the fixed effects - nothing special 2) The design matrix for the random effects - somewhat interesting 3) The variance-covariance structure of the random effects - 'lots of fun' 4) The variance-covariance structure of the residual effects - 'lots of fun' The combination of 3) and 4) addresses a huge range of models but it gets hard really quickly. That excludes methodology: 1) Maximum likelihood and restricted maximum likelihood are done via iterative MIVQUE in the file Josef provided. Basically you are iterating the mixed model equations so somewhat easy but rather slow. 2) R (Bates' with Lindstrom or Pinheiro) and SAS use second derivative methods (here Mixed procedure with REML or ML) - probably the fast approach 3) ASReml uses average information REML - neat approach but probably rather uncommon for the vast majority of people. But I don' recall Jonathan's approach with his formula code. Bruce From dbigbear at gmail.com Tue Aug 30 01:42:57 2011 From: dbigbear at gmail.com (Xiong Deng) Date: Tue, 30 Aug 2011 13:42:57 +0800 Subject: [SciPy-User] How to install cpufreq-selector Message-ID: Hi, I am installing numpy, scipy, atlas which requires disabling CPU throttling. http://math-atlas.sourceforge.net/atlas_install/ OS: * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux AS release 4 (Nahant Update 3) It is very strange that cpufreq-selector seems not exist in the system...I tried to install it myself, but cannot find any source code or install package of it on the internet... So how can get the CPU throttling disabled and how can I have cpufreq-selector installed ? Tried to manipulate the file /proc/acpi/processor/CPU/throttling but it does not exist. Thank you John -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Aug 30 02:28:02 2011 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 30 Aug 2011 01:28:02 -0500 Subject: [SciPy-User] How to install cpufreq-selector In-Reply-To: References: Message-ID: On Tue, Aug 30, 2011 at 00:42, Xiong Deng wrote: > Hi, > > I am installing numpy, scipy, atlas which requires disabling CPU throttling. > > http://math-atlas.sourceforge.net/atlas_install/ > > OS: > * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 > x86_64 x86_64 x86_64 GNU/Linux > Red Hat Enterprise Linux AS release 4 (Nahant Update 3) > > It is very strange that cpufreq-selector seems not exist in the system...I > tried to install it myself, but cannot find any source code or install > package of it on the internet... Googling suggests that it may be available on some Red Hat versions as cpufreq-utils: http://forums.fedoraforum.org/archive/index.php/t-92619.html Or it simply may not exist on RHEL4. > So how can get the CPU throttling disabled and how can I have > cpufreq-selector installed ? > > Tried to manipulate the file > /proc/acpi/processor/CPU/throttling > > but it does not exist. The above link has other locations on the filesystem where these settings may be manipulated directly. It can vary from version to version of the Linux kernel and even the configuration of the particular build of the kernel. You may need to ask on the RHEL4 support mailing lists to get better information. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From lists at hilboll.de Tue Aug 30 05:29:34 2011 From: lists at hilboll.de (lists at hilboll.de) Date: Tue, 30 Aug 2011 11:29:34 +0200 Subject: [SciPy-User] 2d spline interpolation with periodic boundaries Message-ID: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu> Hi there, I want to do 2d interpolation with periodic boundary conditions. Basically, I have data on a global (as in Earth) rectangular (as in degrees latitude/longitude) grid, and I'd like to interpolate to arbitrary points on the Earth's surface. So I need periodic boundary conditions in the zonal direction. Now, I'd like to look into ``scipy.interpolate.interp2d`` and ``scipy.interpolate.RectBivariateSpline``. Now my question is: Is it enough to give an Western boundary at 360? in addition to the Eastern values at 0? to really give me period boundary conditions? Another question is: How does RectBivariateSpline work? There's not much info in the docs as to what the function actually does, math-wise. Any help is greatly appreciated :) Cheers, Andreas. From lists at hilboll.de Tue Aug 30 06:01:06 2011 From: lists at hilboll.de (Andreas H.) Date: Tue, 30 Aug 2011 12:01:06 +0200 Subject: [SciPy-User] Calculation of weights depending on area Message-ID: Hi, again a question coming from analysis of geodata. Say, I have 3d (lat/lon/z) data, in the easiest case on a rectangular grid. Now I would like to re-grid these data to a new (again rectangular, in the simplest case) grid by calculating the volume-weighted mean of the original grid. So for each cell of the new grid, the algorithm should take the volume-weighted average of those grid cells from the first grid which "are part of" the new cell. Is there any algorithm in SciPy to do this? If not, do you have any suggestion on where to start? Perhaps there's some library from a more low-level language that could be wrapped? Any help is greatly appreciated :) Cheers, Andreas. From ralf.gommers at googlemail.com Tue Aug 30 08:11:18 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 30 Aug 2011 14:11:18 +0200 Subject: [SciPy-User] RectBivariateSpline In-Reply-To: References: Message-ID: On Sun, Aug 28, 2011 at 1:48 PM, ali franco wrote: > Can RectBivariateSpline be used to calculated derivatives and integrals? RectBivariateSpline has an "integral" method that should do what it says. bisplev should be able to evaluate derivates for you, you can feed it the RectBivariateSpline.tck attribute (which I just notice is undocumented). It may be useful for BivariateSpline to grow a "derivative" method that does this. A patch would be very welcome. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jerome.Kieffer at esrf.fr Tue Aug 30 09:06:38 2011 From: Jerome.Kieffer at esrf.fr (ESRF) Date: Tue, 30 Aug 2011 15:06:38 +0200 Subject: [SciPy-User] 2d spline interpolation with periodic boundaries In-Reply-To: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu> References: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: <20110830150638.5510422d.Jerome.Kieffer@esrf.fr> On Tue, 30 Aug 2011 11:29:34 +0200 lists at hilboll.de wrote: > Another question is: How does RectBivariateSpline work? There's not much > info in the docs as to what the function actually does, math-wise. > > Any help is greatly appreciated :) Hi, it is a wrapper for "FITPACK" from Dierckx http://www.netlib.org/dierckx/ Have a look at Fitpack's documentation to understand how it works (control points have to be ordered and other oddities ...) Cheers -- ESRF From pav at iki.fi Tue Aug 30 09:10:01 2011 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Aug 2011 13:10:01 +0000 (UTC) Subject: [SciPy-User] Calculation of weights depending on area References: Message-ID: Hi, Tue, 30 Aug 2011 12:01:06 +0200, Andreas H. wrote: > again a question coming from analysis of geodata. Say, I have 3d > (lat/lon/z) data, in the easiest case on a rectangular grid. Now I would > like to re-grid these data to a new (again rectangular, in the simplest > case) grid by calculating the volume-weighted mean of the original grid. [clip] Some suggestions: (i) For a rectangular grid, the operation in 3D seems to be a tensor product of 1D operations. If so, you can write it as follows data = regrid_volume_1d(x, x_new, data, axis=0) data = regrid_volume_1d(y, y_new, data, axis=1) data = regrid_volume_1d(z, z_new, data, axis=2) So it would be enough to first write a 1D version of the algorithm, and make it such that it can operate on one axis at a time. (ii) An implementation of the 1D version can probably done first in Python. Because it will (for 3D data) operate across slices with many points, the result should be fast enough. A function that may be useful here: numpy.searchsorted -- Pauli Virtanen From Dharhas.Pothina at twdb.state.tx.us Tue Aug 30 09:44:27 2011 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 30 Aug 2011 08:44:27 -0500 Subject: [SciPy-User] Calculate surface area and volume from intersection of volume and plane. Message-ID: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us> Hi, We have an old ArcGIS aml script that we are trying to replace. The original script takes the input from an ArcGIS TIN model (basically a 2D delaunay triangulation of irregular xy data points with z's defining the depth at each xy) and calculates the surface area and volume of the lake at different elevations (i.e. z cut planes) >From my googling it looks like I have options for the delaunay triangulation using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface area and volume calculations at various z planes once I have the triangulation. I would appreciate any pointers. thanks, - dharhas -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Aug 30 09:47:22 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Aug 2011 07:47:22 -0600 Subject: [SciPy-User] Calculation of weights depending on area In-Reply-To: References: Message-ID: On Tue, Aug 30, 2011 at 4:01 AM, Andreas H. wrote: > Hi, > > again a question coming from analysis of geodata. Say, I have 3d > (lat/lon/z) data, in the easiest case on a rectangular grid. Now I would > like to re-grid these data to a new (again rectangular, in the simplest > case) grid by calculating the volume-weighted mean of the original grid. > > So for each cell of the new grid, the algorithm should take the > volume-weighted average of those grid cells from the first grid which "are > part of" the new cell. > > Is there any algorithm in SciPy to do this? If not, do you have any > suggestion on where to start? Perhaps there's some library from a more > low-level language that could be wrapped? > > Any help is greatly appreciated :) > > Sounds vaguely like the drizzle algorithm from astronomy. Another approach would be to subsample and convolve, or smooth and resample. Choosing a suitable method will depend on the smoothness/sampling of the original data. For the original approach, if your sample points are on an evenly spaced grid you can use an fft approach. The sampled data gives rise to a periodic spectrum, multiplication by the transform of a rectangular spot gives the data convolved by 'pillars', essentially subsampling in the Fourier Domain. Or you can compute the overlaps as you originally proposed. I don't know of any software for that but someone is bound to have done it before. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Aug 30 10:42:33 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Aug 2011 08:42:33 -0600 Subject: [SciPy-User] Calculation of weights depending on area In-Reply-To: References: Message-ID: On Tue, Aug 30, 2011 at 7:47 AM, Charles R Harris wrote: > > > On Tue, Aug 30, 2011 at 4:01 AM, Andreas H. wrote: > >> Hi, >> >> again a question coming from analysis of geodata. Say, I have 3d >> (lat/lon/z) data, in the easiest case on a rectangular grid. Now I would >> like to re-grid these data to a new (again rectangular, in the simplest >> case) grid by calculating the volume-weighted mean of the original grid. >> >> So for each cell of the new grid, the algorithm should take the >> volume-weighted average of those grid cells from the first grid which "are >> part of" the new cell. >> >> Is there any algorithm in SciPy to do this? If not, do you have any >> suggestion on where to start? Perhaps there's some library from a more >> low-level language that could be wrapped? >> >> Any help is greatly appreciated :) >> >> > Sounds vaguely like the drizzle algorithm from astronomy. Another approach > would be to subsample and convolve, or smooth and resample. Choosing a > suitable method will depend on the smoothness/sampling of the original data. > > For the original approach, if your sample points are on an evenly spaced > grid you can use an fft approach. The sampled data gives rise to a periodic > spectrum, multiplication by the transform of a rectangular spot gives the > data convolved by 'pillars', essentially subsampling in the Fourier Domain. > > Or you can compute the overlaps as you originally proposed. I don't know of > any software for that but someone is bound to have done it before. > > I should mention that if you have a rectangular grid and the overlap is with a rectangle of the same shape as the basic grid, then I think bilinear interpolation will do what you want. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Aug 30 14:41:33 2011 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 30 Aug 2011 13:41:33 -0500 Subject: [SciPy-User] Calculate surface area and volume from intersection of volume and plane. In-Reply-To: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us> References: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us> Message-ID: On Tue, Aug 30, 2011 at 08:44, Dharhas Pothina wrote: > Hi, > > We have an old ArcGIS aml script that we are trying to replace. The original > script takes the input from an ArcGIS TIN model (basically a 2D delaunay > triangulation of irregular xy data points with z's defining the depth at > each xy) and calculates the surface area and volume of the lake at different > elevations (i.e. z cut planes) > > From my googling it looks like I have options for the delaunay triangulation > using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface > area and volume calculations at various z planes once I have the > triangulation. I would appreciate any pointers. Your previous email came through fine. There is no need to repeat it. It's relatively straightforward to find the polygon of intersection between the Z plane and the TIN. Just loop through the triangles and check each of the 3 sides to see if one end is above while the other end is below. Simple geometry determines the point of contact of that side. Join up the two sides into a line segment and add that to your list of line segments. The line segments join up into an irregular polygon, probably with holes. The area of this polygon can be found by a formula that you can Google for. E.g.: http://paulbourke.net/geometry/polyarea/ The volume can be calculated similarly. You can break up the volume into triangular prisms projecting up from each of the triangles in the TIN below the Z-plane. You can calculate the volume of each of those prisms easily. Just be sure to properly take into account the triangles that intersect the Z-plane. You only want to count the part that's below the Z-plane. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From deil.christoph at googlemail.com Tue Aug 30 16:15:17 2011 From: deil.christoph at googlemail.com (Christoph Deil) Date: Tue, 30 Aug 2011 22:15:17 +0200 Subject: [SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit Message-ID: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> I noticed that scipy.optimize.curve_fit returns parameter errors that don't scale with sigma, the standard deviation of ydata, as I expected. Here is a code snippet to illustrate my point, which fits a straight line to five data points: import numpy as np from scipy.optimize import curve_fit x = np.arange(5) y = np.array([1, -2, 1, -2, 1]) sigma = np.array([1, 2, 1, 2, 1]) def f(x, a, b): return a + b * x popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma) perr = np.sqrt(pcov.diagonal()) print('*** sigma = {0} ***'.format(sigma)) print('popt: {0}'.format(popt)) print('perr: {0}'.format(perr)) I get the following result: *** sigma = [1 2 1 2 1] *** popt: [ 5.71428536e-01 1.19956213e-08] perr: [ 0.93867933 0.40391117] Increasing sigma by a factor of 10, sigma = 10 * np.array([1, 2, 1, 2, 1]) I get the following result: *** sigma = [10 20 10 20 10] *** popt: [ 5.71428580e-01 -2.27625699e-09] perr: [ 0.93895295 0.37079075] The best-fit values stayed the same as expected. But the error on the slope b decreased by 8% (the error on the offset a didn't change much) I would have expected fit parameter errors to increase with increasing errors on the data!? Is this a bug? Looking at the source code I see that scipy.optimize.curve_fit multiplies the pcov obtained from scipy.optimize.leastsq by a factor s_sq: https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438 if (len(ydata) > len(p0)) and pcov is not None: s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) pcov = pcov * s_sq If so is it possible to add an explanation to http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html that pcov is multiplied with this s_sq factor and why that will give correct errors? After I noticed this issue I saw that this s_sq factor is mentioned in the cov_x return parameter description of leastsq, but I think it should be explained in curve_fit where it is applied, maybe leaving a reference in the cov_x leastsq description. Also it would be nice to mention the full_output option in the curve_fit docu, I only realized after looking at the source code that this was possible. Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Aug 30 17:25:21 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 30 Aug 2011 17:25:21 -0400 Subject: [SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> Message-ID: On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil wrote: > I noticed that scipy.optimize.curve_fit returns parameter errors that don't > scale with sigma, the standard deviation of ydata, as I expected. > Here is a code snippet to illustrate my point, which fits a straight line to > five data points: > import numpy as np > from scipy.optimize import curve_fit > x = np.arange(5) > y = np.array([1, -2, 1, -2, 1]) > sigma = np.array([1,? 2, 1,? 2, 1]) > def f(x, a, b): > ? ? return a + b * x > popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma) > perr = np.sqrt(pcov.diagonal()) > print('*** sigma = {0} ***'.format(sigma)) > print('popt: {0}'.format(popt)) > print('perr: {0}'.format(perr)) > I get the following result: > *** sigma = [1 2 1 2 1] *** > popt: [? 5.71428536e-01 ? 1.19956213e-08] > perr: [ 0.93867933? 0.40391117] > Increasing sigma by a factor of 10, > sigma = 10 * np.array([1,? 2, 1,? 2, 1]) > I get the following result: > *** sigma = [10 20 10 20 10] *** > popt: [? 5.71428580e-01? -2.27625699e-09] > perr: [ 0.93895295? 0.37079075] > The best-fit values stayed the same as expected. > But the error on the slope b?decreased by 8% (the error on the offset a > didn't change much) > I would have expected fit parameter errors to increase with increasing > errors on the data!? > Is this a bug? No bug in the formulas. I tested all of them when curve_fit was added. However in your example the numerical cov lacks quite a bit of precision. Trying your example with different starting values, I get a 0.05 difference in your perr (std of parameter estimates). Trying smaller xtol and ftol doesn't change anything. (?) Since it's linear >>> import scikits.statsmodels.api as sm >>> x = np.arange(5.) >>> y = np.array([1, -2, 1, -2, 1.]) >>> sigma = np.array([1, 2, 1, 2, 1.]) >>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit() >>> res.params array([ 5.71428571e-01, 1.11022302e-16]) >>> res.bse array([ 0.98609784, 0.38892223]) >>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./(sigma*10)**2).fit() >>> res.params array([ 5.71428571e-01, 1.94289029e-16]) >>> res.bse array([ 0.98609784, 0.38892223]) rescaling doesn't change parameter estimates nor perr Josef > Looking at the source code I see that scipy.optimize.curve_fit multiplies > the pcov obtained from scipy.optimize.leastsq by a factor s_sq: > https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438 > > ????if (len(ydata) > len(p0)) and pcov is not None: > ????????s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) > ????????pcov = pcov * s_sq > > If so is it possible to add an explanation to > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html > that pcov is multiplied with this s_sq factor and why that will give correct > errors? > After I noticed this issue I saw that this s_sq factor is mentioned in the > cov_x return parameter description of leastsq, > but I think it should be explained in curve_fit where it is applied, maybe > leaving a reference in the cov_x leastsq description. > > Also it would be nice to mention the full_output option in the curve_fit > docu, I only realized after looking at the source code that this was > possible. > Christoph > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From charlesr.harris at gmail.com Tue Aug 30 23:19:36 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Aug 2011 21:19:36 -0600 Subject: [SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> Message-ID: On Tue, Aug 30, 2011 at 2:15 PM, Christoph Deil < deil.christoph at googlemail.com> wrote: > I noticed that scipy.optimize.curve_fit returns parameter errors that don't > scale with sigma, the standard deviation of ydata, as I expected. > > Here is a code snippet to illustrate my point, which fits a straight line > to five data points: > import numpy as np > from scipy.optimize import curve_fit > x = np.arange(5) > y = np.array([1, -2, 1, -2, 1]) > sigma = np.array([1, 2, 1, 2, 1]) > def f(x, a, b): > return a + b * x > popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma) > perr = np.sqrt(pcov.diagonal()) > print('*** sigma = {0} ***'.format(sigma)) > print('popt: {0}'.format(popt)) > print('perr: {0}'.format(perr)) > > I get the following result: > *** sigma = [1 2 1 2 1] *** > popt: [ 5.71428536e-01 1.19956213e-08] > perr: [ 0.93867933 0.40391117] > > Increasing sigma by a factor of 10, > sigma = 10 * np.array([1, 2, 1, 2, 1]) > I get the following result: > *** sigma = [10 20 10 20 10] *** > popt: [ 5.71428580e-01 -2.27625699e-09] > perr: [ 0.93895295 0.37079075] > > The best-fit values stayed the same as expected. > But the error on the slope b decreased by 8% (the error on the offset a > didn't change much) > I would have expected fit parameter errors to increase with increasing > errors on the data!? > Is this a bug? > > Looking at the source code I see that scipy.optimize.curve_fit multiplies > the pcov obtained from scipy.optimize.leastsq by a factor s_sq: > https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438 > > if (len(ydata) > len(p0)) and pcov is not None: > s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) > pcov = pcov * s_sq > > If so is it possible to add an explanation to > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html > that pcov is multiplied with this s_sq factor and why that will give > correct errors? > > After I noticed this issue I saw that this s_sq factor is mentioned in the > cov_x return parameter description of leastsq, > but I think it should be explained in curve_fit where it is applied, maybe > leaving a reference in the cov_x leastsq description. > > Also it would be nice to mention the full_output option in the curve_fit > docu, I only realized after looking at the source code that this was > possible. > > Five points, minus two parameters, doesn't give you much accuracy in estimating the variance, look at the \Chi^2 distributionfor three degrees of freedom. Generally, you would like a few hundred points for this sort of thing. Note that the leastsq documentation about the cov is incorrect, it needs to be multiplied by the variance fo the residuals, not the standard deviation. Not to say that there isn't a bug here, just that the evidence is thin. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alacast at gmail.com Wed Aug 31 06:30:04 2011 From: alacast at gmail.com (Alacast) Date: Wed, 31 Aug 2011 11:30:04 +0100 Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 55 In-Reply-To: References: Message-ID: Hilbert transform: Padding with zeros to the next power of 2 sped it up greatly. Thanks! Is there any reason hilbert doesn't do that automatically, then remove the padding before returning the analytic signal? On Mon, Aug 29, 2011 at 10:02 PM, wrote: > Send SciPy-User mailing list submissions to > scipy-user at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/scipy-user > or, via email, send a message with subject or body 'help' to > scipy-user-request at scipy.org > > You can reach the person managing the list at > scipy-user-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-User digest..." > > > Today's Topics: > > 1. Re: R vs Python for simple interactive data analysis > (josef.pktd at gmail.com) > 2. Hilbert transform (Alacast) > 3. Re: Hilbert transform (Robert Kern) > 4. Re: Return variable value by function value (Kliment) > 5. Re: R vs Python for simple interactive data analysis > (Christopher Jordan-Squire) > 6. Re: R vs Python for simple interactive data analysis > (Christopher Jordan-Squire) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 29 Aug 2011 13:13:48 -0400 > From: josef.pktd at gmail.com > Subject: Re: [SciPy-User] R vs Python for simple interactive data > analysis > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset=ISO-8859-1 > > On Mon, Aug 29, 2011 at 12:59 PM, wrote: > > On Mon, Aug 29, 2011 at 11:42 AM, ? wrote: > >> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire > >> wrote: > >>> On Mon, Aug 29, 2011 at 10:27 AM, ? wrote: > >>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold < > jsseabold at gmail.com> wrote: > >>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire > >>>>> wrote: > >>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold < > jsseabold at gmail.com> wrote: > >>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey < > bsouthey at gmail.com> wrote: > >>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney < > wesmckinn at gmail.com> wrote: > >>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout > >>>>>>>>> wrote: > >>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: > >>>>>>>>>>> This comparison might be useful to some people, so I stuck it > up on a > >>>>>>>>>>> github repo. My overall impression is that R is much stronger > for > >>>>>>>>>>> interactive data analysis. Click on the link for more details > why, > >>>>>>>>>>> which are summarized in the README file. > >>>>>>>>>> > >>>>>>>>>> ?From the README: > >>>>>>>>>> > >>>>>>>>>> "In fact, using Python without the IPython qtconsole is > practically > >>>>>>>>>> impossible for this sort of cut and paste, interactive analysis. > >>>>>>>>>> The shell IPython doesn't allow it because it automatically adds > >>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted > code's > >>>>>>>>>> alignment. Cutting and pasting works for the standard python > shell, > >>>>>>>>>> but then you lose all the advantages of IPython." > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> You might use %cpaste in the ipython normal shell to paste > without it > >>>>>>>>>> automatically inserting spaces: > >>>>>>>>>> > >>>>>>>>>> In [5]: %cpaste > >>>>>>>>>> Pasting code; enter '--' alone on the line to stop. > >>>>>>>>>> :if 1>0: > >>>>>>>>>> : ? ?print 'hi' > >>>>>>>>>> :-- > >>>>>>>>>> hi > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> > >>>>>>>>>> Jason > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> SciPy-User mailing list > >>>>>>>>>> SciPy-User at scipy.org > >>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> This strikes me as a textbook example of why we need an > integrated > >>>>>>>>> formula framework in statsmodels. I'll make a pass through when I > get > >>>>>>>>> a chance and see if there are some places where pandas would > really > >>>>>>>>> help out. > >>>>>>>> > >>>>>>>> We used to have a formula class is scipy.stats and I do not follow > >>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it > also > >>>>>>>> had this (extremely flexible but very hard to comprehend). It was > what > >>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a > >>>>>>>> community effort because the syntax required serves multiple > >>>>>>>> communities with different annotations and needs. That is also > seen > >>>>>>>> from the different approaches taken by the stats packages from > S/R, > >>>>>>>> SAS, Genstat (and those are just are ones I have used). > >>>>>>>> > >>>>>>> > >>>>>>> We have held this discussion at _great_ length multiple times on > the > >>>>>>> statsmodels list and are in the process of trying to integrate > >>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) > into > >>>>>>> the statsmodels base. > >>>>>>> > >>>>>>> > http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework > >>>>>>> > >>>>>>> and more recently > >>>>>>> > >>>>>>> > https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931 > ? > >>>>>>> > >>>>>>> https://github.com/statsmodels/formula > >>>>>>> https://github.com/statsmodels/charlton > >>>>>>> > >>>>>>> Wes and I made some effort to go through this at SciPy. From where > I > >>>>>>> sit, I think it's difficult to disentangle the data structures from > >>>>>>> the formula implementation, or maybe I'd just prefer to finish > >>>>>>> tackling the former because it's much more straightforward. So I'd > >>>>>>> like to first finish the pandas-integration branch that we've > started > >>>>>>> and then focus on the formula support. This is on my (our, I > hope...) > >>>>>>> immediate long-term goal list. Then I'd like to come back to the > >>>>>>> community and hash out the 'rules of the game' details for formulas > >>>>>>> after we have some code for people to play with, which promises to > be > >>>>>>> "fun." > >>>>>>> > >>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration > >>>>>>> > >>>>>>> FWIW, I could also improve the categorical function to be much > nicer > >>>>>>> for the given examples (ie., take a list, drop a reference > category), > >>>>>>> but I don't know that it's worth it, because it's really just a > >>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts > on > >>>>>>> more stop-gap? > >>>>>>> > >>>>>> > >>>>>> I want more usability, but I agree that a stop-gap probably isn't > the > >>>>>> right way to go, unless it has things we'd eventually want anyways. > >>>>>> > >>>>>>> If I understand Chris' concerns, I think pandas + formula will go a > >>>>>>> long way towards bridging the gap between Python and R usability, > but > >>>>>> > >>>>>> Yes, I agree. pandas + formulas would go a long, long way towards > more > >>>>>> usability. > >>>>>> > >>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) > in > >>>>>> statsmodels. I use it a lot, and the final part of my R file was > >>>>>> entirely lowess. (And, I should add, that was the part people liked > >>>>>> best since one of the main goals of the assignment was to generate > >>>>>> nifty pictures that could be used to summarize the data.) > >>>>>> > >>>>> > >>>>> Working my way through the pull requests. Very time poor... > >>>>> > >>>>>>> it's a large effort and there are only a handful (at best) of > people > >>>>>>> writing code -- Wes being the only one who's more or less "full > time" > >>>>>>> as far as I can tell. The 0.4 statsmodels release should be very > >>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then > >>>>>>> there's only the small problem of building an infrastructure and > >>>>>>> community like CRAN so we can have specialists writing and > maintaining > >>>>>>> code...but I hope once all the tools are in place this will seem > much > >>>>>>> less daunting. There certainly seems to be the right sentiment for > it. > >>>>>>> > >>>>>> > >>>>>> At the very least creating and testing models would be much simpler. > >>>>>> For weeks I've been wanting to see if gmm is the same as gee by > >>>>>> fitting both models to the same dataset, but I've been putting it > off > >>>>>> because I didn't want to construct the design matrices by hand for > >>>>>> such a simple question. (GMM--Generalized Method of Moments--is a > >>>>>> standard econometrics model and GEE--Generalized Estimating > >>>>>> Equations--is a standard biostatics model. They're both > >>>>>> generalizations of quasi-likelihood and appear very similar, but I > >>>>>> want to fit some models to figure out if they're exactly the same.) > >>>> > >>>> Since GMM is still in the sandbox, the interface is not very polished, > >>>> and it's missing some enhancements. I recommend asking on the mailing > >>>> list if it's not clear. > >>>> > >>>> Note GMM itself is very general and will never be a quick interactive > >>>> method. The main work will always be to define the moment conditions > >>>> (a bit similar to non-linear function estimation, optimize.leastsq). > >>>> > >>>> There are and will be special subclasses, eg. IV2SLS, that have > >>>> predefined moment conditions, but, still, it's up to the user do > >>>> construct design and instrument arrays. > >>>> And as far as I remember, the GMM/GEE package in R doesn't have a > >>>> formula interface either. > >>>> > >>> > >>> Both of the two gee packages in R I know of have formula interfaces. > >>> > >>> http://cran.r-project.org/web/packages/geepack/ > >>> http://cran.r-project.org/web/packages/gee/index.html > > > > This is very different from what's in GMM in statsmodels so far. The > > help file is very short, so I'm mostly guessing. > > It seems to be for (a subset) of generalized linear models with > > longitudinal/panel covariance structures. Something like this will > > eventually (once we get panel data models) ?as a special case of GMM > > in statsmodels, assuming it's similar to what I know from the > > econometrics literature. > > > > Most of the subclasses of GMM that I currently have, are focused on > > instrumental variable estimation, including non-linear regression. > > This should be expanded over time. > > > > But GMM itself is designed for subclassing by someone who wants to use > > her/his own moment conditions, as in > > http://cran.r-project.org/web/packages/gmm/index.html > > or for us to implement specific models with it. > > > > If someone wants to use it, then I have to quickly add the options for > > the kernels of the weighting matrix, which I keep postponing. > > Currently there is only a truncated, uniform kernel that assumes > > observations are order by time, but users can provide their own > > weighting function. > > > > Josef > > > >> > >> I have to look at this. I mixed up some acronyms, I meant GEL and GMM > >> http://cran.r-project.org/web/packages/gmm/index.html > >> the vignette was one of my readings, and the STATA description for GMM. > >> > >> I never really looked at GEE. (That's Skipper's private work so far.) > >> > >> Josef > >> > >>> > >>> -Chris JS > >>> > >>>> Josef > >>>> > >>>>>> > >>>>> > >>>>> Oh, it's not *that* bad. I agree, of course, that it could be better, > >>>>> but I've been using mainly Python for my work, including GMM and > >>>>> estimating equations models (mainly empirical likelihood and > >>>>> generalized maximum entropy) for the last ~two years. > >>>>> > >>>>> Skipper > >>>>> _______________________________________________ > >>>>> SciPy-User mailing list > >>>>> SciPy-User at scipy.org > >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>> _______________________________________________ > >>>> SciPy-User mailing list > >>>> SciPy-User at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>> > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > >> > > > > just to make another point: > > Without someone adding mixed effects, hierachical, panel/longitudinal > models, and .... it will not help to have a formula interface to them. > (Thanks to Scott we will soon have survival) > > Josef > > > ------------------------------ > > Message: 2 > Date: Mon, 29 Aug 2011 18:38:09 +0100 > From: Alacast > Subject: [SciPy-User] Hilbert transform > To: scipy-user at scipy.org > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > I'm doing some analyses on sets of real-valued time series in which I want > to know the envelope/instantaneous amplitude of each series in the set. > Consequently, I've been taking the Hilbert transform (using > scipy.signal.hilbert), then taking the absolute value of the result. > > The problem is that sometimes this process is far too slow. These time > series can have on the order of 10^5 to 10^6 data points, and the sets can > have up to 128 time series. Some datasets have been taking an hour or hours > to compute on a perfectly modern computing node (1TB of RAM, plenty of > 2.27Ghz cores, etc.). Is this expected behavior? > > I learned that Scipy's Hilbert transform implementation uses FFT, and that > Scipy's FFT implementation can run in O(n^2) time when the number of time > points is prime. This happened in a few of my datasets, but I've now > included a check and correction for that (drop the last data point, so now > the number is even and consequently not prime). Still, I observe a good > amount of variability in run times, and they are rather long. Thoughts? > > Thanks! > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/7c59ef28/attachment-0001.html > > ------------------------------ > > Message: 3 > Date: Mon, 29 Aug 2011 13:06:02 -0500 > From: Robert Kern > Subject: Re: [SciPy-User] Hilbert transform > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset=UTF-8 > > On Mon, Aug 29, 2011 at 12:38, Alacast wrote: > > I'm doing some analyses on sets of real-valued time series in which I > want > > to know the envelope/instantaneous amplitude of each series in the set. > > Consequently, I've been taking the Hilbert transform (using > > scipy.signal.hilbert), then taking the absolute value of the result. > > The problem is that sometimes this process is far too slow. These time > > series can have on the order of 10^5 to 10^6 data points, and the sets > can > > have up to 128 time series. Some datasets have been taking an hour or > hours > > to compute on a perfectly modern computing node (1TB of RAM, plenty of > > 2.27Ghz cores, etc.). Is this expected behavior? > > I learned that Scipy's Hilbert transform implementation uses FFT, and > that > > Scipy's FFT implementation can run in O(n^2) time when the number of time > > points is prime. This happened in a few of my datasets, but I've now > > included a check and correction for that (drop the last data point, so > now > > the number is even and consequently not prime). Still, I observe a good > > amount of variability in run times, and they are rather long. Thoughts? > > Having N be prime is just the extreme case. Basically, the FFT > recursively computes the DFT. It can only recurse on integral factors > of N, so any prime factor M must be computed the slow way, taking > O(M^2) steps. You probably have large prime factors sitting around. A > typical approach is to pad your signal with 0s until the next power of > 2 or other reasonably-factorable size. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > > > ------------------------------ > > Message: 4 > Date: Sun, 28 Aug 2011 16:04:25 +0200 > From: "Kliment" > Subject: Re: [SciPy-User] Return variable value by function value > To: scipy-user at scipy.org > Message-ID: <20110828140425.E64D9E6719 at smtp.hushmail.com> > Content-Type: text/plain; charset="UTF-8" > > Thanks for your input guys > > So in similar cases I should use interpolation function (or solver > depending on initial function) from SciPy package > > Example I provided was from scratch of course, but it seems that > 0.95 is still in y range: > > >>> sqrt(1 - 98**2/10E+4) > 0.95076811052958654 > > >>> sqrt(1 - 99**2/10E+4) > 0.94973154101567037 > > > Regards, > Kliment > > > > ------------------------------ > > Message: 5 > Date: Mon, 29 Aug 2011 15:55:08 -0500 > From: Christopher Jordan-Squire > Subject: Re: [SciPy-User] R vs Python for simple interactive data > analysis > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset=ISO-8859-1 > > I've just pushed an updated version of the .r and .py files to github, > as well as a summary of the corrections/suggestions from the mailing > list. I'd appreciate any further comments/suggestions. > > Compared to the original .r and .py files, in these revised version: > -The R code was cleaned up because I realized I didn't need to use > as.factor if I made the relevant variables into factors > -The python code was cleaned up by computing the 'sub-design matrices' > associated with each factor variable before hand and stashing > them in a dictionary > -Names were added to the variables in the regression by creating them > from the calls to sm.categorical and stashing them in a dictionary > > Notably, the helper fucntions and stashing of the pieces of design matrices > simplified the calls for model fitting, but they didn't noticeably shorten > the code. They also required a small increase in complexity. (In terms of > the > data structures and function calls used to create the list of names and > the design matrices.) > > I also added some comments to the effect that: > *one can use paste or cpaste in the IPython shell > *np.set_printoptions or sm.iolib.SimpleTable can be used to help with > printing of numpy arrays > *names can be added by the user to regression model summaries > *one can make helper functions to construct design matrices and keep > track of names, but the simplest way of doing it isn't robust to > subset-ing the data in the presence of categorical variables > > Did I miss anything? > > -Chris JS > > > On Sat, Aug 27, 2011 at 1:19 PM, Christopher Jordan-Squire > wrote: > > Hi--I've been a moderately heavy R user for the past two years, so > > about a month ago I took an (abbreviated) version of a simple data > > analysis I did in R and tried to rewrite as much of it as possible, > > line by line, into python using numpy and statsmodels. I didn't use > > pandas, and I can't comment on how much it might have simplified > > things. > > > > This comparison might be useful to some people, so I stuck it up on a > > github repo. My overall impression is that R is much stronger for > > interactive data analysis. Click on the link for more details why, > > which are summarized in the README file. > > > > https://github.com/chrisjordansquire/r_vs_py > > > > The code examples should run out of the box with no downloads (other > > than R, Python, numpy, scipy, and statsmodels) required. > > > > -Chris Jordan-Squire > > > > > ------------------------------ > > Message: 6 > Date: Mon, 29 Aug 2011 16:03:00 -0500 > From: Christopher Jordan-Squire > Subject: Re: [SciPy-User] R vs Python for simple interactive data > analysis > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset=ISO-8859-1 > > On Mon, Aug 29, 2011 at 12:13 PM, wrote: > > On Mon, Aug 29, 2011 at 12:59 PM, ? wrote: > >> On Mon, Aug 29, 2011 at 11:42 AM, ? wrote: > >>> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire > >>> wrote: > >>>> On Mon, Aug 29, 2011 at 10:27 AM, ? wrote: > >>>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold < > jsseabold at gmail.com> wrote: > >>>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire > >>>>>> wrote: > >>>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold < > jsseabold at gmail.com> wrote: > >>>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey < > bsouthey at gmail.com> wrote: > >>>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney < > wesmckinn at gmail.com> wrote: > >>>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout > >>>>>>>>>> wrote: > >>>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote: > >>>>>>>>>>>> This comparison might be useful to some people, so I stuck it > up on a > >>>>>>>>>>>> github repo. My overall impression is that R is much stronger > for > >>>>>>>>>>>> interactive data analysis. Click on the link for more details > why, > >>>>>>>>>>>> which are summarized in the README file. > >>>>>>>>>>> > >>>>>>>>>>> ?From the README: > >>>>>>>>>>> > >>>>>>>>>>> "In fact, using Python without the IPython qtconsole is > practically > >>>>>>>>>>> impossible for this sort of cut and paste, interactive > analysis. > >>>>>>>>>>> The shell IPython doesn't allow it because it automatically > adds > >>>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted > code's > >>>>>>>>>>> alignment. Cutting and pasting works for the standard python > shell, > >>>>>>>>>>> but then you lose all the advantages of IPython." > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> You might use %cpaste in the ipython normal shell to paste > without it > >>>>>>>>>>> automatically inserting spaces: > >>>>>>>>>>> > >>>>>>>>>>> In [5]: %cpaste > >>>>>>>>>>> Pasting code; enter '--' alone on the line to stop. > >>>>>>>>>>> :if 1>0: > >>>>>>>>>>> : ? ?print 'hi' > >>>>>>>>>>> :-- > >>>>>>>>>>> hi > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> > >>>>>>>>>>> Jason > >>>>>>>>>>> > >>>>>>>>>>> _______________________________________________ > >>>>>>>>>>> SciPy-User mailing list > >>>>>>>>>>> SciPy-User at scipy.org > >>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> This strikes me as a textbook example of why we need an > integrated > >>>>>>>>>> formula framework in statsmodels. I'll make a pass through when > I get > >>>>>>>>>> a chance and see if there are some places where pandas would > really > >>>>>>>>>> help out. > >>>>>>>>> > >>>>>>>>> We used to have a formula class is scipy.stats and I do not > follow > >>>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it > also > >>>>>>>>> had this (extremely flexible but very hard to comprehend). It was > what > >>>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a > >>>>>>>>> community effort because the syntax required serves multiple > >>>>>>>>> communities with different annotations and needs. That is also > seen > >>>>>>>>> from the different approaches taken by the stats packages from > S/R, > >>>>>>>>> SAS, Genstat (and those are just are ones I have used). > >>>>>>>>> > >>>>>>>> > >>>>>>>> We have held this discussion at _great_ length multiple times on > the > >>>>>>>> statsmodels list and are in the process of trying to integrate > >>>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) > into > >>>>>>>> the statsmodels base. > >>>>>>>> > >>>>>>>> > http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework > >>>>>>>> > >>>>>>>> and more recently > >>>>>>>> > >>>>>>>> > https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931 > ? > >>>>>>>> > >>>>>>>> https://github.com/statsmodels/formula > >>>>>>>> https://github.com/statsmodels/charlton > >>>>>>>> > >>>>>>>> Wes and I made some effort to go through this at SciPy. From where > I > >>>>>>>> sit, I think it's difficult to disentangle the data structures > from > >>>>>>>> the formula implementation, or maybe I'd just prefer to finish > >>>>>>>> tackling the former because it's much more straightforward. So I'd > >>>>>>>> like to first finish the pandas-integration branch that we've > started > >>>>>>>> and then focus on the formula support. This is on my (our, I > hope...) > >>>>>>>> immediate long-term goal list. Then I'd like to come back to the > >>>>>>>> community and hash out the 'rules of the game' details for > formulas > >>>>>>>> after we have some code for people to play with, which promises to > be > >>>>>>>> "fun." > >>>>>>>> > >>>>>>>> > https://github.com/statsmodels/statsmodels/tree/pandas-integration > >>>>>>>> > >>>>>>>> FWIW, I could also improve the categorical function to be much > nicer > >>>>>>>> for the given examples (ie., take a list, drop a reference > category), > >>>>>>>> but I don't know that it's worth it, because it's really just a > >>>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts > on > >>>>>>>> more stop-gap? > >>>>>>>> > >>>>>>> > >>>>>>> I want more usability, but I agree that a stop-gap probably isn't > the > >>>>>>> right way to go, unless it has things we'd eventually want anyways. > >>>>>>> > >>>>>>>> If I understand Chris' concerns, I think pandas + formula will go > a > >>>>>>>> long way towards bridging the gap between Python and R usability, > but > >>>>>>> > >>>>>>> Yes, I agree. pandas + formulas would go a long, long way towards > more > >>>>>>> usability. > >>>>>>> > >>>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) > in > >>>>>>> statsmodels. I use it a lot, and the final part of my R file was > >>>>>>> entirely lowess. (And, I should add, that was the part people liked > >>>>>>> best since one of the main goals of the assignment was to generate > >>>>>>> nifty pictures that could be used to summarize the data.) > >>>>>>> > >>>>>> > >>>>>> Working my way through the pull requests. Very time poor... > >>>>>> > >>>>>>>> it's a large effort and there are only a handful (at best) of > people > >>>>>>>> writing code -- Wes being the only one who's more or less "full > time" > >>>>>>>> as far as I can tell. The 0.4 statsmodels release should be very > >>>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then > >>>>>>>> there's only the small problem of building an infrastructure and > >>>>>>>> community like CRAN so we can have specialists writing and > maintaining > >>>>>>>> code...but I hope once all the tools are in place this will seem > much > >>>>>>>> less daunting. There certainly seems to be the right sentiment for > it. > >>>>>>>> > >>>>>>> > >>>>>>> At the very least creating and testing models would be much > simpler. > >>>>>>> For weeks I've been wanting to see if gmm is the same as gee by > >>>>>>> fitting both models to the same dataset, but I've been putting it > off > >>>>>>> because I didn't want to construct the design matrices by hand for > >>>>>>> such a simple question. (GMM--Generalized Method of Moments--is a > >>>>>>> standard econometrics model and GEE--Generalized Estimating > >>>>>>> Equations--is a standard biostatics model. They're both > >>>>>>> generalizations of quasi-likelihood and appear very similar, but I > >>>>>>> want to fit some models to figure out if they're exactly the same.) > >>>>> > >>>>> Since GMM is still in the sandbox, the interface is not very > polished, > >>>>> and it's missing some enhancements. I recommend asking on the mailing > >>>>> list if it's not clear. > >>>>> > >>>>> Note GMM itself is very general and will never be a quick interactive > >>>>> method. The main work will always be to define the moment conditions > >>>>> (a bit similar to non-linear function estimation, optimize.leastsq). > >>>>> > >>>>> There are and will be special subclasses, eg. IV2SLS, that have > >>>>> predefined moment conditions, but, still, it's up to the user do > >>>>> construct design and instrument arrays. > >>>>> And as far as I remember, the GMM/GEE package in R doesn't have a > >>>>> formula interface either. > >>>>> > >>>> > >>>> Both of the two gee packages in R I know of have formula interfaces. > >>>> > >>>> http://cran.r-project.org/web/packages/geepack/ > >>>> http://cran.r-project.org/web/packages/gee/index.html > >> > >> This is very different from what's in GMM in statsmodels so far. The > >> help file is very short, so I'm mostly guessing. > >> It seems to be for (a subset) of generalized linear models with > >> longitudinal/panel covariance structures. Something like this will > >> eventually (once we get panel data models) ?as a special case of GMM > >> in statsmodels, assuming it's similar to what I know from the > >> econometrics literature. > >> > >> Most of the subclasses of GMM that I currently have, are focused on > >> instrumental variable estimation, including non-linear regression. > >> This should be expanded over time. > >> > >> But GMM itself is designed for subclassing by someone who wants to use > >> her/his own moment conditions, as in > >> http://cran.r-project.org/web/packages/gmm/index.html > >> or for us to implement specific models with it. > >> > >> If someone wants to use it, then I have to quickly add the options for > >> the kernels of the weighting matrix, which I keep postponing. > >> Currently there is only a truncated, uniform kernel that assumes > >> observations are order by time, but users can provide their own > >> weighting function. > >> > >> Josef > >> > >>> > >>> I have to look at this. I mixed up some acronyms, I meant GEL and GMM > >>> http://cran.r-project.org/web/packages/gmm/index.html > >>> the vignette was one of my readings, and the STATA description for GMM. > >>> > >>> I never really looked at GEE. (That's Skipper's private work so far.) > >>> > >>> Josef > >>> > >>>> > >>>> -Chris JS > >>>> > >>>>> Josef > >>>>> > >>>>>>> > >>>>>> > >>>>>> Oh, it's not *that* bad. I agree, of course, that it could be > better, > >>>>>> but I've been using mainly Python for my work, including GMM and > >>>>>> estimating equations models (mainly empirical likelihood and > >>>>>> generalized maximum entropy) for the last ~two years. > >>>>>> > >>>>>> Skipper > >>>>>> _______________________________________________ > >>>>>> SciPy-User mailing list > >>>>>> SciPy-User at scipy.org > >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> > >>>>> _______________________________________________ > >>>>> SciPy-User mailing list > >>>>> SciPy-User at scipy.org > >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>> _______________________________________________ > >>>> SciPy-User mailing list > >>>> SciPy-User at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>> > >>> > >> > > > > just to make another point: > > > > Without someone adding mixed effects, hierachical, panel/longitudinal > > models, and .... it will not help to have a formula interface to them. > > (Thanks to Scott we will soon have survival) > > > > I don't think I understand. > > I assumed that the formula framework is essentially orthogonal to the > models themselves. In the sense that it should be simple to adapt a > formula framework to new models. At least if they're some variety of > linear model, and provided the formula framework is designed to allow > for grouping syntax from the beginning. I think easy of extension to > new models is a major goal, in fact, since we want it to be easy for > people to contribute new models. > > -Chris JS > > > > Josef > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > ------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > End of SciPy-User Digest, Vol 96, Issue 55 > ****************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Wed Aug 31 08:17:51 2011 From: lists at hilboll.de (Andreas H.) Date: Wed, 31 Aug 2011 14:17:51 +0200 Subject: [SciPy-User] 2d spline interpolation with periodic boundaries In-Reply-To: <20110830150638.5510422d.Jerome.Kieffer@esrf.fr> References: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu> <20110830150638.5510422d.Jerome.Kieffer@esrf.fr> Message-ID: Hi Jerome, >> Another question is: How does RectBivariateSpline work? There's not much >> info in the docs as to what the function actually does, math-wise. > > it is a wrapper for "FITPACK" from Dierckx > http://www.netlib.org/dierckx/ > > Have a look at Fitpack's documentation to understand how it works (control > points have to be ordered and other oddities ...) RectBivariateSpline is defined in scipy/interpolate/fitpack2.py, where all I can find is a call to dfitpack.regrid_smth. However, in the FITPACK library, I cannot find a function by that name -- and I couldn't really find the source code to dfitpack.so to check how FITPACK actually gets called ... Any ideas? Cheers, Andreas. From Dharhas.Pothina at twdb.state.tx.us Wed Aug 31 08:20:52 2011 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 31 Aug 2011 07:20:52 -0500 Subject: [SciPy-User] Calculate surface area and volume from intersection of volume and plane. In-Reply-To: References: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us> Message-ID: <4E5DE0D40200009B0003DB7B@GWWEB.twdb.state.tx.us> Robert, sorry for the dup post and thanks for the pointers I think that gives me enough ideas to build something. - dharhas >>> Robert Kern 8/30/2011 1:41 PM >>> On Tue, Aug 30, 2011 at 08:44, Dharhas Pothina wrote: > Hi, > > We have an old ArcGIS aml script that we are trying to replace. The original > script takes the input from an ArcGIS TIN model (basically a 2D delaunay > triangulation of irregular xy data points with z's defining the depth at > each xy) and calculates the surface area and volume of the lake at different > elevations (i.e. z cut planes) > > From my googling it looks like I have options for the delaunay triangulation > using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface > area and volume calculations at various z planes once I have the > triangulation. I would appreciate any pointers. Your previous email came through fine. There is no need to repeat it. It's relatively straightforward to find the polygon of intersection between the Z plane and the TIN. Just loop through the triangles and check each of the 3 sides to see if one end is above while the other end is below. Simple geometry determines the point of contact of that side. Join up the two sides into a line segment and add that to your list of line segments. The line segments join up into an irregular polygon, probably with holes. The area of this polygon can be found by a formula that you can Google for. E.g.: http://paulbourke.net/geometry/polyarea/ The volume can be calculated similarly. You can break up the volume into triangular prisms projecting up from each of the triangles in the TIN below the Z-plane. You can calculate the volume of each of those prisms easily. Just be sure to properly take into account the triangles that intersect the Z-plane. You only want to count the part that's below the Z-plane. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From dbigbear at gmail.com Wed Aug 31 08:29:30 2011 From: dbigbear at gmail.com (Xiong Deng) Date: Wed, 31 Aug 2011 20:29:30 +0800 Subject: [SciPy-User] Problem with Python + Hadoop: how to link .so outside Python Message-ID: Hi, I have successfully installed scipy on my Python 2.7 on my local Linux, and I want to pack my Python2.7 (with scipy) onto Hadoop and run my Python MapReduce scipts, like this: 20 ${HADOOP_HOME}/bin/hadoop streaming \$ 21 -input "${input}" \$ 22 -output "${output}" \$ 23 -mapper "python27/bin/python27.sh rp_extractMap.py" \$ 24 -reducer "python27/bin/python27.sh rp_extractReduce.py" \$ 25 -partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner \$ 26 -file rp_extractMap.py \$ 27 -file rp_extractReduce.py \$ 28 -file shitu_conf.py \$ 29 -cacheArchive "/share/python27.tar.gz#python27" \$ 30 -outputformat org.apache.hadoop.mapred.TextOutputFormat \$ 31 -inputformat org.apache.hadoop.mapred.CombineTextInputFormat \$ 32 -jobconf mapred.max.split.size="512000000" \$ 33 -jobconf mapred.job.name="[reserve_price][rp_extract]" \$ 34 -jobconf mapred.job.priority=HIGH \$ 35 -jobconf mapred.job.map.capacity=1000 \$ 36 -jobconf mapred.job.reduce.capacity=200 \$ 37 -jobconf mapred.reduce.tasks=200$ 38 -jobconf num.key.fields.for.partition=2$ I have to do this, because the Hadoop server installed its own python of very low version which may not support some of my python scripts, and I do not have privelege to install scipy lib on the server. So,I have to use the -cacheArchieve command to use my own python2.7 with scipy.... But, I find out that some of the .so in scipy are linked to other dynamic libs outside Python2.7.. For example $ ldd ~/local/python-2.7.2/lib/python2.7/site-packages/scipy/linalg/flapack.so liblapack.so => /usr/local/atlas/lib/liblapack.so (0x0000002a956fd000) libatlas.so => /usr/local/atlas/lib/libatlas.so (0x0000002a95df3000) libgfortran.so.3 => /home/work/local/gcc-4.6.1/lib64/libgfortran.so.3 (0x0000002a9668d000) libm.so.6 => /lib64/tls/libm.so.6 (0x0000002a968b6000) libgcc_s.so.1 => /home/work/local/gcc-4.6.1/lib64/libgcc_s.so.1 (0x0000002a96a3c000) libquadmath.so.0 => /home/work/local/gcc-4.6.1/lib64/libquadmath.so.0 (0x0000002a96b51000) libc.so.6 => /lib64/tls/libc.so.6 (0x0000002a96c87000) libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000002a96ebb000) /lib64/ld-linux-x86-64.so.2 (0x000000552aaaa000) So, my question is: how can I include this libs? Should I search for all the linked .so and .a under my local linux and pack them together with Python2.7??? If yes, How can I get a full list of the libs needed and How can make Python2.7 know where to find the new libs?? Thanks Xiong -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Aug 31 09:54:01 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 31 Aug 2011 09:54:01 -0400 Subject: [SciPy-User] 2d spline interpolation with periodic boundaries In-Reply-To: References: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu> <20110830150638.5510422d.Jerome.Kieffer@esrf.fr> Message-ID: On Wed, Aug 31, 2011 at 8:17 AM, Andreas H. wrote: > Hi Jerome, > >>> Another question is: How does RectBivariateSpline work? There's not much >>> info in the docs as to what the function actually does, math-wise. >> >> it is a wrapper for "FITPACK" from Dierckx >> http://www.netlib.org/dierckx/ >> >> Have a look at Fitpack's documentation to understand how it works (control >> points have to be ordered and other oddities ?...) > > RectBivariateSpline is defined in scipy/interpolate/fitpack2.py, where all > I can find is a call to dfitpack.regrid_smth. However, in the FITPACK > library, I cannot find a function by that name -- and I couldn't really > find the source code to dfitpack.so to check how FITPACK actually gets > called ... > > Any ideas? dfitpack is created by f2py regrid_smth is defined in "scipy\interpolate\src\fitpack.pyf" and points to fortranname regrid "\scipy\interpolate\fitpack\regrid.f" Josef > > Cheers, > Andreas. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From deil.christoph at googlemail.com Wed Aug 31 11:09:20 2011 From: deil.christoph at googlemail.com (Christoph Deil) Date: Wed, 31 Aug 2011 17:09:20 +0200 Subject: [SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> Message-ID: On Aug 30, 2011, at 11:25 PM, josef.pktd at gmail.com wrote: > On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil > wrote: >> I noticed that scipy.optimize.curve_fit returns parameter errors that don't >> scale with sigma, the standard deviation of ydata, as I expected. >> Here is a code snippet to illustrate my point, which fits a straight line to >> five data points: >> import numpy as np >> from scipy.optimize import curve_fit >> x = np.arange(5) >> y = np.array([1, -2, 1, -2, 1]) >> sigma = np.array([1, 2, 1, 2, 1]) >> def f(x, a, b): >> return a + b * x >> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma) >> perr = np.sqrt(pcov.diagonal()) >> print('*** sigma = {0} ***'.format(sigma)) >> print('popt: {0}'.format(popt)) >> print('perr: {0}'.format(perr)) >> I get the following result: >> *** sigma = [1 2 1 2 1] *** >> popt: [ 5.71428536e-01 1.19956213e-08] >> perr: [ 0.93867933 0.40391117] >> Increasing sigma by a factor of 10, >> sigma = 10 * np.array([1, 2, 1, 2, 1]) >> I get the following result: >> *** sigma = [10 20 10 20 10] *** >> popt: [ 5.71428580e-01 -2.27625699e-09] >> perr: [ 0.93895295 0.37079075] >> The best-fit values stayed the same as expected. >> But the error on the slope b decreased by 8% (the error on the offset a >> didn't change much) >> I would have expected fit parameter errors to increase with increasing >> errors on the data!? >> Is this a bug? > > No bug in the formulas. I tested all of them when curve_fit was added. > > However in your example the numerical cov lacks quite a bit of > precision. Trying your example with different starting values, I get a > 0.05 difference in your perr (std of parameter estimates). > > Trying smaller xtol and ftol doesn't change anything. (?) Making ftol = 1e-15 very small I get a different wrong result: popt: [ 5.71428580e-01 -2.27625699e-09] perr: [ 0.92582011 0.59868281] What do I have to do to get a correct answer (say to 5 significant digits) from curve_fit for this simple example? > > Since it's linear > >>>> import scikits.statsmodels.api as sm >>>> x = np.arange(5.) >>>> y = np.array([1, -2, 1, -2, 1.]) >>>> sigma = np.array([1, 2, 1, 2, 1.]) >>>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit() >>>> res.params > array([ 5.71428571e-01, 1.11022302e-16]) >>>> res.bse > array([ 0.98609784, 0.38892223]) > >>>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./(sigma*10)**2).fit() >>>> res.params > array([ 5.71428571e-01, 1.94289029e-16]) >>>> res.bse > array([ 0.98609784, 0.38892223]) > > rescaling doesn't change parameter estimates nor perr This is what I don't understand. Why don't the parameter estimate errors increase with increasing errors sigma on the data points? If I have less precise measurements, the model parameters should be less constrained?! I was using MINUIT before I learned Scipy and the error definition for a chi2 fit given in the MINUIT User Guide http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html as well as the example results here http://code.google.com/p/pyminuit/wiki/GettingStartedGuide don't mention the factor s_sq that is used in curve_fit to scale pcov. Is the error definition in the MINUIT manual wrong? Can you point me to a web resource that explains why the s_sq factor needs to be applied to the covariance matrix? > > Josef > > PS: I've attached a script to fit the two examples using statsmodels, scipy and minuit (applying the s_sq factor myself). Here are the results I get (who's right for the first example? why does statsmodels only return on parameter value and error?): """Example from http://code.google.com/p/pyminuit/wiki/GettingStartedGuide""" x = np.array([1 , 2 , 3 , 4 ]) y = np.array([1.1, 2.1, 2.4, 4.3]) sigma = np.array([0.1, 0.1, 0.2, 0.1]) statsmodels.api.WLS popt: [ 1.04516129] perr: [ 0.0467711] scipy.optimize.curve_fit popt: [ 8.53964011e-08 1.04516128e+00] perr: [ 0.27452122 0.09784324] minuit popt: [-4.851674617611934e-14, 1.0451612903225629] perr: [ 0.33828315 0.12647671] """Example from http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html""" x = np.arange(5) y = np.array([1, -2, 1, -2, 1]) sigma = 10 * np.array([1, 2, 1, 2, 1]) statsmodels.api.WLS popt: [ 5.71428571e-01 7.63278329e-17] perr: [ 0.98609784 0.38892223] scipy.optimize.curve_fit popt: [ 5.71428662e-01 -8.73679511e-08] perr: [ 0.97804034 0.3818681 ] minuit popt: [0.5714285714294132, 2.1449508835758024e-13] perr: [ 0.98609784 0.38892223] > > >> Looking at the source code I see that scipy.optimize.curve_fit multiplies >> the pcov obtained from scipy.optimize.leastsq by a factor s_sq: >> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438 >> >> if (len(ydata) > len(p0)) and pcov is not None: >> s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) >> pcov = pcov * s_sq >> >> If so is it possible to add an explanation to >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html >> that pcov is multiplied with this s_sq factor and why that will give correct >> errors? >> After I noticed this issue I saw that this s_sq factor is mentioned in the >> cov_x return parameter description of leastsq, >> but I think it should be explained in curve_fit where it is applied, maybe >> leaving a reference in the cov_x leastsq description. >> >> Also it would be nice to mention the full_output option in the curve_fit >> docu, I only realized after looking at the source code that this was >> possible. >> Christoph >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: chi2_example.py Type: text/x-python-script Size: 1802 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Aug 31 12:10:52 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 31 Aug 2011 12:10:52 -0400 Subject: [SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> Message-ID: On Wed, Aug 31, 2011 at 11:09 AM, Christoph Deil wrote: > > On Aug 30, 2011, at 11:25 PM, josef.pktd at gmail.com wrote: > > On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil > wrote: > > I noticed that scipy.optimize.curve_fit returns parameter errors that don't > > scale with sigma, the standard deviation of ydata, as I expected. > > Here is a code snippet to illustrate my point, which fits a straight line to > > five data points: > > import numpy as np > > from scipy.optimize import curve_fit > > x = np.arange(5) > > y = np.array([1, -2, 1, -2, 1]) > > sigma = np.array([1,? 2, 1,? 2, 1]) > > def f(x, a, b): > > ? ? return a + b * x > > popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma) > > perr = np.sqrt(pcov.diagonal()) > > print('*** sigma = {0} ***'.format(sigma)) > > print('popt: {0}'.format(popt)) > > print('perr: {0}'.format(perr)) > > I get the following result: > > *** sigma = [1 2 1 2 1] *** > > popt: [? 5.71428536e-01 ? 1.19956213e-08] > > perr: [ 0.93867933? 0.40391117] > > Increasing sigma by a factor of 10, > > sigma = 10 * np.array([1,? 2, 1,? 2, 1]) > > I get the following result: > > *** sigma = [10 20 10 20 10] *** > > popt: [? 5.71428580e-01? -2.27625699e-09] > > perr: [ 0.93895295? 0.37079075] > > The best-fit values stayed the same as expected. > > But the error on the slope b?decreased by 8% (the error on the offset a > > didn't change much) > > I would have expected fit parameter errors to increase with increasing > > errors on the data!? > > Is this a bug? > > No bug in the formulas. I tested all of them when curve_fit was added. > > However in your example the numerical cov lacks quite a bit of > precision. Trying your example with different starting values, I get a > 0.05 difference in your perr (std of parameter estimates). > > Trying smaller xtol and ftol doesn't change anything. (?) > > Making ftol = 1e-15 very small I get a different wrong result: > popt: [? 5.71428580e-01? -2.27625699e-09] > perr: [ 0.92582011? 0.59868281] > What do I have to do to get a correct answer (say to 5 significant digits) > from curve_fit for this simple example? > > Since it's linear > > import scikits.statsmodels.api as sm > > x = np.arange(5.) > > y = np.array([1, -2, 1, -2, 1.]) > > sigma = np.array([1, ?2, 1, ?2, 1.]) > > res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit() > > res.params > > array([ ?5.71428571e-01, ??1.11022302e-16]) > > res.bse > > array([ 0.98609784, ?0.38892223]) > > res = sm.WLS(y, sm.add_constant(x, prepend=True), > weights=1./(sigma*10)**2).fit() > > res.params > > array([ ?5.71428571e-01, ??1.94289029e-16]) > > res.bse > > array([ 0.98609784, ?0.38892223]) > > rescaling doesn't change parameter estimates nor perr > > This is what I don't understand. > Why don't the parameter estimate errors increase with increasing errors > sigma on the data points? > If I have less precise measurements, the model parameters should be less > constrained?! > I was using MINUIT before I learned Scipy and the error definition for a > chi2 fit given in the MINUIT User Guide > http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html > as well as the example results here > http://code.google.com/p/pyminuit/wiki/GettingStartedGuide > don't mention the factor s_sq that is used in curve_fit to scale pcov. > Is the error definition in the MINUIT manual wrong? > Can you point me to a web resource that explains why the s_sq factor needs > to be applied to the covariance matrix? It's standard text book information, but Wikipedia seems to be lacking a bit in this. for the linear case http://en.wikipedia.org/wiki/Ordinary_least_squares#Assuming_normality cov_params = sigma^2 (X'X)^{-1} for the non-linear case with leastsq, X is replaced by Jacobian, otherwise everything is the same. However, in your minuit links I saw only the Hessian mentioned (from very fast skimming the pages) With maximum likelihood, the inverse Hessian is the complete covariance matrix, no additional multiplication is necessary. Essentially, these are implementation details depending on how the estimation is calculated, and there are various ways of numerically approximating the Hessian. That's why this is described for optimize.leastsq (incorrectly as Chuck pointed out) and but not in optimize.curve_fit. With leastsquares are maximum likelihood, rescaling both y and f(x,params) has no effect on the parameter estimates, it's just like changing units of y, meters instead of centimeters. I guess scipy.odr would work differently, since it is splitting up the errors between y and x's, but I never looked at the details. > > Josef > > > > PS: I've attached a script to fit the two examples using statsmodels, scipy > and minuit (applying the s_sq factor myself). > Here are the results I get (who's right for the first example? why does > statsmodels only return on parameter value and error?): > ? ??"""Example from > http://code.google.com/p/pyminuit/wiki/GettingStartedGuide""" > ? ? x = np.array([1? , 2? , 3? , 4? ]) > ? ? y = np.array([1.1, 2.1, 2.4, 4.3]) > ? ? sigma = np.array([0.1, 0.1, 0.2, 0.1]) > statsmodels.api.WLS > popt: [ 1.04516129] > perr: [ 0.0467711] > scipy.optimize.curve_fit > popt: [? 8.53964011e-08 ? 1.04516128e+00] > perr: [ 0.27452122? 0.09784324] that's what I get with example 1 when I run your script, I don't know why you have one params in your case (full_output threw an exception in curve_fit with scipy.__version__ '0.9.0' statsmodels.api.WLS popt: [ -6.66133815e-16 1.04516129e+00] perr: [ 0.33828314 0.12647671] scipy.optimize.curve_fit popt: [ 8.53964011e-08 1.04516128e+00] perr: [ 0.27452122 0.09784324] > minuit > popt: [-4.851674617611934e-14, 1.0451612903225629] > perr: [ 0.33828315? 0.12647671] statsmodels and minuit agree pretty well > ? ??"""Example from > http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html""" > ? ? x = np.arange(5) > ? ? y = np.array([1, -2, 1, -2, 1]) > ? ? sigma = 10 * np.array([1,? 2, 1,? 2, 1]) > statsmodels.api.WLS > popt: [? 5.71428571e-01 ? 7.63278329e-17] > perr: [ 0.98609784? 0.38892223] > scipy.optimize.curve_fit > popt: [? 5.71428662e-01? -8.73679511e-08] > perr: [ 0.97804034? 0.3818681 ] > minuit > popt: [0.5714285714294132, 2.1449508835758024e-13] > perr: [ 0.98609784? 0.38892223] statsmodels and minuit agree, my guess is that the jacobian calculation of leastsq (curve_fit) is not very good in these examples. Maybe trying Dfun or the other options, epsfcn, will help. I was trying to see whether I get better results calculation the numerical derivatives in a different way, but had to spend the time fixing bugs. (NonlinearLS didn't work correctly with weights.) Josef > > > > > Looking at the source code I see that scipy.optimize.curve_fit multiplies > > the pcov obtained from scipy.optimize.leastsq by a factor s_sq: > > https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438 > > ????if (len(ydata) > len(p0)) and pcov is not None: > > ????????s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) > > ????????pcov = pcov * s_sq > > If so is it possible to add an explanation to > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html > > that pcov is multiplied with this s_sq factor and why that will give correct > > errors? > > After I noticed this issue I saw that this s_sq factor is mentioned in the > > cov_x return parameter description of leastsq, > > but I think it should be explained in curve_fit where it is applied, maybe > > leaving a reference in the cov_x leastsq description. > > Also it would be nice to mention the full_output option in the curve_fit > > docu, I only realized after looking at the source code that this was > > possible. > > Christoph > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From robert.kern at gmail.com Wed Aug 31 15:12:37 2011 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 31 Aug 2011 14:12:37 -0500 Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 55 In-Reply-To: References: Message-ID: Please do no reply to digest messages. Please consider the digests to be read-only. If you wish to participate in the mailing list, please subscribe normally. If you must reply to digest messages, please edit what you quote to just the portion that you respond to and adjust the Subject line accordingly. Thank you. On Wed, Aug 31, 2011 at 05:30, Alacast wrote: > Hilbert transform: > Padding with zeros to the next power of 2 sped it up greatly. Thanks! Is > there any reason hilbert doesn't do that automatically, then remove the > padding before returning the analytic signal? It's not always the right thing to do. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From robert.kern at gmail.com Wed Aug 31 15:26:08 2011 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 31 Aug 2011 14:26:08 -0500 Subject: [SciPy-User] Problem with Python + Hadoop: how to link .so outside Python In-Reply-To: References: Message-ID: On Wed, Aug 31, 2011 at 07:29, Xiong Deng wrote: > So, my question is: how can I include this libs? Should I search for all the > linked .so and .a under my local linux and pack them together with > Python2.7??? If yes, How can I get a full list of the libs needed and How > can make Python2.7 know where to find the new libs?? You may get the best advice on a Hadoop mailing list. Some of this depends on how -cacheArchive will unpack the archive and how Hadoop Streaming will set up the environment for the subprocesses. You may be able to use this tool to help you bundle up everything that is necessary: http://stanford.edu/~pgbovine/cde.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From cweisiger at msg.ucsf.edu Wed Aug 31 18:35:57 2011 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Wed, 31 Aug 2011 15:35:57 -0700 Subject: [SciPy-User] Projecting volumes down to 2D Message-ID: Briefly, I'm working on a visualization tool for five-dimensional microscopy data (X/Y/Z/time/wavelength). Different wavelengths can be transformed with respect to each other: X/Y/Z translation, rotation about the Z axis, and uniform scaling in X and Y. We can then show various 2D slices of the data that pass through a specific XYZT point: an X-Y slice, an X-Z slice, a Y-Z slice, and slices through time. These slices are generated by transforming the view coordinates and using scipy.ndimage.map_coordinates. Now we want to be able to project an entire row/column/etc. of pixels into a single pixel. For example, in the X-Y slice, each pixel shown is actually the brightest pixel from the entire Z column. This example is easily done by taking the maximum along the Z axis and then proceeding as normal with generating the slice, albeit with a Z transformation of 0. That's because the other transformation parameters don't move data through the Z axis. Thus I still only have to transform X by Y pixels. I'm having trouble with an edge case for transformed data, though: if the projection axis is X or Y, and there is a rotation/scale factor, then I can't see a way to avoid having to transform every single pixel in a 3D volume to obtain the projection -- that is, transforming X by Y by Z pixels. This is expensive. Obviously each pixel in the volume must be considered to generate these projections, but does every pixel have to be transformed? I don't suppose anyone knows of a way to simplify the problem? -Chris From justinbois at gmail.com Wed Aug 31 18:47:27 2011 From: justinbois at gmail.com (Justin Bois) Date: Wed, 31 Aug 2011 15:47:27 -0700 Subject: [SciPy-User] Importing OpenCV makes Python segfault on Mac OS X Message-ID: I am trying to use the OpenCV library with Python bindings on Mac OS X. I am using the Enthought Python Distribution for my Python/NumPy/etc. and installed OpenCV 2.2.0 using Homebrew. The installation of OpenCV seems to work ok, and but when I try to import OpenCV, I get a segmentation fault. I get the same behavior if I build OpenCV 2.3.1 from source. Below is what I see. (Note: when I use Python installed from MacPorts, I do not have this problem, but I would like to stick with EPD.) Any help with this would be greatly appreciated! % echo $PYTHONPATH /Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages:/usr/local/lib/python2.7/site-packages % which python /Library/Frameworks/EPD64.framework/Versions/Current/bin/python % more test.py import cv print 'hello world' % gdb python Starting program: /Library/Frameworks/EPD64.framework/Versions/7.1/bin/python test.py Reading symbols for shared libraries .+++..... done Reading symbols for shared libraries ............................................................................................................. done Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000 0x0000000000000000 in ?? () (gdb) backtrace #0 0x0000000000000000 in ?? () #1 0x00000001006814a4 in PyEval_GetGlobals () #2 0x00000001006977f2 in PyImport_Import () #3 0x000000010069799f in PyImport_ImportModule () #4 0x00000001004c2ee2 in initcv () #5 0x00000001000e4b9a in import_submodule () #6 0x00000001000e4dea in load_next () #7 0x00000001000e5778 in PyImport_ImportModuleLevel () #8 0x00000001000be2b3 in builtin___import__ () #9 0x000000010000d002 in PyObject_Call () #10 0x00000001000c3d27 in PyEval_CallObjectWithKeywords () #11 0x00000001000c72ae in PyEval_EvalFrameEx () #12 0x00000001000cca15 in PyEval_EvalCodeEx () #13 0x00000001000ccd16 in PyEval_EvalCode () #14 0x00000001000f11ee in PyRun_FileExFlags () #15 0x00000001000f2001 in PyRun_SimpleFileExFlags () #16 0x0000000100107c65 in Py_Main () #17 0x0000000100000f54 in start () (gdb) -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Aug 31 19:13:34 2011 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 31 Aug 2011 18:13:34 -0500 Subject: [SciPy-User] Importing OpenCV makes Python segfault on Mac OS X In-Reply-To: References: Message-ID: On Wed, Aug 31, 2011 at 17:47, Justin Bois wrote: > I am trying to use the OpenCV library with Python bindings on Mac OS X.? I > am using the Enthought Python Distribution for my Python/NumPy/etc. and > installed OpenCV 2.2.0 using Homebrew. Bug reports for EPD should be directed to epd.support at enthought.com. Thank you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Wed Aug 31 21:45:14 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 31 Aug 2011 21:45:14 -0400 Subject: [SciPy-User] Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com> Message-ID: On Wed, Aug 31, 2011 at 12:10 PM, wrote: > On Wed, Aug 31, 2011 at 11:09 AM, Christoph Deil > wrote: >> >> On Aug 30, 2011, at 11:25 PM, josef.pktd at gmail.com wrote: >> >> On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil >> wrote: >> >> I noticed that scipy.optimize.curve_fit returns parameter errors that don't >> >> scale with sigma, the standard deviation of ydata, as I expected. >> >> Here is a code snippet to illustrate my point, which fits a straight line to >> >> five data points: >> >> import numpy as np >> >> from scipy.optimize import curve_fit >> >> x = np.arange(5) >> >> y = np.array([1, -2, 1, -2, 1]) >> >> sigma = np.array([1,? 2, 1,? 2, 1]) >> >> def f(x, a, b): >> >> ? ? return a + b * x >> >> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma) >> >> perr = np.sqrt(pcov.diagonal()) >> >> print('*** sigma = {0} ***'.format(sigma)) >> >> print('popt: {0}'.format(popt)) >> >> print('perr: {0}'.format(perr)) >> >> I get the following result: >> >> *** sigma = [1 2 1 2 1] *** >> >> popt: [? 5.71428536e-01 ? 1.19956213e-08] >> >> perr: [ 0.93867933? 0.40391117] >> >> Increasing sigma by a factor of 10, >> >> sigma = 10 * np.array([1,? 2, 1,? 2, 1]) >> >> I get the following result: >> >> *** sigma = [10 20 10 20 10] *** >> >> popt: [? 5.71428580e-01? -2.27625699e-09] >> >> perr: [ 0.93895295? 0.37079075] >> >> The best-fit values stayed the same as expected. >> >> But the error on the slope b?decreased by 8% (the error on the offset a >> >> didn't change much) >> >> I would have expected fit parameter errors to increase with increasing >> >> errors on the data!? >> >> Is this a bug? >> >> No bug in the formulas. I tested all of them when curve_fit was added. >> >> However in your example the numerical cov lacks quite a bit of >> precision. Trying your example with different starting values, I get a >> 0.05 difference in your perr (std of parameter estimates). >> >> Trying smaller xtol and ftol doesn't change anything. (?) >> >> Making ftol = 1e-15 very small I get a different wrong result: >> popt: [? 5.71428580e-01? -2.27625699e-09] >> perr: [ 0.92582011? 0.59868281] >> What do I have to do to get a correct answer (say to 5 significant digits) >> from curve_fit for this simple example? >> >> Since it's linear >> >> import scikits.statsmodels.api as sm >> >> x = np.arange(5.) >> >> y = np.array([1, -2, 1, -2, 1.]) >> >> sigma = np.array([1, ?2, 1, ?2, 1.]) >> >> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit() >> >> res.params >> >> array([ ?5.71428571e-01, ??1.11022302e-16]) >> >> res.bse >> >> array([ 0.98609784, ?0.38892223]) >> >> res = sm.WLS(y, sm.add_constant(x, prepend=True), >> weights=1./(sigma*10)**2).fit() >> >> res.params >> >> array([ ?5.71428571e-01, ??1.94289029e-16]) >> >> res.bse >> >> array([ 0.98609784, ?0.38892223]) >> >> rescaling doesn't change parameter estimates nor perr >> >> This is what I don't understand. >> Why don't the parameter estimate errors increase with increasing errors >> sigma on the data points? >> If I have less precise measurements, the model parameters should be less >> constrained?! >> I was using MINUIT before I learned Scipy and the error definition for a >> chi2 fit given in the MINUIT User Guide >> http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html >> as well as the example results here >> http://code.google.com/p/pyminuit/wiki/GettingStartedGuide >> don't mention the factor s_sq that is used in curve_fit to scale pcov. >> Is the error definition in the MINUIT manual wrong? >> Can you point me to a web resource that explains why the s_sq factor needs >> to be applied to the covariance matrix? > > It's standard text book information, but Wikipedia seems to be lacking > a bit in this. > > for the linear case > http://en.wikipedia.org/wiki/Ordinary_least_squares#Assuming_normality > > cov_params = sigma^2 (X'X)^{-1} > > for the non-linear case with leastsq, X is replaced by Jacobian, > otherwise everything is the same. > > However, in your minuit links I saw only the Hessian mentioned (from > very fast skimming the pages) > > With maximum likelihood, the inverse Hessian is the complete > covariance matrix, no additional multiplication is necessary. > > Essentially, these are implementation details depending on how the > estimation is calculated, and there are various ways of numerically > approximating the Hessian. > That's why this is described for optimize.leastsq (incorrectly as > Chuck pointed out) and but not in optimize.curve_fit. > > With leastsquares are maximum likelihood, rescaling both y and > f(x,params) has no effect on the parameter estimates, it's just like > changing units of y, meters instead of centimeters. > > I guess scipy.odr would work differently, since it is splitting up the > errors between y and x's, but I never looked at the details. > > >> >> Josef >> >> >> >> PS: I've attached a script to fit the two examples using statsmodels, scipy >> and minuit (applying the s_sq factor myself). >> Here are the results I get (who's right for the first example? why does >> statsmodels only return on parameter value and error?): >> ? ??"""Example from >> http://code.google.com/p/pyminuit/wiki/GettingStartedGuide""" >> ? ? x = np.array([1? , 2? , 3? , 4? ]) >> ? ? y = np.array([1.1, 2.1, 2.4, 4.3]) >> ? ? sigma = np.array([0.1, 0.1, 0.2, 0.1]) >> statsmodels.api.WLS >> popt: [ 1.04516129] >> perr: [ 0.0467711] >> scipy.optimize.curve_fit >> popt: [? 8.53964011e-08 ? 1.04516128e+00] >> perr: [ 0.27452122? 0.09784324] > > that's what I get with example 1 when I run your script, > I don't know why you have one params in your case > (full_output threw an exception in curve_fit with scipy.__version__ '0.9.0' > > statsmodels.api.WLS > popt: [ -6.66133815e-16 ? 1.04516129e+00] > perr: [ 0.33828314 ?0.12647671] > scipy.optimize.curve_fit > popt: [ ?8.53964011e-08 ? 1.04516128e+00] > perr: [ 0.27452122 ?0.09784324] > > >> minuit >> popt: [-4.851674617611934e-14, 1.0451612903225629] >> perr: [ 0.33828315? 0.12647671] statsmodels.api.WLS popt: [ -4.90926744e-16 1.04516129e+00] perr: [ 0.33828314 0.12647671] statsmodels NonlinearLS popt: [ -3.92166386e-08 1.04516130e+00] perr: [ 0.33828314 0.12647671] finally, I got some bugs out of the weights handling, but still not fully tested def run_nonlinearls(): from scikits.statsmodels.miscmodels.nonlinls import NonlinearLS class Myfunc(NonlinearLS): def _predict(self, params): x = self.exog a, b = params return a + b*x mod = Myfunc(y, x, sigma=sigma**2) res = mod.fit(start_value=(0.042, 0.42)) print ('statsmodels NonlinearLS') print('popt: {0}'.format(res.params)) print('perr: {0}'.format(res.bse)) The basics is the same as curve_fit using leastsq, but it uses complex derivatives which are usually numerically very good. So it looks like the problems with curve_fit in your example are only in the numerically derivatives that leastsq is using for the Jacobian. If leastsq is using only forward differences, then it might be better to calculate the final Jacobian with centered differences. just a guess. > > statsmodels and minuit agree pretty well > >> ? ??"""Example from >> http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html""" >> ? ? x = np.arange(5) >> ? ? y = np.array([1, -2, 1, -2, 1]) >> ? ? sigma = 10 * np.array([1,? 2, 1,? 2, 1]) >> statsmodels.api.WLS >> popt: [? 5.71428571e-01 ? 7.63278329e-17] >> perr: [ 0.98609784? 0.38892223] >> scipy.optimize.curve_fit >> popt: [? 5.71428662e-01? -8.73679511e-08] >> perr: [ 0.97804034? 0.3818681 ] >> minuit >> popt: [0.5714285714294132, 2.1449508835758024e-13] >> perr: [ 0.98609784? 0.38892223] statsmodels.api.WLS popt: [ 5.71428571e-01 1.94289029e-16] perr: [ 0.98609784 0.38892223] statsmodels NonlinearLS popt: [ 5.71428387e-01 8.45750929e-08] perr: [ 0.98609784 0.38892223] Josef > > statsmodels and minuit agree, > > my guess is that the jacobian calculation of leastsq (curve_fit) is > not very good in these examples. Maybe trying Dfun or the other > options, epsfcn, will help. > > I was trying to see whether I get better results calculation the > numerical derivatives in a different way, but had to spend the time > fixing bugs. > (NonlinearLS didn't work correctly with weights.) > > Josef > >> >> >> >> >> Looking at the source code I see that scipy.optimize.curve_fit multiplies >> >> the pcov obtained from scipy.optimize.leastsq by a factor s_sq: >> >> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438 >> >> ????if (len(ydata) > len(p0)) and pcov is not None: >> >> ????????s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) >> >> ????????pcov = pcov * s_sq >> >> If so is it possible to add an explanation to >> >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html >> >> that pcov is multiplied with this s_sq factor and why that will give correct >> >> errors? >> >> After I noticed this issue I saw that this s_sq factor is mentioned in the >> >> cov_x return parameter description of leastsq, >> >> but I think it should be explained in curve_fit where it is applied, maybe >> >> leaving a reference in the cov_x leastsq description. >> >> Also it would be nice to mention the full_output option in the curve_fit >> >> docu, I only realized after looking at the source code that this was >> >> possible. >> >> Christoph >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >