From aarchiba at physics.mcgill.ca  Mon Aug  1 01:20:16 2011
From: aarchiba at physics.mcgill.ca (Anne Archibald)
Date: Mon, 1 Aug 2011 01:20:16 -0400
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
	<1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
Message-ID: <CANm_+Zp27+jVW-fPjGbtiG-r14vDrjE-PawFxPzO=1LfnHb4Tg@mail.gmail.com>

I realize this discussion has gone rather far afield from efficient 1D
deconvolution, but we do a funny thing in radio interferometry, and
I'm curious whether this is normal for other kinds of deconvolution as
well.

In radio interferometry we obtain our images convolved with the
so-called "dirty beam", a convolution kernel that has a nice narrow
peak but usually a chaos of monstrous sidelobes often only marginally
smaller than the main lobe. We use a different regularization
condition to do our deconvolution: we treat the underlying image as a
modest collection of point sources. (One can see why this appeals to
astronomers.) Through an iterative process (the "CLEAN" algorithm and
its many descendants) we obtain an estimate of this underlying image.
But we very rarely actually work with this image directly. We normally
convolve it with a sort of idealized version of our kernel without all
the sidelobes. This then gives an image one might have obtained from a
normal telescope the size of the interferometer array. (Apart from all
the CLEAN artifacts.)

What I'm wondering is, is this final step of convolving with an
idealized version of the kernel standard practice elsewhere?

>From one point of view it could just be parochiality, astronomers
being so accustomed to smudgy images that we have to convert anything
else to this format. But I think that at the least it softens the
effect of the rather strict regularization assumption behind CLEAN -
which amounts to "no extended sources". It probably makes us less
sensitive to shortcuts in CLEAN implementations. I think, though, that
this trick may be useful for many applications of deconvolution.
Rather than try to translate the image from the observed kernel to
some ideal Dirac-delta kernel, this tries to convert it from the
observed kernel to a similar but simpler kernel; one would expect the
impact of a deconvolution artifact to be related to the magnitude of
the difference between kernels.

In terms of 1D Fourier deconvolution, this is saying, after
deconvolution, that we don't really need all those high frequencies
amplified so much anyway, and smoothing them back down with a nice
clean easy-to-understand kernel. In these terms, in fact, it makes
perfect sense to use a wider kernel than necessary for this smoothing
if one is interested in larger-scale features.

Anne

On 31 July 2011 20:51, David Baddeley <david_baddeley at yahoo.com.au> wrote:
> Hi Ralf,
> I do a reasonable?amount?of (2 & 3D) deconvolution of microscopy images and
> the method I use depends quite a lot on the exact properties of the signal.
> You can usually get away with fft based convolutions even if your signal is
> not periodic as long as your kernel is significantly smaller than the signal
> extent.
> As Joe mentioned, for a noisy signal convolving with the inverse or
> performing fourier domain division doesn't work as you end up amplifying
> high frequency noise components. You thus need some form of regularisation.
> The thresholding of fourier components that Joe suggests does this, but you
> might also want to explore more sophisticated options, the simplest of which
> is probably Wiener filtering
> (http://en.wikipedia.org/wiki/Wiener_deconvolution).
> If you've got a signal which is constrained to be positive, it's often
> useful to introduce a positivity constraint on the deconvolution result
> which generally means you need an iterative algorithm. The choice of
> algorithm should also depend on the type of noise that is present in your
> signal - ?my image data is constrained to be +ve and typically has either
> Poisson or a mixture of Poisson and Gaussian noise and I use either the
> Richardson-Lucy or a weighted version of ICTM (Iterative Constrained
> Tikhonov-Miller) algorithm. I can provide more details of these if required.
> cheers,
> David
>
>
>
> ________________________________
> From: Ralf Gommers <ralf.gommers at googlemail.com>
> To: SciPy Users List <scipy-user at scipy.org>
> Sent: Mon, 1 August, 2011 5:56:49 AM
> Subject: [SciPy-User] deconvolution of 1-D signals
>
> Hi,
>
> For a measured signal that is the convolution of a real signal with a
> response function, plus measurement noise on top, I want to recover the real
> signal. Since I know what the response function is and the noise is
> high-frequency compared to the real signal, a straightforward approach is to
> smooth the measured signal (or fit a spline to it), then remove the response
> function by deconvolution. See example code below.
>
> Can anyone point me towards code that does the deconvolution efficiently?
> Perhaps signal.deconvolve would do the trick, but I can't seem to make it
> work (except for directly on the output of np.convolve(y, window,
> mode='valid')).
>
> Thanks,
> Ralf
>
>
> import numpy as np
> from scipy import interpolate, signal
> import matplotlib.pyplot as plt
>
> # Real signal
> x = np.linspace(0, 10, num=201)
> y = np.sin(x + np.pi/5)
>
> # Noisy signal
> mode = 'valid'
> window_len = 11.
> window = np.ones(window_len) / window_len
> y_meas = np.convolve(y, window, mode=mode)
> y_meas += 0.2 * np.random.rand(y_meas.size) - 0.1
> if mode == 'full':
> ??? xstep = x[1] - x[0]
> ??? x_meas = np.concatenate([ \
> ??????? np.linspace(x[0] - window_len//2 * xstep, x[0] - xstep,
> num=window_len//2),
> ??????? x,
> ??????? np.linspace(x[-1] + xstep, x[-1] + window_len//2 * xstep,
> num=window_len//2)])
> elif mode == 'valid':
> ??? x_meas = x[window_len//2:-window_len//2+1]
> elif mode == 'same':
> ??? x_meas = x
>
> # Approximating spline
> xs = np.linspace(0, 10, num=500)
> knots = np.array([1, 3, 5, 7, 9])
> tck = interpolate.splrep(x_meas, y_meas, s=0, k=3, t=knots, task=-1)
> ys = interpolate.splev(xs, tck, der=0)
>
> # Find (low-frequency part of) original signal by deconvolution of smoothed
> # measured signal and known window.
> y_deconv = signal.deconvolve(ys, window)[0]? #FIXME
>
> # Plot all signals
> fig = plt.figure()
> ax = fig.add_subplot(111)
>
> ax.plot(x, y, 'b-', label="Original signal")
> ax.plot(x_meas, y_meas, 'r-', label="Measured, noisy signal")
> ax.plot(xs, ys, 'g-', label="Approximating spline")
> ax.plot(xs[window.size//2-1:-window.size//2], y_deconv, 'k-',
> ??????? label="signal.deconvolve result")
> ax.set_ylim([-1.2, 2])
> ax.legend()
>
> plt.show()
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From johann.cohentanugi at gmail.com  Mon Aug  1 01:50:33 2011
From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi)
Date: Mon, 01 Aug 2011 07:50:33 +0200
Subject: [SciPy-User] ipython-qtconsole with current master
Message-ID: <4E363EA9.60804@gmail.com>

hi there,
I  get :
(mypy)cohen at jarrett:~/sources/python/pyvault/ipython$ ipython-qtconsole
Traceback (most recent call last):
   File "/home/cohen/sources/python/mypy/bin/ipython-qtconsole", line 5, 
in <module>
     from pkg_resources import load_entry_point
   File 
"/home/cohen/sources/python/mypy/lib/python2.6/site-packages/distribute-0.6.10-py2.6.egg/pkg_resources.py", 
line 2659, in <module>
     parse_requirements(__requires__), Environment()
   File 
"/home/cohen/sources/python/mypy/lib/python2.6/site-packages/distribute-0.6.10-py2.6.egg/pkg_resources.py", 
line 546, in resolve
     raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ipython==0.11.dev

?
Johann


From ralf.gommers at googlemail.com  Mon Aug  1 02:03:10 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 1 Aug 2011 08:03:10 +0200
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
	<1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
Message-ID: <CABL7CQgvHXZYyFD8auGXOA3pjD8Wqin8_U39jUAjia-6kfW5iw@mail.gmail.com>

On Mon, Aug 1, 2011 at 2:51 AM, David Baddeley
<david_baddeley at yahoo.com.au>wrote:

> Hi Ralf,
>
> I do a reasonable amount of (2 & 3D) deconvolution of microscopy images
> and the method I use depends quite a lot on the exact properties of the
> signal. You can usually get away with fft based convolutions even if your
> signal is not periodic as long as your kernel is significantly smaller than
> the signal extent.
>

The kernel is typically about 5 to 15 times smaller than the signal extent,
so I guess that may be problematic.

>
> As Joe mentioned, for a noisy signal convolving with the inverse or
> performing fourier domain division doesn't work as you end up amplifying
> high frequency noise components. You thus need some form of regularisation.
> The thresholding of fourier components that Joe suggests does this, but you
> might also want to explore more sophisticated options, the simplest of which
> is probably Wiener filtering (
> http://en.wikipedia.org/wiki/Wiener_deconvolution).
>

I'm aware of the problems with high frequency noise. This is why I tried the
spline fitting - I figured that on a spline the deconvolution would be okay
because the spline is very smooth. This should be fine for my data because
the noise is much higher-frequency than the underlying signal, and the SNR
is high to start with. But maybe there are better ways. I looked for a
Python implementation of Wiener deconvolution but couldn't find one so
quickly. Is there a package out there that has it?

>
> If you've got a signal which is constrained to be positive, it's often
> useful to introduce a positivity constraint on the deconvolution result
> which generally means you need an iterative algorithm. The choice of
> algorithm should also depend on the type of noise that is present in your
> signal -  my image data is constrained to be +ve and typically has either
> Poisson or a mixture of Poisson and Gaussian noise and I use either the
> Richardson-Lucy or a weighted version of ICTM (Iterative Constrained
> Tikhonov-Miller) algorithm. I can provide more details of these if required.
>
> By constrained to be positive I'm guessing you mean monotonic? Otherwise I
could just add a constant offset, but that's probably not what you mean.
What's typically the speed penalty for an iterative method?

Ralf


>
>
cheers,
> David
>
>
>
>
> ------------------------------
> *From:* Ralf Gommers <ralf.gommers at googlemail.com>
> *To:* SciPy Users List <scipy-user at scipy.org>
> *Sent:* Mon, 1 August, 2011 5:56:49 AM
> *Subject:* [SciPy-User] deconvolution of 1-D signals
>
> Hi,
>
> For a measured signal that is the convolution of a real signal with a
> response function, plus measurement noise on top, I want to recover the real
> signal. Since I know what the response function is and the noise is
> high-frequency compared to the real signal, a straightforward approach is to
> smooth the measured signal (or fit a spline to it), then remove the response
> function by deconvolution. See example code below.
>
> Can anyone point me towards code that does the deconvolution efficiently?
> Perhaps signal.deconvolve would do the trick, but I can't seem to make it
> work (except for directly on the output of np.convolve(y, window,
> mode='valid')).
>
> Thanks,
> Ralf
>
>
> import numpy as np
> from scipy import interpolate, signal
> import matplotlib.pyplot as plt
>
> # Real signal
> x = np.linspace(0, 10, num=201)
> y = np.sin(x + np.pi/5)
>
> # Noisy signal
> mode = 'valid'
> window_len = 11.
> window = np.ones(window_len) / window_len
> y_meas = np.convolve(y, window, mode=mode)
> y_meas += 0.2 * np.random.rand(y_meas.size) - 0.1
> if mode == 'full':
>     xstep = x[1] - x[0]
>     x_meas = np.concatenate([ \
>         np.linspace(x[0] - window_len//2 * xstep, x[0] - xstep,
> num=window_len//2),
>         x,
>         np.linspace(x[-1] + xstep, x[-1] + window_len//2 * xstep,
> num=window_len//2)])
> elif mode == 'valid':
>     x_meas = x[window_len//2:-window_len//2+1]
> elif mode == 'same':
>     x_meas = x
>
> # Approximating spline
> xs = np.linspace(0, 10, num=500)
> knots = np.array([1, 3, 5, 7, 9])
> tck = interpolate.splrep(x_meas, y_meas, s=0, k=3, t=knots, task=-1)
> ys = interpolate.splev(xs, tck, der=0)
>
> # Find (low-frequency part of) original signal by deconvolution of smoothed
> # measured signal and known window.
> y_deconv = signal.deconvolve(ys, window)[0]  #FIXME
>
> # Plot all signals
> fig = plt.figure()
> ax = fig.add_subplot(111)
>
> ax.plot(x, y, 'b-', label="Original signal")
> ax.plot(x_meas, y_meas, 'r-', label="Measured, noisy signal")
> ax.plot(xs, ys, 'g-', label="Approximating spline")
> ax.plot(xs[window.size//2-1:-window.size//2], y_deconv, 'k-',
>         label="signal.deconvolve result")
> ax.set_ylim([-1.2, 2])
> ax.legend()
>
> plt.show()
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/9fb42a79/attachment.html>

From david_baddeley at yahoo.com.au  Mon Aug  1 03:03:18 2011
From: david_baddeley at yahoo.com.au (David Baddeley)
Date: Mon, 1 Aug 2011 00:03:18 -0700 (PDT)
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <CABL7CQgvHXZYyFD8auGXOA3pjD8Wqin8_U39jUAjia-6kfW5iw@mail.gmail.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
	<1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
	<CABL7CQgvHXZYyFD8auGXOA3pjD8Wqin8_U39jUAjia-6kfW5iw@mail.gmail.com>
Message-ID: <1312182198.97186.YahooMailRC@web113409.mail.gq1.yahoo.com>

Hi Ralf,

5-15 times smaller would probably be fine, although you might want to watch the 
edges in the reconstruction - if they're at different dc levels you'll get edge 
artifacts (within ~ 1 kernel width of the edges). I'd tend to avoid spline 
filtering (or any form of noise reduction) before deconvolution, as this will 
also transform the data in a way not explained by the model you're using to 
deconvolve with. 

Weiner filtering is a 2 liner -> 

H = fft(kernel)
deconvolved = ifftshift(ifft(fft(signal)*np.conj(H)/(H*np.conj(H) + lambda**2)))

where lambda is your regularisation parameter, and white noise is assumed. There 
are various methods for choosing lambda optimally, but most people tend to use 
trial and error.

Iterative methods are typically ~1-2 orders of magnitude slower than a Weiner 
filter, but with fast fft libraries and modern computers still quite reasonable 
for modest data sizes (a 3D image stack of ~ 512x512x50 pixels will tend to be 
done in a bit under a minute, can't really comment on 1D data, but unless your 
signal is very long I'd expect it to be significantly quicker). Ffts scale with 
O(nlogn) so will generally dramatically outperform things based on a simple 
convolution or filtering approaches (O(n**2)) for large n. This might make an 
iterative approach using ffts faster than something like scipy.signal.deconvolve 
if your kernel is large.

cheers,
David


________________________________
From: Ralf Gommers <ralf.gommers at googlemail.com>
To: David Baddeley <david_baddeley at yahoo.com.au>; SciPy Users List 
<scipy-user at scipy.org>
Sent: Mon, 1 August, 2011 6:03:10 PM
Subject: Re: [SciPy-User] deconvolution of 1-D signals


On Mon, Aug 1, 2011 at 2:51 AM, David Baddeley <david_baddeley at yahoo.com.au> 
wrote:

Hi Ralf,
>
>
>I do a reasonable amount of (2 & 3D) deconvolution of microscopy images and the 
>method I use depends quite a lot on the exact properties of the signal. You can 
>usually get away with fft based convolutions even if your signal is not periodic 
>as long as your kernel is  significantly smaller than the signal extent.

The kernel is typically about 5 to 15 times smaller than the signal extent, so I 
guess that may be problematic.


>
>As Joe mentioned, for a noisy signal convolving with the inverse or performing 
>fourier domain division doesn't work as you end up amplifying high frequency 
>noise components. You thus need some form of regularisation. The thresholding of 
>fourier components that Joe suggests does this, but you might also want to 
>explore more sophisticated options, the simplest of which is probably Wiener 
>filtering (http://en.wikipedia.org/wiki/Wiener_deconvolution). 

I'm aware of the problems with high frequency noise. This is why I tried the 
spline fitting - I figured that on a spline the deconvolution would be okay 
because the spline is very smooth. This should be fine for my data because the 
noise is much higher-frequency than the underlying signal, and the SNR is high 
to start with. But maybe there are better ways. I looked for a Python 
implementation of Wiener deconvolution but couldn't find one so quickly. Is 
there a package out there that has it?


>
>If you've got a signal which is constrained to be positive, it's often useful to 
>introduce a positivity constraint on the deconvolution result which generally 
>means you need an iterative algorithm. The choice of algorithm should also 
>depend on the type of noise that is present in your signal -  my image data is 
>constrained to be +ve and typically has either Poisson or a mixture of Poisson 
>and Gaussian noise and I use either the Richardson-Lucy or a weighted version of 
>ICTM (Iterative Constrained Tikhonov-Miller) algorithm. I can provide more 
>details of these if required.
>
By constrained to be positive I'm guessing you mean monotonic? Otherwise I could 
just add a constant offset, but that's probably not what you mean. What's 
typically the speed penalty for an iterative method?

Ralf
 

>
cheers,
>David
>
>
>
>
>
>
>
>
>
________________________________
From: Ralf Gommers <ralf.gommers at googlemail.com>
>To: SciPy Users List <scipy-user at scipy.org>
>Sent: Mon, 1 August, 2011 5:56:49 AM
>Subject: [SciPy-User] deconvolution of 1-D signals
>
>
>Hi,
>
>For a measured signal that is the convolution of a real signal  with a response 
>function, plus measurement noise on top, I want to recover the real signal. 
>Since I know what the response function is and the noise is high-frequency 
>compared to the real signal, a straightforward approach is to smooth the 
>measured signal (or fit a spline to it), then remove the response function by 
>deconvolution. See example code below.
>
>Can anyone point me towards code that does the deconvolution efficiently? 
>Perhaps signal.deconvolve would do the trick, but I can't seem to make it work 
>(except for directly on the output of np.convolve(y, window, mode='valid')).
>
>Thanks,
>Ralf
>
>
>import numpy as np
>from scipy import interpolate, signal
>import matplotlib.pyplot as plt
>
># Real signal
>x = np.linspace(0, 10, num=201)
>y = np.sin(x + np.pi/5)
>
># Noisy signal
>mode = 'valid'
>window_len = 11.
>window = np.ones(window_len) / window_len
>y_meas = np.convolve(y, window, mode=mode)
>y_meas += 0.2 * np.random.rand(y_meas.size) - 0.1
>if mode == 'full':
>    xstep = x[1] - x[0]
>    x_meas = np.concatenate([ \
>        np.linspace(x[0] - window_len//2 * xstep, x[0] - xstep, 
>num=window_len//2),
>        x,
>        np.linspace(x[-1] + xstep, x[-1] + window_len//2 * xstep, 
>num=window_len//2)])
>elif mode == 'valid':
>    x_meas = x[window_len//2:-window_len//2+1]
>elif mode == 'same':
>    x_meas = x
>
># Approximating spline
>xs = np.linspace(0, 10, num=500)
>knots = np.array([1, 3, 5, 7, 9])
>tck = interpolate.splrep(x_meas, y_meas, s=0, k=3, t=knots, task=-1)
>ys = interpolate.splev(xs, tck, der=0)
>
># Find (low-frequency part of) original signal by deconvolution of smoothed
># measured signal and known window.
>y_deconv = signal.deconvolve(ys, window)[0]  #FIXME
>
># Plot all signals
>fig = plt.figure()
>ax = fig.add_subplot(111)
>
>ax.plot(x, y, 'b-', label="Original signal")
>ax.plot(x_meas, y_meas, 'r-', label="Measured, noisy signal")
>ax.plot(xs, ys, 'g-', label="Approximating spline")
>ax.plot(xs[window.size//2-1:-window.size//2], y_deconv, 'k-',
>        label="signal.deconvolve result")
>ax.set_ylim([-1.2, 2])
>ax.legend()
>
>plt.show()
>
>
>_______________________________________________
>SciPy-User mailing list
>SciPy-User at scipy.org
>http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/f81b43ee/attachment.html>

From charlesr.harris at gmail.com  Mon Aug  1 10:14:13 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 1 Aug 2011 08:14:13 -0600
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <CANm_+Zp27+jVW-fPjGbtiG-r14vDrjE-PawFxPzO=1LfnHb4Tg@mail.gmail.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
	<1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
	<CANm_+Zp27+jVW-fPjGbtiG-r14vDrjE-PawFxPzO=1LfnHb4Tg@mail.gmail.com>
Message-ID: <CAB6mnxLWuKr5Sg-NkBugHDp48iFZVbmB7fzYQ0F7kA9jYgeT=g@mail.gmail.com>

On Sun, Jul 31, 2011 at 11:20 PM, Anne Archibald <aarchiba at physics.mcgill.ca
> wrote:

> I realize this discussion has gone rather far afield from efficient 1D
> deconvolution, but we do a funny thing in radio interferometry, and
> I'm curious whether this is normal for other kinds of deconvolution as
> well.
>
> In radio interferometry we obtain our images convolved with the
> so-called "dirty beam", a convolution kernel that has a nice narrow
> peak but usually a chaos of monstrous sidelobes often only marginally
> smaller than the main lobe. We use a different regularization
> condition to do our deconvolution: we treat the underlying image as a
> modest collection of point sources. (One can see why this appeals to
> astronomers.) Through an iterative process (the "CLEAN" algorithm and
> its many descendants) we obtain an estimate of this underlying image.
> But we very rarely actually work with this image directly. We normally
> convolve it with a sort of idealized version of our kernel without all
> the sidelobes. This then gives an image one might have obtained from a
> normal telescope the size of the interferometer array. (Apart from all
> the CLEAN artifacts.)
>
> What I'm wondering is, is this final step of convolving with an
> idealized version of the kernel standard practice elsewhere?
>
>
That's interesting. It sounds like fitting a parametric model, which yields
points, followed by a smoothing that in some sense represents the error. Are
there frequency aliasing problems associated with the deconvolution?

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/77200cee/attachment.html>

From charlesr.harris at gmail.com  Mon Aug  1 10:19:34 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 1 Aug 2011 08:19:34 -0600
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <1312182198.97186.YahooMailRC@web113409.mail.gq1.yahoo.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
	<1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
	<CABL7CQgvHXZYyFD8auGXOA3pjD8Wqin8_U39jUAjia-6kfW5iw@mail.gmail.com>
	<1312182198.97186.YahooMailRC@web113409.mail.gq1.yahoo.com>
Message-ID: <CAB6mnxJYmzpsDHiaCtr2s9nLBsbDAM3PLaoFWnfR7JdGnv5B5g@mail.gmail.com>

On Mon, Aug 1, 2011 at 1:03 AM, David Baddeley
<david_baddeley at yahoo.com.au>wrote:

> Hi Ralf,
>
> 5-15 times smaller would probably be fine, although you might want to watch
> the edges in the reconstruction - if they're at different dc levels you'll
> get edge artifacts (within ~ 1 kernel width of the edges). I'd tend to avoid
> spline filtering (or any form of noise reduction) before deconvolution, as
> this will also transform the data in a way not explained by the model you're
> using to deconvolve with.
>
> Weiner filtering is a 2 liner ->
>
> H = fft(kernel)
> deconvolved = ifftshift(ifft(fft(signal)*np.conj(H)/(H*np.conj(H) +
> lambda**2)))
>
> where lambda is your regularisation parameter, and white noise is assumed.
> There are various methods for choosing lambda optimally, but most people
> tend to use trial and error.
>
> Iterative methods are typically ~1-2 orders of magnitude slower than a
> Weiner filter, but with fast fft libraries and modern computers still quite
> reasonable for modest data sizes (a 3D image stack of ~ 512x512x50 pixels
> will tend to be done in a bit under a minute, can't really comment on 1D
> data, but unless your signal is very long I'd expect it to be significantly
> quicker). Ffts scale with O(nlogn) so will generally dramatically outperform
> things based on a simple convolution or filtering approaches (O(n**2)) for
> large n. This might make an iterative approach using ffts faster than
> something like scipy.signal.deconvolve if your kernel is large.
>
>
The main problem with Weiner filtering is that it assumes that both the
signal and noise are Gaussian. For instance, if you are looking for spikes
in noise, the amplitudes of the spikes would have a Gaussian distribution.
The Weiner filter is then the Bayesian estimate that follows from those
assumptions, but those might not be the best assumptions for the data.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/e670f05f/attachment.html>

From rigal at rapideye.de  Mon Aug  1 13:47:30 2011
From: rigal at rapideye.de (Matthieu Rigal)
Date: Mon, 1 Aug 2011 19:47:30 +0200
Subject: [SciPy-User] wrong output shape calculation in
	scipy.ndimage.interpolation.zoom
Message-ID: <201108011947.30560.rigal@rapideye.de>

Hi guys,

I just detected a problem with the output shape calculation when running a 
zoom function.
Sometimes it returns an odd value, here is an example :
>>> import numpy
>>> from scipy.ndimage.interpolation import zoom
>>> aT = numpy.ones((5000,5000))
>>> aT2 = numpy.ones((556,463))
>>> aT3 = zoom(aT2, (float(aT.shape[0])/aT2.shape[0], 
float(aT.shape[1])/aT2.shape[1]))
>>> aT3.shape
(4999, 5000)

Whereas adding a very little incrementation factor produces it right :
>>> aT3 = zoom(aT2, (1.00001*float(aT.shape[0])/aT2.shape[0], 
1.00001*float(aT.shape[1])/aT2.shape[1]))
>>> aT3.shape
(5000, 5000)

There must be somewhere a problem with the roundings... should I file a ticket 
?

Regards,
Matthieu

RapidEye AG
Molkenmarkt 30
14776 Brandenburg an der Havel
Germany
 
Follow us on Twitter! www.twitter.com/rapideye_ag
 
Head Office/Sitz der Gesellschaft: Brandenburg an der Havel
Management Board/Vorstand: Wolfgang G. Biedermann,
                           Frederik Jung-Rothenhaeusler
Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates: 
Juergen Breitkopf 
Commercial Register/Handelsregister Potsdam HRB 17 796
Tax Number/Steuernummer: 048/100/00053
VAT-Ident-Number/Ust.-ID: DE 199331235
DIN EN ISO 9001 certified
 
*************************************************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese
E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den
Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie
die unbefugte Weitergabe dieser E-Mail ist nicht gestattet.
 
The information in this e-mail is intended for the named recipients
only. It may contain privileged and confidential information. If you
have received this communication in error, any use, copying or
dissemination of its contents is strictly prohibited. Please erase all
copies of the message along with any included attachments and notify
RapidEye AG or the sender immediately by telephone at the number
indicated on this page.


From gustavo.goretkin at gmail.com  Mon Aug  1 15:23:38 2011
From: gustavo.goretkin at gmail.com (Gustavo Goretkin)
Date: Mon, 1 Aug 2011 15:23:38 -0400
Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective function
Message-ID: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>

I am using the Gaussian Process module in scikit-learn. It uses
optimize.fmin_cobyla to find the best hyper-parameters. It looks like,
though, that fmin_cobyla is, after a couple of iterations, feeding nan to
the objective function. Any ideas?

scipy.__version__ = '0.10.0.dev7180'

Thanks,
Gustavo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/f1b578fa/attachment.html>

From cjordan1 at uw.edu  Mon Aug  1 15:49:20 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Mon, 1 Aug 2011 14:49:20 -0500
Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective
	function
In-Reply-To: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>
References: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>
Message-ID: <CAEJxiFoMEHCR_n7F5o36+QsTmz_bRhd3T0s7sZ6L4p5ONOALQA@mail.gmail.com>

Could you send the code that's causing the problem?

-Chris Jordan-Squire

On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin <gustavo.goretkin at gmail.com
> wrote:

> I am using the Gaussian Process module in scikit-learn. It uses
> optimize.fmin_cobyla to find the best hyper-parameters. It looks like,
> though, that fmin_cobyla is, after a couple of iterations, feeding nan to
> the objective function. Any ideas?
>
> scipy.__version__ = '0.10.0.dev7180'
>
> Thanks,
> Gustavo
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/2c9cc0c8/attachment.html>

From friedrichromstedt at gmail.com  Mon Aug  1 16:22:18 2011
From: friedrichromstedt at gmail.com (Friedrich Romstedt)
Date: Mon, 1 Aug 2011 22:22:18 +0200
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
Message-ID: <CAN06=CzMUz=AG0_RRNCbk6669Vr3h3uvVY8gdDCGYztPDfAMNQ@mail.gmail.com>

Hi Ralf,

2011/7/31 Ralf Gommers <ralf.gommers at googlemail.com>:
> For a measured signal that is the convolution of a real signal with a
> response function, plus measurement noise on top, I want to recover the real
> signal. Since I know what the response function is and the noise is
> high-frequency compared to the real signal, a straightforward approach is to
> smooth the measured signal (or fit a spline to it), then remove the response
> function by deconvolution. See example code below.

I ran across this (see below) soon ago since I'm dealing with
information theory recently.  It has an deconvolution example included
in 1D, and it compares some different general methods in a kind-of
"unified framework", as far as this exists.  I found it quite
informative and helpful.  If you can't get access I can get it from
the library in 2 weeks.  The citation is:

Robert L. Fry (ed.), Bayesian Inference and Maximum Entropy Methods in
Science and Engineering: 21st International Workshop, Baltimore,
Maryland, AIP Conf. Proc. 617 (2002)
ISBN 0-7354-0063-6; ISSN 0094-243X
Tutorial "Bayesian Inference for Inverse Problems" (A.
Mohammad-Djafari) on page 477ff.

It includes different noise models, afair, at least the structure how
to deal with this.  If I'm not mistaken the problem discussed there
was a mass-spectrometry spectrum, so should been shot noise mainly,
and of course the psf.

The tutorial covers (in short) maximum entropy as well as maximum
likelihood, and a combination of both (hence the "unification").  I
cannot help much with this since I'm new to it myself.  But I did a
reasonable literature search, and this was one of the best outcomes.
But as said, I was about information theory.

Hope this is a useful pointer,
Friedrich

> Can anyone point me towards code that does the deconvolution efficiently?
> Perhaps signal.deconvolve would do the trick, but I can't seem to make it
> work (except for directly on the output of np.convolve(y, window,
> mode='valid')).

No.  In fact, I don't think there is an automagical solution anywhere.  :-)

Good luck!


From aarchiba at physics.mcgill.ca  Mon Aug  1 17:07:48 2011
From: aarchiba at physics.mcgill.ca (Anne Archibald)
Date: Mon, 1 Aug 2011 17:07:48 -0400
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <CAB6mnxLWuKr5Sg-NkBugHDp48iFZVbmB7fzYQ0F7kA9jYgeT=g@mail.gmail.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
	<1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
	<CANm_+Zp27+jVW-fPjGbtiG-r14vDrjE-PawFxPzO=1LfnHb4Tg@mail.gmail.com>
	<CAB6mnxLWuKr5Sg-NkBugHDp48iFZVbmB7fzYQ0F7kA9jYgeT=g@mail.gmail.com>
Message-ID: <CANm_+ZpCjegQviv9pYj3e0_hE1eiLQAHR_K-1Y_OdSZ8vumeWQ@mail.gmail.com>

On 1 August 2011 10:14, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
> On Sun, Jul 31, 2011 at 11:20 PM, Anne Archibald
> <aarchiba at physics.mcgill.ca> wrote:
>>
>> I realize this discussion has gone rather far afield from efficient 1D
>> deconvolution, but we do a funny thing in radio interferometry, and
>> I'm curious whether this is normal for other kinds of deconvolution as
>> well.
>>
>> In radio interferometry we obtain our images convolved with the
>> so-called "dirty beam", a convolution kernel that has a nice narrow
>> peak but usually a chaos of monstrous sidelobes often only marginally
>> smaller than the main lobe. We use a different regularization
>> condition to do our deconvolution: we treat the underlying image as a
>> modest collection of point sources. (One can see why this appeals to
>> astronomers.) Through an iterative process (the "CLEAN" algorithm and
>> its many descendants) we obtain an estimate of this underlying image.
>> But we very rarely actually work with this image directly. We normally
>> convolve it with a sort of idealized version of our kernel without all
>> the sidelobes. This then gives an image one might have obtained from a
>> normal telescope the size of the interferometer array. (Apart from all
>> the CLEAN artifacts.)
>>
>> What I'm wondering is, is this final step of convolving with an
>> idealized version of the kernel standard practice elsewhere?
>>
>
> That's interesting. It sounds like fitting a parametric model, which yields
> points, followed by a smoothing that in some sense represents the error. Are
> there frequency aliasing problems associated with the deconvolution?

It's very like fitting a parametric model, yes, except that we don't
care much about the model parameters. In fact we often end up with
models that have clusters of "point sources" with positive and
negative emissions trying to match up with what is in reality a single
point source. This can be due to inadequacies of the dirty beam model
(though usually we have a decent estimate) or simply noise. In any
case smoothing with an idealized main lobe makes us much less
sensitive to this kind of junk. Plus if you're going to do this
anyway, it can make life much easier to constrain your point sources
to a grid.

(As an aside, this trick - of fitting a parametric model but then
extracting "observational" parameters for comparison to reduce
model-sensitivity - came up with some X-ray spectral data I was
looking at: you need to use a model to pull out the instrumental
effects, but if you report (say) the model luminosity in a band your
instrument can detect, then it doesn't much matter whether your model
thinks the photons are thermal or power-law. In principle you can even
do this trick with published model parameters, but you run into the
problem that people don't give full covariance matrices for the fitted
parameters so you get spurious uncertainties.)

As far as frequency aliasing, there's not so much coming from the
deconvolution, since our beam is so irregular. The actual observation
samples image spatial frequencies rather badly; it's the price we pay
for not having a filled aperture. So we're often simply missing
information on spatial frequencies, most often the lowest ones
(because there's a limit on how close you can put tracking dishes
together without shadowing). But I don't think this is a deconvolution
issue; in fact in situations where people are really pushing the
limits of interferometry, like the millimeter-wave interferometric
observations of the black hole at the center of our galaxy, you often
give up on producing an image at all and fit (say) an emission model
including the event horizon to the observed spatial frequencies
directly.

Anne


From gustavo.goretkin at gmail.com  Mon Aug  1 18:06:37 2011
From: gustavo.goretkin at gmail.com (Gustavo Goretkin)
Date: Mon, 1 Aug 2011 18:06:37 -0400
Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective
	function
In-Reply-To: <CAEJxiFoMEHCR_n7F5o36+QsTmz_bRhd3T0s7sZ6L4p5ONOALQA@mail.gmail.com>
References: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>
	<CAEJxiFoMEHCR_n7F5o36+QsTmz_bRhd3T0s7sZ6L4p5ONOALQA@mail.gmail.com>
Message-ID: <CAD+Uu3Pyuy1nJtkyAz6DnVpGZqJjCH+8T4terxp=X=ZXYts+-A@mail.gmail.com>

The code depends on scikit-learn. I'll post the issue there if you think the
problem is related to that module. My thinking is that fmin_cobyla shouldn't
be feeding nan to the objective function.

The code that causes the exception is gp_error.py . I made a change to one
of the functions in scikit-learn, so I just included that file too. Just
keep both files in the same directory.

Thanks for the help.

Gustavo


On Mon, Aug 1, 2011 at 3:49 PM, Christopher Jordan-Squire
<cjordan1 at uw.edu>wrote:

> Could you send the code that's causing the problem?
>
> -Chris Jordan-Squire
>
> On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin <
> gustavo.goretkin at gmail.com> wrote:
>
>> I am using the Gaussian Process module in scikit-learn. It uses
>> optimize.fmin_cobyla to find the best hyper-parameters. It looks like,
>> though, that fmin_cobyla is, after a couple of iterations, feeding nan to
>> the objective function. Any ideas?
>>
>> scipy.__version__ = '0.10.0.dev7180'
>>
>> Thanks,
>> Gustavo
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/37855a0f/attachment.html>

From charlesr.harris at gmail.com  Mon Aug  1 19:48:06 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 1 Aug 2011 17:48:06 -0600
Subject: [SciPy-User] deconvolution of 1-D signals
In-Reply-To: <CANm_+ZpCjegQviv9pYj3e0_hE1eiLQAHR_K-1Y_OdSZ8vumeWQ@mail.gmail.com>
References: <CABL7CQjTzBi7v-wjA5cegfnD4i1k+XwoxQeNZd_0ebUrR5CfpA@mail.gmail.com>
	<1312159888.10394.YahooMailRC@web113403.mail.gq1.yahoo.com>
	<CANm_+Zp27+jVW-fPjGbtiG-r14vDrjE-PawFxPzO=1LfnHb4Tg@mail.gmail.com>
	<CAB6mnxLWuKr5Sg-NkBugHDp48iFZVbmB7fzYQ0F7kA9jYgeT=g@mail.gmail.com>
	<CANm_+ZpCjegQviv9pYj3e0_hE1eiLQAHR_K-1Y_OdSZ8vumeWQ@mail.gmail.com>
Message-ID: <CAB6mnxLGpRKGUqVX3ydSDcLOSshL2q_Sjk4nr-o+AX9OxTkNpA@mail.gmail.com>

On Mon, Aug 1, 2011 at 3:07 PM, Anne Archibald
<aarchiba at physics.mcgill.ca>wrote:

> On 1 August 2011 10:14, Charles R Harris <charlesr.harris at gmail.com>
> wrote:
> >
> >
> > On Sun, Jul 31, 2011 at 11:20 PM, Anne Archibald
> > <aarchiba at physics.mcgill.ca> wrote:
> >>
> >> I realize this discussion has gone rather far afield from efficient 1D
> >> deconvolution, but we do a funny thing in radio interferometry, and
> >> I'm curious whether this is normal for other kinds of deconvolution as
> >> well.
> >>
> >> In radio interferometry we obtain our images convolved with the
> >> so-called "dirty beam", a convolution kernel that has a nice narrow
> >> peak but usually a chaos of monstrous sidelobes often only marginally
> >> smaller than the main lobe. We use a different regularization
> >> condition to do our deconvolution: we treat the underlying image as a
> >> modest collection of point sources. (One can see why this appeals to
> >> astronomers.) Through an iterative process (the "CLEAN" algorithm and
> >> its many descendants) we obtain an estimate of this underlying image.
> >> But we very rarely actually work with this image directly. We normally
> >> convolve it with a sort of idealized version of our kernel without all
> >> the sidelobes. This then gives an image one might have obtained from a
> >> normal telescope the size of the interferometer array. (Apart from all
> >> the CLEAN artifacts.)
> >>
> >> What I'm wondering is, is this final step of convolving with an
> >> idealized version of the kernel standard practice elsewhere?
> >>
> >
> > That's interesting. It sounds like fitting a parametric model, which
> yields
> > points, followed by a smoothing that in some sense represents the error.
> Are
> > there frequency aliasing problems associated with the deconvolution?
>
> It's very like fitting a parametric model, yes, except that we don't
> care much about the model parameters. In fact we often end up with
> models that have clusters of "point sources" with positive and
> negative emissions trying to match up with what is in reality a single
> point source. This can be due to inadequacies of the dirty beam model
> (though usually we have a decent estimate) or simply noise. In any
> case smoothing with an idealized main lobe makes us much less
> sensitive to this kind of junk. Plus if you're going to do this
> anyway, it can make life much easier to constrain your point sources
> to a grid.
>
> (As an aside, this trick - of fitting a parametric model but then
> extracting "observational" parameters for comparison to reduce
> model-sensitivity - came up with some X-ray spectral data I was
> looking at: you need to use a model to pull out the instrumental
> effects, but if you report (say) the model luminosity in a band your
> instrument can detect, then it doesn't much matter whether your model
> thinks the photons are thermal or power-law. In principle you can even
> do this trick with published model parameters, but you run into the
> problem that people don't give full covariance matrices for the fitted
> parameters so you get spurious uncertainties.)
>
> As far as frequency aliasing, there's not so much coming from the
> deconvolution, since our beam is so irregular. The actual observation
> samples image spatial frequencies rather badly; it's the price we pay
> for not having a filled aperture. So we're often simply missing
> information on spatial frequencies, most often the lowest ones
> (because there's a limit on how close you can put tracking dishes
> together without shadowing). But I don't think this is a deconvolution
> issue; in fact in situations where people are really pushing the
> limits of interferometry, like the millimeter-wave interferometric
> observations of the black hole at the center of our galaxy, you often
> give up on producing an image at all and fit (say) an emission model
> including the event horizon to the observed spatial frequencies
> directly.
>
>
Thanks Anne, it's a good trick to know about.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/5a98013d/attachment.html>

From R.Springuel at umit.maine.edu  Mon Aug  1 20:36:07 2011
From: R.Springuel at umit.maine.edu (R. Padraic Springuel)
Date: Mon, 01 Aug 2011 20:36:07 -0400
Subject: [SciPy-User] numpy, scipy, and python 3
Message-ID: <4E374677.7010709@umit.maine.edu>

I read in the readme for numpy 1.6.1 and scipy 0.9.0 that both support 
python 3, but I can't find and Mac installation files for either package 
that work with any version of python 3.  Anyone know where I can get 
some or do I need to build from source (something I haven't done in a 
while, but should, theoretically, be able to do)?
-- 

R. Padraic Springuel, PhD


From wardefar at iro.umontreal.ca  Mon Aug  1 20:52:08 2011
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Mon, 1 Aug 2011 20:52:08 -0400
Subject: [SciPy-User] numpy, scipy, and python 3
In-Reply-To: <4E374677.7010709@umit.maine.edu>
References: <4E374677.7010709@umit.maine.edu>
Message-ID: <4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca>

On 2011-08-01, at 8:36 PM, R. Padraic Springuel wrote:

> I read in the readme for numpy 1.6.1 and scipy 0.9.0 that both support 
> python 3, but I can't find and Mac installation files for either package 
> that work with any version of python 3.  Anyone know where I can get 
> some or do I need to build from source (something I haven't done in a 
> while, but should, theoretically, be able to do)?

I don't think Ralf has been building Python 3 binaries. I'm guessing that there isn't much demand, as other parts of the tool stack have yet to make the jump to Python 3 (notably matplotlib). 

However, OS X is probably the easiest platform on which to build NumPy. SciPy, at last glance, required you grab a certain version of gfortran (this one: http://r.research.att.com/tools/ but check the docs in case this has changed), but is otherwise a straightforward "python setup.py build && sudo python setup.py install" affair as well. Let the list know if you have problems.

David


From jason.heeris at gmail.com  Mon Aug  1 22:19:02 2011
From: jason.heeris at gmail.com (Jason Heeris)
Date: Tue, 2 Aug 2011 10:19:02 +0800
Subject: [SciPy-User] Vectorised convolution
Message-ID: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>

I'm using the scipy.signal.convolve function on an ndarray that represents
independent sets of data (each set is a row). It seems that with this
function I need to manually split up the rows to work on them independently,
otherwise it does a 2D convolution:

    for idx in xrange(0, S):
        conv[idx] = sp.signal.convolve(inputs[idx], other, mode='full')

Is there a vectorised version of this function? In other words, if I were
doing an FFT I'd use np.fft.fft(inputs, axis=1) ? is it possible to do a
single axis convolution on a 2D array?

Cheers,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/c45458e5/attachment.html>

From david_baddeley at yahoo.com.au  Mon Aug  1 22:37:54 2011
From: david_baddeley at yahoo.com.au (David Baddeley)
Date: Mon, 1 Aug 2011 19:37:54 -0700 (PDT)
Subject: [SciPy-User] Vectorised convolution
In-Reply-To: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
References: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
Message-ID: <1312252674.23043.YahooMailRC@web113401.mail.gq1.yahoo.com>

try scipy.ndimage.convolve1d

It doesn't seem to support mode='full' though

cheers,
David


________________________________
From: Jason Heeris <jason.heeris at gmail.com>
To: SciPy Users List <scipy-user at scipy.org>
Sent: Tue, 2 August, 2011 2:19:02 PM
Subject: [SciPy-User] Vectorised convolution

I'm using the scipy.signal.convolve function on an ndarray that represents 
independent sets of data (each set is a row). It seems that with this function I 
need to manually split up the rows to work on them independently, otherwise it 
does a 2D convolution:


    for idx in xrange(0, S):
        conv[idx] = sp.signal.convolve(inputs[idx], other, mode='full')

Is there a vectorised version of this function? In other words, if I were doing 
an FFT I'd use np.fft.fft(inputs, axis=1) ? is it possible to do a single axis 
convolution on a 2D array?

Cheers,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110801/c8aa6d5f/attachment.html>

From warren.weckesser at enthought.com  Mon Aug  1 22:57:25 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Mon, 1 Aug 2011 21:57:25 -0500
Subject: [SciPy-User] Vectorised convolution
In-Reply-To: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
References: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
Message-ID: <CAM-+wY8pTC4BCeGzR5AN7EC0h3TSzdVPOn69HNQpXJfyO-VJ8Q@mail.gmail.com>

On Mon, Aug 1, 2011 at 9:19 PM, Jason Heeris <jason.heeris at gmail.com> wrote:
> I'm using the scipy.signal.convolve function on an ndarray that represents
> independent sets of data (each set is a row). It seems that with this
> function I need to manually split up the rows to work on them independently,
> otherwise it does a 2D convolution:
>
> ? ? for idx in xrange(0, S):
> ? ? ? ? conv[idx] = sp.signal.convolve(inputs[idx], other, mode='full')
> Is there a vectorised version of this function? In other words, if I were
> doing an FFT I'd use?np.fft.fft(inputs, axis=1) ? is it possible to do a
> single axis convolution on a 2D array?


I show one way to do this in the following SciPy cookbook entry:

    http://www.scipy.org/Cookbook/ApplyFIRFilter

In particular, see the second paragraph.

Warren


> Cheers,
> Jason
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From jason.heeris at gmail.com  Mon Aug  1 23:02:18 2011
From: jason.heeris at gmail.com (Jason Heeris)
Date: Tue, 2 Aug 2011 11:02:18 +0800
Subject: [SciPy-User] Vectorised convolution
In-Reply-To: <1312252674.23043.YahooMailRC@web113401.mail.gq1.yahoo.com>
References: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
	<1312252674.23043.YahooMailRC@web113401.mail.gq1.yahoo.com>
Message-ID: <CA+Zd3Fd837GjAw2YM9+o5639maRbZrAqvHT0Baup1_-KJOYG2A@mail.gmail.com>

On 2 August 2011 10:37, David Baddeley <david_baddeley at yahoo.com.au> wrote:

> try scipy.ndimage.convolve1d
>
> It doesn't seem to support mode='full' though
>
> That's easy to implement by zero padding both input arrays. Between this
and Warren's answer I should be able to do it.

Thanks!

? Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/293a98b1/attachment.html>

From ralf.gommers at googlemail.com  Tue Aug  2 02:26:52 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 2 Aug 2011 08:26:52 +0200
Subject: [SciPy-User] numpy, scipy, and python 3
In-Reply-To: <4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca>
References: <4E374677.7010709@umit.maine.edu>
	<4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca>
Message-ID: <CABL7CQjbi0ZhUjPki=kzLR-G=2uA4omP-3=eE2D9WU83Ss7fjA@mail.gmail.com>

On Tue, Aug 2, 2011 at 2:52 AM, David Warde-Farley <
wardefar at iro.umontreal.ca> wrote:

> On 2011-08-01, at 8:36 PM, R. Padraic Springuel wrote:
>
> > I read in the readme for numpy 1.6.1 and scipy 0.9.0 that both support
> > python 3, but I can't find and Mac installation files for either package
> > that work with any version of python 3.  Anyone know where I can get
> > some or do I need to build from source (something I haven't done in a
> > while, but should, theoretically, be able to do)?
>
> I don't think Ralf has been building Python 3 binaries. I'm guessing that
> there isn't much demand, as other parts of the tool stack have yet to make
> the jump to Python 3 (notably matplotlib).
>
> I haven't. It still requires some work since bdist_mpkg (used for the 2.x
binaries) doesn't support python 3.x.


> However, OS X is probably the easiest platform on which to build NumPy.
> SciPy, at last glance, required you grab a certain version of gfortran (this
> one: http://r.research.att.com/tools/ but check the docs in case this has
> changed), but is otherwise a straightforward "python setup.py build && sudo
> python setup.py install" affair as well. Let the list know if you have
> problems.
>
> That's the right site to grab gfortran from, but note that if you're on
Lion you need a new binary that wasn't on the site last time I checked.
Easiest is to install it through homebrew.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/eea32a4f/attachment.html>

From jason.heeris at gmail.com  Tue Aug  2 02:35:48 2011
From: jason.heeris at gmail.com (Jason Heeris)
Date: Tue, 2 Aug 2011 14:35:48 +0800
Subject: [SciPy-User] Vectorised convolution
In-Reply-To: <CAM-+wY8pTC4BCeGzR5AN7EC0h3TSzdVPOn69HNQpXJfyO-VJ8Q@mail.gmail.com>
References: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
	<CAM-+wY8pTC4BCeGzR5AN7EC0h3TSzdVPOn69HNQpXJfyO-VJ8Q@mail.gmail.com>
Message-ID: <CA+Zd3FfNSo9zG9k=biiW3FuitVo_R61vpiF1Dbr+N8NiiiYjVg@mail.gmail.com>

On 2 August 2011 10:57, Warren Weckesser <warren.weckesser at enthought.com>wrote:

> I show one way to do this in the following SciPy cookbook entry:
>

Interesting ? I just tried that approach and found that it was actually
slower than the looped version, which seems weird.

But the convolve1d version works great (less than a tenth of the time as my
loop) and the lfilter version is almost as good.

Thanks,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/d7dcad99/attachment.html>

From aarchiba at physics.mcgill.ca  Tue Aug  2 02:56:10 2011
From: aarchiba at physics.mcgill.ca (Anne Archibald)
Date: Tue, 2 Aug 2011 02:56:10 -0400
Subject: [SciPy-User] Vectorised convolution
In-Reply-To: <CA+Zd3FfNSo9zG9k=biiW3FuitVo_R61vpiF1Dbr+N8NiiiYjVg@mail.gmail.com>
References: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
	<CAM-+wY8pTC4BCeGzR5AN7EC0h3TSzdVPOn69HNQpXJfyO-VJ8Q@mail.gmail.com>
	<CA+Zd3FfNSo9zG9k=biiW3FuitVo_R61vpiF1Dbr+N8NiiiYjVg@mail.gmail.com>
Message-ID: <CANm_+ZoZ_a6a9b2TWNCXZ286d1Tk8CQEKDajxMov885rxmHNwQ@mail.gmail.com>

The blunt-instrument approach is to pad your image with zeros then
flatten it. If you're careful about memory layouts this can even be
done with a view. Either way you get a single long one-dimensional
array you can apply unvectorized 1D convolutions to. You can then
reshape the output back to two dimensions, clipping out the padding as
appropriate in the process. The big drawback is that you have to pad
the whole image at once, which could be a memory hog if your kernel is
large.

Anne

On 2 August 2011 02:35, Jason Heeris <jason.heeris at gmail.com> wrote:
> On 2 August 2011 10:57, Warren Weckesser <warren.weckesser at enthought.com>
> wrote:
>>
>> I show one way to do this in the following SciPy cookbook entry:
>
> Interesting ? I just tried that approach and found that it was actually
> slower than the looped version, which seems weird.
> But the convolve1d version works great (less than a tenth of the time as my
> loop) and the lfilter version is almost as good.
> Thanks,
> Jason
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From timmichelsen at gmx-topmail.de  Tue Aug  2 03:37:29 2011
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Tue, 2 Aug 2011 07:37:29 +0000 (UTC)
Subject: [SciPy-User] Status of TimeSeries SciKit
References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk>
	<7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com>
	<CAJPUwMAaCmZqf=wVhfoEjMns5559QrnsGa73h4uuXL2u2mKSFw@mail.gmail.com>
	<7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com>
	<CAKF=DjtznfhsHi3K8X0DQLiGSj0UofTMkc8BXKcp5tym4Dzkrg@mail.gmail.com>
	<loom.20110726T195753-649@post.gmane.org>
	<20110726222843.GB8920@phare.normalesup.org>
	<loom.20110727T150528-769@post.gmane.org>
	<20110727141251.GB30024@phare.normalesup.org>
	<CAJPUwMBBeQ3-JXc78Wfv_K91PfGg14rM7g7MxBc1pg7oOR9jpA@mail.gmail.com>
	<j1197k$5sr$2@dough.gmane.org>
	<CAJPUwMBxftgZFG0cgo3B+ur4tf=iJ21bT5FwtYvWu0Dev+P5ag@mail.gmail.com>
Message-ID: <loom.20110802T092817-688@post.gmane.org>

> >> I agree. I already have 50% or more of the features in
> >> scikits.timeseries, so this gets back to my fragmentation argument
> >> (users being stuck with a confusing choice between multiple
> >> libraries). Let's make it happen!
> > So what needs to be done to move things forward?
> > Do we need to draw up a roadmap?
> > A table with functions that respond to common use cases in natual
> > science, computing, and economics?
> Having a place to collect concrete use cases (like your list from the
> prior e-mail, but with illustrative code snippets) would be good.
> You're welcome to start doing it here:
> 
> https://github.com/wesm/pandas/wiki
Here goes:
https://github.com/wesm/pandas/wiki/Time-Series-Manipulation 

I will fill it with my stuff.
Shall we file feature request directly as issues?

> A good place to start, which I can do when I have some time, would be
> to start moving the scikits.timeseries code into pandas. There are
> several key components
> 
> - Date and DateArray stuff, frequency implementations
> - masked array time series implementations (record array and not)
> - plotting
> - reporting, moving window functions, etc.
> 
> We need to evaluate Date/DateArray as they relate to numpy.datetime64
> and see what can be done. I haven't looked closely but I'm not sure if
> all the convenient attribute access stuff (day, month, day_of_week,
> weekday, etc.) is available in NumPy yet. I suspect it would be
> reasonably straightforward to wrap DateArray so it can be an Index for
> a pandas object.
> 
> I won't have much time for this until mid-August, but a couple days'
> hacking should get most of the pieces into place. I guess we can just
> keep around the masked array classes for legacy API support and for
> feature completeness.
I value very much the work of Pierre and Matt.
But my difficulti with the scikit was that the code is too complex. So I was
only able to contribute helper functions for doc fixes.
Please, lets make it happen that this effort is not a on or 3 man show but
results in something whcih can be maintained by the whole community.

Nevertheless, the timeseries scikit made my work more comfortable and
understadable than I was able to manage with R.

Regards,
Timmie


From pgmdevlist at gmail.com  Tue Aug  2 03:49:05 2011
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 2 Aug 2011 09:49:05 +0200
Subject: [SciPy-User] Status of TimeSeries SciKit
In-Reply-To: <loom.20110802T092817-688@post.gmane.org>
References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk>
	<7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com>
	<CAJPUwMAaCmZqf=wVhfoEjMns5559QrnsGa73h4uuXL2u2mKSFw@mail.gmail.com>
	<7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com>
	<CAKF=DjtznfhsHi3K8X0DQLiGSj0UofTMkc8BXKcp5tym4Dzkrg@mail.gmail.com>
	<loom.20110726T195753-649@post.gmane.org>
	<20110726222843.GB8920@phare.normalesup.org>
	<loom.20110727T150528-769@post.gmane.org>
	<20110727141251.GB30024@phare.normalesup.org>
	<CAJPUwMBBeQ3-JXc78Wfv_K91PfGg14rM7g7MxBc1pg7oOR9jpA@mail.gmail.com>
	<j1197k$5sr$2@dough.gmane.org>
	<CAJPUwMBxftgZFG0cgo3B+ur4tf=iJ21bT5FwtYvWu0Dev+P5ag@mail.gmail.com>
	<loom.20110802T092817-688@post.gmane.org>
Message-ID: <EDAA8E7A-5C51-4448-911C-C81568C958A4@gmail.com>


On Aug 2, 2011, at 9:37 AM, Tim Michelsen wrote:

>>>> I agree. I already have 50% or more of the features in
>>>> scikits.timeseries, so this gets back to my fragmentation argument
>>>> (users being stuck with a confusing choice between multiple
>>>> libraries). Let's make it happen!
>>> So what needs to be done to move things forward?
>>> Do we need to draw up a roadmap?
>>> A table with functions that respond to common use cases in natual
>>> science, computing, and economics?
>> Having a place to collect concrete use cases (like your list from the
>> prior e-mail, but with illustrative code snippets) would be good.
>> You're welcome to start doing it here:
>> 
>> https://github.com/wesm/pandas/wiki
> Here goes:
> https://github.com/wesm/pandas/wiki/Time-Series-Manipulation 
> 
> I will fill it with my stuff.
> Shall we file feature request directly as issues?
> 
>> A good place to start, which I can do when I have some time, would be
>> to start moving the scikits.timeseries code into pandas. There are
>> several key components
>> 
>> - Date and DateArray stuff, frequency implementations
>> - masked array time series implementations (record array and not)
>> - plotting
>> - reporting, moving window functions, etc.
>> 
>> We need to evaluate Date/DateArray as they relate to numpy.datetime64
>> and see what can be done. I haven't looked closely but I'm not sure if
>> all the convenient attribute access stuff (day, month, day_of_week,
>> weekday, etc.) is available in NumPy yet. I suspect it would be
>> reasonably straightforward to wrap DateArray so it can be an Index for
>> a pandas object.
>> 
>> I won't have much time for this until mid-August, but a couple days'
>> hacking should get most of the pieces into place. I guess we can just
>> keep around the masked array classes for legacy API support and for
>> feature completeness.
> I value very much the work of Pierre and Matt.
> But my difficulti with the scikit was that the code is too complex. So I was
> only able to contribute helper functions for doc fixes.
> Please, lets make it happen that this effort is not a on or 3 man show but
> results in something whcih can be maintained by the whole community.

The apparent complexity of the code comes likely from the fact that some features were coded directly in C (not even Cython) for  efficiency. That, and that it relied on MaskedArray, of course ;)


> Nevertheless, the timeseries scikit made my work more comfortable and
> understadable than I was able to manage with R.

Great ! That was the purpose. 

From warren.weckesser at enthought.com  Tue Aug  2 08:47:53 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Tue, 2 Aug 2011 07:47:53 -0500
Subject: [SciPy-User] Vectorised convolution
In-Reply-To: <CA+Zd3FfNSo9zG9k=biiW3FuitVo_R61vpiF1Dbr+N8NiiiYjVg@mail.gmail.com>
References: <CA+Zd3FcJ=aNBaX589cVd3mfU=CJoyM22nRtJL-pUgvGMQYmo-g@mail.gmail.com>
	<CAM-+wY8pTC4BCeGzR5AN7EC0h3TSzdVPOn69HNQpXJfyO-VJ8Q@mail.gmail.com>
	<CA+Zd3FfNSo9zG9k=biiW3FuitVo_R61vpiF1Dbr+N8NiiiYjVg@mail.gmail.com>
Message-ID: <CAM-+wY-mR2p87rzpyWa6TMttqOh+6AV75bx0dGZa0ArP2bj0-A@mail.gmail.com>

On Tue, Aug 2, 2011 at 1:35 AM, Jason Heeris <jason.heeris at gmail.com> wrote:
> On 2 August 2011 10:57, Warren Weckesser <warren.weckesser at enthought.com>
> wrote:
>>
>> I show one way to do this in the following SciPy cookbook entry:
>
> Interesting ? I just tried that approach and found that it was actually
> slower than the looped version, which seems weird.
> But the convolve1d version works great (less than a tenth of the time as my
> loop) and the lfilter version is almost as good.


That's good to know.  I just updated
http://www.scipy.org/Cookbook/ApplyFIRFilter to include
ndimage.convolve1d.

Warren


> Thanks,
> Jason
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From ralf.gommers at googlemail.com  Tue Aug  2 09:58:48 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 2 Aug 2011 15:58:48 +0200
Subject: [SciPy-User] wrong output shape calculation in
	scipy.ndimage.interpolation.zoom
In-Reply-To: <201108011947.30560.rigal@rapideye.de>
References: <201108011947.30560.rigal@rapideye.de>
Message-ID: <CABL7CQh1teK8eDPLcVTgx3fOb5Jj7D-0TUXz=SBxBqWqj3z60Q@mail.gmail.com>

On Mon, Aug 1, 2011 at 7:47 PM, Matthieu Rigal <rigal at rapideye.de> wrote:

> Hi guys,
>
> I just detected a problem with the output shape calculation when running a
> zoom function.
> Sometimes it returns an odd value, here is an example :
> >>> import numpy
> >>> from scipy.ndimage.interpolation import zoom
> >>> aT = numpy.ones((5000,5000))
> >>> aT2 = numpy.ones((556,463))
> >>> aT3 = zoom(aT2, (float(aT.shape[0])/aT2.shape[0],
> float(aT.shape[1])/aT2.shape[1]))
> >>> aT3.shape
> (4999, 5000)
>
> Whereas adding a very little incrementation factor produces it right :
> >>> aT3 = zoom(aT2, (1.00001*float(aT.shape[0])/aT2.shape[0],
> 1.00001*float(aT.shape[1])/aT2.shape[1]))
> >>> aT3.shape
> (5000, 5000)
>
> There must be somewhere a problem with the roundings... should I file a
> ticket
> ?
>

The zoom function seems to round down when non-integer shapes are requested.
This is more a problem with the interface than an actual bug. Your first
zoom factor times the input axis size gives:

>>> (float(aT.shape[0])/aT2.shape[0]) * aT2.shape[0]
4999.9999999999991

The zoom function can't know whether you want an array of size 4999 or 5000
if you pass in a zoom factor that implies an output shape of 4999.xxx. A
patch for zoom to accept an `output_shape` keyword that would override the
`zoom` parameter may be useful.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/9241b135/attachment.html>

From gustavo.goretkin at gmail.com  Tue Aug  2 15:53:46 2011
From: gustavo.goretkin at gmail.com (Gustavo Goretkin)
Date: Tue, 2 Aug 2011 15:53:46 -0400
Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective
	function
In-Reply-To: <CAD+Uu3Pyuy1nJtkyAz6DnVpGZqJjCH+8T4terxp=X=ZXYts+-A@mail.gmail.com>
References: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>
	<CAEJxiFoMEHCR_n7F5o36+QsTmz_bRhd3T0s7sZ6L4p5ONOALQA@mail.gmail.com>
	<CAD+Uu3Pyuy1nJtkyAz6DnVpGZqJjCH+8T4terxp=X=ZXYts+-A@mail.gmail.com>
Message-ID: <CAD+Uu3MNOM_qvLjb+=nq_4FpndwBMksy7fAEL1VjRRNMGDExMg@mail.gmail.com>

Okay here is a much simpler test case: https://gist.github.com/1121046

The print statements, for me, show that *objective* is being passed nan.
Does this warrant a ticket? Should I continue this discussion on the
development list?

Gustavo

On Mon, Aug 1, 2011 at 6:06 PM, Gustavo Goretkin <gustavo.goretkin at gmail.com
> wrote:

> The code depends on scikit-learn. I'll post the issue there if you think
> the problem is related to that module. My thinking is that fmin_cobyla
> shouldn't be feeding nan to the objective function.
>
> The code that causes the exception is gp_error.py . I made a change to one
> of the functions in scikit-learn, so I just included that file too. Just
> keep both files in the same directory.
>
> Thanks for the help.
>
> Gustavo
>
>
>
> On Mon, Aug 1, 2011 at 3:49 PM, Christopher Jordan-Squire <cjordan1 at uw.edu
> > wrote:
>
>> Could you send the code that's causing the problem?
>>
>> -Chris Jordan-Squire
>>
>> On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin <
>> gustavo.goretkin at gmail.com> wrote:
>>
>>> I am using the Gaussian Process module in scikit-learn. It uses
>>> optimize.fmin_cobyla to find the best hyper-parameters. It looks like,
>>> though, that fmin_cobyla is, after a couple of iterations, feeding nan to
>>> the objective function. Any ideas?
>>>
>>> scipy.__version__ = '0.10.0.dev7180'
>>>
>>> Thanks,
>>> Gustavo
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/b4256b27/attachment.html>

From ralf.gommers at googlemail.com  Tue Aug  2 16:42:25 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 2 Aug 2011 22:42:25 +0200
Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective
	function
In-Reply-To: <CAD+Uu3MNOM_qvLjb+=nq_4FpndwBMksy7fAEL1VjRRNMGDExMg@mail.gmail.com>
References: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>
	<CAEJxiFoMEHCR_n7F5o36+QsTmz_bRhd3T0s7sZ6L4p5ONOALQA@mail.gmail.com>
	<CAD+Uu3Pyuy1nJtkyAz6DnVpGZqJjCH+8T4terxp=X=ZXYts+-A@mail.gmail.com>
	<CAD+Uu3MNOM_qvLjb+=nq_4FpndwBMksy7fAEL1VjRRNMGDExMg@mail.gmail.com>
Message-ID: <CABL7CQjv=-rF85i79ZpCE1WC4b8sY+vgo2mRdezetwHishywpQ@mail.gmail.com>

On Tue, Aug 2, 2011 at 9:53 PM, Gustavo Goretkin <gustavo.goretkin at gmail.com
> wrote:

> Okay here is a much simpler test case: https://gist.github.com/1121046
>
> The print statements, for me, show that *objective* is being passed nan.
> Does this warrant a ticket? Should I continue this discussion on the
> development list?
>
> Your function always returns inf, so it's not very surprising that you get
a nan after a few iterations. Could happen for example if the code
determines a derivative numerically, resulting in inf / inf = nan.

It would be helpful if you had a realistic, self-contained example.

Ralf

Gustavo
>
> On Mon, Aug 1, 2011 at 6:06 PM, Gustavo Goretkin <
> gustavo.goretkin at gmail.com> wrote:
>
>> The code depends on scikit-learn. I'll post the issue there if you think
>> the problem is related to that module. My thinking is that fmin_cobyla
>> shouldn't be feeding nan to the objective function.
>>
>> The code that causes the exception is gp_error.py . I made a change to one
>> of the functions in scikit-learn, so I just included that file too. Just
>> keep both files in the same directory.
>>
>> Thanks for the help.
>>
>> Gustavo
>>
>>
>>
>> On Mon, Aug 1, 2011 at 3:49 PM, Christopher Jordan-Squire <
>> cjordan1 at uw.edu> wrote:
>>
>>> Could you send the code that's causing the problem?
>>>
>>> -Chris Jordan-Squire
>>>
>>> On Mon, Aug 1, 2011 at 2:23 PM, Gustavo Goretkin <
>>> gustavo.goretkin at gmail.com> wrote:
>>>
>>>> I am using the Gaussian Process module in scikit-learn. It uses
>>>> optimize.fmin_cobyla to find the best hyper-parameters. It looks like,
>>>> though, that fmin_cobyla is, after a couple of iterations, feeding nan to
>>>> the objective function. Any ideas?
>>>>
>>>> scipy.__version__ = '0.10.0.dev7180'
>>>>
>>>> Thanks,
>>>> Gustavo
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/ef1c01b4/attachment.html>

From gustavo.goretkin at gmail.com  Tue Aug  2 16:55:46 2011
From: gustavo.goretkin at gmail.com (Gustavo Goretkin)
Date: Tue, 2 Aug 2011 16:55:46 -0400
Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective
	function
In-Reply-To: <CABL7CQjv=-rF85i79ZpCE1WC4b8sY+vgo2mRdezetwHishywpQ@mail.gmail.com>
References: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>
	<CAEJxiFoMEHCR_n7F5o36+QsTmz_bRhd3T0s7sZ6L4p5ONOALQA@mail.gmail.com>
	<CAD+Uu3Pyuy1nJtkyAz6DnVpGZqJjCH+8T4terxp=X=ZXYts+-A@mail.gmail.com>
	<CAD+Uu3MNOM_qvLjb+=nq_4FpndwBMksy7fAEL1VjRRNMGDExMg@mail.gmail.com>
	<CABL7CQjv=-rF85i79ZpCE1WC4b8sY+vgo2mRdezetwHishywpQ@mail.gmail.com>
Message-ID: <CAD+Uu3N3XaDYLM5jtxiSuy+tQFqFWMi6RzHG1fFkALZTB92R6w@mail.gmail.com>

>
>
>> Your function always returns inf, so it's not very surprising that you get
> a nan after a few iterations. Could happen for example if the code
> determines a derivative numerically, resulting in inf / inf = nan.
>
> It would be helpful if you had a realistic, self-contained example.
>
> Raln
>

In scikit-learn, fmin_cobyla is used to optimize some parameters of a
Gaussian Process. The objective function returns inf when the parameters are
such that the matrix calculations are unstable and NumPy throws a LinAlg
exception. What would be a better way to handle this?

My gut feeling is that an optimizer should not pass nan to the objective
function, since it cannot possibly be informative. Maybe checking for nan
would be inefficient.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110802/9773f7fb/attachment.html>

From fperez.net at gmail.com  Wed Aug  3 03:40:46 2011
From: fperez.net at gmail.com (Fernando Perez)
Date: Wed, 3 Aug 2011 00:40:46 -0700
Subject: [SciPy-User] [ANN] IPython 0.11 is officially out
In-Reply-To: <CAHAreOpeS1yWz_dg7+HYwyCLvL12DZ3JwoWb_6nHOq4eMNGd+w@mail.gmail.com>
References: <CAHAreOpeS1yWz_dg7+HYwyCLvL12DZ3JwoWb_6nHOq4eMNGd+w@mail.gmail.com>
Message-ID: <CAHAreOpiKakVXfzi=Y9qaq=rY=sNS_zJJc90pxpAmH7O32MZVA@mail.gmail.com>

On Sun, Jul 31, 2011 at 10:19 AM, Fernando Perez <fperez.net at gmail.com> wrote:

> Please see our release notes for the full details on everything about
> this release: https://github.com/ipython/ipython/zipball/rel-0.11

And embarrassingly, that URL was for a zip download instead
(copy/paste error), the detailed release notes are here:

http://ipython.org/ipython-doc/rel-0.11/whatsnew/version0.11.html

Sorry about the mistake...

Cheers,

f


From ralf.gommers at googlemail.com  Wed Aug  3 12:05:33 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Wed, 3 Aug 2011 18:05:33 +0200
Subject: [SciPy-User] optimize.fmin_cobyla giving nan to objective
	function
In-Reply-To: <CAD+Uu3N3XaDYLM5jtxiSuy+tQFqFWMi6RzHG1fFkALZTB92R6w@mail.gmail.com>
References: <CAD+Uu3MKX=QXX+Rq3KhzFn2SGCfOfUpgAj9mu5CFwsOGiS_Q8w@mail.gmail.com>
	<CAEJxiFoMEHCR_n7F5o36+QsTmz_bRhd3T0s7sZ6L4p5ONOALQA@mail.gmail.com>
	<CAD+Uu3Pyuy1nJtkyAz6DnVpGZqJjCH+8T4terxp=X=ZXYts+-A@mail.gmail.com>
	<CAD+Uu3MNOM_qvLjb+=nq_4FpndwBMksy7fAEL1VjRRNMGDExMg@mail.gmail.com>
	<CABL7CQjv=-rF85i79ZpCE1WC4b8sY+vgo2mRdezetwHishywpQ@mail.gmail.com>
	<CAD+Uu3N3XaDYLM5jtxiSuy+tQFqFWMi6RzHG1fFkALZTB92R6w@mail.gmail.com>
Message-ID: <CABL7CQh=YOXB4QT3rJp0gVFbKL4bJDZ4jfgXLmSdeQ-Ftok4xA@mail.gmail.com>

On Tue, Aug 2, 2011 at 10:55 PM, Gustavo Goretkin <
gustavo.goretkin at gmail.com> wrote:

>
>>> Your function always returns inf, so it's not very surprising that you
>> get a nan after a few iterations. Could happen for example if the code
>> determines a derivative numerically, resulting in inf / inf = nan.
>>
>> It would be helpful if you had a realistic, self-contained example.
>>
>> Raln
>>
>
> In scikit-learn, fmin_cobyla is used to optimize some parameters of a
> Gaussian Process. The objective function returns inf when the parameters are
> such that the matrix calculations are unstable and NumPy throws a LinAlg
> exception. What would be a better way to handle this?
>

Let the objective function do something sensible? Like figure out what the
unstable region is and returning values that steer the optimizer away from
it.

With a slight modification to your last test script I see that fmin_cobyla
doesn't choke on receiving a first inf from the objective function (see
below). If it receives infs not for a single x, but several or a range, then
I'd expect it to fail.

from scipy.optimize import fmin_cobyla
import numpy as np

def objective(x):
    print 'Input: ', x, '  return value: ', x + 1./x
    return x + 1./x

def constraint1(x):
    return 0

xstar = fmin_cobyla(func=objective, x0=0, cons=[constraint1])

Cheers,
Ralf


> My gut feeling is that an optimizer should not pass nan to the objective
> function, since it cannot possibly be informative. Maybe checking for nan
> would be inefficient.
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110803/31274407/attachment.html>

From keith.hughitt at gmail.com  Wed Aug  3 12:19:39 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 3 Aug 2011 09:19:39 -0700 (PDT)
Subject: [SciPy-User] numpy, scipy, and python 3
In-Reply-To: <CABL7CQjbi0ZhUjPki=kzLR-G=2uA4omP-3=eE2D9WU83Ss7fjA@mail.gmail.com>
References: <4E374677.7010709@umit.maine.edu>
	<4F2259D3-F112-4104-8269-33E6DA5F148D@iro.umontreal.ca>
	<CABL7CQjbi0ZhUjPki=kzLR-G=2uA4omP-3=eE2D9WU83Ss7fjA@mail.gmail.com>
Message-ID: <18374965.5317.1312388379274.JavaMail.geo-discussion-forums@yqkc21>

Matplotlib actually has a Python 3 branch as well now: 
https://github.com/matplotlib/matplotlib-py3

I'm sure there is still work to be done, but I've used it for some basic 
plotting and so far it has worked well.
Keith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110803/0de3e371/attachment.html>

From keith.hughitt at gmail.com  Wed Aug  3 12:37:29 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 3 Aug 2011 09:37:29 -0700 (PDT)
Subject: [SciPy-User] Having scipy.ndimage,
 etc. methods return ndarray subclass instances?
Message-ID: <3536413.654.1312389449535.JavaMail.geo-discussion-forums@yqcj24>

Hello,

I'm currently working on creating a subclass of numpy.ndarray, and would 
like to ensure that other methods that work with ndarrays (e.g. 
http://docs.scipy.org/doc/scipy/reference/ndimage.html) return an instance 
of the subclass instead of an ndarray. After reading a discussion on the 
topic on Stack Overflow (
http://stackoverflow.com/questions/6190859/some-numpy-functions-return-ndarray-instead-of-my-subclass), I 
looked into adding/modifying __array_finalize__ and __array_wrap__ but 
neither of these appear to be called when I call a scipy.ndimage method such 
as median_filter.

Is there a way I can extend my subclass so that I can ensure that a new 
subclass instance or view is returned instead of an ndarray?

Any suggestions would be greatly appreciated.

Thanks,
Keith

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110803/ed3f40b0/attachment.html>

From keith.hughitt at gmail.com  Wed Aug  3 11:58:51 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 3 Aug 2011 08:58:51 -0700 (PDT)
Subject: [SciPy-User] Having scipy.ndimage,
 etc. methods return ndarray subclass instances?
Message-ID: <27260730.6415.1312387131805.JavaMail.geo-discussion-forums@yqbp37>

Hello,

I'm currently working on creating a subclass of numpy.ndarray, and would 
like to ensure that other methods that work with ndarrays (e.g. 
scipy.ndimage.* <http://www.scipy.org/SciPyPackages/Ndimage>) return an 
instance of the subclass instead of an ndarray.

After reading a discussion on the topic<http://stackoverflow.com/questions/6190859/some-numpy-functions-return-ndarray-instead-of-my-subclass>on StackOverflow, I looked into adding/modifying __array_finalize__ 
and __array_wrap__<http://docs.scipy.org/doc/numpy/reference/arrays.classes.html>. 
Neither of these appear to be called when I call a scipy.ndimage method such 
as median_filter<http://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.filters.median_filter.html>
.

Is there a way I can extend my subclass so that I can ensure that a new 
subclass instance or view is returned instead of an ndarray?

Any suggestions would be greatly appreciated.

Thanks,
Keith


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110803/f81829b2/attachment.html>

From xavier.gnata at gmail.com  Thu Aug  4 10:17:07 2011
From: xavier.gnata at gmail.com (Xavier Gnata)
Date: Thu, 4 Aug 2011 16:17:07 +0200
Subject: [SciPy-User] bug in optimize.curve_fit ?
Message-ID: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>

Hi,

def func(x, a, b, c):
    return a*np.exp(-b*x) + c
x = np.linspace(0,4,50)
y = func(x, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x))
popt, pcov = curve_fit(func,x, yn)

works vey well but if you change it to:

def func(x, a, b, c):
    return a*np.exp(-b*x) + c
x = list(np.linspace(0,4,50))
y = func(x, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x))
popt, pcov = curve_fit(func, x, yn)

then x is a list and we get this error:
"TypeError: can't multiply sequence by non-int of type 'float'"

However, according to the documentation, xdata : An N-length sequence
or an (k,N)-shaped array. I understand this statement as : "either a
list, a tuple or an array". Should optimize.curve_fit internally cast
xdata to an array? I would think so.


Xavier


From guziy.sasha at gmail.com  Thu Aug  4 10:26:23 2011
From: guziy.sasha at gmail.com (Oleksandr Huziy)
Date: Thu, 4 Aug 2011 10:26:23 -0400
Subject: [SciPy-User] bug in optimize.curve_fit ?
In-Reply-To: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>
References: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>
Message-ID: <CAN3p1sW9eS9e-ULed7S=VcPUJbKboQT869w0Y-N9CTKRW7HgRQ@mail.gmail.com>

Hi,

if you did x = list(x) it becomes a simple list, which cannot be multiplied
by a number.
don't do this.

--
Oleksandr Huziy


2011/8/4 Xavier Gnata <xavier.gnata at gmail.com>

> Hi,
>
> def func(x, a, b, c):
>    return a*np.exp(-b*x) + c
> x = np.linspace(0,4,50)
> y = func(x, 2.5, 1.3, 0.5)
> yn = y + 0.2*np.random.normal(size=len(x))
> popt, pcov = curve_fit(func,x, yn)
>
> works vey well but if you change it to:
>
> def func(x, a, b, c):
>    return a*np.exp(-b*x) + c
> x = list(np.linspace(0,4,50))
> y = func(x, 2.5, 1.3, 0.5)
> yn = y + 0.2*np.random.normal(size=len(x))
> popt, pcov = curve_fit(func, x, yn)
>
> then x is a list and we get this error:
> "TypeError: can't multiply sequence by non-int of type 'float'"
>
> However, according to the documentation, xdata : An N-length sequence
> or an (k,N)-shaped array. I understand this statement as : "either a
> list, a tuple or an array". Should optimize.curve_fit internally cast
> xdata to an array? I would think so.
>
>
>
> Xavier
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110804/7dee78e9/attachment.html>

From xavier.gnata at gmail.com  Fri Aug  5 03:29:08 2011
From: xavier.gnata at gmail.com (Xavier Gnata)
Date: Fri, 05 Aug 2011 09:29:08 +0200
Subject: [SciPy-User] bug in optimize.curve_fit ?
In-Reply-To: <CAN3p1sW9eS9e-ULed7S=VcPUJbKboQT869w0Y-N9CTKRW7HgRQ@mail.gmail.com>
References: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>
	<CAN3p1sW9eS9e-ULed7S=VcPUJbKboQT869w0Y-N9CTKRW7HgRQ@mail.gmail.com>
Message-ID: <4E3B9BC4.4000404@gmail.com>

Hi,

Yes but the doc ( 
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html 
) claims that xdata can be "An N-length sequence or an (k,N)-shaped array"

IRL, I would not cast a array to a list to call optimize.curve_fit.
x = list(x) was only an easy way to write a short testcase based on the 
exemple in the doc.

Xavier

> Hi,
>
> if you did x = list(x) it becomes a simple list, which cannot be 
> multiplied by a number.
> don't do this.
>
> --
> Oleksandr Huziy
>
>
> 2011/8/4 Xavier Gnata <xavier.gnata at gmail.com 
> <mailto:xavier.gnata at gmail.com>>
>
>     Hi,
>
>     def func(x, a, b, c):
>        return a*np.exp(-b*x) + c
>     x = np.linspace(0,4,50)
>     y = func(x, 2.5, 1.3, 0.5)
>     yn = y + 0.2*np.random.normal(size=len(x))
>     popt, pcov = curve_fit(func,x, yn)
>
>     works vey well but if you change it to:
>
>     def func(x, a, b, c):
>        return a*np.exp(-b*x) + c
>     x = list(np.linspace(0,4,50))
>     y = func(x, 2.5, 1.3, 0.5)
>     yn = y + 0.2*np.random.normal(size=len(x))
>     popt, pcov = curve_fit(func, x, yn)
>
>     then x is a list and we get this error:
>     "TypeError: can't multiply sequence by non-int of type 'float'"
>
>     However, according to the documentation, xdata : An N-length sequence
>     or an (k,N)-shaped array. I understand this statement as : "either a
>     list, a tuple or an array". Should optimize.curve_fit internally cast
>     xdata to an array? I would think so.
>
>
>
>     Xavier
>     _______________________________________________
>     SciPy-User mailing list
>     SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From josef.pktd at gmail.com  Fri Aug  5 04:39:14 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 5 Aug 2011 04:39:14 -0400
Subject: [SciPy-User] bug in optimize.curve_fit ?
In-Reply-To: <4E3B9BC4.4000404@gmail.com>
References: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>
	<CAN3p1sW9eS9e-ULed7S=VcPUJbKboQT869w0Y-N9CTKRW7HgRQ@mail.gmail.com>
	<4E3B9BC4.4000404@gmail.com>
Message-ID: <CAMMTP+CfkMPSn2_thuU9WXbRn3UW909tys7XGTg=ydtES2puCA@mail.gmail.com>

On Fri, Aug 5, 2011 at 3:29 AM, Xavier Gnata <xavier.gnata at gmail.com> wrote:
> Hi,
>
> Yes but the doc (
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
> ) claims that xdata can be "An N-length sequence or an (k,N)-shaped array"
>
> IRL, I would not cast a array to a list to call optimize.curve_fit.
> x = list(x) was only an easy way to write a short testcase based on the
> exemple in the doc.
>
> Xavier
>
>> Hi,
>>
>> if you did x = list(x) it becomes a simple list, which cannot be
>> multiplied by a number.
>> don't do this.
>>
>> --
>> Oleksandr Huziy
>>
>>
>> 2011/8/4 Xavier Gnata <xavier.gnata at gmail.com
>> <mailto:xavier.gnata at gmail.com>>
>>
>> ? ? Hi,
>>
>> ? ? def func(x, a, b, c):
>> ? ? ? ?return a*np.exp(-b*x) + c
>> ? ? x = np.linspace(0,4,50)
>> ? ? y = func(x, 2.5, 1.3, 0.5)
>> ? ? yn = y + 0.2*np.random.normal(size=len(x))
>> ? ? popt, pcov = curve_fit(func,x, yn)
>>
>> ? ? works vey well but if you change it to:
>>
>> ? ? def func(x, a, b, c):
>> ? ? ? ?return a*np.exp(-b*x) + c
>> ? ? x = list(np.linspace(0,4,50))
>> ? ? y = func(x, 2.5, 1.3, 0.5)
>> ? ? yn = y + 0.2*np.random.normal(size=len(x))
>> ? ? popt, pcov = curve_fit(func, x, yn)
>>
>> ? ? then x is a list and we get this error:
>> ? ? "TypeError: can't multiply sequence by non-int of type 'float'"
>>
>> ? ? However, according to the documentation, xdata : An N-length sequence
>> ? ? or an (k,N)-shaped array. I understand this statement as : "either a
>> ? ? list, a tuple or an array". Should optimize.curve_fit internally cast
>> ? ? xdata to an array? I would think so.

The docstring might be a bit misleading. x (xdata) is just handed
through curve_fit and leastsq to your function and could be anything.
The interpretation as xdata is just for the specific usecase
y=f(x)+noise, but nothing in the implementation requires directly
anything about x. (The only requirement is that f(x) returns a (N,)
array for an x.)

So, I don't think curve_fit should do any changes to x, it's an arg
for the user function that the user should take care of, and a user
can exploit it's flexibility.

Josef


>>
>>
>>
>> ? ? Xavier
>> ? ? _______________________________________________
>> ? ? SciPy-User mailing list
>> ? ? SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>> ? ? http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Fri Aug  5 04:53:29 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 5 Aug 2011 04:53:29 -0400
Subject: [SciPy-User] bug in optimize.curve_fit ?
In-Reply-To: <CAMMTP+CfkMPSn2_thuU9WXbRn3UW909tys7XGTg=ydtES2puCA@mail.gmail.com>
References: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>
	<CAN3p1sW9eS9e-ULed7S=VcPUJbKboQT869w0Y-N9CTKRW7HgRQ@mail.gmail.com>
	<4E3B9BC4.4000404@gmail.com>
	<CAMMTP+CfkMPSn2_thuU9WXbRn3UW909tys7XGTg=ydtES2puCA@mail.gmail.com>
Message-ID: <CAMMTP+C_bW7ubU2GVQSZpMgCKPP_yEk9C9gjr7GnpTGG6=Ksaw@mail.gmail.com>

On Fri, Aug 5, 2011 at 4:39 AM,  <josef.pktd at gmail.com> wrote:
> On Fri, Aug 5, 2011 at 3:29 AM, Xavier Gnata <xavier.gnata at gmail.com> wrote:
>> Hi,
>>
>> Yes but the doc (
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
>> ) claims that xdata can be "An N-length sequence or an (k,N)-shaped array"
>>
>> IRL, I would not cast a array to a list to call optimize.curve_fit.
>> x = list(x) was only an easy way to write a short testcase based on the
>> exemple in the doc.
>>
>> Xavier
>>
>>> Hi,
>>>
>>> if you did x = list(x) it becomes a simple list, which cannot be
>>> multiplied by a number.
>>> don't do this.
>>>
>>> --
>>> Oleksandr Huziy
>>>
>>>
>>> 2011/8/4 Xavier Gnata <xavier.gnata at gmail.com
>>> <mailto:xavier.gnata at gmail.com>>
>>>
>>> ? ? Hi,
>>>
>>> ? ? def func(x, a, b, c):
>>> ? ? ? ?return a*np.exp(-b*x) + c
>>> ? ? x = np.linspace(0,4,50)
>>> ? ? y = func(x, 2.5, 1.3, 0.5)
>>> ? ? yn = y + 0.2*np.random.normal(size=len(x))
>>> ? ? popt, pcov = curve_fit(func,x, yn)
>>>
>>> ? ? works vey well but if you change it to:
>>>
>>> ? ? def func(x, a, b, c):
>>> ? ? ? ?return a*np.exp(-b*x) + c
>>> ? ? x = list(np.linspace(0,4,50))
>>> ? ? y = func(x, 2.5, 1.3, 0.5)
>>> ? ? yn = y + 0.2*np.random.normal(size=len(x))
>>> ? ? popt, pcov = curve_fit(func, x, yn)
>>>
>>> ? ? then x is a list and we get this error:
>>> ? ? "TypeError: can't multiply sequence by non-int of type 'float'"
>>>
>>> ? ? However, according to the documentation, xdata : An N-length sequence
>>> ? ? or an (k,N)-shaped array. I understand this statement as : "either a
>>> ? ? list, a tuple or an array". Should optimize.curve_fit internally cast
>>> ? ? xdata to an array? I would think so.
>
> The docstring might be a bit misleading. x (xdata) is just handed
> through curve_fit and leastsq to your function and could be anything.
> The interpretation as xdata is just for the specific usecase
> y=f(x)+noise, but nothing in the implementation requires directly
> anything about x. (The only requirement is that f(x) returns a (N,)
> array for an x.)


for example:

import numpy as np
from scipy.optimize import curve_fit

def func(x, a, b, c):
    #print b, x, type(x)
    return a*np.exp(-b*x.x) + c

x0 = list(np.linspace(0,4,50))

class Dummy(object):
    def __init__(self, x):
        self.x = np.asarray(x)

xd = Dummy(x0)

y = func(xd, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x0))
popt, pcov = curve_fit(func, xd, yn)

print popt

Josef

>
> So, I don't think curve_fit should do any changes to x, it's an arg
> for the user function that the user should take care of, and a user
> can exploit it's flexibility.
>
> Josef
>
>
>>>
>>>
>>>
>>> ? ? Xavier
>>> ? ? _______________________________________________
>>> ? ? SciPy-User mailing list
>>> ? ? SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>>> ? ? http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>


From pav at iki.fi  Fri Aug  5 04:54:08 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 5 Aug 2011 08:54:08 +0000 (UTC)
Subject: [SciPy-User] bug in optimize.curve_fit ?
References: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>
Message-ID: <j1gb3g$ql3$1@dough.gmane.org>

Thu, 04 Aug 2011 16:17:07 +0200, Xavier Gnata wrote:
[clip]
> def func(x, a, b, c):
>     return a*np.exp(-b*x) + c
> x = list(np.linspace(0,4,50))
> y = func(x, 2.5, 1.3, 0.5)

Your program fails already here -- before curve_fit.


From josef.pktd at gmail.com  Fri Aug  5 05:14:13 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 5 Aug 2011 05:14:13 -0400
Subject: [SciPy-User] bug in optimize.curve_fit ?
In-Reply-To: <j1gb3g$ql3$1@dough.gmane.org>
References: <CAH+jD+3fbmczdLt5-H5VZfhJywUkd3XzXm=Y4D3_MgZ+nrBskQ@mail.gmail.com>
	<j1gb3g$ql3$1@dough.gmane.org>
Message-ID: <CAMMTP+DNGJ=LSUSi=L551vTAdKXt-sxEg6BOFbAQpf1+JR+qww@mail.gmail.com>

On Fri, Aug 5, 2011 at 4:54 AM, Pauli Virtanen <pav at iki.fi> wrote:
> Thu, 04 Aug 2011 16:17:07 +0200, Xavier Gnata wrote:
> [clip]
>> def func(x, a, b, c):
>> ? ? return a*np.exp(-b*x) + c
>> x = list(np.linspace(0,4,50))
>> y = func(x, 2.5, 1.3, 0.5)
>
> Your program fails already here -- before curve_fit.

Good catch, I had spent 10 minutes looking for the bug in my version
without ever checking the line number of the exception.

Josef

>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From R.Springuel at umit.maine.edu  Sat Aug  6 18:25:22 2011
From: R.Springuel at umit.maine.edu (R. Padraic Springuel)
Date: Sat, 06 Aug 2011 18:25:22 -0400
Subject: [SciPy-User] numpy, scipy, and python 3
In-Reply-To: <fc.3b9aca00ad52432d3b9aca002d4352ad.49f831e3@umit.maine.edu>
References: <fc.3b9aca00ad52432d3b9aca002d4352ad.49f831e3@umit.maine.edu>
Message-ID: <4E3DBF52.5040708@umit.maine.edu>

Well, I've successfully built both numpy and scipy for python 3.2.  I've 
also run the nose tests and only come up with one failed test, but it's 
the same test that fails on python 2.7 for me, and doesn't appear to be 
on a function that I've ever used.

For those interested, here's the output on the failed test:
> FAIL: test_expon (test_morestats.TestAnderson)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/stats/tests/test_morestats.py", line 72, in test_expon
>     assert_array_less(crit[:-1], A)
>   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 869, in assert_array_less
>     header='Arrays are not less-ordered')
>   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 613, in assert_array_compare
>     chk_same_position(x_id, y_id, hasval='inf')
>   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 588, in chk_same_position
>     raise AssertionError(msg)
> AssertionError:
> Arrays are not less-ordered
>
> x and y inf location mismatch:
>  x: array([ 0.911,  1.065,  1.325,  1.587])
>  y: array(inf)
>
> ----------------------------------------------------------------------

The results are identical for python 3.2 except "2.7" is replaced by 
"3.2" everywhere that it occurs.
-- 

R. Padraic Springuel, PhD


From wesmckinn at gmail.com  Sun Aug  7 16:26:28 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sun, 7 Aug 2011 16:26:28 -0400
Subject: [SciPy-User] Status of TimeSeries SciKit
In-Reply-To: <j10qkb$m33$1@dough.gmane.org>
References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk>
	<7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com>
	<CAJPUwMAaCmZqf=wVhfoEjMns5559QrnsGa73h4uuXL2u2mKSFw@mail.gmail.com>
	<7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com>
	<CAKF=DjtznfhsHi3K8X0DQLiGSj0UofTMkc8BXKcp5tym4Dzkrg@mail.gmail.com>
	<loom.20110726T195753-649@post.gmane.org>
	<20110726222843.GB8920@phare.normalesup.org>
	<loom.20110727T150528-769@post.gmane.org>
	<20110727141251.GB30024@phare.normalesup.org>
	<CAJPUwMBBeQ3-JXc78Wfv_K91PfGg14rM7g7MxBc1pg7oOR9jpA@mail.gmail.com>
	<loom.20110727T181119-128@post.gmane.org>
	<CAJPUwMCLG7LfarSy65SBQKZzQOgtPBDr0jbqPYm+xc1WPaDq1Q@mail.gmail.com>
	<loom.20110727T194936-933@post.gmane.org>
	<CAJPUwMBLLgVBpk5FhxSk8K4K1p5r0eFzS5S-D_O-=BqSR4hC0w@mail.gmail.com>
	<D1DA6B19-716C-465F-8520-4F5B00B97369@gmail.com>
	<j0q3ku$lqm$1@dough.gmane.org>
	<CAJPUwMDh44B5oxv9ya2xpNzSjoErgu=XkhtBJnfbGVKkLxjV1A@mail.gmail.com>
	<j10qkb$m33$1@dough.gmane.org>
Message-ID: <CAJPUwMB1Vd8Lc8QRR3goKo-aR6zkHyky1MKNaxZdb7v238dEtQ@mail.gmail.com>

On Sat, Jul 30, 2011 at 7:40 AM, Tim Michelsen
<timmichelsen at gmx-topmail.de> wrote:
>>> Since most of my code for meteorological data evaluations is based on
>>> it, I would be happy to receive infomation on the conclusion and how I
>>> need to adjust my code to upkeep with new developments.
>>
>> When it gets to that point I'd be happy to help (including looking at
>> some of your existing code and data).

Sorry I've been out of commission for the last week or so.

> In short my process goes like:
> * QC of incoming measurements data
> * visualisation and statistics (basics, disribution analysis)
> * reporting
> * back & forcasting with other (modeled) data
> * preparation of result data sets
>
> When it comes to QC I would need:
> * check on missing dates (i.e. failure of aquisitition equipment)
> * check on double dates (= failure of data logger)
> * data integrity and plausability tests with certain filters/flags
>
> All these need to be reported on:
> * data recovery
> * invalid data by filter/flag type
>
> So far, I have been using the masked arrays. Mainly because it is heaily
> ?used in the time series scikit and transfering masks from on array to
> another is quite once you learned the basics.
>
> Would you work these items out in pandas, as well?

I would need to look at code and see the concrete use cases. As with
anything else, you adapt solutions to your problems based on your
available tools.

> P.S. Your presentation "Time series analysis in Python with statsmodels"
> is really cool and has shown me good aspects about the HP filters
>

Thanks...still lots to do on the TSA front. The filtering work has all
been Skipper's.

> Regards,
> Timmie
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From wesmckinn at gmail.com  Sun Aug  7 16:37:05 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sun, 7 Aug 2011 16:37:05 -0400
Subject: [SciPy-User] Status of TimeSeries SciKit
In-Reply-To: <loom.20110802T092817-688@post.gmane.org>
References: <08B2C8B1-DD0B-4D02-82F0-4CBCD304AA31@bilokon.co.uk>
	<7B9AF0B6-8015-4736-AE31-53725695DE40@gmail.com>
	<CAJPUwMAaCmZqf=wVhfoEjMns5559QrnsGa73h4uuXL2u2mKSFw@mail.gmail.com>
	<7B0D4803-D6E3-4451-B60E-957966CCC73D@gmail.com>
	<CAKF=DjtznfhsHi3K8X0DQLiGSj0UofTMkc8BXKcp5tym4Dzkrg@mail.gmail.com>
	<loom.20110726T195753-649@post.gmane.org>
	<20110726222843.GB8920@phare.normalesup.org>
	<loom.20110727T150528-769@post.gmane.org>
	<20110727141251.GB30024@phare.normalesup.org>
	<CAJPUwMBBeQ3-JXc78Wfv_K91PfGg14rM7g7MxBc1pg7oOR9jpA@mail.gmail.com>
	<j1197k$5sr$2@dough.gmane.org>
	<CAJPUwMBxftgZFG0cgo3B+ur4tf=iJ21bT5FwtYvWu0Dev+P5ag@mail.gmail.com>
	<loom.20110802T092817-688@post.gmane.org>
Message-ID: <CAJPUwMAuU5GhKZEFTM1GCZ1sh1KT4Qs-DfY29Hj5jBd5Dye5=w@mail.gmail.com>

On Tue, Aug 2, 2011 at 3:37 AM, Tim Michelsen
<timmichelsen at gmx-topmail.de> wrote:
>> >> I agree. I already have 50% or more of the features in
>> >> scikits.timeseries, so this gets back to my fragmentation argument
>> >> (users being stuck with a confusing choice between multiple
>> >> libraries). Let's make it happen!
>> > So what needs to be done to move things forward?
>> > Do we need to draw up a roadmap?
>> > A table with functions that respond to common use cases in natual
>> > science, computing, and economics?
>> Having a place to collect concrete use cases (like your list from the
>> prior e-mail, but with illustrative code snippets) would be good.
>> You're welcome to start doing it here:
>>
>> https://github.com/wesm/pandas/wiki
> Here goes:
> https://github.com/wesm/pandas/wiki/Time-Series-Manipulation
>
> I will fill it with my stuff.
> Shall we file feature request directly as issues?

Cool, I will start adding things when I have some time. Feel free to
file features requests as issues tagged with "Enhancement".

>> A good place to start, which I can do when I have some time, would be
>> to start moving the scikits.timeseries code into pandas. There are
>> several key components
>>
>> - Date and DateArray stuff, frequency implementations
>> - masked array time series implementations (record array and not)
>> - plotting
>> - reporting, moving window functions, etc.
>>
>> We need to evaluate Date/DateArray as they relate to numpy.datetime64
>> and see what can be done. I haven't looked closely but I'm not sure if
>> all the convenient attribute access stuff (day, month, day_of_week,
>> weekday, etc.) is available in NumPy yet. I suspect it would be
>> reasonably straightforward to wrap DateArray so it can be an Index for
>> a pandas object.
>>
>> I won't have much time for this until mid-August, but a couple days'
>> hacking should get most of the pieces into place. I guess we can just
>> keep around the masked array classes for legacy API support and for
>> feature completeness.
> I value very much the work of Pierre and Matt.
> But my difficulti with the scikit was that the code is too complex. So I was
> only able to contribute helper functions for doc fixes.
> Please, lets make it happen that this effort is not a on or 3 man show but
> results in something whcih can be maintained by the whole community.

Yes, I agree. I am painfully aware of being one of the only people
consistently working on the data structure front (judging from commit
activity at least) but I would like to get more people involved. I'm
hopeful that increasing awareness to what we're working on (e.g. I've
started blogging about pandas and related things) will draw new people
into the projects.

> Nevertheless, the timeseries scikit made my work more comfortable and
> understadable than I was able to manage with R.
>
> Regards,
> Timmie
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From chris at simplistix.co.uk  Sun Aug  7 17:22:19 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Sun, 07 Aug 2011 22:22:19 +0100
Subject: [SciPy-User] getting started with arrays and matplotlib
Message-ID: <4E3F020B.1000500@simplistix.co.uk>

Hi All,

I'm a new user returning to SciPy after quite a long break, so, a few 
high-level questions first:

- Are there any good books or other narrative docs that cover the bulk 
of the core numpy stuff

- Ditto, but for visualisation, particularly with matplotlib or the 
Enthought visualisation suites.

I'm particularly interested in step-by-step docs/books with lots of 
examples, versus reference docs that basically need the user to know 
what they're looking for in a chicken and egg fashion, which was my 
previous experience of scipy docs...

Now, the specific problem I'm looking to solve it a stacked bar chart of 
ticket sales for an event over time. The data I have is basically a log 
file of ticket sales.

I was looking to build a 4-dimensonal array as follows, with each cell 
representing ticket sales for that week at that venue at that event:

event: 2011

venue t-3 week t-2 week t-1 week
v1    10       20       30
v2    15       30       45

event: 2010

venue t-3 week t-2 week t-1 week
v1    1        2        3
v2    15       30       45

...etc...

Now, first question: what's the best way to build this array given that 
I may only see the arrival of a new venue a fair way through building 
the data structure? How can I efficiently say "please add a new row to 
my array", I don't know what the 4th dimension equivalent is ;-)

Secondly, once I've populated this, any good examples of how to turn it 
into a bar chart? (the simple bar chart would be number of sales on the 
y-axis, weeks before the event on the x-axis, however, what I'd then 
like to do is split each bar into chunks for each venue's sales, if that 
makes sense?)

Any help gratefully received!

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From wardefar at iro.umontreal.ca  Mon Aug  8 01:29:44 2011
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Mon, 8 Aug 2011 01:29:44 -0400
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <4E3F020B.1000500@simplistix.co.uk>
References: <4E3F020B.1000500@simplistix.co.uk>
Message-ID: <BF399650-C961-436B-9C96-4926E3A49110@iro.umontreal.ca>

On 2011-08-07, at 5:22 PM, Chris Withers wrote:

> Now, first question: what's the best way to build this array given that 
> I may only see the arrival of a new venue a fair way through building 
> the data structure? How can I efficiently say "please add a new row to 
> my array", I don't know what the 4th dimension equivalent is ;-)

It may be worth thinking about whether an ndarray is necessarily the right way to solve this problem. For one thing, you can't append to ndarrays easily. HDF5 tables (via PyTables, for example) are more forgiving in this respect and play nice with NumPy, but there are certainly other options.

> Secondly, once I've populated this, any good examples of how to turn it 
> into a bar chart? (the simple bar chart would be number of sales on the 
> y-axis, weeks before the event on the x-axis, however, what I'd then 
> like to do is split each bar into chunks for each venue's sales, if that 
> makes sense?)


This might give you an example of what you need:

http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html

but you'd be better off asking on matplotlib-users.

David

From josef.pktd at gmail.com  Mon Aug  8 03:42:55 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 8 Aug 2011 03:42:55 -0400
Subject: [SciPy-User] rejection sampling
Message-ID: <CAMMTP+CTRzNA9mOA-aK5wx-h1yrCHY2fxHoheLoVXktu8P-bhw@mail.gmail.com>

I got started a bit with rejection sampling.

for example scipy.stats.rdist has a very slow random number generator.
for shape parameters>=2 and not very large, rejection sampling works
much faster. (for shape parameter <2, the pdf of rdist is unbound and
rejection sampling against uniform doesn't work)

attached is just the first version to show that it works.

Does someone already have a more complete version that could be shared?

Josef
-------------- next part --------------
A non-text attachment was scrubbed...
Name: try_sampling_reject.py
Type: text/x-python
Size: 1382 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110808/1292f5e2/attachment.py>

From rcarpenter at wdtinc.com  Tue Aug  9 17:59:26 2011
From: rcarpenter at wdtinc.com (Richard Carpenter)
Date: Tue, 9 Aug 2011 21:59:26 +0000
Subject: [SciPy-User] Building from source on RHEL5
Message-ID: <0C46B1FDF194D94CB9E329DFD315DB44252939CA@rain.wdtinc.com>

I am able to install ATLAS and build its shared libraries. But I can't figure out how to build the BLAS and LAPACK shared libraries. Following the installation instructions and editing the LAPACK make.inc with -fPIC, it builds a .a file. Is that really a .so file? What about the BLAS library? Thanks in advance for the help.

Richard Carpenter

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110809/9424649f/attachment.html>

From contact at graune.org  Wed Aug 10 01:59:58 2011
From: contact at graune.org (Manuel Graune)
Date: Wed, 10 Aug 2011 07:59:58 +0200
Subject: [SciPy-User] calculate definite integral of sampled data
Message-ID: <20110810055958.GH2924@uriel>

Hi everyone,

to calculate the definite integral of a function or an array of sampled
data scipy provides (among others) the quad and trapz functions.
So it is possible to compute e. g. the definite integral of cos(t) over
some interval by doing

definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit)

or

definite_integral= scipy.integrate.trapz(some_array).

Now, if I want to plot cos(t) and  the integral of cos(t) from 0 to t in
a graph, the necessary array can be calculated by:

@numpy.vectorize
def intfunc(fnc,upper_limit):
    return scipy.integrate.quad(fnc,0.0,upper_limit)

    definite_inegral= intfunc(cos,t)

which seems (whithout knowing the actual code) a bit wasteful and slow
but is relatively concise.

Now for my question: scipy provides e. g. the trapz-function to
calculate definite integral of a complete array of sampled data.
However, I have no idea how to get achieve the same as above for
sampled data (apart from manually iterating in a for-loop). Is there
a function somewhere which delivers an array of the definite integrals
for each of the data-points in an array?


Regards,

Manuel

-- 
A hundred men did the rational thing. The sum of those rational choices was
called panic. Neal Stephenson -- System of the world
http://www.graune.org/GnuPG_pubkey.asc
Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A  5828 5476 7E92 2DB4 3C99
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/b789b985/attachment.sig>

From gustavo.goretkin at gmail.com  Wed Aug 10 02:46:19 2011
From: gustavo.goretkin at gmail.com (Gustavo Goretkin)
Date: Wed, 10 Aug 2011 02:46:19 -0400
Subject: [SciPy-User] calculate definite integral of sampled data
In-Reply-To: <20110810055958.GH2924@uriel>
References: <20110810055958.GH2924@uriel>
Message-ID: <CAD+Uu3Os4cDAeK689AtfJdy--qt4V4irjt=5atWaeZEU7d9GiQ@mail.gmail.com>

You could try using the numpy.cumsum (standing for cumulative sum) function
to accomplish this. This would give you the equivalent of a Riemann sum (the
sum is approximated with rectangles, specifically I think this would be
considered the midpoint Riemann sum).

You should be able the accomplish the trapezoidal rule by first averaging
consecutive samples and then applying the Riemann sum. Here's an example

In [2]: sample_points = np.linspace(0,10,1000)

In [3]: y = np.cos(sample_points)

In [4]: y_midpoint = np.cumsum(y)

In [5]: y_smooth = ( y[0:-1] + y[1:] ) * (.5)

In [6]: y_trapezoidal = np.cumsum(y_smooth)

Note that after trapezoidal integration, the array length is one fewer.

In my opinion, the more elegant way to do the smoothing step is with the
numpy.convolution operator. In this same way, you should be able to
implement other equally-spaced quadrature rules like Simpson's rule, but I
may be incorrect.

Gustavo


On Wed, Aug 10, 2011 at 1:59 AM, Manuel Graune <contact at graune.org> wrote:

> Hi everyone,
>
> to calculate the definite integral of a function or an array of sampled
> data scipy provides (among others) the quad and trapz functions.
> So it is possible to compute e. g. the definite integral of cos(t) over
> some interval by doing
>
> definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit)
>
> or
>
> definite_integral= scipy.integrate.trapz(some_array).
>
> Now, if I want to plot cos(t) and  the integral of cos(t) from 0 to t in
> a graph, the necessary array can be calculated by:
>
> @numpy.vectorize
> def intfunc(fnc,upper_limit):
>    return scipy.integrate.quad(fnc,0.0,upper_limit)
>
>    definite_inegral= intfunc(cos,t)
>
> which seems (whithout knowing the actual code) a bit wasteful and slow
> but is relatively concise.
>
> Now for my question: scipy provides e. g. the trapz-function to
> calculate definite integral of a complete array of sampled data.
> However, I have no idea how to get achieve the same as above for
> sampled data (apart from manually iterating in a for-loop). Is there
> a function somewhere which delivers an array of the definite integrals
> for each of the data-points in an array?
>
>
> Regards,
>
> Manuel
>
> --
> A hundred men did the rational thing. The sum of those rational choices was
> called panic. Neal Stephenson -- System of the world
> http://www.graune.org/GnuPG_pubkey.asc
> Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A  5828 5476 7E92 2DB4 3C99
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/7abc075a/attachment.html>

From cimrman3 at ntc.zcu.cz  Wed Aug 10 08:55:50 2011
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Wed, 10 Aug 2011 14:55:50 +0200
Subject: [SciPy-User] ANN: SfePy 2011.3
Message-ID: <4E427FD6.5000707@ntc.zcu.cz>

I am pleased to announce release 2011.3 of SfePy.

Description
-----------
SfePy (simple finite elements in Python) is a software for solving
systems of coupled partial differential equations by the finite element
method. The code is based on NumPy and SciPy packages. It is distributed
under the new BSD license.

Home page: http://sfepy.org
Mailing lists, issue tracking: http://code.google.com/p/sfepy/
Git (source) repository: http://github.com/sfepy

Documentation: http://docs.sfepy.org/doc

Highlights of this release
--------------------------
- major update of terms aiming at easier usage and definition
   while retaining original C functions
- overriding problem description items on command line
- improved developer guide
- Primer tutorial - a step-by-step walk-through of the process to solve
   a simple mechanics problem

For more information on this release, see
http://sfepy.googlecode.com/svn/web/releases/2011.3_RELEASE_NOTES.txt
(full release notes, rather long and technical).

Best regards,
Robert Cimrman and Contributors (*)

(*) Contributors to this release (alphabetical order):

Vladim?r Luke?, Maty?? Nov?k, Andre Smit


From rajs2010 at gmail.com  Wed Aug 10 09:18:28 2011
From: rajs2010 at gmail.com (Rajeev Singh)
Date: Wed, 10 Aug 2011 18:48:28 +0530
Subject: [SciPy-User] Speeding up Python Again
Message-ID: <CAABz-z_6y5_ghTUCG-EfWu0bR6f6+5UK8hN_=ZOB8D+chrMxEA@mail.gmail.com>

Hi,

I was trying out the codes discussed at
http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html
Here is a summary of my results -

            Computer: Desktop    imsc9    aravali   annapurna
               NumPy: 7.651419  4.219105  5.576453  4.858640
              Cython: 4.259419  3.477259  3.204909  2.357819
               Weave: 4.302778     *      3.298551  2.400000
      Looped Fortran: 4.199148  3.414484  3.202963  2.315644
  Vectorized Fortran: 3.118410  2.131966  1.512303  1.460251
pure fortran update1: 1.205727  1.964857  2.034688  1.336086
pure fortran update2: 0.600848  0.604649  0.573593  0.721339

imsc9, aravali and annapurna are HPC machines at my institute
* for some reason Weave didn't compile on imsc9

Indeed there is about a factor of 7 to 12 difference between pure fortran
with update2 (vectorized) and the numpy version.

I should mention that I changed N to 150 in laplace_for.f90

Rajeev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/e00e13c8/attachment.html>

From davclark at gmail.com  Tue Aug  9 16:46:35 2011
From: davclark at gmail.com (Dav Clark)
Date: Tue, 9 Aug 2011 13:46:35 -0700
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <4E3F020B.1000500@simplistix.co.uk>
References: <4E3F020B.1000500@simplistix.co.uk>
Message-ID: <E550B8AC-CD2D-423E-B372-FD3C7C45103C@gmail.com>

On Aug 7, 2011, at 2:22 PM, Chris Withers wrote:

> Hi All,
> 
> I'm a new user returning to SciPy after quite a long break, so, a few 
> high-level questions first:
> 
> - Are there any good books or other narrative docs that cover the bulk 
> of the core numpy stuff
> 

> - Ditto, but for visualisation, particularly with matplotlib or the 
> Enthought visualisation suites.


Well, this is probably more basic than you want, but O'Reilly's "Data Analysis with Open Source Tools" is certainly a nice low-level intro for a beginner:

http://oreilly.com/catalog/9780596802363

It's available on Safari Bookshelf, and also talks about using matplotlib (and R and GSL and ...). I'm unaware of any nice Chaco "narratives."


> Now, first question: what's the best way to build this array given that 
> I may only see the arrival of a new venue a fair way through building 
> the data structure? How can I efficiently say "please add a new row to 
> my array", I don't know what the 4th dimension equivalent is ;-)

You might consider doing what matlab does under the hood and just double the array when you run out of space. You can also keep a view around that restricts to just the portion of the data that's "real."

> Secondly, once I've populated this, any good examples of how to turn it 
> into a bar chart? (the simple bar chart would be number of sales on the 
> y-axis, weeks before the event on the x-axis, however, what I'd then 
> like to do is split each bar into chunks for each venue's sales, if that 
> makes sense?)

The book above would do a good job with this.

Cheers,
Dav

From contact at graune.org  Wed Aug 10 01:59:58 2011
From: contact at graune.org (Manuel Graune)
Date: Wed, 10 Aug 2011 07:59:58 +0200
Subject: [SciPy-User] calculate definite integral of sampled data
Message-ID: <20110810055958.GH2924@uriel>

Hi everyone,

to calculate the definite integral of a function or an array of sampled
data scipy provides (among others) the quad and trapz functions.
So it is possible to compute e. g. the definite integral of cos(t) over
some interval by doing

definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit)

or

definite_integral= scipy.integrate.trapz(some_array).

Now, if I want to plot cos(t) and  the integral of cos(t) from 0 to t in
a graph, the necessary array can be calculated by:

@numpy.vectorize
def intfunc(fnc,upper_limit):
    return scipy.integrate.quad(fnc,0.0,upper_limit)

    definite_inegral= intfunc(cos,t)

which seems (whithout knowing the actual code) a bit wasteful and slow
but is relatively concise.

Now for my question: scipy provides e. g. the trapz-function to
calculate definite integral of a complete array of sampled data.
However, I have no idea how to get achieve the same as above for
sampled data (apart from manually iterating in a for-loop). Is there
a function somewhere which delivers an array of the definite integrals
for each of the data-points in an array?


Regards,

Manuel

-- 
A hundred men did the rational thing. The sum of those rational choices was
called panic. Neal Stephenson -- System of the world
http://www.graune.org/GnuPG_pubkey.asc
Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A  5828 5476 7E92 2DB4 3C99
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/b789b985/attachment-0001.sig>

From jeffalstott at gmail.com  Wed Aug 10 09:14:59 2011
From: jeffalstott at gmail.com (Jeff Alstott)
Date: Wed, 10 Aug 2011 15:14:59 +0200
Subject: [SciPy-User] firwin behavior
Message-ID: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>

firwin is producing unreasonable filters for me, and I'm not sure if I'm
misusing the code or if there is a bug. Like so:

In [5]: from scipy.signal import firwin

In [6]: ny = 500

In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);
savefig('FIR21_filter80.png')

Produces the attached file.

In contrast, Matlab:

Trial>> ny = 500

ny =

   500

Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80)

Produces the other attached file. Quite different! The filter produced by
the scipy function, if used with lfilter (or if taken to Matlab to use as a
filter), produces a nonsense filtering, with many high frequency artifacts.

Any thoughts? This is in python3, if that matters.

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/41916ba9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: FIR21_filter80.png
Type: image/png
Size: 16858 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/41916ba9/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matlab_FIR21_filter80.png
Type: image/png
Size: 8402 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/41916ba9/attachment-0001.png>

From warren.weckesser at enthought.com  Wed Aug 10 10:39:11 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 10 Aug 2011 09:39:11 -0500
Subject: [SciPy-User] firwin behavior
In-Reply-To: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
References: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
Message-ID: <CAM-+wY_7+Y8uqaNeAn3gKvS0udJ6v+Jab+u2JRo1ouo_fOkf4w@mail.gmail.com>

On Wed, Aug 10, 2011 at 8:14 AM, Jeff Alstott <jeffalstott at gmail.com> wrote:
> firwin is producing unreasonable filters for me, and I'm not sure if I'm
> misusing the code or if there is a bug. Like so:
>
> In [5]: from scipy.signal import firwin
>
> In [6]: ny = 500
>
> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);
> savefig('FIR21_filter80.png')
>
> Produces the attached file.
>
> In contrast, Matlab:
>
> Trial>> ny = 500
>
> ny =
>
> ?? 500
>
> Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80)
>
> Produces the other attached file. Quite different! The filter produced by
> the scipy function, if used with lfilter (or if taken to Matlab to use as a
> filter), produces a nonsense filtering, with many high frequency artifacts.
>
> Any thoughts? This is in python3, if that matters.


By default, firwin creates a filter that passes DC (i.e. the zero
frequency).  To get a filter like the one produced by matlab, add the
keyword argument pass_zero=False.

Warren


>
> Thanks!
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From gokhansever at gmail.com  Wed Aug 10 13:48:30 2011
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 10 Aug 2011 11:48:30 -0600
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <4E3F020B.1000500@simplistix.co.uk>
References: <4E3F020B.1000500@simplistix.co.uk>
Message-ID: <CAE5kuyh99dJgSuFtOLwzLoAxhar0HwLsW-0t9Gy_ooN6-Hn5FQ@mail.gmail.com>

On Sun, Aug 7, 2011 at 3:22 PM, Chris Withers <chris at simplistix.co.uk> wrote:
> Hi All,
>
> I'm a new user returning to SciPy after quite a long break, so, a few
> high-level questions first:
>
> - Are there any good books or other narrative docs that cover the bulk
> of the core numpy stuff
>
> - Ditto, but for visualisation, particularly with matplotlib or the
> Enthought visualisation suites.
>
> I'm particularly interested in step-by-step docs/books with lots of
> examples, versus reference docs that basically need the user to know
> what they're looking for in a chicken and egg fashion, which was my
> previous experience of scipy docs...

Somewhat an advanced data analysis book, particularly if you are
interested in error analysis, and not so surprisingly powered by
Python:

A Student's Guide to Data and Error Analysis
[http://www.cambridge.org/gb/knowledge/isbn/item5731787/]


-- 
G?khan


From paul.blelloch at ata-e.com  Wed Aug 10 14:48:11 2011
From: paul.blelloch at ata-e.com (Paul Blelloch)
Date: Wed, 10 Aug 2011 11:48:11 -0700
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <CAE5kuyh99dJgSuFtOLwzLoAxhar0HwLsW-0t9Gy_ooN6-Hn5FQ@mail.gmail.com>
Message-ID: <6a7c34c3f511f84f907b45e0a6dc6021@mail>

I recently got Hans Petter Langtangen's "A Primer on Scientific Programming with Python." I thought that it was a good choice.  It's more of a text book than a reference, but is well written. He has another book called "Python Scripting for Computational Science," which might also serve.

-----Original Message-----
From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of G?khan Sever
Sent: Wednesday, August 10, 2011 10:49 AM
To: SciPy Users List
Subject: Re: [SciPy-User] getting started with arrays and matplotlib

On Sun, Aug 7, 2011 at 3:22 PM, Chris Withers <chris at simplistix.co.uk> wrote:
> Hi All,
>
> I'm a new user returning to SciPy after quite a long break, so, a few
> high-level questions first:
>
> - Are there any good books or other narrative docs that cover the bulk
> of the core numpy stuff
>
> - Ditto, but for visualisation, particularly with matplotlib or the
> Enthought visualisation suites.
>
> I'm particularly interested in step-by-step docs/books with lots of
> examples, versus reference docs that basically need the user to know
> what they're looking for in a chicken and egg fashion, which was my
> previous experience of scipy docs...

Somewhat an advanced data analysis book, particularly if you are
interested in error analysis, and not so surprisingly powered by
Python:

A Student's Guide to Data and Error Analysis
[http://www.cambridge.org/gb/knowledge/isbn/item5731787/]


-- 
G?khan
_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user


From gmane at blindgoat.org  Wed Aug 10 15:59:23 2011
From: gmane at blindgoat.org (martin smith)
Date: Wed, 10 Aug 2011 15:59:23 -0400
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <6a7c34c3f511f84f907b45e0a6dc6021@mail>
References: <CAE5kuyh99dJgSuFtOLwzLoAxhar0HwLsW-0t9Gy_ooN6-Hn5FQ@mail.gmail.com>
	<6a7c34c3f511f84f907b45e0a6dc6021@mail>
Message-ID: <j1unv6$ogg$1@dough.gmane.org>

On 8/10/2011 2:48 PM, Paul Blelloch wrote:
> I recently got Hans Petter Langtangen's "A Primer on Scientific Programming with Python." I thought that it was a good choice.  It's more of a text book than a reference, but is well written. He has another book called "Python Scripting for Computational Science," which might also serve.
> 

I'd like to support the recommendation for Langtangen's book (I haven't
seen the second one).  I think it's an excellent combination of advanced
script usage and scientific applications.

- martin smith


From wccarithers at lbl.gov  Wed Aug 10 16:47:16 2011
From: wccarithers at lbl.gov (Bill Carithers)
Date: Wed, 10 Aug 2011 13:47:16 -0700
Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS X
	10.7 aka Lion
Message-ID: <CA683C64.239A6%wccarithers@lbl.gov>

Hi all,

When I upgraded to Lion, it wiped out my previous Python2.6 site-packages,
including scipy-0.7.1. Now I?m trying to re-install the latest scipy from
svn in the Python2.7 supplied from Apple. After reading the installation
instructions, I followed the recommendation to use the fortran compiler (Gnu
Fortran 4.2.4 for Lion) from http://r.research.att.com/tools/ which
installed in Xcode 4.1. (Actually, I?m a little confused by this since, even
though the installation said it succeeded, I couldn?t find it. Also when I
did ?gfortran ?version? from the command line, it returned
?i686-apple-darwin11-gfortran-4.2.1: no input files?.)

When I tried to build and install with ?sudo python setup.py install?, it
was humming along until it got to ARPACK the exited with exit status 1. It
looks to my untrained eye as if it failed when trying to compile
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c: The
terminal output for this portion of the build is appended below.

Another question... Mac OS X 10.7 has an Accelerator for vecLib that
includes a built-in BLAS/ATLAS. Shouldn?t the build be using this? How do I
tell it where to find it?

Any ideas on how to fix this?

Thanks.

creating build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen
creating build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack
creating 
build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack/ARPACK
creating 
build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack/ARPACK/FWR
APPERS
compile options: '-Iscipy/sparse/linalg/eigen/arpack/ARPACK/SRC
-I/Library/Python/2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.7-intel.egg/
numpy/core/include -c'
llvm-gcc-4.2: 
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error:
expected ?;?, ?,? or ?)? before ?*? token
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error:
expected ?;?, ?,? or ?)? before ?*? token
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error:
expected ?;?, ?,? or ?)? before ?*? token
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error:
expected ?;?, ?,? or ?)? before ?*? token
lipo: can't open input file: /var/tmp//ccBUPGd0.out (No such file or
directory)
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error:
expected ?;?, ?,? or ?)? before ?*? token
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error:
expected ?;?, ?,? or ?)? before ?*? token
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:4: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10:
warning: type defaults to ?int? in declaration of ?complex?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:10: error:
expected ?;?, ?,? or ?)? before ?float?
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:16: error:
expected ?;?, ?,? or ?)? before ?*? token
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c:21: error:
expected ?;?, ?,? or ?)? before ?*? token
lipo: can't open input file: /var/tmp//ccBUPGd0.out (No such file or
directory)
error: Command "llvm-gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -g
-Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd
-DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes
-Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes
-DENABLE_DTRACE -arch i386 -arch x86_64 -pipe
-Iscipy/sparse/linalg/eigen/arpack/ARPACK/SRC
-I/Library/Python/2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.7-intel.egg/
numpy/core/include -c
scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c -o
build/temp.macosx-10.7-intel-2.7/scipy/sparse/linalg/eigen/arpack/ARPACK/FWR
APPERS/veclib_cabi_c.o" failed with exit status 1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/f3bfb58a/attachment.html>

From chris.felton at gmail.com  Wed Aug 10 16:16:06 2011
From: chris.felton at gmail.com (Christopher Felton)
Date: Wed, 10 Aug 2011 15:16:06 -0500
Subject: [SciPy-User] firwin behavior
In-Reply-To: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
References: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
Message-ID: <j1uou7$uj9$1@dough.gmane.org>

On 8/10/2011 8:14 AM, Jeff Alstott wrote:
> firwin is producing unreasonable filters for me, and I'm not sure if I'm
> misusing the code or if there is a bug. Like so:
>
> In [5]: from scipy.signal import firwin
>
> In [6]: ny = 500
>
> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);


Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0.

Simply create a float ny = 500. , Note the "." then the divides will be 
floats. In Matlab everything is by default a double.  Python not so.

The version I am running encounters an error on the above, if I use 
floats or not, version 0.8.0.

Regards,
Chris

> savefig('FIR21_filter80.png')
>
> Produces the attached file.
>
> In contrast, Matlab:
>
> Trial>>  ny = 500
>
> ny =
>
>     500
>
> Trial>>  [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80)
>
> Produces the other attached file. Quite different! The filter produced by
> the scipy function, if used with lfilter (or if taken to Matlab to use as a
> filter), produces a nonsense filtering, with many high frequency artifacts.
>
> Any thoughts? This is in python3, if that matters.
>
> Thanks!
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From ralf.gommers at googlemail.com  Wed Aug 10 16:55:24 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Wed, 10 Aug 2011 22:55:24 +0200
Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS
 X 10.7 aka Lion
In-Reply-To: <CA683C64.239A6%wccarithers@lbl.gov>
References: <CA683C64.239A6%wccarithers@lbl.gov>
Message-ID: <CABL7CQiyQzkkey7YTqrqLTfFXfaEXdzvC-RLx86mgXTkNZBHQA@mail.gmail.com>

On Wed, Aug 10, 2011 at 10:47 PM, Bill Carithers <wccarithers at lbl.gov>wrote:

>  Hi all,
>
> When I upgraded to Lion, it wiped out my previous Python2.6 site-packages,
> including scipy-0.7.1. Now I?m trying to re-install the latest scipy from
> svn in the Python2.7 supplied from Apple. After reading the installation
> instructions, I followed the recommendation to use the fortran compiler (Gnu
> Fortran 4.2.4 for Lion) from http://r.research.att.com/tools/ which
> installed in Xcode 4.1. (Actually, I?m a little confused by this since, even
> though the installation said it succeeded, I couldn?t find it. Also when I
> did ?gfortran ?version? from the command line, it returned
> ?i686-apple-darwin11-gfortran-4.2.1: no input files?.)
>
> Try "gfortran --version" with two dashes to get more sensible output. You
have it installed.

When I tried to build and install with ?sudo python setup.py install?, it
> was humming along until it got to ARPACK the exited with exit status 1. It
> looks to my untrained eye as if it failed when trying to compile
> scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c: The
> terminal output for this portion of the build is appended below.
>
> This should have been fixed by commit effa6f6 about two weeks ago, please
check that your checkout is up-to-date.


> Another question... Mac OS X 10.7 has an Accelerator for vecLib that
> includes a built-in BLAS/ATLAS. Shouldn?t the build be using this? How do I
> tell it where to find it?
>

The build does use this by default. It's called "Accelerate Framework".

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/049feb82/attachment.html>

From warren.weckesser at enthought.com  Wed Aug 10 17:02:55 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 10 Aug 2011 16:02:55 -0500
Subject: [SciPy-User] firwin behavior
In-Reply-To: <j1uou7$uj9$1@dough.gmane.org>
References: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
	<j1uou7$uj9$1@dough.gmane.org>
Message-ID: <CAM-+wY_ewiDC4-9aGxCY-2ahY6gynEy-P9hhxnLt=-0=rUxztw@mail.gmail.com>

On Wed, Aug 10, 2011 at 3:16 PM, Christopher Felton
<chris.felton at gmail.com> wrote:
> On 8/10/2011 8:14 AM, Jeff Alstott wrote:
>> firwin is producing unreasonable filters for me, and I'm not sure if I'm
>> misusing the code or if there is a bug. Like so:
>>
>> In [5]: from scipy.signal import firwin
>>
>> In [6]: ny = 500
>>
>> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);
>
>
> Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0.


Jeff said he is using Python 3, so the results of the divisions will be floats.

Warren


>
> Simply create a float ny = 500. , Note the "." then the divides will be
> floats. In Matlab everything is by default a double. ?Python not so.
>
> The version I am running encounters an error on the above, if I use
> floats or not, version 0.8.0.
>
> Regards,
> Chris
>
>> savefig('FIR21_filter80.png')
>>
>> Produces the attached file.
>>
>> In contrast, Matlab:
>>
>> Trial>> ?ny = 500
>>
>> ny =
>>
>> ? ? 500
>>
>> Trial>> ?[f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80)
>>
>> Produces the other attached file. Quite different! The filter produced by
>> the scipy function, if used with lfilter (or if taken to Matlab to use as a
>> filter), produces a nonsense filtering, with many high frequency artifacts.
>>
>> Any thoughts? This is in python3, if that matters.
>>
>> Thanks!
>>
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From wccarithers at lbl.gov  Wed Aug 10 17:14:21 2011
From: wccarithers at lbl.gov (Bill Carithers)
Date: Wed, 10 Aug 2011 14:14:21 -0700
Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS
 X 10.7 aka Lion
In-Reply-To: <CABL7CQiyQzkkey7YTqrqLTfFXfaEXdzvC-RLx86mgXTkNZBHQA@mail.gmail.com>
Message-ID: <CA6842BD.239AD%wccarithers@lbl.gov>

Hi Ralf,

Thanks for the prompt reply. I just checked out from trunk today. The
checkout version at the end said:
Checked out external at revision 8716.
Checked out revision 7183.

Should I be using a branch to get the latest code that includes your fix ?

Thanks,
Bill

On 8/10/11 1:55 PM, "Ralf Gommers" <ralf.gommers at googlemail.com> wrote:

> 
> 
> On Wed, Aug 10, 2011 at 10:47 PM, Bill Carithers <wccarithers at lbl.gov> wrote:
>> Hi all,
>> 
>> When I upgraded to Lion, it wiped out my previous Python2.6 site-packages,
>> including scipy-0.7.1. Now I?m trying to re-install the latest scipy from svn
>> in the Python2.7 supplied from Apple. After reading the installation
>> instructions, I followed the recommendation to use the fortran compiler (Gnu
>> Fortran 4.2.4 for Lion) from http://r.research.att.com/tools/ which installed
>> in Xcode 4.1. (Actually, I?m a little confused by this since, even though the
>> installation said it succeeded, I couldn?t find it. Also when I did ?gfortran
>> ?version? from the command line, it returned
>> ?i686-apple-darwin11-gfortran-4.2.1: no input files?.)
>> 
> Try "gfortran --version" with two dashes to get more sensible output. You have
> it installed.
> 
>> When I tried to build and install with ?sudo python setup.py install?, it was
>> humming along until it got to ARPACK the exited with exit status 1. It looks
>> to my untrained eye as if it failed when trying to compile
>> scipy/sparse/linalg/eigen/arpack/ARPACK/FWRAPPERS/veclib_cabi_c.c: The
>> terminal output for this portion of the build is appended below.
>> 
> This should have been fixed by commit effa6f6 about two weeks ago, please
> check that your checkout is up-to-date.
> 
> ?
>> Another question... Mac OS X 10.7 has an Accelerator for vecLib that includes
>> a built-in BLAS/ATLAS. Shouldn?t the build be using this? How do I tell it
>> where to find it?
> 
> The build does use this by default. It's called "Accelerate Framework".
> 
> Cheers,
> Ralf
> 
> 
> 
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/3531dd0d/attachment.html>

From ralf.gommers at googlemail.com  Wed Aug 10 17:16:37 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Wed, 10 Aug 2011 23:16:37 +0200
Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS
 X 10.7 aka Lion
In-Reply-To: <CA6842BD.239AD%wccarithers@lbl.gov>
References: <CABL7CQiyQzkkey7YTqrqLTfFXfaEXdzvC-RLx86mgXTkNZBHQA@mail.gmail.com>
	<CA6842BD.239AD%wccarithers@lbl.gov>
Message-ID: <CABL7CQjsJKZSOO_NE2-=42DayE-0Bv6O4VUFRELQ7gKzhgZf3g@mail.gmail.com>

On Wed, Aug 10, 2011 at 11:14 PM, Bill Carithers <wccarithers at lbl.gov>wrote:

>  Hi Ralf,
>
> Thanks for the prompt reply. I just checked out from trunk today. The
> checkout version at the end said:
> Checked out external at revision 8716.
> Checked out revision 7183.
>
> Ah, with "svn" you actually meant svn:) I thought that was supposed to not
even work anymore.


> Should I be using a branch to get the latest code that includes your fix ?
>
> You should be using git: https://github.com/scipy/scipy

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/0911c2fc/attachment.html>

From aarchiba at physics.mcgill.ca  Wed Aug 10 19:42:48 2011
From: aarchiba at physics.mcgill.ca (Anne Archibald)
Date: Wed, 10 Aug 2011 19:42:48 -0400
Subject: [SciPy-User] calculate definite integral of sampled data
In-Reply-To: <20110810055958.GH2924@uriel>
References: <20110810055958.GH2924@uriel>
Message-ID: <CANm_+ZpH3hfpsUsJB_QLGACA7AtyXtL8Y8evQN20ieRv0AdWcQ@mail.gmail.com>

I believe that scipy.integrate.cumtrapz exists to solve this problem.
There might be a cumulative Simpson's rule too. Nobody put too much
effort into this because integrating a sampled function is better
divided into separate interpolation (e.g. with a spline) and
integration (exact for spline interpolants). I'd approach your problem
with splrep and splint.

Anne

On 8/10/11, Manuel Graune <contact at graune.org> wrote:
> Hi everyone,
>
> to calculate the definite integral of a function or an array of sampled
> data scipy provides (among others) the quad and trapz functions.
> So it is possible to compute e. g. the definite integral of cos(t) over
> some interval by doing
>
> definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit)
>
> or
>
> definite_integral= scipy.integrate.trapz(some_array).
>
> Now, if I want to plot cos(t) and  the integral of cos(t) from 0 to t in
> a graph, the necessary array can be calculated by:
>
> @numpy.vectorize
> def intfunc(fnc,upper_limit):
>     return scipy.integrate.quad(fnc,0.0,upper_limit)
>
>     definite_inegral= intfunc(cos,t)
>
> which seems (whithout knowing the actual code) a bit wasteful and slow
> but is relatively concise.
>
> Now for my question: scipy provides e. g. the trapz-function to
> calculate definite integral of a complete array of sampled data.
> However, I have no idea how to get achieve the same as above for
> sampled data (apart from manually iterating in a for-loop). Is there
> a function somewhere which delivers an array of the definite integrals
> for each of the data-points in an array?
>
>
> Regards,
>
> Manuel
>
> --
> A hundred men did the rational thing. The sum of those rational choices was
> called panic. Neal Stephenson -- System of the world
> http://www.graune.org/GnuPG_pubkey.asc
> Key fingerprint = 1E44 9CBD DEE4 9E07 5E0A  5828 5476 7E92 2DB4 3C99
>

-- 
Sent from my mobile device


From chris.felton at gmail.com  Wed Aug 10 19:49:34 2011
From: chris.felton at gmail.com (Christopher Felton)
Date: Wed, 10 Aug 2011 18:49:34 -0500
Subject: [SciPy-User] firwin behavior
In-Reply-To: <CAM-+wY_ewiDC4-9aGxCY-2ahY6gynEy-P9hhxnLt=-0=rUxztw@mail.gmail.com>
References: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>	<j1uou7$uj9$1@dough.gmane.org>
	<CAM-+wY_ewiDC4-9aGxCY-2ahY6gynEy-P9hhxnLt=-0=rUxztw@mail.gmail.com>
Message-ID: <j1v5ed$446$1@dough.gmane.org>

On 8/10/11 4:02 PM, Warren Weckesser wrote:
> On Wed, Aug 10, 2011 at 3:16 PM, Christopher Felton
> <chris.felton at gmail.com>  wrote:
>> On 8/10/2011 8:14 AM, Jeff Alstott wrote:
>>> firwin is producing unreasonable filters for me, and I'm not sure if I'm
>>> misusing the code or if there is a bug. Like so:
>>>
>>> In [5]: from scipy.signal import firwin
>>>
>>> In [6]: ny = 500
>>>
>>> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);
>>
>>
>> Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0.
>
>
> Jeff said he is using Python 3, so the results of the divisions will be floats.
>
> Warren

Thanks for the correction Warren,

I know very little about Python3.  Is a float the default number type or 
is the result of the division a float?

Thanks,
Chris


From warren.weckesser at enthought.com  Wed Aug 10 19:56:53 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 10 Aug 2011 18:56:53 -0500
Subject: [SciPy-User] firwin behavior
In-Reply-To: <j1v5ed$446$1@dough.gmane.org>
References: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
	<j1uou7$uj9$1@dough.gmane.org>
	<CAM-+wY_ewiDC4-9aGxCY-2ahY6gynEy-P9hhxnLt=-0=rUxztw@mail.gmail.com>
	<j1v5ed$446$1@dough.gmane.org>
Message-ID: <CAM-+wY_deeGZ1ywKZxoKP0MAtWPP8qK2wWPV4P4ntRmZr2OTiQ@mail.gmail.com>

On Wed, Aug 10, 2011 at 6:49 PM, Christopher Felton
<chris.felton at gmail.com>wrote:

> On 8/10/11 4:02 PM, Warren Weckesser wrote:
> > On Wed, Aug 10, 2011 at 3:16 PM, Christopher Felton
> > <chris.felton at gmail.com>  wrote:
> >> On 8/10/2011 8:14 AM, Jeff Alstott wrote:
> >>> firwin is producing unreasonable filters for me, and I'm not sure if
> I'm
> >>> misusing the code or if there is a bug. Like so:
> >>>
> >>> In [5]: from scipy.signal import firwin
> >>>
> >>> In [6]: ny = 500
> >>>
> >>> In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);
> >>
> >>
> >> Is this a simple error, ny = 500 (integer) and 1/500 = 0, 80/500 = 0.
> >
> >
> > Jeff said he is using Python 3, so the results of the divisions will be
> floats.
> >
> > Warren
>
> Thanks for the correction Warren,
>
> I know very little about Python3.  Is a float the default number type or
> is the result of the division a float?
>


The result of division is a float.   Take a look here:
    http://docs.python.org/release/3.0.1/whatsnew/3.0.html#integers
and click on the "PEP 0238" link for all the details.

Warren


>
> Thanks,
> Chris
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/bee5913b/attachment.html>

From charlesr.harris at gmail.com  Wed Aug 10 21:26:32 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 10 Aug 2011 19:26:32 -0600
Subject: [SciPy-User] calculate definite integral of sampled data
In-Reply-To: <20110810055958.GH2924@uriel>
References: <20110810055958.GH2924@uriel>
Message-ID: <CAB6mnxKdm-yo9SjJD25sS=gqcvpPsEogFuMs7pc9Q7eb9J=4Sw@mail.gmail.com>

On Tue, Aug 9, 2011 at 11:59 PM, Manuel Graune <contact at graune.org> wrote:

> Hi everyone,
>
> to calculate the definite integral of a function or an array of sampled
> data scipy provides (among others) the quad and trapz functions.
> So it is possible to compute e. g. the definite integral of cos(t) over
> some interval by doing
>
> definite_integral= scipy.integrate.quad(cos,lower_limit,upper_limit)
>
> or
>
> definite_integral= scipy.integrate.trapz(some_array).
>
> Now, if I want to plot cos(t) and  the integral of cos(t) from 0 to t in
> a graph, the necessary array can be calculated by:
>
> @numpy.vectorize
> def intfunc(fnc,upper_limit):
>    return scipy.integrate.quad(fnc,0.0,upper_limit)
>
>    definite_inegral= intfunc(cos,t)
>
> which seems (whithout knowing the actual code) a bit wasteful and slow
> but is relatively concise.
>
> Now for my question: scipy provides e. g. the trapz-function to
> calculate definite integral of a complete array of sampled data.
> However, I have no idea how to get achieve the same as above for
> sampled data (apart from manually iterating in a for-loop). Is there
> a function somewhere which delivers an array of the definite integrals
> for each of the data-points in an array?
>
>
> Regards,
>
> Manuel
>
>
How many data points do you have?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/4046c85a/attachment.html>

From rashed.golam at gmail.com  Wed Aug 10 10:26:19 2011
From: rashed.golam at gmail.com (Md. Golam Rashed)
Date: Wed, 10 Aug 2011 07:26:19 -0700 (PDT)
Subject: [SciPy-User] ANN: SfePy 2011.3
In-Reply-To: <4E427FD6.5000707@ntc.zcu.cz>
References: <4E427FD6.5000707@ntc.zcu.cz>
Message-ID: <29614221.73.1312986379136.JavaMail.geo-discussion-forums@yqfn40>

GREAT!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/d0798384/attachment.html>

From rashed.golam at gmail.com  Wed Aug 10 12:19:08 2011
From: rashed.golam at gmail.com (Md. Golam Rashed)
Date: Wed, 10 Aug 2011 09:19:08 -0700 (PDT)
Subject: [SciPy-User] ANN: SfePy 2011.3
In-Reply-To: <4E427FD6.5000707@ntc.zcu.cz>
References: <4E427FD6.5000707@ntc.zcu.cz>
Message-ID: <5166843.67.1312993148868.JavaMail.geo-discussion-forums@prec11>

58 test file(s) executed in 561.13 s, 0 failure(s) of 88 test(s)

tested on win7, Intel Atom dual core.
simple installation on windows followed while installing sfepy.

** I'm busy with my MS, so being irregular sometime, but concentrate fully 
on sfepy when i'm free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110810/c634d637/attachment.html>

From jeffalstott at gmail.com  Thu Aug 11 07:36:33 2011
From: jeffalstott at gmail.com (Jeff Alstott)
Date: Thu, 11 Aug 2011 13:36:33 +0200
Subject: [SciPy-User] firwin behavior
In-Reply-To: <CAM-+wY_7+Y8uqaNeAn3gKvS0udJ6v+Jab+u2JRo1ouo_fOkf4w@mail.gmail.com>
References: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
	<CAM-+wY_7+Y8uqaNeAn3gKvS0udJ6v+Jab+u2JRo1ouo_fOkf4w@mail.gmail.com>
Message-ID: <CAGoRfgEHDKW-wDgg7HQPVCcGgrPy66H=CxXEKhD7yn3mhsrfBA@mail.gmail.com>

Wow. The passing of the DC frequency is exactly the issue, and that default
behavior is clearly shown in the documentation. I see now that given a band,
the default behavior is band-stop, whereas I would expect it to be
band-pass. So, that fixed it.

What I don't understand, however, is *why* that would be default behavior.
More importantly, even if that is the default behavior, the name of the
pass_zero flag does not readily help a dumb user like me grok the
functionality. Has there been any thought to renaming it?

Thanks!

On Wed, Aug 10, 2011 at 4:39 PM, Warren Weckesser <
warren.weckesser at enthought.com> wrote:

> On Wed, Aug 10, 2011 at 8:14 AM, Jeff Alstott <jeffalstott at gmail.com>
> wrote:
> > firwin is producing unreasonable filters for me, and I'm not sure if I'm
> > misusing the code or if there is a bug. Like so:
> >
> > In [5]: from scipy.signal import firwin
> >
> > In [6]: ny = 500
> >
> > In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);
> > savefig('FIR21_filter80.png')
> >
> > Produces the attached file.
> >
> > In contrast, Matlab:
> >
> > Trial>> ny = 500
> >
> > ny =
> >
> >    500
> >
> > Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80)
> >
> > Produces the other attached file. Quite different! The filter produced by
> > the scipy function, if used with lfilter (or if taken to Matlab to use as
> a
> > filter), produces a nonsense filtering, with many high frequency
> artifacts.
> >
> > Any thoughts? This is in python3, if that matters.
>
>
> By default, firwin creates a filter that passes DC (i.e. the zero
> frequency).  To get a filter like the one produced by matlab, add the
> keyword argument pass_zero=False.
>
> Warren
>
>
> >
> > Thanks!
> >
> >
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> >
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110811/538908ac/attachment.html>

From guyer at nist.gov  Thu Aug 11 14:19:37 2011
From: guyer at nist.gov (Jonathan Guyer)
Date: Thu, 11 Aug 2011 14:19:37 -0400
Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS
	X 10.7 aka Lion
In-Reply-To: <CABL7CQjsJKZSOO_NE2-=42DayE-0Bv6O4VUFRELQ7gKzhgZf3g@mail.gmail.com>
References: <CABL7CQiyQzkkey7YTqrqLTfFXfaEXdzvC-RLx86mgXTkNZBHQA@mail.gmail.com>
	<CA6842BD.239AD%wccarithers@lbl.gov>
	<CABL7CQjsJKZSOO_NE2-=42DayE-0Bv6O4VUFRELQ7gKzhgZf3g@mail.gmail.com>
Message-ID: <CE3AAD25-84DC-45F0-8120-CF1689142044@nist.gov>


On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote:

> Ah, with "svn" you actually meant svn:) I thought that was supposed to not even work anymore.

It does work and it's confusing. I had not been following the transition closely and so was under the impression that the svn repository was being mirrored from git. It's not. It's just old.


From yyc at solvcon.net  Thu Aug 11 23:44:48 2011
From: yyc at solvcon.net (Yung-Yu Chen)
Date: Thu, 11 Aug 2011 23:44:48 -0400
Subject: [SciPy-User] ANN: SOLVCON 0.1
Message-ID: <CACHrFq1_vq_bSPAhsonQAP=oRokJd2pEP5dyX+d==7kAGJofYg@mail.gmail.com>

Hello,

I am pleased to announce version 0.1 of SOLVCON.  SOLVCON is a Python-based,
multi-physics software framework for solving first-order hyperbolic PDEs.

The source tarball can be downloaded at
http://bitbucket.org/yungyuc/solvcon/downloads .  More information can be
found at http://solvcon.net/ .

This release marks a milestone of SOLVCON.  Future development of SOLVCON will
focus on production use.  The planned directions include (i) the high-order
CESE method, (ii) improving the scalability by consolidating the
distributed-memory parallel code, (iii) expanding the capabilities of the
existing solver kernels, and (iv) incorporating more physical processes.

New features:

- Glue BCs are added.  A pair of collocated BCs can now be glued together to
  work as an internal interface.  The glued BCs helps to dynamically turn on or
  off the BC pair.
- ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for
  multi-threaded computing.  They were using a thread pool built-in SOLVCON for
  multi-threading.  OpenMP makes multi-threaded functions more flexible in
  argument specification.
- Add the ``soil/`` directory for providing building helpers for GCC 4.6.1.
  Note, the name ``gcc/`` is deliberately avoided for the directory, because of
  a bug in gcc itself (bug id 48306
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ).
- Add ``-j`` command line option for building dependencies in the ``ground/``
  directory and the ``soil/`` directory.  Note that ATLAS doesn't work with
  ``make -j N``.

Bug-fix:

- METIS changes its download URL.  Modify SConstruct accordingly.

--
Yung-Yu Chen
http://solvcon.net/yyc/
+1 (614) 859 2436


From chris at simplistix.co.uk  Fri Aug 12 01:38:15 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 12 Aug 2011 06:38:15 +0100
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <E550B8AC-CD2D-423E-B372-FD3C7C45103C@gmail.com>
References: <4E3F020B.1000500@simplistix.co.uk>
	<E550B8AC-CD2D-423E-B372-FD3C7C45103C@gmail.com>
Message-ID: <4E44BC47.7060400@simplistix.co.uk>

On 09/08/2011 21:46, Dav Clark wrote:
> Well, this is probably more basic than you want, but O'Reilly's "Data Analysis with Open Source Tools" is certainly a nice low-level intro for a beginner:
>
> http://oreilly.com/catalog/9780596802363
>
> It's available on Safari Bookshelf, and also talks about using matplotlib (and R and GSL and ...). I'm unaware of any nice Chaco "narratives."

Thanks to you and everyone else for the great suggestions :-)

>> Now, first question: what's the best way to build this array given that
>> I may only see the arrival of a new venue a fair way through building
>> the data structure? How can I efficiently say "please add a new row to
>> my array", I don't know what the 4th dimension equivalent is ;-)
>
> You might consider doing what matlab does under the hood and just double the array when you run out of space.

What's the best way to do this?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
            - http://www.simplistix.co.uk


From klonuo at gmail.com  Fri Aug 12 01:49:57 2011
From: klonuo at gmail.com (Klonuo Umom)
Date: Fri, 12 Aug 2011 07:49:57 +0200
Subject: [SciPy-User] ANN: SOLVCON 0.1
In-Reply-To: <CACHrFq1_vq_bSPAhsonQAP=oRokJd2pEP5dyX+d==7kAGJofYg@mail.gmail.com>
References: <CACHrFq1_vq_bSPAhsonQAP=oRokJd2pEP5dyX+d==7kAGJofYg@mail.gmail.com>
Message-ID: <CAA-8Ld9AnE51KS7hMnYasnT5AyFxSuauyGL6oy+Gi+oKXd5n4A@mail.gmail.com>

Interesting package. Congratulations on making milestone release

I installed it and on a first look I can't see it workflow. On web
portal I found tip to follow examples that come with this package, but
those aren't trivial at all; I mean lot of classes and functions comes
from nowhare and it's like no walkthrough provided

If I may explain myself, I got summer seminar assigment, starting from
shallow water eqs to derive eq of absolute vorticity for nondivergent
flow in linearized form using perturbation method. I've done it by
hand, but would like to understand the process with some of Python
packages if feasible, and as staring eqs are hyperbolic PDEs maybe I
could use this package although it uses different method for solving.
Is it good idea to try to use this package, and if answer is yes, can
you maybe provide some starting point for this simple task?


Thanks

On Fri, Aug 12, 2011 at 5:44 AM, Yung-Yu Chen <yyc at solvcon.net> wrote:
> Hello,
>
> I am pleased to announce version 0.1 of SOLVCON. ?SOLVCON is a Python-based,
> multi-physics software framework for solving first-order hyperbolic PDEs.
>
> The source tarball can be downloaded at
> http://bitbucket.org/yungyuc/solvcon/downloads . ?More information can be
> found at http://solvcon.net/ .
>
> This release marks a milestone of SOLVCON. ?Future development of SOLVCON will
> focus on production use. ?The planned directions include (i) the high-order
> CESE method, (ii) improving the scalability by consolidating the
> distributed-memory parallel code, (iii) expanding the capabilities of the
> existing solver kernels, and (iv) incorporating more physical processes.
>
> New features:
>
> - Glue BCs are added. ?A pair of collocated BCs can now be glued together to
> ?work as an internal interface. ?The glued BCs helps to dynamically turn on or
> ?off the BC pair.
> - ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for
> ?multi-threaded computing. ?They were using a thread pool built-in SOLVCON for
> ?multi-threading. ?OpenMP makes multi-threaded functions more flexible in
> ?argument specification.
> - Add the ``soil/`` directory for providing building helpers for GCC 4.6.1.
> ?Note, the name ``gcc/`` is deliberately avoided for the directory, because of
> ?a bug in gcc itself (bug id 48306
> ?http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ).
> - Add ``-j`` command line option for building dependencies in the ``ground/``
> ?directory and the ``soil/`` directory. ?Note that ATLAS doesn't work with
> ?``make -j N``.
>
> Bug-fix:
>
> - METIS changes its download URL. ?Modify SConstruct accordingly.
>
> --
> Yung-Yu Chen
> http://solvcon.net/yyc/
> +1 (614) 859 2436
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jrocher at enthought.com  Fri Aug 12 03:00:41 2011
From: jrocher at enthought.com (Jonathan Rocher)
Date: Fri, 12 Aug 2011 09:00:41 +0200
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <4E44BC47.7060400@simplistix.co.uk>
References: <4E3F020B.1000500@simplistix.co.uk>
	<E550B8AC-CD2D-423E-B372-FD3C7C45103C@gmail.com>
	<4E44BC47.7060400@simplistix.co.uk>
Message-ID: <CAOzk5Qd4R1w8WhW5KUi1hU4h_kfbAugmDmJXGDKZuEWmA=Lx4A@mail.gmail.com>

Dear Chris,

for documentation about the Enthought Tools Suite (open source, BSD-like
licence), let me point you to
http://code.enthought.com/
Specifically about 2D visualization, Chaco is good at building interactive
plotting tools efficiently even with large datasets. Its entire
documentation can be found at
http://github.enthought.com/chaco/
The documentation is not perfect but you can definitely find lots of
examples to follow in the
Tutorials<http://github.enthought.com/chaco/user_manual/tutorial.html>section
as well as in the
gallery <http://code.enthought.com/projects/chaco/gallery.php>.

Hope this helps.
Jonathan

On Fri, Aug 12, 2011 at 7:38 AM, Chris Withers <chris at simplistix.co.uk>wrote:

> On 09/08/2011 21:46, Dav Clark wrote:
> > Well, this is probably more basic than you want, but O'Reilly's "Data
> Analysis with Open Source Tools" is certainly a nice low-level intro for a
> beginner:
> >
> > http://oreilly.com/catalog/9780596802363
> >
> > It's available on Safari Bookshelf, and also talks about using matplotlib
> (and R and GSL and ...). I'm unaware of any nice Chaco "narratives."
>
> Thanks to you and everyone else for the great suggestions :-)
>
> >> Now, first question: what's the best way to build this array given that
> >> I may only see the arrival of a new venue a fair way through building
> >> the data structure? How can I efficiently say "please add a new row to
> >> my array", I don't know what the 4th dimension equivalent is ;-)
> >
> > You might consider doing what matlab does under the hood and just double
> the array when you run out of space.
>
> What's the best way to do this?
>
> cheers,
>
> Chris
>
> --
> Simplistix - Content Management, Batch Processing & Python Consulting
>            - http://www.simplistix.co.uk
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


-- 
Jonathan Rocher, PhD
Scientific software developer
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110812/95b840f3/attachment.html>

From smooth29 at hotmail.com  Fri Aug 12 06:02:35 2011
From: smooth29 at hotmail.com (Pawel Zmarz)
Date: Fri, 12 Aug 2011 10:02:35 +0000
Subject: [SciPy-User] Data Acquisition with NIDAQmx error
Message-ID: <SNT118-W1189FA5D1B385FB907CFFAC6250@phx.gbl>


Hello scipy user community,
I hope I am emailing the correct list...
I'm a newbie, and I've been trying to implement the 'Data Acquisition with NIDAQmx' code from the SciPy Cookbook to use python to generate an analog signal out of my NI USB-6008 card.
However, when I run the code I get the following error:
RuntimeError: nidaq call failed with error -200077: 'Requested value is not a supported value for this property.'
Any ideas on what this is and hot to solve it? I don't really understand the code, and I am not really sure where I specify the value to be generated...
Link to the code (under Analog Generation):http://www.scipy.org/Cookbook/Data_Acquisition_with_NIDAQmx
Thank you for help!!BW,paw 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110812/b65d3d16/attachment.html>

From yyc at solvcon.net  Fri Aug 12 07:20:44 2011
From: yyc at solvcon.net (Yung-Yu Chen)
Date: Fri, 12 Aug 2011 07:20:44 -0400
Subject: [SciPy-User] ANN: SOLVCON 0.1
In-Reply-To: <CAA-8Ld9AnE51KS7hMnYasnT5AyFxSuauyGL6oy+Gi+oKXd5n4A@mail.gmail.com>
References: <CACHrFq1_vq_bSPAhsonQAP=oRokJd2pEP5dyX+d==7kAGJofYg@mail.gmail.com>
	<CAA-8Ld9AnE51KS7hMnYasnT5AyFxSuauyGL6oy+Gi+oKXd5n4A@mail.gmail.com>
Message-ID: <CACHrFq1O-ckZQ13EMc3m1DgcOAOfrJTrf5nVi6X=ZUpDWcm4zw@mail.gmail.com>

Hello,

On Fri, Aug 12, 2011 at 01:49, Klonuo Umom <klonuo at gmail.com> wrote:
> Interesting package. Congratulations on making milestone release
>
> I installed it and on a first look I can't see it workflow. On web
> portal I found tip to follow examples that come with this package, but
> those aren't trivial at all; I mean lot of classes and functions comes
> from nowhare and it's like no walkthrough provided
>

We have not put efforts to make the package user friendly.  SOLVCON
began with the idea to provide a framework to collect important
supportive functionalities needed by CFD codes, to enhance the
robustness and coding efficiency.  The proof of concept turned out to
be successful, and we realized that SOLVCON has great potentials to
facilitate a new category of practices of building high-performance
conservation-law solvers for high-fidelity solutions.

In the time being, SOLVCON is made for experts in computational
science.  In the foreseeable future, our collaborators and we will
make it more accurate, more scalable, and more versatile.  You can
find the plan for the forthcoming development at
http://solvcon.net/yyc/writing/2011/solvcon_0.1.html .  We hope
SOLVCON can be used to renew the technology used in high-end
calculations of PDEs, e.g., CFD, computational electromagnetism, etc.

> If I may explain myself, I got summer seminar assigment, starting from
> shallow water eqs to derive eq of absolute vorticity for nondivergent
> flow in linearized form using perturbation method. I've done it by
> hand, but would like to understand the process with some of Python
> packages if feasible, and as staring eqs are hyperbolic PDEs maybe I
> could use this package although it uses different method for solving.
> Is it good idea to try to use this package, and if answer is yes, can
> you maybe provide some starting point for this simple task?
>
>

The package was geared up for large-scale, complex calculations.
Using SOLVCON for very simple calculations would be an overkill.
Unfortunately, the short answer to your question could be no.

When dealing with multi-physics, we take the approach to model the
underlying numerical algorithm, mathematics and physics as much as
possible.  The approach looks complicated at the first glance, but is
actually concise from theories to implementations.  We believe the
conciseness or compactness is critically important for scaling SOLVCON
from thousands of CPUs to hundreds of thousand of CPUs.

The price for this approach is to prolong the path to generic
representation of PDEs.  We do hope to provide the capability to
compile the PDEs written by users in a symbolic form for SOLVCON to
execute automatically.  But this won't happen in foreseeable future.

If you want to know more about SOLVCON and its theoretical background,
you can check up with my dissertation at
http://solvcon.net/yyc/publications.html .

with regards,
Yung-Yu Chen

> Thanks
>
> On Fri, Aug 12, 2011 at 5:44 AM, Yung-Yu Chen <yyc at solvcon.net> wrote:
>> Hello,
>>
>> I am pleased to announce version 0.1 of SOLVCON. ?SOLVCON is a Python-based,
>> multi-physics software framework for solving first-order hyperbolic PDEs.
>>
>> The source tarball can be downloaded at
>> http://bitbucket.org/yungyuc/solvcon/downloads . ?More information can be
>> found at http://solvcon.net/ .
>>
>> This release marks a milestone of SOLVCON. ?Future development of SOLVCON will
>> focus on production use. ?The planned directions include (i) the high-order
>> CESE method, (ii) improving the scalability by consolidating the
>> distributed-memory parallel code, (iii) expanding the capabilities of the
>> existing solver kernels, and (iv) incorporating more physical processes.
>>
>> New features:
>>
>> - Glue BCs are added. ?A pair of collocated BCs can now be glued together to
>> ?work as an internal interface. ?The glued BCs helps to dynamically turn on or
>> ?off the BC pair.
>> - ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for
>> ?multi-threaded computing. ?They were using a thread pool built-in SOLVCON for
>> ?multi-threading. ?OpenMP makes multi-threaded functions more flexible in
>> ?argument specification.
>> - Add the ``soil/`` directory for providing building helpers for GCC 4.6.1.
>> ?Note, the name ``gcc/`` is deliberately avoided for the directory, because of
>> ?a bug in gcc itself (bug id 48306
>> ?http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ).
>> - Add ``-j`` command line option for building dependencies in the ``ground/``
>> ?directory and the ``soil/`` directory. ?Note that ATLAS doesn't work with
>> ?``make -j N``.
>>
>> Bug-fix:
>>
>> - METIS changes its download URL. ?Modify SConstruct accordingly.
>>
>> --
>> Yung-Yu Chen
>> http://solvcon.net/yyc/
>> +1 (614) 859 2436
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


-- 
Yung-Yu Chen
http://solvcon.net/yyc/
+1 (614) 859 2436


From hasslerjc at comcast.net  Fri Aug 12 09:37:17 2011
From: hasslerjc at comcast.net (John Hassler)
Date: Fri, 12 Aug 2011 09:37:17 -0400
Subject: [SciPy-User] Data Acquisition with NIDAQmx error
In-Reply-To: <SNT118-W1189FA5D1B385FB907CFFAC6250@phx.gbl>
References: <SNT118-W1189FA5D1B385FB907CFFAC6250@phx.gbl>
Message-ID: <4E452C8D.2090401@comcast.net>

An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110812/a4fa8311/attachment.html>

From pjabardo at yahoo.com.br  Fri Aug 12 09:43:41 2011
From: pjabardo at yahoo.com.br (Paulo Jabardo)
Date: Fri, 12 Aug 2011 06:43:41 -0700 (PDT)
Subject: [SciPy-User] Data Acquisition with NIDAQmx error
In-Reply-To: <SNT118-W1189FA5D1B385FB907CFFAC6250@phx.gbl>
References: <SNT118-W1189FA5D1B385FB907CFFAC6250@phx.gbl>
Message-ID: <1313156621.72956.YahooMailNeo@web30004.mail.mud.yahoo.com>

Try using pydaqtools, a more generic data acquisition interface.

http://pydaqtools.org/


________________________________
De: Pawel Zmarz <smooth29 at hotmail.com>
Para: scipy-user at scipy.org
Enviadas: Sexta-feira, 12 de Agosto de 2011 7:02
Assunto: [SciPy-User] Data Acquisition with NIDAQmx error


Hello scipy user community,

I hope I am emailing the correct list...

I'm a newbie, and I've been trying to implement the 'Data?Acquisition?with NIDAQmx' code from the SciPy Cookbook to use python to generate an analog signal out of my NI USB-6008 card.

However, when I run the code I get the following error:
RuntimeError: nidaq call failed with error -200077: 'Requested value is not a supported value for this property.'

Any ideas on what this is and hot to solve it? I don't really understand the code, and I am not really sure where I specify the value to be generated...

Link to the code (under Analog Generation):
http://www.scipy.org/Cookbook/Data_Acquisition_with_NIDAQmx

Thank you for help!!
BW,
paw
_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110812/0cd80379/attachment.html>

From hasslerjc at comcast.net  Fri Aug 12 09:52:37 2011
From: hasslerjc at comcast.net (John Hassler)
Date: Fri, 12 Aug 2011 09:52:37 -0400
Subject: [SciPy-User] Data Acquisition with NIDAQmx error
In-Reply-To: <1313156621.72956.YahooMailNeo@web30004.mail.mud.yahoo.com>
References: <SNT118-W1189FA5D1B385FB907CFFAC6250@phx.gbl>
	<1313156621.72956.YahooMailNeo@web30004.mail.mud.yahoo.com>
Message-ID: <4E453025.9070100@comcast.net>

An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110812/8c773d1e/attachment.html>

From pearu.peterson at gmail.com  Fri Aug 12 09:53:52 2011
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Fri, 12 Aug 2011 16:53:52 +0300
Subject: [SciPy-User] ANN: iocbio.microscope - a Python deconvolution
	software
Message-ID: <4E453070.5000801@cens.ioc.ee>


We are proud to release a new package for deconvolving 3D microscope
images iocbio.microscope. It is a part of an open-source software
project iocbio from the Laboratory of Systems Biology in the Institute
of Cybernetics at Tallinn Technical University (http://sysbio.ioc.ee).

Iocbio.microscope software package allows to deconvolve microscope
images. In addition to the deconvolution program, the package includes
the set of tools that is required for processing images, estimation of
point spread function (PSF) and visualizing the results. This software
is written in Python and is released with an open-source license (BSD).

Homepage: http://code.google.com/p/iocbio/wiki/IOCBioMicroscope

Tutorial: http://code.google.com/p/iocbio/wiki/DeconvolutionTutorial

Iocbio API documentation:
http://sysbio.ioc.ee/download/software/iocbio/index.html

Sources and download: http://iocbio.googlecode.com

Iocbio is developed under Linux (ubuntu) but will also run under Windows 
(we provide installer for Windows users to ease the process of setting 
up the iocbio software as well as its prerequisites).

Mathematical background of implemented deconvolution algorithm, notes
and guidelines on selection of parameters for deconvolution and
application to real-life images are described in a recent paper

Laasmaa, M, Vendelin, M, Peterson, P (2011). Application of regularized
Richardson-Lucy algorithm for deconvolution of confocal microscopy
images. J. Microscopy. Volume 243,  Issue 2 , pages 124?140, August
2011:
http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2818.2011.03486.x/full

Pearu Peterson


From contact at graune.org  Fri Aug 12 12:00:17 2011
From: contact at graune.org (Manuel Graune)
Date: Fri, 12 Aug 2011 18:00:17 +0200
Subject: [SciPy-User] calculate definite integral of sampled data
In-Reply-To: <CAB6mnxKdm-yo9SjJD25sS=gqcvpPsEogFuMs7pc9Q7eb9J=4Sw@mail.gmail.com>
References: <20110810055958.GH2924@uriel>
	<CAB6mnxKdm-yo9SjJD25sS=gqcvpPsEogFuMs7pc9Q7eb9J=4Sw@mail.gmail.com>
Message-ID: <20110812160017.GB3741@uriel>


On Wed, Aug 10, 2011 at 07:26:32PM -0600, Charles R Harris wrote:
> 
>    How many data points do you have?
>    Chuck
> 

depending on the use-case about 10-20000. The solution suggested
by Gustavo works pretty well for me. Just as scipy.integrate.cumtrapz
does. I obiously had not read the documentation quite enough to understand
what cumtrapz good for.

Thanks to all.

Manuel


> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110812/d201d0bb/attachment.sig>

From klonuo at gmail.com  Fri Aug 12 12:56:55 2011
From: klonuo at gmail.com (Klonuo Umom)
Date: Fri, 12 Aug 2011 18:56:55 +0200
Subject: [SciPy-User] ANN: SOLVCON 0.1
In-Reply-To: <CACHrFq1O-ckZQ13EMc3m1DgcOAOfrJTrf5nVi6X=ZUpDWcm4zw@mail.gmail.com>
References: <CACHrFq1_vq_bSPAhsonQAP=oRokJd2pEP5dyX+d==7kAGJofYg@mail.gmail.com>
	<CAA-8Ld9AnE51KS7hMnYasnT5AyFxSuauyGL6oy+Gi+oKXd5n4A@mail.gmail.com>
	<CACHrFq1O-ckZQ13EMc3m1DgcOAOfrJTrf5nVi6X=ZUpDWcm4zw@mail.gmail.com>
Message-ID: <CAA-8Ld-T8X=QOGmepxwVPjyA-LYMXZ9xNOi5AcqrkYhbwDG37w@mail.gmail.com>

Thanks for your explanation

I thought it would be something like this


On Fri, Aug 12, 2011 at 1:20 PM, Yung-Yu Chen <yyc at solvcon.net> wrote:
> Hello,
>
> On Fri, Aug 12, 2011 at 01:49, Klonuo Umom <klonuo at gmail.com> wrote:
>> Interesting package. Congratulations on making milestone release
>>
>> I installed it and on a first look I can't see it workflow. On web
>> portal I found tip to follow examples that come with this package, but
>> those aren't trivial at all; I mean lot of classes and functions comes
>> from nowhare and it's like no walkthrough provided
>>
>
> We have not put efforts to make the package user friendly. ?SOLVCON
> began with the idea to provide a framework to collect important
> supportive functionalities needed by CFD codes, to enhance the
> robustness and coding efficiency. ?The proof of concept turned out to
> be successful, and we realized that SOLVCON has great potentials to
> facilitate a new category of practices of building high-performance
> conservation-law solvers for high-fidelity solutions.
>
> In the time being, SOLVCON is made for experts in computational
> science. ?In the foreseeable future, our collaborators and we will
> make it more accurate, more scalable, and more versatile. ?You can
> find the plan for the forthcoming development at
> http://solvcon.net/yyc/writing/2011/solvcon_0.1.html . ?We hope
> SOLVCON can be used to renew the technology used in high-end
> calculations of PDEs, e.g., CFD, computational electromagnetism, etc.
>
>> If I may explain myself, I got summer seminar assigment, starting from
>> shallow water eqs to derive eq of absolute vorticity for nondivergent
>> flow in linearized form using perturbation method. I've done it by
>> hand, but would like to understand the process with some of Python
>> packages if feasible, and as staring eqs are hyperbolic PDEs maybe I
>> could use this package although it uses different method for solving.
>> Is it good idea to try to use this package, and if answer is yes, can
>> you maybe provide some starting point for this simple task?
>>
>>
>
> The package was geared up for large-scale, complex calculations.
> Using SOLVCON for very simple calculations would be an overkill.
> Unfortunately, the short answer to your question could be no.
>
> When dealing with multi-physics, we take the approach to model the
> underlying numerical algorithm, mathematics and physics as much as
> possible. ?The approach looks complicated at the first glance, but is
> actually concise from theories to implementations. ?We believe the
> conciseness or compactness is critically important for scaling SOLVCON
> from thousands of CPUs to hundreds of thousand of CPUs.
>
> The price for this approach is to prolong the path to generic
> representation of PDEs. ?We do hope to provide the capability to
> compile the PDEs written by users in a symbolic form for SOLVCON to
> execute automatically. ?But this won't happen in foreseeable future.
>
> If you want to know more about SOLVCON and its theoretical background,
> you can check up with my dissertation at
> http://solvcon.net/yyc/publications.html .
>
> with regards,
> Yung-Yu Chen
>
>> Thanks
>>
>> On Fri, Aug 12, 2011 at 5:44 AM, Yung-Yu Chen <yyc at solvcon.net> wrote:
>>> Hello,
>>>
>>> I am pleased to announce version 0.1 of SOLVCON. ?SOLVCON is a Python-based,
>>> multi-physics software framework for solving first-order hyperbolic PDEs.
>>>
>>> The source tarball can be downloaded at
>>> http://bitbucket.org/yungyuc/solvcon/downloads . ?More information can be
>>> found at http://solvcon.net/ .
>>>
>>> This release marks a milestone of SOLVCON. ?Future development of SOLVCON will
>>> focus on production use. ?The planned directions include (i) the high-order
>>> CESE method, (ii) improving the scalability by consolidating the
>>> distributed-memory parallel code, (iii) expanding the capabilities of the
>>> existing solver kernels, and (iv) incorporating more physical processes.
>>>
>>> New features:
>>>
>>> - Glue BCs are added. ?A pair of collocated BCs can now be glued together to
>>> ?work as an internal interface. ?The glued BCs helps to dynamically turn on or
>>> ?off the BC pair.
>>> - ``solvcon.kerpak.cuse`` series solver kernels are changed to use OpenMP for
>>> ?multi-threaded computing. ?They were using a thread pool built-in SOLVCON for
>>> ?multi-threading. ?OpenMP makes multi-threaded functions more flexible in
>>> ?argument specification.
>>> - Add the ``soil/`` directory for providing building helpers for GCC 4.6.1.
>>> ?Note, the name ``gcc/`` is deliberately avoided for the directory, because of
>>> ?a bug in gcc itself (bug id 48306
>>> ?http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48306 ).
>>> - Add ``-j`` command line option for building dependencies in the ``ground/``
>>> ?directory and the ``soil/`` directory. ?Note that ATLAS doesn't work with
>>> ?``make -j N``.
>>>
>>> Bug-fix:
>>>
>>> - METIS changes its download URL. ?Modify SConstruct accordingly.
>>>
>>> --
>>> Yung-Yu Chen
>>> http://solvcon.net/yyc/
>>> +1 (614) 859 2436
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
>
> --
> Yung-Yu Chen
> http://solvcon.net/yyc/
> +1 (614) 859 2436
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From nouiz at nouiz.org  Fri Aug 12 16:06:49 2011
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Fri, 12 Aug 2011 16:06:49 -0400
Subject: [SciPy-User] Theano 0.4.1 released
Message-ID: <CADKKbtjn1iETetAYMZzf_QSJvTFrB3iBf0J_BSOjZNqbG99W9A@mail.gmail.com>

===========================
 Announcing Theano 0.4.1
===========================

This is an important release, with lots of new features, bug
fixes and some deprecation warning.  The upgrade is recommended for everybody.

For those using the bleeding edge version in the
mercurial repository, we encourage you to update to the `0.4.1` tag.


What's New
----------

New features:

 * `R_op <http://deeplearning.net/software/theano/tutorial/gradients.html>`_
macro like theano.tensor.grad
   * Not all tests are done yet (TODO)
 * Added alias theano.tensor.bitwise_{and,or,xor,not}. They are the numpy names.
 * Updates returned by Scan (you need to pass them to the
theano.function) are now a new Updates class.
   That allow more check and easier work with them. The Updates class
is a subclass of dict
 * Scan can now work in a "do while" loop style.
   * We scan until a condition is met.
   * There is a minimum of 1 iteration(can't do "while do" style loop)
 * The "Interactive Debugger" (compute_test_value theano flags)
   * Now should work with all ops (even the one with only C code)
   * In the past some errors were caught and re-raised as unrelated
errors (ShapeMismatch replaced with NotImplemented). We don't do that
anymore.
 * The new Op.make_thunk function(introduced in 0.4.0) is now used by
constant_folding and DebugMode
 * Added A_TENSOR_VARIABLE.astype() as a way to cast. NumPy allows this syntax.
 * New BLAS GER implementation.
 * Insert GEMV more frequently.
 * Added new ifelse(scalar condition, rval_if_true, rval_if_false) Op.
   * This is a subset of the elemwise switch (tensor condition,
rval_if_true, rval_if_false).
   * With the new feature in the sandbox, only one of rval_if_true or
rval_if_false will be evaluated.

Optimizations:

 * Subtensor has C code
 * {Inc,Set}Subtensor has C code
 * ScalarFromTensor has C code
 * dot(zeros,x) and dot(x,zeros)
 * IncSubtensor(x, zeros, idx) -> x
 * SetSubtensor(x, x[idx], idx) -> x (when x is a constant)
 * subtensor(alloc,...) -> alloc
 * Many new scan optimization
   * Lower scan execution overhead with a Cython implementation
   * Removed scan double compilation (by using the new Op.make_thunk mechanism)
   * Certain computations from the inner graph are now Pushed out into the outer
     graph. This means they are not re-comptued at every step of scan.
   * Different scan ops get merged now into a single op (if possible), reducing
     the overhead and sharing computations between the two instances

GPU:

 * PyCUDA/CUDAMat/Gnumpy/Theano bridge and `documentation
<http://deeplearning.net/software/theano/tutorial/gpu_data_convert.html>`_.
   * New function to easily convert pycuda GPUArray object to and from
CudaNdarray object
   * Fixed a bug if you crated a view of a manually created
CudaNdarray that are view of GPUArray.
 * Removed a warning when nvcc is not available and the user did not
requested it.
 * renamed config option cuda.nvccflags -> nvcc.flags
 * Allow GpuSoftmax and GpuSoftmaxWithBias to work with bigger input.

Bugs fixed:

 * In one case an AdvancedSubtensor1 could be converted to a
GpuAdvancedIncSubtensor1 insted of GpuAdvancedSubtensor1.
   It probably didn't happen due to the order of optimizations, but
that order is not guaranteed to be the same on all computers.
 * Derivative of set_subtensor was wrong.
 * Derivative of Alloc was wrong.

Crash fixed:

 * On an unusual Python 2.4.4 on Windows
 * When using a C cache copied from another location
 * On Windows 32 bits when setting a complex64 to 0.
 * Compilation crash with CUDA 4
 * When wanting to copy the compilation cache from a computer to another
   * This can be useful for using Theano on a computer without a compiler.
 * GPU:
   * Compilation crash fixed under Ubuntu 11.04
   * Compilation crash fixed with CUDA 4.0

Know bug:

 * CAReduce with nan in inputs don't return the good output (`Ticket
<http://trac-hg.assembla.com/theano/ticket/763>`_).
   * This is used in tensor.{max,mean,prod,sum} and in the grad of
PermuteRowElements.
   * This is not a new bug, just a bug discovered since the last
release that we didn't had time to fix.

Deprecation (will be removed in Theano 0.5, warning generated if you use them):
 * The string mode (accepted only by theano.function()) FAST_RUN_NOGC.
Use Mode(linker='c|py_nogc') instead.
 * The string mode (accepted only by theano.function()) STABILIZE. Use
Mode(optimizer='stabilize') instead.
 * scan interface change:
   * The use of `return_steps` for specifying how many entries of the output
     scan has been depricated
     * The same thing can be done by applying a subtensor on the output
       return by scan to select a certain slice
   * The inner function (that scan receives) should return its outputs and
     updates following this order:
        [outputs], [updates], [condition]. One can skip any of the three if not
        used, but the order has to stay unchanged.
 * tensor.grad(cost, wrt) will return an object of the "same type" as wrt
   (list/tuple/TensorVariable).
   * Currently tensor.grad return a type list when the wrt is a list/tuple of
     more then 1 element.

Sandbox:

 * MRG random generator now implements the same casting behavior as
the regular random generator.
Sandbox New features(not enabled by default):
 * New Linkers (theano flags linker={vm,cvm})
   * The new linker allows lazy evaluation of the new ifelse op,
meaning we compute only the true or false branch depending of the
condition. This can speed up some types of computation.
   * Uses a new profiling system (that currently tracks less stuff)
   * The cvm is implemented in C, so it lowers Theano's overhead.
   * The vm is implemented in python. So it can help debugging in some cases.
   * In the future, the default will be the cvm.
 * Some new not yet well tested sparse ops:
theano.sparse.sandbox.{SpSum, Diag, SquareDiagonal, ColScaleCSC,
RowScaleCSC, Remove0, EnsureSortedIndices, ConvolutionIndices}

Documentation:

 * How to compute the `Jacobian, Hessian, Jacobian times a vector,
Hessian times a vector
<http://deeplearning.net/software/theano/tutorial/gradients.html>`_.
 * Slide for a 3 hours class with exercises that was done at the
HPCS2011 Conference in Montreal.

Others:

 * Logger name renamed to be consistent.
 * Logger function simplified and made more consistent.
 * Fixed transformation of error by other not related error with the
compute_test_value Theano flag.
 * Compilation cache enhancements.
 * Made compatible with NumPy 1.6 and SciPy 0.9
 * Fix tests when there was new dtype in NumPy that is not supported by Theano.
 * Fixed some tests when SciPy is not available.
 * Don't compile anything when Theano is imported. Compile support
code when we compile the first C code.
 * Python 2.4 fix:
   * Fix the file theano/misc/check_blas.py
   * For python 2.4.4 on Windows, replaced float("inf") with numpy.inf.
 * Removes useless inputs to a scan node
   * Beautification mostly, making the graph more visible. Such inputs
would appear as a consequence of other optimizations

Core:

 * there is a new mechanism that lets an Op permit that one of its
   inputs to be aliased to another destroyed input.  This will generally
   result in incorrect calculation, so it should be used with care!  The
   right way to use it is when the caller can guarantee that even if
   these two inputs look aliased, they actually will never overlap. This
   mechanism can be used, for example, by a new alternative approach to
   implementing Scan.  If an op has an attribute called
   "destroyhandler_tolerate_aliased" then this is what's going on.
   IncSubtensor is thus far the only Op to use this mechanism.Mechanism


Download
--------

You can download Theano from http://pypi.python.org/pypi/Theano.

Description
-----------

Theano is a Python library that allows you to define, optimize, and
efficiently evaluate mathematical expressions involving
multi-dimensional arrays. It is built on top of NumPy. Theano
features:

 * tight integration with NumPy: a similar interface to NumPy's.
   numpy.ndarrays are also used internally in Theano-compiled functions.
 * transparent use of a GPU: perform data-intensive computations up to
   140x faster than on a CPU (support for float32 only).
 * efficient symbolic differentiation: Theano can compute derivatives
   for functions of one or many inputs.
 * speed and stability optimizations: avoid nasty bugs when computing
   expressions such as log(1+ exp(x)) for large values of x.
 * dynamic C code generation: evaluate expressions faster.
 * extensive unit-testing and self-verification: includes tools for
   detecting and diagnosing bugs and/or potential problems.

Theano has been powering large-scale computationally intensive
scientific research since 2007, but it is also approachable
enough to be used in the classroom (IFT6266 at the University of Montreal).

Resources
---------

About Theano:

http://deeplearning.net/software/theano/

About NumPy:

http://numpy.scipy.org/

About SciPy:

http://www.scipy.org/

Machine Learning Tutorial with Theano on Deep Architectures:

http://deeplearning.net/tutorial/

Acknowledgments
---------------


I would like to thank all contributors of Theano. For this particular
release, here is the people that contributed code and/or
documentation: (in alphabetical order) Frederic Bastien, James Bergstra,
Olivier Delalleau, Xavier Glorot, Ian Goodfellow, Pascal Lamblin, Gr?goire
Mesnil, Razvan Pascanu, Ilya Sutskever and David Warde-Farley


Also, thank you to all NumPy and Scipy developers as Theano builds on
its strength.

All questions/comments are always welcome on the Theano
mailing-lists ( http://deeplearning.net/software/theano/ )


From stef.mientki at gmail.com  Sat Aug 13 09:54:03 2011
From: stef.mientki at gmail.com (Stef Mientki)
Date: Sat, 13 Aug 2011 15:54:03 +0200
Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released
In-Reply-To: <BANLkTikZy74N0CBuRWXLuhJ6SytT03JwvA@mail.gmail.com>
References: <BANLkTikZy74N0CBuRWXLuhJ6SytT03JwvA@mail.gmail.com>
Message-ID: <4E4681FA.4020905@gmail.com>

hello,

is it possible to create a windows executable
(a lot of windows users can't compile C-code).

I tried the prebuild versions from:
http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck

but the fast routines are all missing there.

thanks,
Stef

On 13-06-2011 23:35, Keith Goodman wrote:
> Bottleneck is a collection of fast NumPy array functions written in
> Cython. It contains functions like median, nanmedian, nanargmax,
> move_max, rankdata.
>
> The fifth release of bottleneck adds four new functions, comes in a
> single source distribution instead of separate 32 and 64 bit versions,
> and contains bug fixes.
>
> J. David Lee wrote the C-code implementation of the double heap moving
> window median.
>
> New functions:
> - move_median(), moving window median
> - partsort(), partial sort
> - argpartsort()
> - ss(), sum of squares, faster version of scipy.stats.ss
>
> Changes:
> - Single source distribution instead of separate 32 and 64 bit versions
> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN
>
> Bug fixes:
> - #14 Support python 2.5 by importing `with` statement
> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements
> - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows
> - #29 rankdata and nanrankdata crashed on 64-bit Windows
>
> download
>    http://pypi.python.org/pypi/Bottleneck
> docs
>    http://berkeleyanalytics.com/bottleneck
> code
>    http://github.com/kwgoodman/bottleneck
> mailing list
>    http://groups.google.com/group/bottle-neck
> mailing list 2
>    http://mail.scipy.org/mailman/listinfo/scipy-user
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From cgohlke at uci.edu  Sat Aug 13 11:09:55 2011
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Sat, 13 Aug 2011 08:09:55 -0700
Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released
In-Reply-To: <4E4681FA.4020905@gmail.com>
References: <BANLkTikZy74N0CBuRWXLuhJ6SytT03JwvA@mail.gmail.com>
	<4E4681FA.4020905@gmail.com>
Message-ID: <4E4693C3.5010103@uci.edu>


On 8/13/2011 6:54 AM, Stef Mientki wrote:
> hello,
>
> is it possible to create a windows executable
> (a lot of windows users can't compile C-code).
>
> I tried the prebuild versions from:
> http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck
>
> but the fast routines are all missing there.

I don't see anything missing. Tests and benchmarks yield expected 
results using numpy 1.6.1.

What's the output of `import bottleneck as bn;bn.test()` (requires nose 
1.x)?

Christoph

>
> thanks,
> Stef
>
> On 13-06-2011 23:35, Keith Goodman wrote:
>> Bottleneck is a collection of fast NumPy array functions written in
>> Cython. It contains functions like median, nanmedian, nanargmax,
>> move_max, rankdata.
>>
>> The fifth release of bottleneck adds four new functions, comes in a
>> single source distribution instead of separate 32 and 64 bit versions,
>> and contains bug fixes.
>>
>> J. David Lee wrote the C-code implementation of the double heap moving
>> window median.
>>
>> New functions:
>> - move_median(), moving window median
>> - partsort(), partial sort
>> - argpartsort()
>> - ss(), sum of squares, faster version of scipy.stats.ss
>>
>> Changes:
>> - Single source distribution instead of separate 32 and 64 bit versions
>> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN
>>
>> Bug fixes:
>> - #14 Support python 2.5 by importing `with` statement
>> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements
>> - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows
>> - #29 rankdata and nanrankdata crashed on 64-bit Windows
>>
>> download
>>     http://pypi.python.org/pypi/Bottleneck
>> docs
>>     http://berkeleyanalytics.com/bottleneck
>> code
>>     http://github.com/kwgoodman/bottleneck
>> mailing list
>>     http://groups.google.com/group/bottle-neck
>> mailing list 2
>>     http://mail.scipy.org/mailman/listinfo/scipy-user
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From ralf.gommers at googlemail.com  Sat Aug 13 11:58:41 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 13 Aug 2011 17:58:41 +0200
Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after
 upgrading to Mac OS X 10.7 aka Lion)
Message-ID: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>

On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer <guyer at nist.gov> wrote:

>
> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote:
>
> > Ah, with "svn" you actually meant svn:) I thought that was supposed to
> not even work anymore.
>
> It does work and it's confusing. I had not been following the transition
> closely and so was under the impression that the svn repository was being
> mirrored from git. It's not. It's just old.
>
> Who can disable SVN access for numpy and scipy? There are still plenty of
links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can
confuse users.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110813/785fd630/attachment.html>

From ognen at enthought.com  Sat Aug 13 12:00:49 2011
From: ognen at enthought.com (Ognen Duzlevski)
Date: Sat, 13 Aug 2011 12:00:49 -0400
Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after
 upgrading to Mac OS X 10.7 aka Lion)
In-Reply-To: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
Message-ID: <CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>

On Sat, Aug 13, 2011 at 11:58 AM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer <guyer at nist.gov> wrote:
>
>>
>> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote:
>>
>> > Ah, with "svn" you actually meant svn:) I thought that was supposed to
>> not even work anymore.
>>
>> It does work and it's confusing. I had not been following the transition
>> closely and so was under the impression that the svn repository was being
>> mirrored from git. It's not. It's just old.
>>
>> Who can disable SVN access for numpy and scipy? There are still plenty of
> links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can
> confuse users.
>
> Ralf
>

Ralf,

I am the new Enthought sys admin. Is there anything I can do to help?

Thanks,
Ognen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110813/8ab58ad2/attachment.html>

From ralf.gommers at googlemail.com  Sat Aug 13 12:14:11 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 13 Aug 2011 18:14:11 +0200
Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after
 upgrading to Mac OS X 10.7 aka Lion)
In-Reply-To: <CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
Message-ID: <CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>

On Sat, Aug 13, 2011 at 6:00 PM, Ognen Duzlevski <ognen at enthought.com>wrote:

> On Sat, Aug 13, 2011 at 11:58 AM, Ralf Gommers <
> ralf.gommers at googlemail.com> wrote:
>
>>
>>
>> On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer <guyer at nist.gov> wrote:
>>
>>>
>>> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote:
>>>
>>> > Ah, with "svn" you actually meant svn:) I thought that was supposed to
>>> not even work anymore.
>>>
>>> It does work and it's confusing. I had not been following the transition
>>> closely and so was under the impression that the svn repository was being
>>> mirrored from git. It's not. It's just old.
>>>
>>> Who can disable SVN access for numpy and scipy? There are still plenty of
>> links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can
>> confuse users.
>>
>> Ralf
>>
>
> Hi Ognen,


> Ralf,
>
> I am the new Enthought sys admin. Is there anything I can do to help?
>
> We should check if there's still any code in SVN branches that is useful.
If so the people who are interested in it should move it somewhere else.
Anyone?

After that I think you can pull the plug on http://svn.scipy.org/svn/numpy/and
http://svn.scipy.org/svn/scipy/.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110813/8adc1e88/attachment.html>

From ognen at enthought.com  Sat Aug 13 12:57:55 2011
From: ognen at enthought.com (Ognen Duzlevski)
Date: Sat, 13 Aug 2011 12:57:55 -0400
Subject: [SciPy-User] disabling SVN (was: Trouble installing scipy after
 upgrading to Mac OS X 10.7 aka Lion)
In-Reply-To: <CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
	<CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
Message-ID: <CAA6U3WBQMMfKMHuw+3sAOkksp0zzUb3b0+vh7r0J1goiN2FwAA@mail.gmail.com>

On Sat, Aug 13, 2011 at 12:14 PM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

> On Sat, Aug 13, 2011 at 6:00 PM, Ognen Duzlevski <ognen at enthought.com>wrote:
>
>> On Sat, Aug 13, 2011 at 11:58 AM, Ralf Gommers <
>> ralf.gommers at googlemail.com> wrote:
>>
>>>
>>>
>>> On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer <guyer at nist.gov> wrote:
>>>
>>>>
>>>> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote:
>>>>
>>>> > Ah, with "svn" you actually meant svn:) I thought that was supposed to
>>>> not even work anymore.
>>>>
>>>> It does work and it's confusing. I had not been following the transition
>>>> closely and so was under the impression that the svn repository was being
>>>> mirrored from git. It's not. It's just old.
>>>>
>>>> Who can disable SVN access for numpy and scipy? There are still plenty
>>> of links to http://svn.scipy.org/svn/numpy/trunk/ floating around that
>>> can confuse users.
>>>
>>> Ralf
>>>
>>
>> Hi Ognen,
>
>
>> Ralf,
>>
>> I am the new Enthought sys admin. Is there anything I can do to help?
>>
>> We should check if there's still any code in SVN branches that is useful.
> If so the people who are interested in it should move it somewhere else.
> Anyone?
>
> After that I think you can pull the plug on
> http://svn.scipy.org/svn/numpy/ and http://svn.scipy.org/svn/scipy/.
>
> Ralf
>

OK - let me know.
Ognen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110813/45925aaf/attachment.html>

From warren.weckesser at enthought.com  Sat Aug 13 14:12:00 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Sat, 13 Aug 2011 13:12:00 -0500
Subject: [SciPy-User] firwin behavior
In-Reply-To: <CAGoRfgEHDKW-wDgg7HQPVCcGgrPy66H=CxXEKhD7yn3mhsrfBA@mail.gmail.com>
References: <CAGoRfgFpjy0rfoxY1GzX5DovhYSpALWwZEEfbyLGiu1S1apVMA@mail.gmail.com>
	<CAM-+wY_7+Y8uqaNeAn3gKvS0udJ6v+Jab+u2JRo1ouo_fOkf4w@mail.gmail.com>
	<CAGoRfgEHDKW-wDgg7HQPVCcGgrPy66H=CxXEKhD7yn3mhsrfBA@mail.gmail.com>
Message-ID: <CAM-+wY8GqmM3geRF-7RU169_BuTaKHzb4pBEs-s8m6BzpJCj0w@mail.gmail.com>

On Thu, Aug 11, 2011 at 6:36 AM, Jeff Alstott <jeffalstott at gmail.com> wrote:

> Wow. The passing of the DC frequency is exactly the issue, and that default
> behavior is clearly shown in the documentation. I see now that given a band,
> the default behavior is band-stop, whereas I would expect it to be
> band-pass. So, that fixed it.
>
> What I don't understand, however, is *why* that would be default behavior.
> More importantly, even if that is the default behavior, the name of the
> pass_zero flag does not readily help a dumb user like me grok the
> functionality. Has there been any thought to renaming it?
>


See here for the evolution of the firwin API:
    http://projects.scipy.org/scipy/ticket/902

Warren


>
> Thanks!
>
>
> On Wed, Aug 10, 2011 at 4:39 PM, Warren Weckesser <
> warren.weckesser at enthought.com> wrote:
>
>> On Wed, Aug 10, 2011 at 8:14 AM, Jeff Alstott <jeffalstott at gmail.com>
>> wrote:
>> > firwin is producing unreasonable filters for me, and I'm not sure if I'm
>> > misusing the code or if there is a bug. Like so:
>> >
>> > In [5]: from scipy.signal import firwin
>> >
>> > In [6]: ny = 500
>> >
>> > In [7]: f21f80= firwin(21, [1/ny, 80/ny]); plot(f21f80);
>> > savefig('FIR21_filter80.png')
>> >
>> > Produces the attached file.
>> >
>> > In contrast, Matlab:
>> >
>> > Trial>> ny = 500
>> >
>> > ny =
>> >
>> >    500
>> >
>> > Trial>> [f20f80] = fir1(20, [1/ny, 80/ny]); figure; plot(f20f80)
>> >
>> > Produces the other attached file. Quite different! The filter produced
>> by
>> > the scipy function, if used with lfilter (or if taken to Matlab to use
>> as a
>> > filter), produces a nonsense filtering, with many high frequency
>> artifacts.
>> >
>> > Any thoughts? This is in python3, if that matters.
>>
>>
>> By default, firwin creates a filter that passes DC (i.e. the zero
>> frequency).  To get a filter like the one produced by matlab, add the
>> keyword argument pass_zero=False.
>>
>> Warren
>>
>>
>> >
>> > Thanks!
>> >
>> >
>> >
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >
>> >
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110813/7836d02b/attachment.html>

From stef.mientki at gmail.com  Sat Aug 13 15:05:10 2011
From: stef.mientki at gmail.com (Stef Mientki)
Date: Sat, 13 Aug 2011 21:05:10 +0200
Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released
In-Reply-To: <4E4693C3.5010103@uci.edu>
References: <BANLkTikZy74N0CBuRWXLuhJ6SytT03JwvA@mail.gmail.com>
	<4E4681FA.4020905@gmail.com> <4E4693C3.5010103@uci.edu>
Message-ID: <4E46CAE6.8000403@gmail.com>

thanks Cristoph,

found the problem,
I used a too old version of numpy,
( wouldn't it be an idea to replace line 17 in __init__.py with "print 'requires at least numpy 1.5.1"
now the fast routines are working ...
... at least sometimes
... at least on some computers

What I've seen until now:
Computer 1: numpy 1.4, so it uses slow routines: functional ok
Computer 2: exactly the same python + libs: screen starts to "blink" to black a few times (for about 
half a second, with an interval about 2 seconds),
after 10 times, the screen is filled with a repeating part of the screen and computer hangs totally.

Computer 2: numpy 1.6.1 : first program run, screen "blinks" black once, the fast bottleneck 
routines are use, and they function ok.
Second run of the same program: screen blinks blank once, after a few seconds, the screen is again 
filled with a smaal repating part of the screen and the computer hangs totally.
Any ideas ?

Is the GPU used with these routines ?

cheers,
Stef

On 13-08-2011 17:09, Christoph Gohlke wrote:
>
> On 8/13/2011 6:54 AM, Stef Mientki wrote:
>> hello,
>>
>> is it possible to create a windows executable
>> (a lot of windows users can't compile C-code).
>>
>> I tried the prebuild versions from:
>> http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck
>>
>> but the fast routines are all missing there.
> I don't see anything missing. Tests and benchmarks yield expected
> results using numpy 1.6.1.
>
> What's the output of `import bottleneck as bn;bn.test()` (requires nose
> 1.x)?
>
> Christoph
>
>> thanks,
>> Stef
>>
>> On 13-06-2011 23:35, Keith Goodman wrote:
>>> Bottleneck is a collection of fast NumPy array functions written in
>>> Cython. It contains functions like median, nanmedian, nanargmax,
>>> move_max, rankdata.
>>>
>>> The fifth release of bottleneck adds four new functions, comes in a
>>> single source distribution instead of separate 32 and 64 bit versions,
>>> and contains bug fixes.
>>>
>>> J. David Lee wrote the C-code implementation of the double heap moving
>>> window median.
>>>
>>> New functions:
>>> - move_median(), moving window median
>>> - partsort(), partial sort
>>> - argpartsort()
>>> - ss(), sum of squares, faster version of scipy.stats.ss
>>>
>>> Changes:
>>> - Single source distribution instead of separate 32 and 64 bit versions
>>> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN
>>>
>>> Bug fixes:
>>> - #14 Support python 2.5 by importing `with` statement
>>> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements
>>> - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows
>>> - #29 rankdata and nanrankdata crashed on 64-bit Windows
>>>
>>> download
>>>      http://pypi.python.org/pypi/Bottleneck
>>> docs
>>>      http://berkeleyanalytics.com/bottleneck
>>> code
>>>      http://github.com/kwgoodman/bottleneck
>>> mailing list
>>>      http://groups.google.com/group/bottle-neck
>>> mailing list 2
>>>      http://mail.scipy.org/mailman/listinfo/scipy-user
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From kwgoodman at gmail.com  Sat Aug 13 20:39:57 2011
From: kwgoodman at gmail.com (Keith Goodman)
Date: Sat, 13 Aug 2011 17:39:57 -0700
Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released
In-Reply-To: <4E46CAE6.8000403@gmail.com>
References: <BANLkTikZy74N0CBuRWXLuhJ6SytT03JwvA@mail.gmail.com>
	<4E4681FA.4020905@gmail.com> <4E4693C3.5010103@uci.edu>
	<4E46CAE6.8000403@gmail.com>
Message-ID: <CAB6Y535m=uGmhe77RQ+Degejj-oBjRtb7G1dtOQaUiBi8_0KKA@mail.gmail.com>

On Sat, Aug 13, 2011 at 12:05 PM, Stef Mientki <stef.mientki at gmail.com> wrote:
> thanks Cristoph,
>
> found the problem,
> I used a too old version of numpy,
> ( wouldn't it be an idea to replace line 17 in __init__.py with "print 'requires at least numpy 1.5.1"

Sounds like a good idea but unfortunately there are other reasons,
besides an old version of numpy, why the cython functions might fail
to load. For example, the compilation may have failed.

> now the fast routines are working ...

Yay!

> ... at least sometimes

Oh :(

> ... at least on some computers
>
> What I've seen until now:
> Computer 1: numpy 1.4, so it uses slow routines: functional ok
> Computer 2: exactly the same python + libs: screen starts to "blink" to black a few times (for about
> half a second, with an interval about 2 seconds),
> after 10 times, the screen is filled with a repeating part of the screen and computer hangs totally.

Oh, my goodness! That is odd.

> Computer 2: numpy 1.6.1 : first program run, screen "blinks" black once, the fast bottleneck
> routines are use, and they function ok.
> Second run of the same program: screen blinks blank once, after a few seconds, the screen is again
> filled with a smaal repating part of the screen and the computer hangs totally.
> Any ideas ?

That is terrible. I have no clue as to the cause.

> Is the GPU used with these routines ?

No.

> cheers,
> Stef
>
> On 13-08-2011 17:09, Christoph Gohlke wrote:
>>
>> On 8/13/2011 6:54 AM, Stef Mientki wrote:
>>> hello,
>>>
>>> is it possible to create a windows executable
>>> (a lot of windows users can't compile C-code).
>>>
>>> I tried the prebuild versions from:
>>> http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck
>>>
>>> but the fast routines are all missing there.
>> I don't see anything missing. Tests and benchmarks yield expected
>> results using numpy 1.6.1.
>>
>> What's the output of `import bottleneck as bn;bn.test()` (requires nose
>> 1.x)?
>>
>> Christoph
>>
>>> thanks,
>>> Stef
>>>
>>> On 13-06-2011 23:35, Keith Goodman wrote:
>>>> Bottleneck is a collection of fast NumPy array functions written in
>>>> Cython. It contains functions like median, nanmedian, nanargmax,
>>>> move_max, rankdata.
>>>>
>>>> The fifth release of bottleneck adds four new functions, comes in a
>>>> single source distribution instead of separate 32 and 64 bit versions,
>>>> and contains bug fixes.
>>>>
>>>> J. David Lee wrote the C-code implementation of the double heap moving
>>>> window median.
>>>>
>>>> New functions:
>>>> - move_median(), moving window median
>>>> - partsort(), partial sort
>>>> - argpartsort()
>>>> - ss(), sum of squares, faster version of scipy.stats.ss
>>>>
>>>> Changes:
>>>> - Single source distribution instead of separate 32 and 64 bit versions
>>>> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN
>>>>
>>>> Bug fixes:
>>>> - #14 Support python 2.5 by importing `with` statement
>>>> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements
>>>> - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows
>>>> - #29 rankdata and nanrankdata crashed on 64-bit Windows
>>>>
>>>> download
>>>> ? ? ?http://pypi.python.org/pypi/Bottleneck
>>>> docs
>>>> ? ? ?http://berkeleyanalytics.com/bottleneck
>>>> code
>>>> ? ? ?http://github.com/kwgoodman/bottleneck
>>>> mailing list
>>>> ? ? ?http://groups.google.com/group/bottle-neck
>>>> mailing list 2
>>>> ? ? ?http://mail.scipy.org/mailman/listinfo/scipy-user
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From paul.anton.letnes at gmail.com  Sun Aug 14 16:45:30 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sun, 14 Aug 2011 21:45:30 +0100
Subject: [SciPy-User] segfault
Message-ID: <C202742B-9E25-4828-B825-C7FC55CCCFC8@gmail.com>

Hi!

(I am cross posting this to the 

This code (see bottom of email) crashed with a segfault at the scipy.linalg.eigvals line:
% time python iterative-test.py
File read
Eigvals:
zsh: segmentation fault  python iterative-test.py
python iterative-test.py  536.82s user 2.90s system 96% cpu 9:18.75 total

Numpy version: 1.6.1
Scipy version: 0.9.0
Python version: 2.7.2 (default, Jun 25 2011, 09:29:54) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
Mac OS X 10.6.8

I have 4 GB of memory (and I believe Mac OS X allows you to use as much hard drive space as theoretically available as swap space), and the matrix A is of dtype numpy.complex64 and has shape (4608, 4608). In 'activity monitor' the process claims to use just shy of 370 MB of memory, and does not increase with time. By my calculations the A matrix should be about 162 MB (not worrying about 'object overhead', which should be small).

Anything I can do to help? I'd be happy to upload my matrix on my webpage, if someone wants to use it as test data. The hdf5 file is 162 MB so too big for the mailing list I suppose.

Cheers,
Paul

***************
def main():
    f = h5py.File('A2.h5', 'r')
    A = f['A'][:]
    b = numpy.loadtxt('RHS_p_formatted_copy', dtype=numpy.float32)
    b = b[:, 0] + 1.0j * b[:, 1]
    print 'File read'

    t0 = time.time()
    print 'Eigvals:'
->  w = scipy.linalg.eigvals(A, overwrite_a=True)
    w = numpy.sort(w)
    t_eig = time.time()
    print 'eig time:', t_eig - t0
    print w


From paul.anton.letnes at gmail.com  Sun Aug 14 16:56:34 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sun, 14 Aug 2011 21:56:34 +0100
Subject: [SciPy-User] Fwd: segfault
References: <C202742B-9E25-4828-B825-C7FC55CCCFC8@gmail.com>
Message-ID: <A730BF74-4D1E-41AC-B43F-7414D969DBC7@gmail.com>

Replying to myself with a bit more information. I tried installing the most recent scipy from the git repository into a virtualenv, but I ran into problems with umfpack.
% python setup.py build
blas_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers']

non-existing path in 'scipy/io': 'docs'
lapack_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-msse3']

umfpack_info:
  libraries umfpack not found in /Users/paulanto/Desktop/Rayleigh2D-debug/dev-scipy/bin/../lib
  libraries umfpack not found in /usr/local/lib
  libraries umfpack not found in /usr/lib
amd_info:
  libraries amd not found in /Users/paulanto/Desktop/Rayleigh2D-debug/dev-scipy/bin/../lib
  libraries amd not found in /usr/local/lib
  libraries amd not found in /usr/lib
  FOUND:
    libraries = ['amd']
    library_dirs = ['/opt/local/lib']

  FOUND:
    libraries = ['umfpack', 'amd']
    library_dirs = ['/opt/local/lib']

running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
build_src
building py_modules sources
building library "dfftpack" sources
building library "fftpack" sources
building library "linpack_lite" sources
building library "mach" sources
building library "quadpack" sources
building library "odepack" sources
building library "dop" sources
building library "fitpack" sources
building library "odrpack" sources
building library "minpack" sources
building library "rootfind" sources
building library "superlu_src" sources
building library "arpack_scipy" sources
building library "qhull" sources
building library "sc_c_misc" sources
building library "sc_cephes" sources
building library "sc_mach" sources
building library "sc_toms" sources
building library "sc_amos" sources
building library "sc_cdf" sources
building library "sc_specfun" sources
building library "statlib" sources
building extension "scipy.cluster._vq" sources
building extension "scipy.cluster._hierarchy_wrap" sources
building extension "scipy.fftpack._fftpack" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.fftpack.convolve" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.integrate._quadpack" sources
building extension "scipy.integrate._odepack" sources
building extension "scipy.integrate.vode" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.integrate._dop" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.interpolate.interpnd" sources
building extension "scipy.interpolate._fitpack" sources
building extension "scipy.interpolate.dfitpack" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
  adding 'build/src.macosx-10.6-x86_64-2.7/scipy/interpolate/src/dfitpack-f2pywrappers.f' to sources.
building extension "scipy.interpolate._interpolate" sources
building extension "scipy.io.matlab.streams" sources
building extension "scipy.io.matlab.mio_utils" sources
building extension "scipy.io.matlab.mio5_utils" sources
building extension "scipy.lib.blas.fblas" sources
f2py options: ['skip:', ':']
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
  adding 'build/src.macosx-10.6-x86_64-2.7/build/src.macosx-10.6-x86_64-2.7/scipy/lib/blas/fblas-f2pywrappers.f' to sources.
building extension "scipy.lib.blas.cblas" sources
  adding 'build/src.macosx-10.6-x86_64-2.7/scipy/lib/blas/cblas.pyf' to sources.
f2py options: ['skip:', ':']
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.lib.lapack.flapack" sources
f2py options: ['skip:', ':']
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.lib.lapack.clapack" sources
  adding 'build/src.macosx-10.6-x86_64-2.7/scipy/lib/lapack/clapack.pyf' to sources.
f2py options: ['skip:', ':']
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.lib.lapack.calc_lwork" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.lib.lapack.atlas_version" sources
building extension "scipy.linalg.fblas" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
  adding 'build/src.macosx-10.6-x86_64-2.7/build/src.macosx-10.6-x86_64-2.7/scipy/linalg/fblas-f2pywrappers.f' to sources.
building extension "scipy.linalg.cblas" sources
  adding 'build/src.macosx-10.6-x86_64-2.7/scipy/linalg/cblas.pyf' to sources.
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.linalg.flapack" sources
  adding 'build/src.macosx-10.6-x86_64-2.7/scipy/linalg/flapack.pyf' to sources.
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
  adding 'build/src.macosx-10.6-x86_64-2.7/build/src.macosx-10.6-x86_64-2.7/scipy/linalg/flapack-f2pywrappers.f' to sources.
building extension "scipy.linalg.clapack" sources
  adding 'build/src.macosx-10.6-x86_64-2.7/scipy/linalg/clapack.pyf' to sources.
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.linalg._flinalg" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.linalg.calc_lwork" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.linalg.atlas_version" sources
building extension "scipy.odr.__odrpack" sources
building extension "scipy.optimize._minpack" sources
building extension "scipy.optimize._zeros" sources
building extension "scipy.optimize._lbfgsb" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.optimize.moduleTNC" sources
building extension "scipy.optimize._cobyla" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.optimize.minpack2" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.optimize._slsqp" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.optimize._nnls" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.signal.sigtools" sources
building extension "scipy.signal.spectral" sources
building extension "scipy.signal.spline" sources
building extension "scipy.sparse.linalg.isolve._iterative" sources
f2py options: []
  adding 'build/src.macosx-10.6-x86_64-2.7/fortranobject.c' to sources.
  adding 'build/src.macosx-10.6-x86_64-2.7' to include_dirs.
building extension "scipy.sparse.linalg.dsolve._superlu" sources
building extension "scipy.sparse.linalg.dsolve.umfpack.__umfpack" sources
  adding 'scipy/sparse/linalg/dsolve/umfpack/umfpack.i' to sources.
swig: scipy/sparse/linalg/dsolve/umfpack/umfpack.i
swig -python -o build/src.macosx-10.6-x86_64-2.7/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c -outdir build/src.macosx-10.6-x86_64-2.7/scipy/sparse/linalg/dsolve/umfpack scipy/sparse/linalg/dsolve/umfpack/umfpack.i
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:192: Error: Unable to find 'umfpack.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:193: Error: Unable to find 'umfpack_solve.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:194: Error: Unable to find 'umfpack_defaults.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:195: Error: Unable to find 'umfpack_triplet_to_col.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:196: Error: Unable to find 'umfpack_col_to_triplet.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:197: Error: Unable to find 'umfpack_transpose.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:198: Error: Unable to find 'umfpack_scale.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:200: Error: Unable to find 'umfpack_report_symbolic.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:201: Error: Unable to find 'umfpack_report_numeric.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:202: Error: Unable to find 'umfpack_report_info.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:203: Error: Unable to find 'umfpack_report_control.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:215: Error: Unable to find 'umfpack_symbolic.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:216: Error: Unable to find 'umfpack_numeric.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:225: Error: Unable to find 'umfpack_free_symbolic.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:226: Error: Unable to find 'umfpack_free_numeric.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:248: Error: Unable to find 'umfpack_get_lunz.h'
scipy/sparse/linalg/dsolve/umfpack/umfpack.i:272: Error: Unable to find 'umfpack_get_numeric.h'
error: command 'swig' failed with exit status 1


From paul.anton.letnes at gmail.com  Sun Aug 14 17:11:00 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sun, 14 Aug 2011 22:11:00 +0100
Subject: [SciPy-User] Fwd: segfault
References: <A730BF74-4D1E-41AC-B43F-7414D969DBC7@gmail.com>
Message-ID: <52DAC708-8150-4FFA-9D31-00D6B722CCD8@gmail.com>

Replying to myself with a bit more information - again.

% time python iterative-test.py
File read
Eigvals:
zsh: segmentation fault  python iterative-test.py
python iterative-test.py  537.58s user 2.66s system 97% cpu 9:13.79 total
Memory use this time was approx. 550 MB (don't recall the exact number). The input matrix was the same.

The code was modified to:
def main():
    f = h5py.File('A2.h5', 'r')
    A = f['A'][:]
    b = numpy.loadtxt('RHS_p_formatted_copy', dtype=numpy.float32)
    b = b[:, 0] + 1.0j * b[:, 1]
    print 'File read'

    t0 = time.time()
    print 'Eigvals:'
->  w, vl, vr = scipy.linalg.eig(A, overwrite_a=True)


From scipydevwikiaccount  Sat Aug 13 19:20:36 2011
From: scipydevwikiaccount (scipydevwikiaccount)
Date: Sun, 14 Aug 2011 03:20:36 +0400
Subject: [SciPy-User] fmin_bfgs stuck in infinite loop (but only with new
	version)
Message-ID: <ACA92F729585465BA5EB7AB844F2349A@corp.parking.ru>

------------------------------------------------------------------------
--------------------------------
This email was sent via Anonymous email service for free.
YOU CAN REMOVE THIS TEXT MESSAGE BY BEING A PAID MEMBER FOR $19/year.
Message ID= 111663
------------------------------------------------------------------------
--------------------------------


I have run into a bug where scipy's fmin_bfgs will get stuck in an
infinite loop.

I have submitted a bug report, but I would also like to bring this
issue up in the mailing list to see if anybody has any suggestions.

http://projects.scipy.org/scipy/ticket/1494

Thanks,


Joshua

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110814/bda7a51c/attachment.html>

From st4s3a1l at gmail.com  Sat Aug 13 19:38:46 2011
From: st4s3a1l at gmail.com (b9o2jnbm tsd71eam)
Date: Sat, 13 Aug 2011 16:38:46 -0700
Subject: [SciPy-User] fmin_bfgs stuck in infinite loop
Message-ID: <CAGirKLaboyAEU7ZW_9a1sowfGdFNiuACyxKq1f4Onnaxe4TDMg@mail.gmail.com>

I have run into a frustrating problem where scipy.optimize.fmin_bfgs will
get stuck in an infinite loop.

I have submitted a bug report:

http://projects.scipy.org/scipy/ticket/1494

but would also like to see if anybody on this list has any suggestions or
feedback.

Thanks,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110813/82aa0829/attachment.html>

From newville at cars.uchicago.edu  Mon Aug 15 09:05:50 2011
From: newville at cars.uchicago.edu (Matt Newville)
Date: Mon, 15 Aug 2011 08:05:50 -0500
Subject: [SciPy-User] lmfit-py -- simple least squares minimization
Message-ID: <CA+7ESbpt4HFJ9Vti4MUAGG9h+ouaxMkFjVS+QwYUY+ffcJOjYw@mail.gmail.com>

Hi,

Having used on numpy and scipy for many years and being very pleased
with them, I've found an area which I think might benefit from a
modest improvement, and have tried to implement this.

The scipy.optimize routines are robust, but seem a little unfriendly
to people coming from proprietary environments or Numerical
Recipes-level tools.   Specifically, the Levenberg-Marquardt algorithm
is used heavily in many domains (including the x-ray spectroscopy
fields I am most familiar with), but the MINPACK and
scipy.optimize.leastsq implementation lack convenient ways to:
      -  turn on/off parameters for fitting,  that is, to "fix"
certain parameters.
      -  place simple min/max bounds on parameters
      -  place simple mathematical constraints on parameters.

While these limitations can be worked around, doing so requires
putting many options into the function to be minimized, which is
somewhat inconvenient.    On the other hand, these features do exist
in less robust fitting code that is not based on directly on MINPACK
or as well-supported as scipy.

I've written a module to do this so that the least-squares
minimization from scipy.optimize.leastsq can take bounded and
constrained parameters, and tried to make it of general use.    This
code (BSD-licensed, somewhat documented) is at
         http://github.com/newville/lmfit-py

The constraint mechanism is a bit involved (using the ast module
instead of 'eval'), but the rest of the code is quite straightforward
and simple.   Currently, this supports minimization with
scipy.optimize.leastsq, scipy.optimize.fmin_l_bfgs_b, and
scipy.optimize.anneal. Supporting other algorithms could be possible.

If you find this interesting or useful, I'd appreciate any feedback
you might have.  For example, this is not currently organized as a
scikit -- would that be preferable?

Cheers,

--Matt Newville <newville at cars.uchicago.edu>


From tmp50 at ukr.net  Mon Aug 15 15:21:30 2011
From: tmp50 at ukr.net (Dmitrey)
Date: Mon, 15 Aug 2011 22:21:30 +0300
Subject: [SciPy-User] [ANN] Constrained optimization solver with guaranteed
	precision
Message-ID: <E1Qt2j4-000O9x-UB@ffe8.ukr.net>

 Hi all,
   I'm glad to inform you that general constraints handling for interalg
   (free solver with guaranteed user-defined precision) now is available.
   Despite it is very premature and requires lots of improvements, it is
   already capable of outperforming commercial BARON (example:
   http://openopt.org/interalg_bench#Test_4)  and thus you could be
   interested in trying it right now (next OpenOpt release will be no
   sooner than 1 month).

   interalg can be especially more effective than BARON (and some other
   competitors) on problems with huge or absent Lipschitz constant, for
   example on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha<1, when
   domain of x is something like [small_positive_value, another_value].

   Let me also remember you that interalg can search for all solutions of
   nonlinear equations / systems of them where local solvers like
   scipy.optimize fsolve cannot find anyone, and search single/multiple
   integral with guaranteed user-defined precision (speed of integration
   is intended to be enhanced in future).
   However, only FuncDesigner models are handled (read interalg webpage
   for more details).

   Regards, D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110815/757d0828/attachment.html>

From bdeb at willmore.eu  Tue Aug 16 08:06:14 2011
From: bdeb at willmore.eu (Ben Willmore)
Date: Tue, 16 Aug 2011 13:06:14 +0100
Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS
	X	10.7 aka Lion
Message-ID: <C2FFE6FD-1345-48F5-8DC0-DF39E37BE99E@willmore.eu>

Hi Bill,

As Ralf mentioned, the veclib_cabi_c.c (etc) problem you have has been fixed. But, even with new checkouts of scipy, I found that simply running 'python setup.py install' did not result in a version of scipy that worked correctly on Mac OSX 10.7 (for example, using scipy.fftpack, ifft(fft(signal)) != signal, problems with single precision vector arithmetic, failures and a crash in ARPACK tests). I was able to fix these by setting:

export CC=gcc-4.2
export CXX=g++-4.2
export FFLAGS=-ff2c

before running setup.py. Complete details at the links below.

Ben


[1] http://article.gmane.org/gmane.comp.python.scientific.devel/15349
[2] http://willmore.eu/blog/?p=5


From brockp at umich.edu  Tue Aug 16 11:31:30 2011
From: brockp at umich.edu (Brock Palen)
Date: Tue, 16 Aug 2011 11:31:30 -0400
Subject: [SciPy-User] numpy and mkl/10.3
Message-ID: <728A1A21-A219-4ADD-954A-A1497CABAB4E@umich.edu>

I am trying to build the current numpy with mkl/10.3  Many library names have changed and I would like to get the effective link line of:

 -Wl,--start-group  $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a $(MKLROOT)/lib/intel64/libmkl_sequential.a $(MKLROOT)/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread

I have also tried doing the normal dynamic library link of:
-L/usr/caen/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread

I have tried messing about in site.cfg and following an intel forum post I set BLAS and LAPACK to these values with an empty site.cfg,

No mater what I get:
F2PY Version 2
blas_opt_info:
blas_mkl_info:
  NOT AVAILABLE

And it picks up redhats blas/lapack which I don't want to use (very very slow).

Any tips for why numpy does not build with MKL 10.3 ?


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


From gerrit.holl at ltu.se  Tue Aug 16 11:50:14 2011
From: gerrit.holl at ltu.se (Gerrit Holl)
Date: Tue, 16 Aug 2011 17:50:14 +0200
Subject: [SciPy-User] loadtxt and complicated dtype
Message-ID: <CACtgQxwxAtPZsKWPf5=dN48k475PgX7SztMof0vP=pNtJH8c4Q@mail.gmail.com>

Hello,

I have a datafile with 5000 rows and 839 columns that have particular
meanings. I use a complicated dtype to read this data, and used for
this until now loadtxt. However, it seems that it has stopped working
at some point. For example.

>>> from numpy import loadtxt, uint8
>>> from StringIO import StringIO
>>> from numpy.version import version
>>> print version
2.0.0.dev-5cf0a07
>>> loadtxt(StringIO("0 1 2 3"), dtype=[("a", uint8, 2), ("b", uint8, 2)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/storage4/home/gerrit/.local/lib/python2.6/site-packages/numpy/lib/npyio.py",
line 806, in loadtxt
    X = np.array(X, dtype)
ValueError: setting an array element with a sequence.
>>> loadtxt(StringIO("0 1 2 3"), dtype=[("a", uint8, 4)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/storage4/home/gerrit/.local/lib/python2.6/site-packages/numpy/lib/npyio.py",
line 806, in loadtxt
    X = np.array(X, dtype)
ValueError: setting an array element with a sequence.

Why does this not work? I have filed a bug-report.
http://projects.scipy.org/numpy/ticket/1936

Alright then, so I can try it in a different way. In my real case I
have a 2-D array M with shape (5000, 839). I have my complicated
dtype:

[('temp', <type 'numpy.float64'>, 91),
 ('hum', <type 'numpy.float64'>, 91),
 ...,
 ('gpoint', <type 'numpy.uint32'>, 1),
 ('ind', <type 'numpy.uint16'>, 1)]
]

whose numbers add up to 839. How do I turn this into an array of size
(5000,) with my requested dtype?
- .view(dtype) does not do what I mean, because this interprets the
actual bytes, and my new array will have a different number of bytes
compared to the old one
- array(M, dtype) does not do what I mean, because this will try to
expand every element of M according to the requested dtype, does
making the array much larger (and throwing a MemoryError).

I want this, because it's a very convenient way to access fields of my
data. It's more convenient to say M["ciw"] than to say M[:, 455:546].
If someone can suggest another way to achieve this convenience, I'm
open for suggestions.

kind regards,
Gerrit Holl.

-- 
Gerrit Holl
PhD student at Division of Space Technology, Lule? University of
Technology, Kiruna, Sweden
http://www.sat.ltu.se/members/gerrit/


From wccarithers at lbl.gov  Tue Aug 16 13:18:03 2011
From: wccarithers at lbl.gov (Bill Carithers)
Date: Tue, 16 Aug 2011 10:18:03 -0700
Subject: [SciPy-User] Trouble installing scipy after upgrading to Mac OS
	X 10.7 aka Lion
In-Reply-To: <C2FFE6FD-1345-48F5-8DC0-DF39E37BE99E@willmore.eu>
References: <C2FFE6FD-1345-48F5-8DC0-DF39E37BE99E@willmore.eu>
Message-ID: <57C07E12-F6D4-45E8-8B7E-72960963E378@lbl.gov>

Hi Ben,

thanks for the info. I was happy just to get it to compile and I didn't run a full slate of tests.

Cheers,
Bill

On Aug 16, 2011, at 5:06 AM, Ben Willmore <bdeb at willmore.eu> wrote:

> Hi Bill,
> 
> As Ralf mentioned, the veclib_cabi_c.c (etc) problem you have has been fixed. But, even with new checkouts of scipy, I found that simply running 'python setup.py install' did not result in a version of scipy that worked correctly on Mac OSX 10.7 (for example, using scipy.fftpack, ifft(fft(signal)) != signal, problems with single precision vector arithmetic, failures and a crash in ARPACK tests). I was able to fix these by setting:
> 
> export CC=gcc-4.2
> export CXX=g++-4.2
> export FFLAGS=-ff2c
> 
> before running setup.py. Complete details at the links below.
> 
> Ben
> 
> 
> [1] http://article.gmane.org/gmane.comp.python.scientific.devel/15349
> [2] http://willmore.eu/blog/?p=5
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From ciampagg at usi.ch  Tue Aug 16 13:26:47 2011
From: ciampagg at usi.ch (Giovanni Luca Ciampaglia)
Date: Tue, 16 Aug 2011 10:26:47 -0700
Subject: [SciPy-User] loadtxt and complicated dtype
In-Reply-To: <CACtgQxwxAtPZsKWPf5=dN48k475PgX7SztMof0vP=pNtJH8c4Q@mail.gmail.com>
References: <CACtgQxwxAtPZsKWPf5=dN48k475PgX7SztMof0vP=pNtJH8c4Q@mail.gmail.com>
Message-ID: <4E4AA857.1070508@usi.ch>

Il 16. 08. 11 08:50, Gerrit Holl ha scritto:
> In my real case I
> have a 2-D array M with shape (5000, 839). I have my complicated
> dtype:
>
> [('temp',<type 'numpy.float64'>, 91),
>   ('hum',<type 'numpy.float64'>, 91),
>   ...,
>   ('gpoint',<type 'numpy.uint32'>, 1),
>   ('ind',<type 'numpy.uint16'>, 1)]
> ]
>
> whose numbers add up to 839. How do I turn this into an array of size
> (5000,) with my requested dtype?
> - .view(dtype) does not do what I mean, because this interprets the
> actual bytes, and my new array will have a different number of bytes
> compared to the old one
> - array(M, dtype) does not do what I mean, because this will try to
> expand every element of M according to the requested dtype, does
> making the array much larger (and throwing a MemoryError).
>
> I want this, because it's a very convenient way to access fields of my
> data. It's more convenient to say M["ciw"] than to say M[:, 455:546].
> If someone can suggest another way to achieve this convenience, I'm
> open for suggestions.

Hi Gerrit,
you could use numpy.empty:

# x has shape (5000, 184)

ty = dtype([('temp', float64, 91),
('hum', float64, 91),
('gpoint', int32, 1),
('ind', int16, 1)])

data = empty((5000,), ty)

# copy the individual columns

data['temp'] = x[:,:91]
data['hum'] = x[:,91:182]
data['gpoint'] = x[:,182]
data['ind'] = x[:,183]

you can probably do the assignments in a for loop using the shape 
information from the individual fields

cheers

-- 
Giovanni Luca Ciampaglia

Ph.D. Candidate
Faculty of Informatics
University of Lugano
Web: http://www.inf.usi.ch/phd/ciampaglia/

Bertastra?e 36 ? 8003 Z?rich ? Switzerland


From questions.anon at gmail.com  Tue Aug 16 18:50:35 2011
From: questions.anon at gmail.com (questions anon)
Date: Wed, 17 Aug 2011 08:50:35 +1000
Subject: [SciPy-User] numpy array append
Message-ID: <CAN_=ogsNPqsrR0epVqJU4wsq4BJxvcnQhUwRYr1C2i3W_3Gosw@mail.gmail.com>

I would like to loop through a bunch of netcdf files in separate folders and
select a particular time and then calculate the mean and plot this. I have
been told to use append and make the selected times into a big array and
then use numpy.mean but I can't seem to get the numpy array to work. The
loop keeps calculating over the top of the last entry, if that makes sense?

from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as N
from mpl_toolkits.basemap import Basemap
import os

MainFolder=r"E:/temp_samples/"
for (path, dirs, files) in os.walk(MainFolder):
    for dir in dirs:
        print dir
    for ncfile in files:
        if ncfile[-3:]=='.nc':
            ncfile=os.path.join(path,ncfile)
            ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
            TSFC=ncfile.variables['T_SFC'][4::24]
            LAT=ncfile.variables['latitude'][:]
            LON=ncfile.variables['longitude'][:]
            TIME=ncfile.variables['time'][:]
            fillvalue=ncfile.variables['T_SFC']._FillValue
            ncfile.close()

#calculate summary stats
            big_array=[]
            for i in TSFC:
                big_array.append(i)
                big_array=N.array(big_array)
                Mean=N.mean(big_array, axis=0)

#plot output summary stats
            map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,

llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
            map.drawcoastlines()
            map.drawstates()
            x,y=map(*N.meshgrid(LON,LAT))
            plt.title('Total Mean at 3pm')
            ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
            CS = map.contourf(x,y,Mean,ticks, cmap=plt.cm.jet)
            l,b,w,h =0.1,0.1,0.8,0.8
            cax = plt.axes([l+w+0.025, b, 0.025, h])
            plt.colorbar(CS,cax=cax, drawedges=True)
            plt.show()
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110817/e39c3aa0/attachment.html>

From tsupinie at gmail.com  Tue Aug 16 19:13:49 2011
From: tsupinie at gmail.com (Tim Supinie)
Date: Tue, 16 Aug 2011 18:13:49 -0500
Subject: [SciPy-User] numpy array append
In-Reply-To: <CAN_=ogsNPqsrR0epVqJU4wsq4BJxvcnQhUwRYr1C2i3W_3Gosw@mail.gmail.com>
References: <CAN_=ogsNPqsrR0epVqJU4wsq4BJxvcnQhUwRYr1C2i3W_3Gosw@mail.gmail.com>
Message-ID: <CADFCwjL=ubOVfu+dr01zGrhjDXZNqd4FyEej-s1-mEyoJc91Sw@mail.gmail.com>

Ah, TSFC is being overwritten every time you go through the "for ncfile in
files" loop.  So if you want to keep around all of them, you should declare
a list before that loop called all_TSFC or something.  Then instead of
saying

TSFC = ncfile...

you would say

all_TSFC.append(ncfile...)

After that you could remove the second loop (for i in TSFC) and simply say

# Convert all_TSFC to a numpy array.  Not sure what happens
# if some lists in all_TSFC are of different sizes, so you
# should probably make sure they're all the same size.
big_array = N.array(all_TSFC)

# Take the mean of big_array along axis 0 (returns a
# 1-dimensional numpy array the size of one of the lists in
# all_TSFC).
Mean = N.mean(big_array, axis=0)

Hope that helps.

Tim

On Tue, Aug 16, 2011 at 5:50 PM, questions anon <questions.anon at gmail.com>wrote:

> I would like to loop through a bunch of netcdf files in separate folders
> and select a particular time and then calculate the mean and plot this. I
> have been told to use append and make the selected times into a big array
> and then use numpy.mean but I can't seem to get the numpy array to work. The
> loop keeps calculating over the top of the last entry, if that makes sense?
>
> from netCDF4 import Dataset
> import matplotlib.pyplot as plt
> import numpy as N
> from mpl_toolkits.basemap import Basemap
> import os
>
> MainFolder=r"E:/temp_samples/"
> for (path, dirs, files) in os.walk(MainFolder):
>     for dir in dirs:
>         print dir
>     for ncfile in files:
>         if ncfile[-3:]=='.nc':
>             ncfile=os.path.join(path,ncfile)
>             ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
>             TSFC=ncfile.variables['T_SFC'][4::24]
>             LAT=ncfile.variables['latitude'][:]
>             LON=ncfile.variables['longitude'][:]
>             TIME=ncfile.variables['time'][:]
>             fillvalue=ncfile.variables['T_SFC']._FillValue
>             ncfile.close()
>
> #calculate summary stats
>             big_array=[]
>             for i in TSFC:
>                 big_array.append(i)
>                 big_array=N.array(big_array)
>                 Mean=N.mean(big_array, axis=0)
>
> #plot output summary stats
>             map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
>
> llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
>             map.drawcoastlines()
>             map.drawstates()
>             x,y=map(*N.meshgrid(LON,LAT))
>             plt.title('Total Mean at 3pm')
>             ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
>             CS = map.contourf(x,y,Mean,ticks, cmap=plt.cm.jet)
>             l,b,w,h =0.1,0.1,0.8,0.8
>             cax = plt.axes([l+w+0.025, b, 0.025, h])
>             plt.colorbar(CS,cax=cax, drawedges=True)
>             plt.show()
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110816/e3341547/attachment.html>

From questions.anon at gmail.com  Tue Aug 16 20:42:59 2011
From: questions.anon at gmail.com (questions anon)
Date: Wed, 17 Aug 2011 10:42:59 +1000
Subject: [SciPy-User] numpy array append
In-Reply-To: <CADFCwjL=ubOVfu+dr01zGrhjDXZNqd4FyEej-s1-mEyoJc91Sw@mail.gmail.com>
References: <CAN_=ogsNPqsrR0epVqJU4wsq4BJxvcnQhUwRYr1C2i3W_3Gosw@mail.gmail.com>
	<CADFCwjL=ubOVfu+dr01zGrhjDXZNqd4FyEej-s1-mEyoJc91Sw@mail.gmail.com>
Message-ID: <CAN_=ogtbSCHZBX5xZ9ATsHvcpmdYaG3pRpnkO=s0Dn_TiRWCBw@mail.gmail.com>

Thanks Tim, that worked although I did run into a problem with different the
sizes of each file.
Each netcdf file contains a month of hourly data and some of those months
are 31 days and some are 28 or 30. Is there a way to get this to work?
here is the error I receive:

*Traceback (most recent call last):*
*  File "d:\documents and settings\SLBurns\Work\My
Dropbox\Python_code\calculate_the_mean_across_multiple_netcdf_files_in_multiple_folders.py",
line 39, in <module>*
*    Mean=N.mean(big_array, axis=0)*
*  File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line
2374, in mean*
*    return mean(axis, dtype, out)*
*ValueError: operands could not be broadcast together with shapes
(31,106,193) (28,106,193)*


from netCDF4 import Dataset
import numpy as N
import os

MainFolder=r"E:/temp_samples/"
all_TSFC=[]
for (path, dirs, files) in os.walk(MainFolder):
    for dir in dirs:
        print dir
    path=path+'/'
    for ncfile in files:
        if ncfile[-3:]=='.nc':
            ncfile=os.path.join(path,ncfile)
            ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
            TSFC=ncfile.variables['T_SFC'][4::24,:,:]
            all_TSFC.append(TSFC)

big_array=N.array(all_TSFC)
Mean=N.mean(big_array, axis=0)
print "the mean for three months at 3pm is", Mean


On Wed, Aug 17, 2011 at 9:13 AM, Tim Supinie <tsupinie at gmail.com> wrote:

> Ah, TSFC is being overwritten every time you go through the "for ncfile in
> files" loop.  So if you want to keep around all of them, you should declare
> a list before that loop called all_TSFC or something.  Then instead of
> saying
>
> TSFC = ncfile...
>
> you would say
>
> all_TSFC.append(ncfile...)
>
> After that you could remove the second loop (for i in TSFC) and simply say
>
> # Convert all_TSFC to a numpy array.  Not sure what happens
> # if some lists in all_TSFC are of different sizes, so you
> # should probably make sure they're all the same size.
> big_array = N.array(all_TSFC)
>
> # Take the mean of big_array along axis 0 (returns a
> # 1-dimensional numpy array the size of one of the lists in
> # all_TSFC).
>
> Mean = N.mean(big_array, axis=0)
>
> Hope that helps.
>
> Tim
>
> On Tue, Aug 16, 2011 at 5:50 PM, questions anon <questions.anon at gmail.com>wrote:
>
>> I would like to loop through a bunch of netcdf files in separate folders
>> and select a particular time and then calculate the mean and plot this. I
>> have been told to use append and make the selected times into a big array
>> and then use numpy.mean but I can't seem to get the numpy array to work. The
>> loop keeps calculating over the top of the last entry, if that makes sense?
>>
>> from netCDF4 import Dataset
>> import matplotlib.pyplot as plt
>> import numpy as N
>> from mpl_toolkits.basemap import Basemap
>> import os
>>
>> MainFolder=r"E:/temp_samples/"
>> for (path, dirs, files) in os.walk(MainFolder):
>>     for dir in dirs:
>>         print dir
>>     for ncfile in files:
>>         if ncfile[-3:]=='.nc':
>>             ncfile=os.path.join(path,ncfile)
>>             ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
>>             TSFC=ncfile.variables['T_SFC'][4::24]
>>             LAT=ncfile.variables['latitude'][:]
>>             LON=ncfile.variables['longitude'][:]
>>             TIME=ncfile.variables['time'][:]
>>             fillvalue=ncfile.variables['T_SFC']._FillValue
>>             ncfile.close()
>>
>> #calculate summary stats
>>             big_array=[]
>>             for i in TSFC:
>>                 big_array.append(i)
>>                 big_array=N.array(big_array)
>>                 Mean=N.mean(big_array, axis=0)
>>
>> #plot output summary stats
>>             map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
>>
>> llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
>>             map.drawcoastlines()
>>             map.drawstates()
>>             x,y=map(*N.meshgrid(LON,LAT))
>>             plt.title('Total Mean at 3pm')
>>             ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
>>             CS = map.contourf(x,y,Mean,ticks, cmap=plt.cm.jet)
>>             l,b,w,h =0.1,0.1,0.8,0.8
>>             cax = plt.axes([l+w+0.025, b, 0.025, h])
>>             plt.colorbar(CS,cax=cax, drawedges=True)
>>             plt.show()
>>
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110817/df6a8bee/attachment.html>

From robert.kern at gmail.com  Tue Aug 16 22:27:37 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 16 Aug 2011 21:27:37 -0500
Subject: [SciPy-User] numpy array append
In-Reply-To: <CAN_=ogtbSCHZBX5xZ9ATsHvcpmdYaG3pRpnkO=s0Dn_TiRWCBw@mail.gmail.com>
References: <CAN_=ogsNPqsrR0epVqJU4wsq4BJxvcnQhUwRYr1C2i3W_3Gosw@mail.gmail.com>
	<CADFCwjL=ubOVfu+dr01zGrhjDXZNqd4FyEej-s1-mEyoJc91Sw@mail.gmail.com>
	<CAN_=ogtbSCHZBX5xZ9ATsHvcpmdYaG3pRpnkO=s0Dn_TiRWCBw@mail.gmail.com>
Message-ID: <CAF6FJisFP98r9ihe36xv-tWRKo3Sn2+NvWNxY7s4qU-T1qvgFA@mail.gmail.com>

On Tue, Aug 16, 2011 at 19:42, questions anon <questions.anon at gmail.com> wrote:
> Thanks Tim, that worked although I did run into a problem with different the
> sizes of each file.
> Each netcdf file contains a month of hourly data and some of those months
> are 31 days and some are 28 or 30. Is there a way to get this to work?

You probably want the following:

  big_array = N.concatenate(all_TSFC)
  Mean = big_array.mean(axis=0)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From questions.anon at gmail.com  Tue Aug 16 23:49:09 2011
From: questions.anon at gmail.com (questions anon)
Date: Wed, 17 Aug 2011 13:49:09 +1000
Subject: [SciPy-User] numpy array append
In-Reply-To: <CAF6FJisFP98r9ihe36xv-tWRKo3Sn2+NvWNxY7s4qU-T1qvgFA@mail.gmail.com>
References: <CAN_=ogsNPqsrR0epVqJU4wsq4BJxvcnQhUwRYr1C2i3W_3Gosw@mail.gmail.com>
	<CADFCwjL=ubOVfu+dr01zGrhjDXZNqd4FyEej-s1-mEyoJc91Sw@mail.gmail.com>
	<CAN_=ogtbSCHZBX5xZ9ATsHvcpmdYaG3pRpnkO=s0Dn_TiRWCBw@mail.gmail.com>
	<CAF6FJisFP98r9ihe36xv-tWRKo3Sn2+NvWNxY7s4qU-T1qvgFA@mail.gmail.com>
Message-ID: <CAN_=ogv=9r5Ddsob3m1LGnUQbaKSLcLukOPsx7+e8DL9wwK1Dw@mail.gmail.com>

Excellent, thank you, that worked!!

On Wed, Aug 17, 2011 at 12:27 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Aug 16, 2011 at 19:42, questions anon <questions.anon at gmail.com>
> wrote:
> > Thanks Tim, that worked although I did run into a problem with different
> the
> > sizes of each file.
> > Each netcdf file contains a month of hourly data and some of those months
> > are 31 days and some are 28 or 30. Is there a way to get this to work?
>
> You probably want the following:
>
>  big_array = N.concatenate(all_TSFC)
>  Mean = big_array.mean(axis=0)
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>   -- Umberto Eco
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110817/2dfa44e7/attachment.html>

From questions.anon at gmail.com  Wed Aug 17 02:17:26 2011
From: questions.anon at gmail.com (questions anon)
Date: Wed, 17 Aug 2011 16:17:26 +1000
Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array
Message-ID: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>

I am trying to run simple stats on a bunch of monthly netcdfs files with
hourly temperature data. With help from this list I am able to loop through
a calculate the mean, but in doing this I have discovered that there are a
some hours that have no values or -32767. I am sure there are some cases
where I could slice out the section (if I know where they are) but is there
a way I could just ignore these hours and calculate the mean?
I have found something called "numpy.isnan" but this does not seem to work.

from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as N
from mpl_toolkits.basemap import Basemap
import os

MainFolder=r"E:/temp_samples/"

all_TSFC=[]
for (path, dirs, files) in os.walk(MainFolder):
    for dir in dirs:
        print dir
    path=path+'/'
    for ncfile in files:
        if ncfile[-3:]=='.nc':
            ncfile=os.path.join(path,ncfile)
            ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
            TSFC=ncfile.variables['T_SFC'][:]
            LAT=ncfile.variables['latitude'][:]
            LON=ncfile.variables['longitude'][:]
            TIME=ncfile.variables['time'][:]
            fillvalue=ncfile.variables['T_SFC']._FillValue
            ncfile.close()

#combine all TSFC to make one array for analyses
            all_TSFC.append(TSFC)

big_array=N.concatenate(all_TSFC)
Mean=big_array.mean(axis=0)
print "the mean is", Mean

 #plot output summary stats
map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,

llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
x,y=map(*N.meshgrid(LON,LAT))
CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h])
plt.colorbar(CS,cax=cax, drawedges=True)

plt.show()
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110817/a9c457d9/attachment.html>

From jrocher at enthought.com  Wed Aug 17 04:11:00 2011
From: jrocher at enthought.com (Jonathan Rocher)
Date: Wed, 17 Aug 2011 10:11:00 +0200
Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array
In-Reply-To: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
References: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
Message-ID: <CAOzk5QePsBaLvMS630qqA5Zqx+JJip7qPMbaEwQQufvYxtzqiA@mail.gmail.com>

Hi,

you can create a mask cutting out all the values you don't want to consider
in your mean and compute the mean of the "masked array". To illustrate the
concept, look at:
In [1]: a = array([1,2,3,NaN,5])

In [4]: isnan(a)
Out[4]: array([False, False, False,  True, False], dtype=bool)

In [5]: ~isnan(a)
Out[5]: array([ True,  True,  True, False,  True], dtype=bool)

In [11]: mask = (~isnan(a)) & (a != 3)

In [12]: mask
Out[12]: array([ True,  True, False, False,  True], dtype=bool)

In [13]: a[mask]
Out[13]: array([ 1.,  2.,  5.])

In [14]: a[mask].mean()
Out[14]: 2.6666666666666665

In you code, you need to use something similar before you compute the mean.

Hope this helps,
Jonathan

On Wed, Aug 17, 2011 at 8:17 AM, questions anon <questions.anon at gmail.com>wrote:

> I am trying to run simple stats on a bunch of monthly netcdfs files with
> hourly temperature data. With help from this list I am able to loop through
> a calculate the mean, but in doing this I have discovered that there are a
> some hours that have no values or -32767. I am sure there are some cases
> where I could slice out the section (if I know where they are) but is there
> a way I could just ignore these hours and calculate the mean?
> I have found something called "numpy.isnan" but this does not seem to work.
>
> from netCDF4 import Dataset
> import matplotlib.pyplot as plt
> import numpy as N
> from mpl_toolkits.basemap import Basemap
> import os
>
> MainFolder=r"E:/temp_samples/"
>
> all_TSFC=[]
> for (path, dirs, files) in os.walk(MainFolder):
>     for dir in dirs:
>         print dir
>     path=path+'/'
>     for ncfile in files:
>         if ncfile[-3:]=='.nc':
>             ncfile=os.path.join(path,ncfile)
>             ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
>             TSFC=ncfile.variables['T_SFC'][:]
>             LAT=ncfile.variables['latitude'][:]
>             LON=ncfile.variables['longitude'][:]
>             TIME=ncfile.variables['time'][:]
>             fillvalue=ncfile.variables['T_SFC']._FillValue
>             ncfile.close()
>
> #combine all TSFC to make one array for analyses
>             all_TSFC.append(TSFC)
>
> big_array=N.concatenate(all_TSFC)
> Mean=big_array.mean(axis=0)
> print "the mean is", Mean
>
>  #plot output summary stats
> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
>
> llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
> x,y=map(*N.meshgrid(LON,LAT))
> CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
> l,b,w,h =0.1,0.1,0.8,0.8
> cax = plt.axes([l+w+0.025, b, 0.025, h])
> plt.colorbar(CS,cax=cax, drawedges=True)
>
> plt.show()
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


-- 
Jonathan Rocher, PhD
Scientific software developer
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110817/f3c7f307/attachment.html>

From klonuo at gmail.com  Wed Aug 17 10:28:11 2011
From: klonuo at gmail.com (Klonuo Umom)
Date: Wed, 17 Aug 2011 16:28:11 +0200
Subject: [SciPy-User] Building SciPy on Debian with Intel compilers
In-Reply-To: <CAA-8Ld_OcXtG6V_35ZDA1+wL2xYWHzEcq-1Et7D9mfNK+Od1QQ@mail.gmail.com>
References: <CAA-8Ld-m1vmgNmX-RmJRZae_ub+cDtOtHJyyBZw-f5zM4oT+Hg@mail.gmail.com>
	<CAGY4rcUrSsha6Qt=dugSBYucFqYccHyNVjUkB-JSbeR_8WSObQ@mail.gmail.com>
	<CAA-8Ld-LLZDJyy8PDjdKuqqB+PL0z6ZWMKMv4k-y5EfDKGXrbw@mail.gmail.com>
	<CAA-8Ld_40poSFzFYeL+-0B3aNsncNFR9mVZBLZdxy8+kK8CXwQ@mail.gmail.com>
	<CAA-8Ld9p+0j+_brcUB_BGEfGs1yz4m4uVT6j7k_mwRgFuaKDbA@mail.gmail.com>
	<CAA-8Ld9hNvkaWQKX8EQWT4FaEkPNBWesxLy1PXtaLexCS80mOA@mail.gmail.com>
	<CAA-8Ld_OcXtG6V_35ZDA1+wL2xYWHzEcq-1Et7D9mfNK+Od1QQ@mail.gmail.com>
Message-ID: <CAA-8Ld_bu5ixLzUPoDauzOGc+mRUvm6+YSRhjPapovHwKBJ+qA@mail.gmail.com>

I got new Ubuntu 11.04 PC, and actually I accidentally found older
blog post which helped me pass this old sparsetools problems. Blog is
here: http://marklodato.github.com/

In brief, solution was to first run custom: `python setup.py config`
and then link sparsetools with icpc by hand:

========================================================================
for x in csr csc coo bsr dia; do
    icpc -xHost -O3 -fPIC -shared \
        build/temp.linux-x86_64-2.6/scipy/sparse/sparsetools/${x}_wrap.o \
        -o build/lib.linux-x86_64-2.6/scipy/sparse/sparsetools/_${x}.so
done
icpc -xHost -O3 -fPIC -openmp -shared \
    build/temp.linux-x86_64-2.6/scipy/interpolate/src/_interpolate.o \
    -o build/lib.linux-x86_64-2.6/scipy/interpolate/_interpolate.so
------------------------------------------------------------------------

Paths are dependant on Intel tools and Scipy version, and it's trivial
to correct them


So I build Numpy and Scipy with latest Intel Parallel Studio XE 2011
update 2 + SparseSuite (AMD and UMFPACK), but then running test I got
this:

========================================================================
*** libmkl_p4p.so *** failed with error :
 /opt/intel/composerxe-2011.4.191/mkl/lib/ia32/libmkl_p4p.so:
undefined symbol: i_malloc

*** libmkl_def.so *** failed with error :
 /opt/intel/composerxe-2011.4.191/mkl/lib/ia32/libmkl_def.so:
undefined symbol: i_malloc

MKL FATAL ERROR: Cannot load neither libmkl_p4p.so nor libmkl_def.so
------------------------------------------------------------------------


Workaround is this:

========================================================================
export LD_PRELOAD=/opt/intel/mkl/lib/ia32/libmkl_core.so:/opt/intel/mkl/lib/ia32/libmkl_sequential.so
------------------------------------------------------------------------

Now I run tests again

Numpy: FAILED (KNOWNFAIL=3, SKIP=4, failures=4)
more info: http://pastebin.com/raw.php?i=m3sns5xU

Scipy: FAILED (KNOWNFAIL=12, SKIP=35, errors=1, failures=3)
more info: http://pastebin.com/raw.php?i=tvqg8PJ1


I wish someone reply about this 'export LD_PRELOAD' workaround, and
also maybe correct online documentation about building Numpy/Scipy
with Intel compilers - at least with earlier David's corrections in
this thread about 'scipy/spatial/qhull/src/qhull_a.h', as if user does
not understand what are C++ templates, he/she could hardly figure what
to do. About sparsetools I'm happy I got it working, and this issue is
open as of Numpy 1.3.0 at least it seems


Cheers


From gorkypl at gmail.com  Wed Aug 17 17:13:52 2011
From: gorkypl at gmail.com (=?UTF-8?B?UGF3ZcWC?=)
Date: Wed, 17 Aug 2011 23:13:52 +0200
Subject: [SciPy-User] My attempt to fix an issue with separate scales for
 left and right axis in scikits.timeseries - correct?
Message-ID: <CAG=0Ctd0S8up-L6CER-fW_1Am4+Rc1g2rB8=XYEWn5LzgZTCFQ@mail.gmail.com>

Hello,

I've tried to solve an issue with scikits.timeseries doesn't allowing
to use separate scales for left and right axis with recent versions of
matplotlib, like in this example:
http://pytseries.sourceforge.net/lib.plotting.examples.html#separate-scales-for-left-and-right-axis

The issue was raised twice:
http://mail.scipy.org/pipermail/scipy-user/2011-April/029046.html
http://permalink.gmane.org/gmane.comp.python.scientific.devel/14645

The example (and my code) works after changing single line 1196:
-    fsp_alt_args = (fsp._rows, fsp._cols, fsp._num + 1)
+    fsp_alt_args = fsp.get_geometry()

I've done this after examining the file
https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes.py
(method get_geometry in line 8369).

Can anyone take a look at the code and say if it makes sense? I'm
certainly not an expert in Python and I'm not sure if this can be so
simple and yet correct.

Side note:
I know scikits.timeseries may be abandoned for a while now (I've
traced the recent discussion of its status), but I use it heavily in
climate analysis and need to keep my code alive for some time.

greetings,
Pawe? Rumian


From rmorgan466 at gmail.com  Thu Aug 18 06:20:51 2011
From: rmorgan466 at gmail.com (Rita)
Date: Thu, 18 Aug 2011 06:20:51 -0400
Subject: [SciPy-User] scipy.stats
Message-ID: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>

I am trying to import scipy.stats but I keep getting an import Error,
...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos

I compiled Numpy with Intel C compiler and Scipy compiled ok but just cant
get this working.

Any advise?

-- 
--- Get your facts first, then you can distort them as you please.--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110818/82743d15/attachment.html>

From chris at simplistix.co.uk  Thu Aug 18 10:26:09 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Thu, 18 Aug 2011 07:26:09 -0700
Subject: [SciPy-User] Enthought Python Distribution questions
Message-ID: <4E4D2101.7060505@simplistix.co.uk>

Hi All,

A couple of questions about EPD, if this is the wrong list, please point 
me at the right one:

- How can I install EPD in such a way that it leaves my system python 
completely alone? I installed it on my Mac and suddenly I have Python 
2.7 with all the libraries everywhere, which isn't what I want :-S
I'm now too petrified to try and install EPD on any of my Debian, Red 
Hat or Ubuntu servers in case the same thing is done there, which would 
have much more catastrophic consequences.

I'm looking for something akin to 'make altinstall' for CPython, I'd 
love to be able to get a python-epd-x.y.z in the same way that gives me 
just a pythonx.y.

- Once I have EPD installed, where can I find all the documentation for 
the included packages? I spend a lot of time working on trains and 
planes, and having the docs available offline would be extremely useful :-)

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From jrocher at enthought.com  Thu Aug 18 10:45:14 2011
From: jrocher at enthought.com (Jonathan Rocher)
Date: Thu, 18 Aug 2011 16:45:14 +0200
Subject: [SciPy-User] Enthought Python Distribution questions
In-Reply-To: <4E4D2101.7060505@simplistix.co.uk>
References: <4E4D2101.7060505@simplistix.co.uk>
Message-ID: <CAOzk5QfXhCPQ1+7YdpzzJKe3nrYS1+5pyJmb09OMjjy+GHrdtw@mail.gmail.com>

Hi Chris,

Correct, this isn't the appropriate mailing list. To request information
about EPD, you should contact info at enthought.com or
epd-support at enthought.com once you are a subsciber.

For your information,

1. EPD install its own python executable and libraries and doesn't interfere
with any existing installed instances of python. The PATH environment
variable allows you to select which one will be launched.

2. EPD comes with a large number of code samples/examples for most of the
packages included, in particular the Enthought Tool Suite. The documentation
is not included though, to save download time. It also comes with a DocLinks
folder with links to each package home page for you to download the
appropriate material.

Best regards,
Jonathan Rocher


On Thu, Aug 18, 2011 at 4:26 PM, Chris Withers <chris at simplistix.co.uk>wrote:

> Hi All,
>
> A couple of questions about EPD, if this is the wrong list, please point
> me at the right one:
>
> - How can I install EPD in such a way that it leaves my system python
> completely alone? I installed it on my Mac and suddenly I have Python
> 2.7 with all the libraries everywhere, which isn't what I want :-S
> I'm now too petrified to try and install EPD on any of my Debian, Red
> Hat or Ubuntu servers in case the same thing is done there, which would
> have much more catastrophic consequences.
>
> I'm looking for something akin to 'make altinstall' for CPython, I'd
> love to be able to get a python-epd-x.y.z in the same way that gives me
> just a pythonx.y.
>
> - Once I have EPD installed, where can I find all the documentation for
> the included packages? I spend a lot of time working on trains and
> planes, and having the docs available offline would be extremely useful :-)
>
> cheers,
>
> Chris
>
> --
> Simplistix - Content Management, Batch Processing & Python Consulting
>             - http://www.simplistix.co.uk
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


-- 
Jonathan Rocher, PhD
Scientific software developer
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110818/fb15bb0d/attachment.html>

From lists at hilboll.de  Thu Aug 18 10:45:27 2011
From: lists at hilboll.de (Andreas)
Date: Thu, 18 Aug 2011 16:45:27 +0200
Subject: [SciPy-User] Enthought Python Distribution questions
In-Reply-To: <4E4D2101.7060505@simplistix.co.uk>
References: <4E4D2101.7060505@simplistix.co.uk>
Message-ID: <4E4D2587.3040403@hilboll.de>

Hi Chris,

I installed EPD 7.1 in my home directory, in ~/lib/epd-7.1-1-x86_64/.
Then I installed virtualenv and virtualenvwrapper in the EPD directory,
by using

   $ ~/lib/epd-7.1-1-x86_64/bin/pip install virtualenv
   $ ~/lib/epd-7.1-1-x86_64/bin/pip install virtualenvwrapper

Now, I can just do something like

   $ source ~/lib/epd-7.1-1-x86_64/bin/virtualenvwrapper.sh
   $ mkvirtualenv myepd

In the virtualenv, I need to make sure that PATH and PYTHONPATH are set
correctly. For this, I create a postactivate script:

   $ cat ~/.virtualenvs/myepd/bin/postactivate
   #!/bin/bash
   # This hook is run after this virtualenv is activated.
   export PATH=~/.virtualenvs/myepd/bin:~/lib/epd-7.1-1-x86_64/bin:$PATH
   export PYTHONPATH=~/lib/epd-7.1-1-x86_64/lib/python2.7/site-packages
   export LD_LIBRARY_PATH=~/lib/epd-7.1-1-x86_64/lib

Now, you can just switch to your EPD environment using the ``workon``
command:

   $ python -V && which python
   Python 2.6.5
   /usr/bin/python
   $ source ~/lib/epd-7.1-1-x86_64/bin/virtualenvwrapper.sh
   $ workon myepd
   (myepd)$ python -V && which python
   Python 2.7.2 -- CUSTOM
   /home/USERNAME/.virtualenvs/myepd/bin/python

Hope this helps!

Cheers,
Andreas.


On 2011-08-18 16:26, Chris Withers wrote:
> Hi All,
> 
> A couple of questions about EPD, if this is the wrong list, please point 
> me at the right one:
> 
> - How can I install EPD in such a way that it leaves my system python 
> completely alone? I installed it on my Mac and suddenly I have Python 
> 2.7 with all the libraries everywhere, which isn't what I want :-S
> I'm now too petrified to try and install EPD on any of my Debian, Red 
> Hat or Ubuntu servers in case the same thing is done there, which would 
> have much more catastrophic consequences.
> 
> I'm looking for something akin to 'make altinstall' for CPython, I'd 
> love to be able to get a python-epd-x.y.z in the same way that gives me 
> just a pythonx.y.
> 
> - Once I have EPD installed, where can I find all the documentation for 
> the included packages? I spend a lot of time working on trains and 
> planes, and having the docs available offline would be extremely useful :-)
> 
> cheers,
> 
> Chris
> 


From kwatford at gmail.com  Thu Aug 18 10:45:26 2011
From: kwatford at gmail.com (Ken Watford)
Date: Thu, 18 Aug 2011 10:45:26 -0400
Subject: [SciPy-User] Enthought Python Distribution questions
In-Reply-To: <4E4D2101.7060505@simplistix.co.uk>
References: <4E4D2101.7060505@simplistix.co.uk>
Message-ID: <CAE+Gw1LcYp9_sNVZ=sERH2DKahcEdYm-P5-mzDju0s8fCvvppA@mail.gmail.com>

On Thu, Aug 18, 2011 at 10:26 AM, Chris Withers <chris at simplistix.co.uk> wrote:
> - How can I install EPD in such a way that it leaves my system python
> completely alone? I installed it on my Mac and suddenly I have Python
> 2.7 with all the libraries everywhere, which isn't what I want :-S
> I'm now too petrified to try and install EPD on any of my Debian, Red
> Hat or Ubuntu servers in case the same thing is done there, which would
> have much more catastrophic consequences.

The situation is better on Linux - the installer asks where you want
it, and it only touches that directory. You can install it as a normal
user in a private directory. You can then add its bin directory to
your path or not.


From chris at simplistix.co.uk  Thu Aug 18 10:51:57 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Thu, 18 Aug 2011 07:51:57 -0700
Subject: [SciPy-User] Enthought Python Distribution questions
In-Reply-To: <CAOzk5QfXhCPQ1+7YdpzzJKe3nrYS1+5pyJmb09OMjjy+GHrdtw@mail.gmail.com>
References: <4E4D2101.7060505@simplistix.co.uk>
	<CAOzk5QfXhCPQ1+7YdpzzJKe3nrYS1+5pyJmb09OMjjy+GHrdtw@mail.gmail.com>
Message-ID: <4E4D270D.9090100@simplistix.co.uk>

On 18/08/2011 07:45, Jonathan Rocher wrote:
> Correct, this isn't the appropriate mailing list. To request information
> about EPD, you should contact info at enthought.com
> <mailto:info at enthought.com> or epd-support at enthought.com
> <mailto:epd-support at enthought.com> once you are a subsciber.

I am a subscriber, so I just forwarded these questions there too :-)
I have EPD 7.0-2 (32-bit) installed.

> 1. EPD install its own python executable and libraries and doesn't
> interfere with any existing installed instances of python. The PATH
> environment variable allows you to select which one will be launched.

Please can you verify that this is the case on MacOS X?

> 2. EPD comes with a large number of code samples/examples for most of
> the packages included, in particular the Enthought Tool Suite.

Where would I find these on MacOS X?

> The
> documentation is not included though, to save download time.

Where can I find a bulk download of all the docs for offline use?

> It also
> comes with a DocLinks folder with links to each package home page for
> you to download the appropriate material.

Again on Mac OS X, where would I find this DocLinks folder?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From aronne.merrelli at gmail.com  Thu Aug 18 12:57:37 2011
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Thu, 18 Aug 2011 11:57:37 -0500
Subject: [SciPy-User] Enthought Python Distribution questions
In-Reply-To: <4E4D270D.9090100@simplistix.co.uk>
References: <4E4D2101.7060505@simplistix.co.uk>
	<CAOzk5QfXhCPQ1+7YdpzzJKe3nrYS1+5pyJmb09OMjjy+GHrdtw@mail.gmail.com>
	<4E4D270D.9090100@simplistix.co.uk>
Message-ID: <CAHNdQ4KVn17QkRLDHB7zovXN_+vMHLaSKg1DC5sEA95p_TnSCw@mail.gmail.com>

On Thu, Aug 18, 2011 at 9:51 AM, Chris Withers <chris at simplistix.co.uk>wrote:

> On 18/08/2011 07:45, Jonathan Rocher wrote:
> > Correct, this isn't the appropriate mailing list. To request information
> > about EPD, you should contact info at enthought.com
> > <mailto:info at enthought.com> or epd-support at enthought.com
> > <mailto:epd-support at enthought.com> once you are a subsciber.
>
> I am a subscriber, so I just forwarded these questions there too :-)
> I have EPD 7.0-2 (32-bit) installed.
>
> > 1. EPD install its own python executable and libraries and doesn't
> > interfere with any existing installed instances of python. The PATH
> > environment variable allows you to select which one will be launched.
>
> Please can you verify that this is the case on MacOS X?
>
> > 2. EPD comes with a large number of code samples/examples for most of
> > the packages included, in particular the Enthought Tool Suite.
>
> Where would I find these on MacOS X?
>
> > The
> > documentation is not included though, to save download time.
>
> Where can I find a bulk download of all the docs for offline use?
>
> > It also
> > comes with a DocLinks folder with links to each package home page for
> > you to download the appropriate material.
>
> Again on Mac OS X, where would I find this DocLinks folder?
>
>
On my Mac (OS X 10.6.8), the EPD stuff is installed here: (there are
DocLinks and Example subdirectories):

/Library/Frameworks/Python.framework/Versions/Current/

It looks like the "standard" python installations that come with MacOS are
here:

/System/Library/Frameworks/Python.framework/Versions/

I also have several python versions installed by macports into /opt/local/.
I have not yet had any problems with different installations conflicting
with each other, they each seem to have their own files (so I'm wasting a
lot of disk space). My path is set to the EPD version which I almost always
use, but it is easy to change if needed; right now the few times I need
another version I just run it directly. (e.g. type /usr/bin/python)

Aronne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110818/365ecf3c/attachment.html>

From dasneutron at gmail.com  Thu Aug 18 14:21:30 2011
From: dasneutron at gmail.com (Piotr Zolnierczuk)
Date: Thu, 18 Aug 2011 14:21:30 -0400
Subject: [SciPy-User] rotation matrices, Euler angles and all that jazz
Message-ID: <CAFD+hEwYFoKYUosC10OatAkHPRbwFThV8xr7Zp8R13ZibGSc+w@mail.gmail.com>

Hi,
The question has probably been asked here before ...

Is there a scipy/numpy module which facilitates computation of
rotation matrices, Euler angles, etc.?
They are ever present in many branches of physics and re-inventing
them again seems like a waste of time.

I've been using a module written by Chrisoph Gohlke (UCI)
http://www.lfd.uci.edu/~gohlke/code/transformations.py.html, but it
would be nice if I could use something that is already included in
SciPy.

Piotr


From cgohlke at uci.edu  Thu Aug 18 15:05:43 2011
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Thu, 18 Aug 2011 12:05:43 -0700
Subject: [SciPy-User] rotation matrices, Euler angles and all that jazz
In-Reply-To: <CAFD+hEwYFoKYUosC10OatAkHPRbwFThV8xr7Zp8R13ZibGSc+w@mail.gmail.com>
References: <CAFD+hEwYFoKYUosC10OatAkHPRbwFThV8xr7Zp8R13ZibGSc+w@mail.gmail.com>
Message-ID: <4E4D6287.6010101@uci.edu>


On 8/18/2011 11:21 AM, Piotr Zolnierczuk wrote:
> Hi,
> The question has probably been asked here before ...
>
> Is there a scipy/numpy module which facilitates computation of
> rotation matrices, Euler angles, etc.?
> They are ever present in many branches of physics and re-inventing
> them again seems like a waste of time.
>
> I've been using a module written by Chrisoph Gohlke (UCI)
> http://www.lfd.uci.edu/~gohlke/code/transformations.py.html, but it
> would be nice if I could use something that is already included in
> SciPy.
>
> Piotr


A quaternion dtype will probably make into the next version of numpy 
<http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057675.html>. That 
will be able to replace ~1/3 of the transformations.py module.

The transformations.py module was shortly discussed on the numpy list in 
2009 
<http://mail.scipy.org/pipermail/numpy-discussion/2009-March/040830.html>.

I had an off list discussion on how to integrate some of the functions 
in numpy/scipy. Consent was that if such a module makes it into 
numpy/scipy it should :

1) support any float dtypes
2) support 2D, 3D, and 3D homogeneous coordinates.
3) support both "column vectors on the right" and "row vectors on the 
left" conventions
4)

Christoph


From dperlman at wisc.edu  Thu Aug 18 23:16:28 2011
From: dperlman at wisc.edu (David Perlman)
Date: Thu, 18 Aug 2011 22:16:28 -0500
Subject: [SciPy-User] interpolate.interp1d: can't figure out what I'm doing
	wrong
Message-ID: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu>

I am absolutely sure that my x_new range doesn't go outside my original x, and yet it is giving me an error saying that it is:
old time range: [0.0, 200.00000298023224, 400.00000596046448, 600.00000894069672]
new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0]
Traceback (most recent call last):
  File "/home/perlman/bin/pretty_fmri.py", line 840, in <module>
    main()
  File "/home/perlman/bin/pretty_fmri.py", line 130, in main
    dataProcessor.interpolate()
  File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate
    self.interpolateddata=f(xnew)
  File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 333, in __call__
    out_of_bounds = self._check_bounds(x_new)
  File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 391, in _check_bounds
    raise ValueError("A value in x_new is above the interpolation "
ValueError: A value in x_new is above the interpolation range.


Here is the snippet of code where this is going wrong:
    oldNum=numpy.shape(self.data)[0]
    endTime=(oldNum-1)*oldTR
    x=numpy.linspace(0, endTime, oldNum)
    if self.opts.verbose: print "old time range:", list(x)
    f=scipy.interpolate.interp1d(x, self.data, self.opts.interp, 0)
    # make the new time points
    xnew=self.crange(0, endTime, newTR)
    if self.opts.verbose: print "new time range:", list(xnew)
    self.interpolateddata=f(xnew)


You can see from that, that there is no code between the displayed ranges and the calling of the interpolator.  So I am at a loss for how to figure out what's going on here!

Any help would be greatly appreciated.  I have been looking into this for a while, even to the point of looking at the source code for the interp1d function.  :-/

--
-dave----------------------------------------------------------------
"Let us work without theorizing... 'tis the only way to make life endurable."
- Voltaire, Candide, Chapter 30


From dperlman at wisc.edu  Thu Aug 18 23:32:08 2011
From: dperlman at wisc.edu (David Perlman)
Date: Thu, 18 Aug 2011 22:32:08 -0500
Subject: [SciPy-User] interpolate.interp1d: can't figure out what I'm
	doing wrong
In-Reply-To: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu>
References: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu>
Message-ID: <EBBFEB51-35D5-4960-828F-AE2437A32C6E@wisc.edu>

Well at least I got it to give me a different error message.  I thought it might not like the generator instead of list, so I converted to list first.  Now I get this error:

old time range: [0.0, 200.00000298023224, 400.00000596046448, 600.00000894069672]
new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0]
Traceback (most recent call last):
  File "/home/perlman/bin/pretty_fmri.py", line 840, in <module>
    main()
  File "/home/perlman/bin/pretty_fmri.py", line 130, in main
    dataProcessor.interpolate()
  File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate
    self.interpolateddata=f(list(xnew))
  File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 362, in __call__
    y_new[out_of_bounds] = self.fill_value
IndexError: invalid index


On Aug 18, 2011, at 10:16 PM, David Perlman wrote:

> I am absolutely sure that my x_new range doesn't go outside my original x, and yet it is giving me an error saying that it is:
> old time range: [0.0, 200.00000298023224, 400.00000596046448, 600.00000894069672]
> new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0]
> Traceback (most recent call last):
>  File "/home/perlman/bin/pretty_fmri.py", line 840, in <module>
>    main()
>  File "/home/perlman/bin/pretty_fmri.py", line 130, in main
>    dataProcessor.interpolate()
>  File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate
>    self.interpolateddata=f(xnew)
>  File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 333, in __call__
>    out_of_bounds = self._check_bounds(x_new)
>  File "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 391, in _check_bounds
>    raise ValueError("A value in x_new is above the interpolation "
> ValueError: A value in x_new is above the interpolation range.
> 
> 
> Here is the snippet of code where this is going wrong:
>    oldNum=numpy.shape(self.data)[0]
>    endTime=(oldNum-1)*oldTR
>    x=numpy.linspace(0, endTime, oldNum)
>    if self.opts.verbose: print "old time range:", list(x)
>    f=scipy.interpolate.interp1d(x, self.data, self.opts.interp, 0)
>    # make the new time points
>    xnew=self.crange(0, endTime, newTR)
>    if self.opts.verbose: print "new time range:", list(xnew)
>    self.interpolateddata=f(xnew)
> 
> 
> You can see from that, that there is no code between the displayed ranges and the calling of the interpolator.  So I am at a loss for how to figure out what's going on here!
> 
> Any help would be greatly appreciated.  I have been looking into this for a while, even to the point of looking at the source code for the interp1d function.  :-/
> 
> --
> -dave----------------------------------------------------------------
> "Let us work without theorizing... 'tis the only way to make life endurable."
> - Voltaire, Candide, Chapter 30
> 

--
-dave----------------------------------------------------------------
"Let us work without theorizing... 'tis the only way to make life endurable."
- Voltaire, Candide, Chapter 30


From questions.anon at gmail.com  Fri Aug 19 01:01:57 2011
From: questions.anon at gmail.com (questions anon)
Date: Fri, 19 Aug 2011 15:01:57 +1000
Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array
In-Reply-To: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
References: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
Message-ID: <CAN_=ogtw3Cr782-P42Ys_K_bLGa3=qK-+qF9qQ6geC2s4HnVnQ@mail.gmail.com>

Thank you, what you suggested worked but now I don't think that is my
problem.
Within the dataset I am trying to calculate the mean from it appears there
are some hours with no data, the output is:

[[[-- -- -- ..., -- -- --]
  [-- -- -- ..., -- -- --]
  [-- -- -- ..., -- -- --]
  ...,
  [-- -- -- ..., -- -- --]
  [-- -- -- ..., -- -- --]
  [-- -- -- ..., -- -- --]]]

So I would assume these would be ignored when I calculate the mean but when
I make all my files/times into one big array these blanks turn into -32767.
Is there some way to avoid this?
Thanks


On Wed, Aug 17, 2011 at 4:17 PM, questions anon <questions.anon at gmail.com>wrote:

> I am trying to run simple stats on a bunch of monthly netcdfs files with
> hourly temperature data. With help from this list I am able to loop through
> a calculate the mean, but in doing this I have discovered that there are a
> some hours that have no values or -32767. I am sure there are some cases
> where I could slice out the section (if I know where they are) but is there
> a way I could just ignore these hours and calculate the mean?
> I have found something called "numpy.isnan" but this does not seem to work.
>
>
from netCDF4 import Dataset
> import matplotlib.pyplot as plt
> import numpy as N
> from mpl_toolkits.basemap import Basemap
> import os
>
> MainFolder=r"E:/temp_samples/"
>
> all_TSFC=[]
> for (path, dirs, files) in os.walk(MainFolder):
>     for dir in dirs:
>         print dir
>     path=path+'/'
>     for ncfile in files:
>         if ncfile[-3:]=='.nc':
>             ncfile=os.path.join(path,ncfile)
>             ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
>             TSFC=ncfile.variables['T_SFC'][:]
>             LAT=ncfile.variables['latitude'][:]
>             LON=ncfile.variables['longitude'][:]
>             TIME=ncfile.variables['time'][:]
>             fillvalue=ncfile.variables['T_SFC']._FillValue
>             ncfile.close()
>
> #combine all TSFC to make one array for analyses
>             all_TSFC.append(TSFC)
>
> big_array=N.concatenate(all_TSFC)
> Mean=big_array.mean(axis=0)
> print "the mean is", Mean
>
>  #plot output summary stats
> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
>
> llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
> x,y=map(*N.meshgrid(LON,LAT))
> CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
> l,b,w,h =0.1,0.1,0.8,0.8
> cax = plt.axes([l+w+0.025, b, 0.025, h])
> plt.colorbar(CS,cax=cax, drawedges=True)
>
> plt.show()
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110819/f019bd0d/attachment.html>

From ralf.gommers at googlemail.com  Fri Aug 19 04:47:20 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 19 Aug 2011 10:47:20 +0200
Subject: [SciPy-User] interpolate.interp1d: can't figure out what I'm
 doing wrong
In-Reply-To: <EBBFEB51-35D5-4960-828F-AE2437A32C6E@wisc.edu>
References: <3350F2F7-B71F-4055-8F3D-7B80EEA8B1D8@wisc.edu>
	<EBBFEB51-35D5-4960-828F-AE2437A32C6E@wisc.edu>
Message-ID: <CABL7CQj=5eMR_eW46rjGb+OeqkEB7zyb_oAqkiZESvppmi4nAg@mail.gmail.com>

On Fri, Aug 19, 2011 at 5:32 AM, David Perlman <dperlman at wisc.edu> wrote:

> Well at least I got it to give me a different error message.  I thought it
> might not like the generator instead of list, so I converted to list first.
>  Now I get this error:
>
> old time range: [0.0, 200.00000298023224, 400.00000596046448,
> 600.00000894069672]
> new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0]
> Traceback (most recent call last):
>  File "/home/perlman/bin/pretty_fmri.py", line 840, in <module>
>    main()
>  File "/home/perlman/bin/pretty_fmri.py", line 130, in main
>    dataProcessor.interpolate()
>  File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate
>     self.interpolateddata=f(list(xnew))
>  File
> "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py",
> line 362, in __call__
>    y_new[out_of_bounds] = self.fill_value
> IndexError: invalid index
>
>
>
> On Aug 18, 2011, at 10:16 PM, David Perlman wrote:
>
> > I am absolutely sure that my x_new range doesn't go outside my original
> x, and yet it is giving me an error saying that it is:
> > old time range: [0.0, 200.00000298023224, 400.00000596046448,
> 600.00000894069672]
> > new time range: [0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0]
> > Traceback (most recent call last):
> >  File "/home/perlman/bin/pretty_fmri.py", line 840, in <module>
> >    main()
> >  File "/home/perlman/bin/pretty_fmri.py", line 130, in main
> >    dataProcessor.interpolate()
> >  File "/home/perlman/bin/pretty_fmri.py", line 249, in interpolate
> >    self.interpolateddata=f(xnew)
> >  File
> "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py",
> line 333, in __call__
> >    out_of_bounds = self._check_bounds(x_new)
> >  File
> "/usr/local/Python/Versions/2.6.5/lib/python2.6/site-packages/scipy/interpolate/interpolate.py",
> line 391, in _check_bounds
> >    raise ValueError("A value in x_new is above the interpolation "
> > ValueError: A value in x_new is above the interpolation range.
> >
> >
> > Here is the snippet of code where this is going wrong:
> >    oldNum=numpy.shape(self.data)[0]
> >    endTime=(oldNum-1)*oldTR
> >    x=numpy.linspace(0, endTime, oldNum)
> >    if self.opts.verbose: print "old time range:", list(x)
> >    f=scipy.interpolate.interp1d(x, self.data, self.opts.interp, 0)
> >    # make the new time points
> >    xnew=self.crange(0, endTime, newTR)
> >    if self.opts.verbose: print "new time range:", list(xnew)
> >    self.interpolateddata=f(xnew)
> >
>

Can you create a self-contained example that illustrates the problem?

Ralf

>
> > You can see from that, that there is no code between the displayed ranges
> and the calling of the interpolator.  So I am at a loss for how to figure
> out what's going on here!
> >
> > Any help would be greatly appreciated.  I have been looking into this for
> a while, even to the point of looking at the source code for the interp1d
> function.  :-/
> >
> > --
> > -dave----------------------------------------------------------------
> > "Let us work without theorizing... 'tis the only way to make life
> endurable."
> > - Voltaire, Candide, Chapter 30
> >
>
> --
> -dave----------------------------------------------------------------
> "Let us work without theorizing... 'tis the only way to make life
> endurable."
> - Voltaire, Candide, Chapter 30
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110819/62aa0025/attachment.html>

From dave.hirschfeld at gmail.com  Fri Aug 19 06:35:19 2011
From: dave.hirschfeld at gmail.com (Dave Hirschfeld)
Date: Fri, 19 Aug 2011 10:35:19 +0000 (UTC)
Subject: [SciPy-User] My attempt to fix an issue with separate scales
	for left and right axis in scikits.timeseries - correct?
References: <CAG=0Ctd0S8up-L6CER-fW_1Am4+Rc1g2rB8=XYEWn5LzgZTCFQ@mail.gmail.com>
Message-ID: <loom.20110819T123009-248@post.gmane.org>

Pawe? <gorkypl <at> gmail.com> writes:
> 
> The example (and my code) works after changing single line 1196:
> -    fsp_alt_args = (fsp._rows, fsp._cols, fsp._num + 1)
> +    fsp_alt_args = fsp.get_geometry()
> 
> 
> Can anyone take a look at the code and say if it makes sense? I'm
> certainly not an expert in Python and I'm not sure if this can be so
> simple and yet correct.
> 
> greetings,
> Pawe? Rumian

FWIW I can confirm that the fix works for me - thanks!

Unfortuantely I'm not an expert in the internals of either matplotlib or
scikits.timeseries so I don't feel qualified to say whether it's the right fix :/

I'm running 32bit Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500
32 bit (Intel)] on a Win7 x64 box.

-Dave


From gorkypl at gmail.com  Fri Aug 19 08:13:04 2011
From: gorkypl at gmail.com (=?UTF-8?B?UGF3ZcWC?=)
Date: Fri, 19 Aug 2011 14:13:04 +0200
Subject: [SciPy-User] My attempt to fix an issue with separate scales
 for left and right axis in scikits.timeseries - correct?
In-Reply-To: <loom.20110819T123009-248@post.gmane.org>
References: <CAG=0Ctd0S8up-L6CER-fW_1Am4+Rc1g2rB8=XYEWn5LzgZTCFQ@mail.gmail.com>
	<loom.20110819T123009-248@post.gmane.org>
Message-ID: <CAG=0Ctchnxu-SZx45qmFOiQxvnnP3ViM1J0-0mpqZP_JQXHLag@mail.gmail.com>

2011/8/19 Dave Hirschfeld <dave.hirschfeld at gmail.com>:
> Pawe? <gorkypl <at> gmail.com> writes:
>>
>> The example (and my code) works after changing single line 1196:
>> - ? ?fsp_alt_args = (fsp._rows, fsp._cols, fsp._num + 1)
>> + ? ?fsp_alt_args = fsp.get_geometry()
>
> FWIW I can confirm that the fix works for me - thanks!

Thanks for confirming :)

I've just noticed I havent stated it clearly - the change has to be
done in scikits/timeseries/lib/plotlib.py of course.

greetings,
Pawe? Rumian


From chris at simplistix.co.uk  Fri Aug 19 10:48:26 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 19 Aug 2011 07:48:26 -0700
Subject: [SciPy-User] Enthought Python Distribution questions
In-Reply-To: <CAHNdQ4KVn17QkRLDHB7zovXN_+vMHLaSKg1DC5sEA95p_TnSCw@mail.gmail.com>
References: <4E4D2101.7060505@simplistix.co.uk>
	<CAOzk5QfXhCPQ1+7YdpzzJKe3nrYS1+5pyJmb09OMjjy+GHrdtw@mail.gmail.com>
	<4E4D270D.9090100@simplistix.co.uk>
	<CAHNdQ4KVn17QkRLDHB7zovXN_+vMHLaSKg1DC5sEA95p_TnSCw@mail.gmail.com>
Message-ID: <4E4E77BA.1000508@simplistix.co.uk>

On 18/08/2011 09:57, Aronne Merrelli wrote:
>
>     Again on Mac OS X, where would I find this DocLinks folder?
>
>
> On my Mac (OS X 10.6.8), the EPD stuff is installed here: (there are
> DocLinks and Example subdirectories):
>
> /Library/Frameworks/Python.framework/Versions/Current/

This is a symlink, which I really don't want EPD to touch.

I guess I'm OK with using 7.0 in Python.framework, on the assumption 
that Python will never make it to 7.0, but really, it should be in 
EPD.framework, no?

> It looks like the "standard" python installations that come with MacOS
> are here:
>
> /System/Library/Frameworks/Python.framework/Versions/

Yes, but what about installs of "normal python"? Why has EPD stomped on 
my /Current without even asking me?!

> own files (so I'm wasting a lot of disk space). My path is set to the
> EPD version

Which path? Aside from stomping on the /Current symlink, I'm curious 
about why EPD now appears to be the default python on my system...

cheers,

Chris

PS: Thanks for pointing me at the DocLinks and Examples folders :-)

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From ralf.gommers at googlemail.com  Fri Aug 19 11:00:40 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 19 Aug 2011 17:00:40 +0200
Subject: [SciPy-User] scipy.stats
In-Reply-To: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
References: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
Message-ID: <CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>

On Thu, Aug 18, 2011 at 12:20 PM, Rita <rmorgan466 at gmail.com> wrote:

> I am trying to import scipy.stats but I keep getting an import Error,
> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos
>
> I compiled Numpy with Intel C compiler and Scipy compiled ok but just cant
> get this working.
>
> Any advise?
>
> The symbol is defined in an Intel math library. You'll need to give us more
details in order to say more than that. What exact compilers and MKL did you
use, what OS? Build command and build log?

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110819/b662d90e/attachment.html>

From arokem at gmail.com  Thu Aug 18 12:58:19 2011
From: arokem at gmail.com (Ariel Rokem)
Date: Thu, 18 Aug 2011 09:58:19 -0700
Subject: [SciPy-User] [ANN] Nitime version 0.3
In-Reply-To: <CA+X4w0wshmfvtochRDZtxj7L7UJSbw-3ydG6q7=i6Xuo5GFwqg@mail.gmail.com>
References: <CA+X4w0wshmfvtochRDZtxj7L7UJSbw-3ydG6q7=i6Xuo5GFwqg@mail.gmail.com>
Message-ID: <CA+X4w0yacPwezEpSy_nNA7LNauAxMRAT4rGQr_ZiLMm2LymqcQ@mail.gmail.com>

I am happy to announce the release of version 0.3 of nitime.

Nitime, a member of the nipy family (http://nipy.org), is a software
library for the analysis of time-series from neuroscience experiments.

To read the online documentation, visit: http://nipy.org/nitime/

To download the source code, visit: http://pypi.python.org/pypi/nitime

Version 0.3 of nitime includes several additions and improvements,
including new analysis methods (MAR process estimation, Granger
'causality', seed correlation analysis, filtering), improvements to
the API (slicing with epochs), many bug fixes, a dramatic increase in
test coverage and many new examples
(http://nipy.org/nitime/examples/index.html)

To read the full release notes and see the list of contributors to
this release, visit: http://nipy.org/nitime/whatsnew/version0.3.html

On behalf of the nitime developers,

Ariel Rokem


From robert.kern at gmail.com  Fri Aug 19 13:18:50 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 19 Aug 2011 12:18:50 -0500
Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array
In-Reply-To: <CAN_=ogtw3Cr782-P42Ys_K_bLGa3=qK-+qF9qQ6geC2s4HnVnQ@mail.gmail.com>
References: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
	<CAN_=ogtw3Cr782-P42Ys_K_bLGa3=qK-+qF9qQ6geC2s4HnVnQ@mail.gmail.com>
Message-ID: <CAF6FJityURpz7nWdCgn48DtV3_QydadW3eMoTQvsuWu3g3fU=g@mail.gmail.com>

On Fri, Aug 19, 2011 at 00:01, questions anon <questions.anon at gmail.com> wrote:
> Thank you, what you suggested worked but now I don't think that is my
> problem.
> Within the dataset I am trying to calculate the mean from it appears there
> are some hours with no data, the output is:
> [[[-- -- -- ..., -- -- --]
> ? [-- -- -- ..., -- -- --]
> ? [-- -- -- ..., -- -- --]
> ? ...,
> ? [-- -- -- ..., -- -- --]
> ? [-- -- -- ..., -- -- --]
> ? [-- -- -- ..., -- -- --]]]
> So I would assume these would be ignored when I calculate the mean but when
> I make all my files/times into one big array these blanks turn into -32767.
> Is there some way to avoid this?

This is just how large arrays get summarized when printed. The data is
all there. You can control how this summarization happens using the
threshold parameter to numpy.set_printoptions():

http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From dasneutron at gmail.com  Fri Aug 19 13:31:53 2011
From: dasneutron at gmail.com (Piotr Zolnierczuk)
Date: Fri, 19 Aug 2011 13:31:53 -0400
Subject: [SciPy-User] rotation matrices, Euler angles and all that jazz
Message-ID: <CAFD+hEywpMW63dZWpOafKwaidzWYyywteT94-4iGoF5a5wmmww@mail.gmail.com>

Christoph,

thanks for the answers and references. I will keep using your very
useful module.

What I would like to add is that one often needs only the 3x3 (pure
rotation) part of it so it would be nice to provide versions for this
case too.

One obviously can "wrap" (that's what I do) the routines, for example:
def rotation_matrix3(....):
     m = rotation_matrix(...)
     return m[:3,:3]


Cheers
   Piotr


From robert.kern at gmail.com  Fri Aug 19 13:34:45 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 19 Aug 2011 12:34:45 -0500
Subject: [SciPy-User] Enthought Python Distribution questions
In-Reply-To: <4E4E77BA.1000508@simplistix.co.uk>
References: <4E4D2101.7060505@simplistix.co.uk>
	<CAOzk5QfXhCPQ1+7YdpzzJKe3nrYS1+5pyJmb09OMjjy+GHrdtw@mail.gmail.com>
	<4E4D270D.9090100@simplistix.co.uk>
	<CAHNdQ4KVn17QkRLDHB7zovXN_+vMHLaSKg1DC5sEA95p_TnSCw@mail.gmail.com>
	<4E4E77BA.1000508@simplistix.co.uk>
Message-ID: <CAF6FJiuUr8n8ntbqf9u3EYeyVH2SNZqmAeE9Y-EMYdq0DubO2Q@mail.gmail.com>

On Fri, Aug 19, 2011 at 09:48, Chris Withers <chris at simplistix.co.uk> wrote:
> On 18/08/2011 09:57, Aronne Merrelli wrote:
>>
>> ? ? Again on Mac OS X, where would I find this DocLinks folder?
>>
>>
>> On my Mac (OS X 10.6.8), the EPD stuff is installed here: (there are
>> DocLinks and Example subdirectories):
>>
>> /Library/Frameworks/Python.framework/Versions/Current/
>
> This is a symlink, which I really don't want EPD to touch.

Well, the installer will adjust it to point to the version that it
installs, which is the usual thing to do. You can adjust it to
whichever version you want to be considered "Current" afterwards.

> I guess I'm OK with using 7.0 in Python.framework, on the assumption
> that Python will never make it to 7.0, but really, it should be in
> EPD.framework, no?

Ideally, yes. Unfortunately, a number of tools rely on the framework
being named "Python.framework" and are difficult to configure to deal
with a different framework name. So we use Python.framework and a
version number equal to EPD's version. That still gets us in trouble
with a few tools that try to infer the Python version number from the
framework version number, but I've only encountered one or two, and
those were easy to patch to look up the version number robustly.

>> It looks like the "standard" python installations that come with MacOS
>> are here:
>>
>> /System/Library/Frameworks/Python.framework/Versions/
>
> Yes, but what about installs of "normal python"? Why has EPD stomped on
> my /Current without even asking me?!

Because it's what the "normal python" installer would do too. If I had
a www.python.org installation of Python 2.6 that Current pointed to,
and then I installed the www.python.org Python 2.7, Current would get
updated to 2.7 without asking me. EPD's installer just does the same
thing. It's what every installer of frameworks does that I've ever
seen. They expect that if you explicitly install a framework, that you
want it to be Current.

>> own files (so I'm wasting a lot of disk space). My path is set to the
>> EPD version
>
> Which path? Aside from stomping on the /Current symlink, I'm curious
> about why EPD now appears to be the default python on my system...

$PATH

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From rmorgan466 at gmail.com  Fri Aug 19 18:53:40 2011
From: rmorgan466 at gmail.com (Rita)
Date: Fri, 19 Aug 2011 18:53:40 -0400
Subject: [SciPy-User] scipy.stats
In-Reply-To: <CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>
References: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
	<CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>
Message-ID: <CAOF-Kfjx0cPWjHgbfcVYakuGBcnHSVx3fW+D=uWzyGZ5mh7Ojw@mail.gmail.com>

I apologize for the vague question.

OS: Linux
Compiler: Intel compiler suite. Version 11 (this also includes fortran
compiler)
MKL: 10.3

Numpy version: 1.6.1

When I do numpy.config() I see it properly compiled against Intel's BLAS and
LAPACK


Where are the build logs located? Do you need to build log for Numpy also?


On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Thu, Aug 18, 2011 at 12:20 PM, Rita <rmorgan466 at gmail.com> wrote:
>
>> I am trying to import scipy.stats but I keep getting an import Error,
>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos
>>
>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just cant
>> get this working.
>>
>> Any advise?
>>
>> The symbol is defined in an Intel math library. You'll need to give us
> more details in order to say more than that. What exact compilers and MKL
> did you use, what OS? Build command and build log?
>
> Cheers,
> Ralf
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


-- 
--- Get your facts first, then you can distort them as you please.--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110819/c9a920c5/attachment.html>

From bsouthey at gmail.com  Fri Aug 19 20:00:39 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 19:00:39 -0500
Subject: [SciPy-User] scipy.stats
In-Reply-To: <CAOF-Kfjx0cPWjHgbfcVYakuGBcnHSVx3fW+D=uWzyGZ5mh7Ojw@mail.gmail.com>
References: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
	<CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>
	<CAOF-Kfjx0cPWjHgbfcVYakuGBcnHSVx3fW+D=uWzyGZ5mh7Ojw@mail.gmail.com>
Message-ID: <CAAea2pb5L6jG_nncOm5GXOHv6JU7DO_AL0YKON+w3k_avHNvPA@mail.gmail.com>

On Fri, Aug 19, 2011 at 5:53 PM, Rita <rmorgan466 at gmail.com> wrote:
> I apologize for the vague question.
> OS: Linux
> Compiler: Intel compiler suite. Version 11 (this also includes fortran
> compiler)
> MKL: 10.3
> Numpy version: 1.6.1
> When I do numpy.config() I see it properly compiled against Intel's BLAS and
> LAPACK
>
> Where are the build logs located? Do you need to build log for Numpy also?
>
> On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers <ralf.gommers at googlemail.com>
> wrote:
>>
>>
>> On Thu, Aug 18, 2011 at 12:20 PM, Rita <rmorgan466 at gmail.com> wrote:
>>>
>>> I am trying to import scipy.stats but I keep getting an import Error,
>>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos
>>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just
>>> cant get this working.
>>> Any advise?
>>
>> The symbol is defined in an Intel math library. You'll need to give us
>> more details in order to say more than that. What exact compilers and MKL
>> did you use, what OS? Build command and build log?
>>
>> Cheers,
>> Ralf
>>
>>
>>

A quick google indicates that you need to ensure that you link to the
appropriate Intel Math library:
http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/

Also what is the cpu type?

Bruce


From rmorgan466 at gmail.com  Sat Aug 20 07:38:57 2011
From: rmorgan466 at gmail.com (Rita)
Date: Sat, 20 Aug 2011 06:38:57 -0500
Subject: [SciPy-User] scipy.stats
In-Reply-To: <CAAea2pb5L6jG_nncOm5GXOHv6JU7DO_AL0YKON+w3k_avHNvPA@mail.gmail.com>
References: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
	<CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>
	<CAOF-Kfjx0cPWjHgbfcVYakuGBcnHSVx3fW+D=uWzyGZ5mh7Ojw@mail.gmail.com>
	<CAAea2pb5L6jG_nncOm5GXOHv6JU7DO_AL0YKON+w3k_avHNvPA@mail.gmail.com>
Message-ID: <CAOF-Kfht9rm0zpkiJAbz=HtJgLUdV_GuinEkgAcCNuP6CDa_7A@mail.gmail.com>

Thanks Bruce. I have already seen this

Here are more details of my build.

My Intel compiler exists here, /opt/intel/

self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L
/opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I
/opt/intel/ipp/em64t/in    clude -I /etg/source/Linux/include -I
/opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf
-lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread     -lstdc++ -DMKL_ILP64'
Here is how I am doing the compilation

CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config
--compiler=intel  --fcompiler=intelem build_clib --compiler=intel
--fcompiler=intelem build_ext --compiler=intel install

/opt/intel/ipp is what I was using for the math library. This compiles but I
keep getting that problem

I use the same compile statement to compile scipy


On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey <bsouthey at gmail.com> wrote:

> On Fri, Aug 19, 2011 at 5:53 PM, Rita <rmorgan466 at gmail.com> wrote:
> > I apologize for the vague question.
> > OS: Linux
> > Compiler: Intel compiler suite. Version 11 (this also includes fortran
> > compiler)
> > MKL: 10.3
> > Numpy version: 1.6.1
> > When I do numpy.config() I see it properly compiled against Intel's BLAS
> and
> > LAPACK
> >
> > Where are the build logs located? Do you need to build log for Numpy
> also?
> >
> > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers <
> ralf.gommers at googlemail.com>
> > wrote:
> >>
> >>
> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita <rmorgan466 at gmail.com> wrote:
> >>>
> >>> I am trying to import scipy.stats but I keep getting an import Error,
> >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos
> >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just
> >>> cant get this working.
> >>> Any advise?
> >>
> >> The symbol is defined in an Intel math library. You'll need to give us
> >> more details in order to say more than that. What exact compilers and
> MKL
> >> did you use, what OS? Build command and build log?
> >>
> >> Cheers,
> >> Ralf
> >>
> >>
> >>
>
> A quick google indicates that you need to ensure that you link to the
> appropriate Intel Math library:
>
> http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/
>
> Also what is the cpu type?
>
> Bruce
>  _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


-- 
--- Get your facts first, then you can distort them as you please.--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110820/d8581d9f/attachment.html>

From rmorgan466 at gmail.com  Sat Aug 20 07:40:42 2011
From: rmorgan466 at gmail.com (Rita)
Date: Sat, 20 Aug 2011 06:40:42 -0500
Subject: [SciPy-User] scipy.stats
In-Reply-To: <CAOF-Kfht9rm0zpkiJAbz=HtJgLUdV_GuinEkgAcCNuP6CDa_7A@mail.gmail.com>
References: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
	<CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>
	<CAOF-Kfjx0cPWjHgbfcVYakuGBcnHSVx3fW+D=uWzyGZ5mh7Ojw@mail.gmail.com>
	<CAAea2pb5L6jG_nncOm5GXOHv6JU7DO_AL0YKON+w3k_avHNvPA@mail.gmail.com>
	<CAOF-Kfht9rm0zpkiJAbz=HtJgLUdV_GuinEkgAcCNuP6CDa_7A@mail.gmail.com>
Message-ID: <CAOF-KficrVnTeXM+jyirp-H96KDuMWmBXWpcS1R=KDvuYJVGJQ@mail.gmail.com>

It should be


'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L
/opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I/opt/intel/mkl/include
-I /opt/intel/include -fPIC -O3 -openmp -limf -lmkl_core -lmkl_intel_lp64
-lmkl_intel_thread     -lstdc++ -DMKL_ILP64'

Here is how I am doing the compilation


On Sat, Aug 20, 2011 at 6:38 AM, Rita <rmorgan466 at gmail.com> wrote:

> Thanks Bruce. I have already seen this
>
> Here are more details of my build.
>
> My Intel compiler exists here, /opt/intel/
>
> self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L
> /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I
> /opt/intel/ipp/em64t/in    clude -I /etg/source/Linux/include -I
> /opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf
> -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread     -lstdc++ -DMKL_ILP64'
> Here is how I am doing the compilation
>
> CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config
> --compiler=intel  --fcompiler=intelem build_clib --compiler=intel
> --fcompiler=intelem build_ext --compiler=intel install
>
> /opt/intel/ipp is what I was using for the math library. This compiles but
> I keep getting that problem
>
> I use the same compile statement to compile scipy
>
>
>
> On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>
>> On Fri, Aug 19, 2011 at 5:53 PM, Rita <rmorgan466 at gmail.com> wrote:
>> > I apologize for the vague question.
>> > OS: Linux
>> > Compiler: Intel compiler suite. Version 11 (this also includes fortran
>> > compiler)
>> > MKL: 10.3
>> > Numpy version: 1.6.1
>> > When I do numpy.config() I see it properly compiled against Intel's BLAS
>> and
>> > LAPACK
>> >
>> > Where are the build logs located? Do you need to build log for Numpy
>> also?
>> >
>> > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers <
>> ralf.gommers at googlemail.com>
>> > wrote:
>> >>
>> >>
>> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita <rmorgan466 at gmail.com> wrote:
>> >>>
>> >>> I am trying to import scipy.stats but I keep getting an import Error,
>> >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos
>> >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just
>> >>> cant get this working.
>> >>> Any advise?
>> >>
>> >> The symbol is defined in an Intel math library. You'll need to give us
>> >> more details in order to say more than that. What exact compilers and
>> MKL
>> >> did you use, what OS? Build command and build log?
>> >>
>> >> Cheers,
>> >> Ralf
>> >>
>> >>
>> >>
>>
>> A quick google indicates that you need to ensure that you link to the
>> appropriate Intel Math library:
>>
>> http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/
>>
>> Also what is the cpu type?
>>
>> Bruce
>>  _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>


-- 
--- Get your facts first, then you can distort them as you please.--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110820/2bd4b3d9/attachment.html>

From chris at simplistix.co.uk  Sat Aug 20 19:12:22 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 20 Aug 2011 16:12:22 -0700
Subject: [SciPy-User] IPython inline plots of stacked bars graphs
Message-ID: <4E503F56.1090908@simplistix.co.uk>

Hi All,

If I do the following in an IPython 0.11 Qt shell:

import matplotlib.pyplot as plt
menMeans   = (20, 35, 30, 35, 27)
womenMeans = (25, 32, 34, 20, 25)
plt.bar(ind, menMeans, color='r')
plt.bar(ind, womenMeans, color='y', bottom=menMeans)

I get, as I'd expect, a stacked bar graph.

However, if I do:

plt.bar(ind, menMeans, color='r')

...hit enter, and then do:

plt.bar(ind, womenMeans, color='y', bottom=menMeans)

...I get two separate plots.

How can I add to an existing inline plot?

Also, and I guess this might be more of a matplotlib question, how do I 
"reach inside" an existing plot to, for example, adjust the width of the 
bars used?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From charlesr.harris at gmail.com  Sat Aug 20 19:46:37 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 20 Aug 2011 17:46:37 -0600
Subject: [SciPy-User] IPython inline plots of stacked bars graphs
In-Reply-To: <4E503F56.1090908@simplistix.co.uk>
References: <4E503F56.1090908@simplistix.co.uk>
Message-ID: <CAB6mnxLD9Fi59Dmoq0yUNw5o+sGrUUq3JzrTjjajqY=cJ-5Pgw@mail.gmail.com>

On Sat, Aug 20, 2011 at 5:12 PM, Chris Withers <chris at simplistix.co.uk>wrote:

> Hi All,
>
> If I do the following in an IPython 0.11 Qt shell:
>
> import matplotlib.pyplot as plt
> menMeans   = (20, 35, 30, 35, 27)
> womenMeans = (25, 32, 34, 20, 25)
> plt.bar(ind, menMeans, color='r')
> plt.bar(ind, womenMeans, color='y', bottom=menMeans)
>
> I get, as I'd expect, a stacked bar graph.
>
> However, if I do:
>
> plt.bar(ind, menMeans, color='r')
>
> ...hit enter, and then do:
>
> plt.bar(ind, womenMeans, color='y', bottom=menMeans)
>
> ...I get two separate plots.
>
> How can I add to an existing inline plot?
>
> Also, and I guess this might be more of a matplotlib question, how do I
> "reach inside" an existing plot to, for example, adjust the width of the
> bars used?
>
> cheers,
>
>
I think it is more of an ipython question, possibly a matplotlib question ;)
You might try the hold(True) command.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110820/ddeadad1/attachment.html>

From chris at simplistix.co.uk  Sat Aug 20 19:52:01 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 20 Aug 2011 16:52:01 -0700
Subject: [SciPy-User] getting started with arrays and matplotlib
In-Reply-To: <BF399650-C961-436B-9C96-4926E3A49110@iro.umontreal.ca>
References: <4E3F020B.1000500@simplistix.co.uk>
	<BF399650-C961-436B-9C96-4926E3A49110@iro.umontreal.ca>
Message-ID: <4E5048A1.5090105@simplistix.co.uk>

On 07/08/2011 22:29, David Warde-Farley wrote:
>> Secondly, once I've populated this, any good examples of how to turn it
>> into a bar chart? (the simple bar chart would be number of sales on the
>> y-axis, weeks before the event on the x-axis, however, what I'd then
>> like to do is split each bar into chunks for each venue's sales, if that
>> makes sense?)
>
> This might give you an example of what you need:
>
> http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html
>
> but you'd be better off asking on matplotlib-users.

Thanks, that was a good start.

One question: How can I automatically get a list of colours for each 
bar? I don't know how many bars I'm going to have so I can't manually 
pick them...

This feels like a common enough problem that I'm guessing there's a 
solution somewhere in matplotlib?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From dragonmagi at gmail.com  Sun Aug 21 04:49:04 2011
From: dragonmagi at gmail.com (Chris Thorne)
Date: Sun, 21 Aug 2011 16:49:04 +0800
Subject: [SciPy-User] Scipy 0.9.0 can't be installed on this disk .....
Message-ID: <CANPgej12_b67Skfb=LfoD86Dgty_9ef9FodHGRzOCDeOSF5S8A@mail.gmail.com>

When I run the installer for scipy (or numpy) on OSX 10.6.7
it will refuse to do the install saying:

"Scipy 0.9.0 can't be installed on this disk. scipy requires
python.orgPython 2.6 to install."

version of python installed with the OS is 2.6.1.
Installing the latest version does not help.

I'm guessing the error message is misleading and he issue is something
else??

Note: I recently installed this on OSX 10.6.8 on another machine without
problems.
One difference that perhaps matters is that one had macports on it.


thanks,

chris


-- 

http://www.vrshed.com
http://www.floatingorigin.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110821/bd0a3e91/attachment.html>

From robert.kern at gmail.com  Sun Aug 21 12:30:42 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 21 Aug 2011 11:30:42 -0500
Subject: [SciPy-User] Scipy 0.9.0 can't be installed on this disk .....
In-Reply-To: <CANPgej12_b67Skfb=LfoD86Dgty_9ef9FodHGRzOCDeOSF5S8A@mail.gmail.com>
References: <CANPgej12_b67Skfb=LfoD86Dgty_9ef9FodHGRzOCDeOSF5S8A@mail.gmail.com>
Message-ID: <CAF6FJitBFLTepGMD+aLL0UB2k1iSzAnGU95C7YZONx_ykBZG6A@mail.gmail.com>

On Sun, Aug 21, 2011 at 03:49, Chris Thorne <dragonmagi at gmail.com> wrote:
> When I run the installer for scipy (or numpy) on OSX 10.6.7
> it will refuse to do the install saying:
>
> "Scipy 0.9.0 can't be installed on this disk. scipy requires python.org
> Python 2.6 to install."
>
> version of python installed with the OS is 2.6.1.
> Installing the latest version does not help.
>
> I'm guessing the error message is misleading and he issue is something
> else??

Note this part: "python.org Python 2.6"

It means that you need to install Python 2.6 from the installers on
www.python.org, *not* the Python 2.6.1 that is included with the OS.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From jeremy at jeremysanders.net  Mon Aug 22 04:45:11 2011
From: jeremy at jeremysanders.net (Jeremy Sanders)
Date: Mon, 22 Aug 2011 09:45:11 +0100
Subject: [SciPy-User] ANN: Veusz 1.13
Message-ID: <j2t4un$7qa$1@dough.gmane.org>

Veusz 1.13
----------
Velvet Ember Under Sky Zenith
-----------------------------
http://home.gna.org/veusz/

Copyright (C) 2003-2011 Jeremy Sanders <jeremy at jeremysanders.net>
and contributors.

Licenced under the GPL (version 2 or greater).

Veusz is a Qt4 based scientific plotting package. It is written in
Python, using PyQt4 for display and user-interfaces, and numpy for
handling the numeric data. Veusz is designed to produce
publication-ready Postscript/PDF/SVG output. The user interface aims
to be simple, consistent and powerful.

Veusz provides a GUI, command line, embedding and scripting interface
(based on Python) to its plotting facilities. It also allows for
manipulation and editing of datasets. Data can be captured from
external sources such as internet sockets or other programs.

Changes in 1.13:
 * Graphs are rendered in separate threads for speed and a responsive
   user interface
 * A changed Graph is rendered immediately on document modification,
   improving latency
 * A new ternary plot widget is included
 * Size of pages can be modified individually in a document
 * Binary data import added
 * NPY/NPZ numpy data import added
 * Axis and tick labels on axes can be rotated at 45 deg intervals
 * Labels can be plotted next to points on non-orthogonal plots
 * Add an option for DPI of output EPS and PDF files

Minor improvements:
 * Import dialog detects filename extension to show correct tab
 * Polygon fill mode for non orthogonal plotting
 * --plugin command line option added, for loading and testing plugins
 * Plugin for swapping two colors in a plot
 * Dataset navigator is moved to right of window by default
 * Mac OS X binary release updated to Python 2.7.2
 * Import plugins can say which file extensions they support
 * Import plugins can be "promoted" to their own tab on the import dialog
 * ForceUpdate command added to embedding API, to force an update of
   the displayed plot (useful if SetUpdateInterval is set to 0)
 * X or Y dataset can be left blank in plotter to plot by row number

Bugs fixed:
 * Images plotted when axes are inverted are inverted too
 * Fixed crash when selecting datasets for plotting in the popup menu
 * Picker crashes with a constant function
 * 2D dataset creation using expressions fixed
 * CSV reader treated dataset names ending in + or - incorrectly
 * unique1d function no longer available in numpy

Features of package:
 * X-Y plots (with errorbars)
 * Line and function plots
 * Contour plots
 * Images (with colour mappings and colorbars)
 * Stepped plots (for histograms)
 * Bar graphs
 * Vector field plots
 * Box plots
 * Polar plots
 * Ternary plots
 * Plotting dates
 * Fitting functions to data
 * Stacked plots and arrays of plots
 * Plot keys
 * Plot labels
 * Shapes and arrows on plots
 * LaTeX-like formatting for text
 * EPS/PDF/PNG/SVG/EMF export
 * Scripting interface
 * Dataset creation/manipulation
 * Embed Veusz within other programs
 * Text, CSV, FITS, NPY/NPZ, QDP, binary and user-plugin importing
 * Data can be captured from external sources
 * User defined functions, constants and can import external Python 
functions
 * Plugin interface to allow user to write or load code to
    - import data using new formats
    - make new datasets, optionally linked to existing datasets
    - arbitrarily manipulate the document
 * Data picker
 * Multithreaded rendering

Requirements for source install:
 Python (2.4 or greater required)
   http://www.python.org/
 Qt >= 4.3 (free edition)
   http://www.trolltech.com/products/qt/  
 PyQt >= 4.3 (SIP is required to be installed first)
   http://www.riverbankcomputing.co.uk/pyqt/
   http://www.riverbankcomputing.co.uk/sip/
 numpy >= 1.0
   http://numpy.scipy.org/

Optional:
 Microsoft Core Fonts (recommended for nice output)
   http://corefonts.sourceforge.net/
 PyFITS >= 1.1 (optional for FITS import)
   http://www.stsci.edu/resources/software_hardware/pyfits
 pyemf >= 2.0.0 (optional for EMF export)
   http://pyemf.sourceforge.net/
 PyMinuit >= 1.1.2 (optional improved fitting)
   http://code.google.com/p/pyminuit/
 For EMF and better SVG export, PyQt >= 4.6 or better is
   required, to fix a bug in the C++ wrapping
   

For documentation on using Veusz, see the "Documents" directory. The
manual is in PDF, HTML and text format (generated from docbook). The
examples are also useful documentation. Please also see and contribute
to the Veusz wiki: http://barmag.net/veusz-wiki/

Issues with the current version:

 * Some recent versions of PyQt/SIP will causes crashes when exporting
   SVG files. Update to 4.7.4 (if released) or a recent snapshot to
   solve this problem.

If you enjoy using Veusz, we would love to hear from you. Please join
the mailing lists at

https://gna.org/mail/?group=veusz

to discuss new features or if you'd like to contribute code. The
latest code can always be found in the Git repository
at https://github.com/jeremysanders/veusz.git.


From WDyk at nobleenergyinc.com  Mon Aug 22 15:23:03 2011
From: WDyk at nobleenergyinc.com (WDyk at nobleenergyinc.com)
Date: Mon, 22 Aug 2011 13:23:03 -0600
Subject: [SciPy-User] IPython inline plots of stacked bars graphs
In-Reply-To: <mailman.13.1313946002.10776.scipy-user@scipy.org>
References: <mailman.13.1313946002.10776.scipy-user@scipy.org>
Message-ID: <OF3C26DEFA.D58E7DB6-ON872578F4.006A0A5C-872578F4.006A7ADD@nobleenergyinc.com>

In ipython 0.11, use Ctrl-Enter to enter multi-line edit mode.  You can 
then send multiple commands to change your plot.  Hit Enter on a blank 
line to send all commands at once.

Wes Dyk, Production Systems Admin
Noble Energy, Inc.


From:   scipy-user-request at scipy.org
To:     scipy-user at scipy.org
Date:   08/21/2011 11:00 AM
Subject:        SciPy-User Digest, Vol 96, Issue 31
Sent by:        scipy-user-bounces at scipy.org


Message: 1
Date: Sat, 20 Aug 2011 16:12:22 -0700
From: Chris Withers <chris at simplistix.co.uk>
Subject: [SciPy-User] IPython inline plots of stacked bars graphs
To: SciPy Users List <scipy-user at scipy.org>
Message-ID: <4E503F56.1090908 at simplistix.co.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi All,

If I do the following in an IPython 0.11 Qt shell:

import matplotlib.pyplot as plt
menMeans   = (20, 35, 30, 35, 27)
womenMeans = (25, 32, 34, 20, 25)
plt.bar(ind, menMeans, color='r')
plt.bar(ind, womenMeans, color='y', bottom=menMeans)

I get, as I'd expect, a stacked bar graph.

However, if I do:

plt.bar(ind, menMeans, color='r')

...hit enter, and then do:

plt.bar(ind, womenMeans, color='y', bottom=menMeans)

...I get two separate plots.

How can I add to an existing inline plot?

Also, and I guess this might be more of a matplotlib question, how do I 
"reach inside" an existing plot to, for example, adjust the width of the 
bars used?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


------------------------------

Message: 2
Date: Sat, 20 Aug 2011 17:46:37 -0600
From: Charles R Harris <charlesr.harris at gmail.com>
Subject: Re: [SciPy-User] IPython inline plots of stacked bars graphs
To: SciPy Users List <scipy-user at scipy.org>
Message-ID:
 <CAB6mnxLD9Fi59Dmoq0yUNw5o+sGrUUq3JzrTjjajqY=cJ-5Pgw at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

On Sat, Aug 20, 2011 at 5:12 PM, Chris Withers 
<chris at simplistix.co.uk>wrote:

> Hi All,
>
> If I do the following in an IPython 0.11 Qt shell:
>
> import matplotlib.pyplot as plt
> menMeans   = (20, 35, 30, 35, 27)
> womenMeans = (25, 32, 34, 20, 25)
> plt.bar(ind, menMeans, color='r')
> plt.bar(ind, womenMeans, color='y', bottom=menMeans)
>
> I get, as I'd expect, a stacked bar graph.
>
> However, if I do:
>
> plt.bar(ind, menMeans, color='r')
>
> ...hit enter, and then do:
>
> plt.bar(ind, womenMeans, color='y', bottom=menMeans)
>
> ...I get two separate plots.
>
> How can I add to an existing inline plot?
>
> Also, and I guess this might be more of a matplotlib question, how do I
> "reach inside" an existing plot to, for example, adjust the width of the
> bars used?
>
> cheers,
>
>
I think it is more of an ipython question, possibly a matplotlib question 
;)
You might try the hold(True) command.

Chuck


------------------------------

_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user


End of SciPy-User Digest, Vol 96, Issue 31
******************************************


The information contained in this e-mail and any attachments may be confidential. If you are not the intended recipient, please understand that dissemination, copying, or using such information is prohibited. If you have received this e-mail in error, please immediately advise the sender by reply e-mail and delete this e-mail and its attachments from your system.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110822/25876934/attachment.html>

From brockp at umich.edu  Mon Aug 22 16:48:11 2011
From: brockp at umich.edu (Brock Palen)
Date: Mon, 22 Aug 2011 16:48:11 -0400
Subject: [SciPy-User] Building Numpy/Scipy with MKL serial 10.3
Message-ID: <6D37C6AC-9941-4207-8E95-DE69DD124444@umich.edu>

We need to force users to use serial MKL so using libmkl_rt.so  is out of the questions,

We built using the 'builder' wrapper that can make a custom MKL library with all the needed bits included:

#copy makefile and function list from $MKLROOT/tools
cp $MKLROOT/tools/builder/makefile /tmp/
cat $MKLROOT/tools/builder/blas_list >> /tmp/user_list
cat $MKLROOT/tools/builder/lapack_list >> /tmp/user_list

#build seqential version
cd /tmp
make libintel64 interface=lp64 name=libmkl_10.3_serial threading=sequential

This creates a file libmkl_10.3_serial.so

I then set:
#create site.cfg with

[mkl]
library_dirs = /tmp
mkl_libs = mkl_10.3_serial
lapack_libs =

If you ahve any questions let me know.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


From ciampagg at usi.ch  Mon Aug 22 17:48:29 2011
From: ciampagg at usi.ch (Giovanni Luca Ciampaglia)
Date: Mon, 22 Aug 2011 14:48:29 -0700
Subject: [SciPy-User] Bootstrapping confidence interval of the maximum of a
	smoothing spline
Message-ID: <4E52CEAD.4050702@usi.ch>

Hi all, I have data on editing activity from an online community and I 
am trying to estimate the day of peak activity using smoothing splines.

I determine the smoothing factor for scipy.interpolate.UnivariateSpline 
by leave-1-out crossvalidation, and then use scipy.optimize.fmin_tnc to 
evaluate the maximum from the resulting spline. This works pretty well 
and seems robust enough (e.g. http://tinypic.com/r/a3m739/7). Now I 
would like to compute the confidence intervals for this estimate, but I 
am not exactly sure on how to proceed, since I cannot sample data from 
my non-parametric model and generate a distribution for this estimator.

I was thinking at applying some noise to the smoothing factor, but I am 
not sure whether this approach has any theoretical basis. Any idea?

Cheers,

-- 
Giovanni Luca Ciampaglia

Ph.D. Candidate
Faculty of Informatics
University of Lugano
Web: http://www.inf.usi.ch/phd/ciampaglia/

Bertastra?e 36 ? 8003 Z?rich ? Switzerland

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 77261 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110822/000599f9/attachment.png>

From josef.pktd at gmail.com  Mon Aug 22 18:06:26 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 22 Aug 2011 18:06:26 -0400
Subject: [SciPy-User] Bootstrapping confidence interval of the maximum
 of a smoothing spline
In-Reply-To: <4E52CEAD.4050702@usi.ch>
References: <4E52CEAD.4050702@usi.ch>
Message-ID: <CAMMTP+Bh4mukUTh=RVnEZ5+eHmhc5T7xBK27meFi2jXQkf9rfw@mail.gmail.com>

On Mon, Aug 22, 2011 at 5:48 PM, Giovanni Luca Ciampaglia
<ciampagg at usi.ch> wrote:
> Hi all, I have data on editing activity from an online community and I am
> trying to estimate the day of peak activity using smoothing splines.
>
> I determine the smoothing factor for scipy.interpolate.UnivariateSpline by
> leave-1-out crossvalidation, and then use scipy.optimize.fmin_tnc to
> evaluate the maximum from the resulting spline. This works pretty well and
> seems robust enough (e.g. http://tinypic.com/r/a3m739/7). Now I would like
> to compute the confidence intervals for this estimate, but I am not exactly
> sure on how to proceed, since I cannot sample data from my non-parametric
> model and generate a distribution for this estimator.

My first idea would be to sample the residuals, the deviation from the
actual observations and the spline, add them to the spline, and
estimate the new spline on the generated data. And repeat for a number
of bootstrap samples.

Josef

>
> I was thinking at applying some noise to the smoothing factor, but I am not
> sure whether this approach has any theoretical basis. Any idea?
>
> Cheers,
>
> --
> Giovanni Luca Ciampaglia
>
> Ph.D. Candidate
> Faculty of Informatics
> University of Lugano
> Web: http://www.inf.usi.ch/phd/ciampaglia/
>
> Bertastra?e 36 * 8003 Z?rich * Switzerland
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From questions.anon at gmail.com  Mon Aug 22 19:00:31 2011
From: questions.anon at gmail.com (questions anon)
Date: Tue, 23 Aug 2011 09:00:31 +1000
Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array
In-Reply-To: <CAF6FJityURpz7nWdCgn48DtV3_QydadW3eMoTQvsuWu3g3fU=g@mail.gmail.com>
References: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
	<CAN_=ogtw3Cr782-P42Ys_K_bLGa3=qK-+qF9qQ6geC2s4HnVnQ@mail.gmail.com>
	<CAF6FJityURpz7nWdCgn48DtV3_QydadW3eMoTQvsuWu3g3fU=g@mail.gmail.com>
Message-ID: <CAN_=ogvmfgswhC8VdZo3otmogozB1fnEvsJbtGxBEsXDV+x5Ww@mail.gmail.com>

Thank you, that is good to know, but that is not the case for this. I know I
have blank data or something in a couple of sections and when I choose to
print around those figures I still end up with what happens below (shown
again).
 [[-- -- -- ..., -- -- --]
  [-- -- -- ..., -- -- --]
  [-- -- -- ..., -- -- --]
  ...,
And then when I make this into one big array these turn into

  [ -3.27670000e+04  -3.27670000e+04  -3.27670000e+04 ...,  -3.27670000e+04
    -3.27670000e+04  -3.27670000e+04]

Is there a way to identify these blanks and ignore them from the analyses?


On Sat, Aug 20, 2011 at 3:18 AM, Robert Kern <robert.kern at gmail.com> wrote:

> On Fri, Aug 19, 2011 at 00:01, questions anon <questions.anon at gmail.com>
> wrote:
> > Thank you, what you suggested worked but now I don't think that is my
> > problem.
> > Within the dataset I am trying to calculate the mean from it appears
> there
> > are some hours with no data, the output is:
> > [[[-- -- -- ..., -- -- --]
> >   [-- -- -- ..., -- -- --]
> >   [-- -- -- ..., -- -- --]
> >   ...,
> >   [-- -- -- ..., -- -- --]
> >   [-- -- -- ..., -- -- --]
> >   [-- -- -- ..., -- -- --]]]
> > So I would assume these would be ignored when I calculate the mean but
> when
> > I make all my files/times into one big array these blanks turn into
> -32767.
> > Is there some way to avoid this?
>
> This is just how large arrays get summarized when printed. The data is
> all there. You can control how this summarization happens using the
> threshold parameter to numpy.set_printoptions():
>
>
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>   -- Umberto Eco
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110823/87009df1/attachment.html>

From robert.kern at gmail.com  Mon Aug 22 19:12:33 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 22 Aug 2011 18:12:33 -0500
Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array
In-Reply-To: <CAN_=ogvmfgswhC8VdZo3otmogozB1fnEvsJbtGxBEsXDV+x5Ww@mail.gmail.com>
References: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
	<CAN_=ogtw3Cr782-P42Ys_K_bLGa3=qK-+qF9qQ6geC2s4HnVnQ@mail.gmail.com>
	<CAF6FJityURpz7nWdCgn48DtV3_QydadW3eMoTQvsuWu3g3fU=g@mail.gmail.com>
	<CAN_=ogvmfgswhC8VdZo3otmogozB1fnEvsJbtGxBEsXDV+x5Ww@mail.gmail.com>
Message-ID: <CAF6FJivCVue_umsPssS46-2n+QWEn5O4mwv=sNTYeV4aYET6Vg@mail.gmail.com>

On Mon, Aug 22, 2011 at 18:00, questions anon <questions.anon at gmail.com> wrote:
> Thank you, that is good to know, but that is not the case for this. I know I
> have blank data or something in a couple of sections and when I choose to
> print around those figures I still end up with what happens below (shown
> again).
> ?[[-- -- -- ..., -- -- --]
> ? [-- -- -- ..., -- -- --]
> ? [-- -- -- ..., -- -- --]
> ? ...,
> And then when I make this into one big array these turn into
> ? [ -3.27670000e+04 ?-3.27670000e+04 ?-3.27670000e+04 ..., ?-3.27670000e+04
> ? ? -3.27670000e+04 ?-3.27670000e+04]
> Is there a way to identify these blanks and ignore them from the analyses?

Or, right, sorry. The -- indeed are masked values. Somehow, you are
using masked_arrays. I don't know if the netCDF4 module is doing that
for you automatically or if you are using different code than what you
showed.

numpy.concatenate() will ignore that the array is a masked_array and
just treat it as if it were a regular numpy ndarray, and lose the mask
information. You will need to use numpy.ma.concatenate() instead.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From questions.anon at gmail.com  Mon Aug 22 19:25:54 2011
From: questions.anon at gmail.com (questions anon)
Date: Tue, 23 Aug 2011 09:25:54 +1000
Subject: [SciPy-User] How to ignore NaN values and -32767 in numpy array
In-Reply-To: <CAF6FJivCVue_umsPssS46-2n+QWEn5O4mwv=sNTYeV4aYET6Vg@mail.gmail.com>
References: <CAN_=ogtEcoMQAURUkR0+x2P515Y+Nu0-akQyND10JOPCN+dtyg@mail.gmail.com>
	<CAN_=ogtw3Cr782-P42Ys_K_bLGa3=qK-+qF9qQ6geC2s4HnVnQ@mail.gmail.com>
	<CAF6FJityURpz7nWdCgn48DtV3_QydadW3eMoTQvsuWu3g3fU=g@mail.gmail.com>
	<CAN_=ogvmfgswhC8VdZo3otmogozB1fnEvsJbtGxBEsXDV+x5Ww@mail.gmail.com>
	<CAF6FJivCVue_umsPssS46-2n+QWEn5O4mwv=sNTYeV4aYET6Vg@mail.gmail.com>
Message-ID: <CAN_=oguO65Ju5MjhBTryyrKTHn_i-jbig3OozDtOYDY0cDdPEw@mail.gmail.com>

yahhh that worked! thank you.
I was showing all the code so maybe it is something with NETCDF4.
Thanks again!!

On Tue, Aug 23, 2011 at 9:12 AM, Robert Kern <robert.kern at gmail.com> wrote:

> On Mon, Aug 22, 2011 at 18:00, questions anon <questions.anon at gmail.com>
> wrote:
> > Thank you, that is good to know, but that is not the case for this. I
> know I
> > have blank data or something in a couple of sections and when I choose to
> > print around those figures I still end up with what happens below (shown
> > again).
> >  [[-- -- -- ..., -- -- --]
> >   [-- -- -- ..., -- -- --]
> >   [-- -- -- ..., -- -- --]
> >   ...,
> > And then when I make this into one big array these turn into
> >   [ -3.27670000e+04  -3.27670000e+04  -3.27670000e+04 ...,
>  -3.27670000e+04
> >     -3.27670000e+04  -3.27670000e+04]
> > Is there a way to identify these blanks and ignore them from the
> analyses?
>
> Or, right, sorry. The -- indeed are masked values. Somehow, you are
> using masked_arrays. I don't know if the netCDF4 module is doing that
> for you automatically or if you are using different code than what you
> showed.
>
> numpy.concatenate() will ignore that the array is a masked_array and
> just treat it as if it were a regular numpy ndarray, and lose the mask
> information. You will need to use numpy.ma.concatenate() instead.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>   -- Umberto Eco
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110823/9576ed0b/attachment.html>

From rajs2010 at gmail.com  Tue Aug 23 05:11:41 2011
From: rajs2010 at gmail.com (Rajeev Singh)
Date: Tue, 23 Aug 2011 14:41:41 +0530
Subject: [SciPy-User] Speeding up Python Again
In-Reply-To: <CAABz-z_6y5_ghTUCG-EfWu0bR6f6+5UK8hN_=ZOB8D+chrMxEA@mail.gmail.com>
References: <CAABz-z_6y5_ghTUCG-EfWu0bR6f6+5UK8hN_=ZOB8D+chrMxEA@mail.gmail.com>
Message-ID: <CAABz-z_s0F6edGG99qVcMPkqMeDbZ-AcLiUQJ_oZHB_rUrc9qw@mail.gmail.com>

On Wed, Aug 10, 2011 at 6:48 PM, Rajeev Singh <rajs2010 at gmail.com> wrote:
> Hi,
> I was trying out the codes discussed
> at
http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html
> Here is a summary of my results -
>             Computer: Desktop    imsc9    aravali   annapurna
>                NumPy: 7.651419  4.219105  5.576453  4.858640
>               Cython: 4.259419  3.477259  3.204909  2.357819
>                Weave: 4.302778     *      3.298551  2.400000
>       Looped Fortran: 4.199148  3.414484  3.202963  2.315644
>   Vectorized Fortran: 3.118410  2.131966  1.512303  1.460251
> pure fortran update1: 1.205727  1.964857  2.034688  1.336086
> pure fortran update2: 0.600848  0.604649  0.573593  0.721339
> imsc9, aravali and annapurna are HPC machines at my institute
> * for some reason Weave didn't compile on imsc9
>
> Indeed there is about a factor of 7 to 12 difference between pure fortran
> with update2 (vectorized) and the numpy version.
> I should mention that I changed N to 150 in laplace_for.f90
> Rajeev

Hi,

Continuing the comparison of various ways of implementing solving laplace
equation, following result might interest you -

                                     Desktop   imsc9  aravali annapurna
                        Octave (0):  20.7866     *    21.6179     *
     Vectorized Fortran (pure) (1):   0.7487  0.6501   0.7507  1.1619
     Vectorized Fortran (f2py) (2):   0.7190  0.6089   0.6243  1.0312
                         NumPy (3):   4.1343  2.5844   2.6565  3.7445
                        Cython (4):   1.7273  1.9927   2.0471  1.3525
                 Cython with C (5):   1.7248  1.9665   2.0354  1.3367
                         Weave (6):   1.9818     *     2.1326  1.4003
         Looped Fortran (f2py) (7):   1.6996  1.9657   2.0429  1.3354
         Looped Fortran (pure) (8):   1.7189  2.0145   2.0917  1.5086
                      C (pure) (9):   1.2820  1.9948   2.0527  1.4259

imsc9, aravali and annapurna are HPC machines at my institute
* for some reason Weave didn't compile on imsc9
* octave isn't installed on imsc9 and annapurna

The difference between numpy and fortran performance seems significant.
However f2py does as well as pure fortran now. The difference from earlier
case is that earlier there was a division inside the loop which I have
replaced by multiplication by reciprocal. This does not affect the result
but makes the execution faster in all cases except pure fortran (I guess
fortran compiler was already doing it).

I would be happy to give all the codes if someone is interested. Should we
update the performance python page at scipy with these codes?

Rajeev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110823/312c371e/attachment.html>

From senthipa at in.ibm.com  Thu Aug 18 02:03:34 2011
From: senthipa at in.ibm.com (Senthil Palanisamy)
Date: Thu, 18 Aug 2011 11:33:34 +0530
Subject: [SciPy-User] scipy install issue
Message-ID: <OFB036DB27.DBC48C40-ON652578F0.0020DAF2-652578F0.0021492E@in.ibm.com>


Hi, i am trying to install scipy on my aix 5.3 machine,

i am getting following error.,


compile options: '-DNO_ATLAS_INFO=1
-I/gpfs1/utils/python/Python-2.7.2/lib/python2.7/site-packages/numpy/core/include
 -I/gpfs1/utils/python/Python-2.7.2/include/python2.7 -c'
xlc_r: scipy/integrate/_odepackmodule.c
/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix /usr/bin/xlf95
 -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp
-bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg
build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib
-Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o
build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so
ld: 0711-317 ERROR: Undefined symbol: .idamax_
ld: 0711-317 ERROR: Undefined symbol: .dscal_
ld: 0711-317 ERROR: Undefined symbol: .daxpy_
ld: 0711-317 ERROR: Undefined symbol: .ddot_
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more
information.
ld: 0711-317 ERROR: Undefined symbol: .idamax_
ld: 0711-317 ERROR: Undefined symbol: .dscal_
ld: 0711-317 ERROR: Undefined symbol: .daxpy_
ld: 0711-317 ERROR: Undefined symbol: .ddot_
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more
information.
error: Command
"/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix /usr/bin/xlf95
 -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp
-bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg
build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib
-Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o
build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so" failed with exit status
8


details

-scipy-0.9.0
-numpy, BLAS , LAPACK are intalled already,
- xlf and xlc compilersare using.


please get back to me with solution, how can i edit the install script?


---  ---
SenthilRaja Palanisamy  |  HPC Team
India Systems & Technology Lab,
EGL-D, Bangalore, KA-560071 India
Email : senthipa at in.ibm.com
IBM India Pvt Ltd.


From rmorgan466 at gmail.com  Tue Aug 23 08:14:34 2011
From: rmorgan466 at gmail.com (Rita)
Date: Tue, 23 Aug 2011 08:14:34 -0400
Subject: [SciPy-User] scipy.stats
In-Reply-To: <CAOF-KficrVnTeXM+jyirp-H96KDuMWmBXWpcS1R=KDvuYJVGJQ@mail.gmail.com>
References: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
	<CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>
	<CAOF-Kfjx0cPWjHgbfcVYakuGBcnHSVx3fW+D=uWzyGZ5mh7Ojw@mail.gmail.com>
	<CAAea2pb5L6jG_nncOm5GXOHv6JU7DO_AL0YKON+w3k_avHNvPA@mail.gmail.com>
	<CAOF-Kfht9rm0zpkiJAbz=HtJgLUdV_GuinEkgAcCNuP6CDa_7A@mail.gmail.com>
	<CAOF-KficrVnTeXM+jyirp-H96KDuMWmBXWpcS1R=KDvuYJVGJQ@mail.gmail.com>
Message-ID: <CAOF-Kfi_T1+SWbBBkagZjMBokXiLOG1q6icFMf1W8_w+qbHm1A@mail.gmail.com>

Any ideas?


On Sat, Aug 20, 2011 at 7:40 AM, Rita <rmorgan466 at gmail.com> wrote:

> It should be
>
>
> 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L
> /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I/opt/intel/mkl/include
> -I /opt/intel/include -fPIC -O3 -openmp -limf -lmkl_core -lmkl_intel_lp64
> -lmkl_intel_thread     -lstdc++ -DMKL_ILP64'
>
> Here is how I am doing the compilation
>
>
>
>
>
> On Sat, Aug 20, 2011 at 6:38 AM, Rita <rmorgan466 at gmail.com> wrote:
>
>> Thanks Bruce. I have already seen this
>>
>> Here are more details of my build.
>>
>> My Intel compiler exists here, /opt/intel/
>>
>> self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib
>> -L /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I
>> /opt/intel/ipp/em64t/in    clude -I /etg/source/Linux/include -I
>> /opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf
>> -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread     -lstdc++ -DMKL_ILP64'
>> Here is how I am doing the compilation
>>
>> CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config
>> --compiler=intel  --fcompiler=intelem build_clib --compiler=intel
>> --fcompiler=intelem build_ext --compiler=intel install
>>
>> /opt/intel/ipp is what I was using for the math library. This compiles but
>> I keep getting that problem
>>
>> I use the same compile statement to compile scipy
>>
>>
>>
>> On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey <bsouthey at gmail.com>wrote:
>>
>>> On Fri, Aug 19, 2011 at 5:53 PM, Rita <rmorgan466 at gmail.com> wrote:
>>> > I apologize for the vague question.
>>> > OS: Linux
>>> > Compiler: Intel compiler suite. Version 11 (this also includes fortran
>>> > compiler)
>>> > MKL: 10.3
>>> > Numpy version: 1.6.1
>>> > When I do numpy.config() I see it properly compiled against Intel's
>>> BLAS and
>>> > LAPACK
>>> >
>>> > Where are the build logs located? Do you need to build log for Numpy
>>> also?
>>> >
>>> > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers <
>>> ralf.gommers at googlemail.com>
>>> > wrote:
>>> >>
>>> >>
>>> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita <rmorgan466 at gmail.com> wrote:
>>> >>>
>>> >>> I am trying to import scipy.stats but I keep getting an import Error,
>>> >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos
>>> >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but just
>>> >>> cant get this working.
>>> >>> Any advise?
>>> >>
>>> >> The symbol is defined in an Intel math library. You'll need to give us
>>> >> more details in order to say more than that. What exact compilers and
>>> MKL
>>> >> did you use, what OS? Build command and build log?
>>> >>
>>> >> Cheers,
>>> >> Ralf
>>> >>
>>> >>
>>> >>
>>>
>>> A quick google indicates that you need to ensure that you link to the
>>> appropriate Intel Math library:
>>>
>>> http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/
>>>
>>> Also what is the cpu type?
>>>
>>> Bruce
>>>  _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
>>
>>
>> --
>> --- Get your facts first, then you can distort them as you please.--
>>
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>


-- 
--- Get your facts first, then you can distort them as you please.--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110823/44504e94/attachment.html>

From Jerome.Kieffer at esrf.fr  Tue Aug 23 10:25:13 2011
From: Jerome.Kieffer at esrf.fr (Jerome Kieffer)
Date: Tue, 23 Aug 2011 16:25:13 +0200
Subject: [SciPy-User] Is this a bug in scipy.ndimage.interpolation.shift ???
Message-ID: <20110823162513.5dc792a1.Jerome.Kieffer@esrf.fr>

Hello,

I was using scipy.ndimage.interpolation.shift with order=0 and "wrap" mode because I did not want to swap the 4 blocs of memory myself ... but I got strange results.

for shifting by hand  one can do:
def shift(input, shift):
    """
    Shift an array like  scipy.ndimage.interpolation.shift(input, shift, mode="wrap", order=0) but faster
    @param in: 2d numpy array
    @param d: 2-tuple of integers 
    @return: shifted image
    """
    re = numpy.zeros_like(input)
    s0, s1 = input.shape
    d0 = shift[0] % s0
    d1 = shift[0] % s1
    r0 = (-d0) % s0
    r1 = (-d1) % s1
    re[d0:, d1:] = input[:r0, :r1]
    re[:d0, d1:] = input[r0:, :r1]
    re[d0:, :d1] = input[:r0, r1:]
    re[:d0, :d1] = input[r0:, r1:]
    return re

In [327]: a=np.random.random((5,5))

In [328]: scipy.ndimage.interpolation.shift(a,(2,3),order=0,mode="wrap")-shift(a,(2,3))
Out[328]: 
array([[-0.13484701,  0.43450823,  0.4920127 , -0.04826882, -0.40258904],
       [ 0.48403199,  0.02161651, -0.35774838,  0.73954376,  0.42218297],
       [-0.23808862,  0.4799521 , -0.39548832,  0.        ,  0.        ],
       [-0.04105354,  0.06934301, -0.18976602,  0.        ,  0.        ],
       [-0.38430434,  0.04591371, -0.33502248,  0.        ,  0.        ]])

SHOULD BE 0 everywhere and it is only in the lower right corner ... 

Do you agree this is an error (or did I misinterpret scipy.ndimage.interpolation.shift since the begining ?)

Shall I open a bug ? I am using an ubuntu 10.04 (LTS)

Cheers,
-- 
J?r?me Kieffer
On-Line Data analysis / Software Group 
ISDD / ESRF
tel +33 476 882 445


From bsouthey at gmail.com  Tue Aug 23 14:04:30 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Tue, 23 Aug 2011 13:04:30 -0500
Subject: [SciPy-User] scipy.stats
In-Reply-To: <CAOF-Kfi_T1+SWbBBkagZjMBokXiLOG1q6icFMf1W8_w+qbHm1A@mail.gmail.com>
References: <CAOF-KfhP0j+=5_VzfbyxzLFNYO8AM2pyxf9hmmfYzHDR_NZuQQ@mail.gmail.com>
	<CABL7CQiG05r2_ip-4V5QJ=a8oLJnqobk9GuJMr4OO_VNweC4qQ@mail.gmail.com>
	<CAOF-Kfjx0cPWjHgbfcVYakuGBcnHSVx3fW+D=uWzyGZ5mh7Ojw@mail.gmail.com>
	<CAAea2pb5L6jG_nncOm5GXOHv6JU7DO_AL0YKON+w3k_avHNvPA@mail.gmail.com>
	<CAOF-Kfht9rm0zpkiJAbz=HtJgLUdV_GuinEkgAcCNuP6CDa_7A@mail.gmail.com>
	<CAOF-KficrVnTeXM+jyirp-H96KDuMWmBXWpcS1R=KDvuYJVGJQ@mail.gmail.com>
	<CAOF-Kfi_T1+SWbBBkagZjMBokXiLOG1q6icFMf1W8_w+qbHm1A@mail.gmail.com>
Message-ID: <CAAea2payJg2GWqHiBULbi-Rmt3NJsP-xggWnAHKR+V6fbReT8A@mail.gmail.com>

On Tue, Aug 23, 2011 at 7:14 AM, Rita <rmorgan466 at gmail.com> wrote:
> Any ideas?
>
> On Sat, Aug 20, 2011 at 7:40 AM, Rita <rmorgan466 at gmail.com> wrote:
>>
>> It should be
>>
>>
>> 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib -L
>> /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I/opt/intel/mkl/include
>> -I /opt/intel/include -fPIC -O3 -openmp -limf -lmkl_core -lmkl_intel_lp64
>> -lmkl_intel_thread???? -lstdc++ -DMKL_ILP64'
>>
>> Here is how I am doing the compilation
>>
>>
>>
>>
>>
>> On Sat, Aug 20, 2011 at 6:38 AM, Rita <rmorgan466 at gmail.com> wrote:
>>>
>>> Thanks Bruce. I have already seen this
>>> Here are more details of my build.
>>>
>>> My Intel compiler exists here, /opt/intel/
>>>
>>> self.cc_exe = 'icc -L /opt/intel/lib/intel64 -L /opt/intel/ipp/em64t/lib
>>> -L /opt/intel/mkl/lib/em64t -L /usr/lib64 -L /usr/lib -I
>>> /opt/intel/ipp/em64t/in??? clude -I /etg/source/Linux/include -I
>>> /opt/intel/mkl/include -I /opt/intel/include -fPIC -O3 -openmp -limf
>>> -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread???? -lstdc++ -DMKL_ILP64'
>>> Here is how I am doing the compilation
>>>
>>> CC=icc CXX=icpc AR=xiar /opt/python-2.7.2/bin/python setup.py config
>>> --compiler=intel? --fcompiler=intelem build_clib --compiler=intel
>>> --fcompiler=intelem build_ext --compiler=intel install
>>>
>>> /opt/intel/ipp is what I was using for the math library. This compiles
>>> but I keep getting that problem
>>>
>>> I use the same compile statement to compile scipy
>>>
>>>
>>> On Fri, Aug 19, 2011 at 8:00 PM, Bruce Southey <bsouthey at gmail.com>
>>> wrote:
>>>>
>>>> On Fri, Aug 19, 2011 at 5:53 PM, Rita <rmorgan466 at gmail.com> wrote:
>>>> > I apologize for the vague question.
>>>> > OS: Linux
>>>> > Compiler: Intel compiler suite. Version 11 (this also includes fortran
>>>> > compiler)
>>>> > MKL: 10.3
>>>> > Numpy version: 1.6.1
>>>> > When I do numpy.config() I see it properly compiled against Intel's
>>>> > BLAS and
>>>> > LAPACK
>>>> >
>>>> > Where are the build logs located? Do you need to build log for Numpy
>>>> > also?
>>>> >
>>>> > On Fri, Aug 19, 2011 at 11:00 AM, Ralf Gommers
>>>> > <ralf.gommers at googlemail.com>
>>>> > wrote:
>>>> >>
>>>> >>
>>>> >> On Thu, Aug 18, 2011 at 12:20 PM, Rita <rmorgan466 at gmail.com> wrote:
>>>> >>>
>>>> >>> I am trying to import scipy.stats but I keep getting an import
>>>> >>> Error,
>>>> >>> ...scipy/special/_cephes.so: undefined symbol: __libm_sse2_sincos
>>>> >>> I compiled Numpy with Intel C compiler and Scipy compiled ok but
>>>> >>> just
>>>> >>> cant get this working.
>>>> >>> Any advise?
>>>> >>
>>>> >> The symbol is defined in an Intel math library. You'll need to give
>>>> >> us
>>>> >> more details in order to say more than that. What exact compilers and
>>>> >> MKL
>>>> >> did you use, what OS? Build command and build log?
>>>> >>
>>>> >> Cheers,
>>>> >> Ralf
>>>> >>
>>>> >>
>>>> >>
>>>>
>>>> A quick google indicates that you need to ensure that you link to the
>>>> appropriate Intel Math library:
>>>>
>>>> http://software.intel.com/en-us/articles/unresolved-external-symbol-libm-sse2/
>>>>
>>>> Also what is the cpu type?
>>>>
>>>> Bruce
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>>
>>> --
>>> --- Get your facts first, then you can distort them as you please.--
>>
>>
>>
>> --
>> --- Get your facts first, then you can distort them as you please.--
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>

Just use Enthought's version:
http://www.enthought.com/products/epd.php

I do not use Intel's compiler so without more details it just appears
that you have not given icc the correct paths to the libraries it
needs when linking.

Bruce


From ralf.gommers at googlemail.com  Tue Aug 23 15:36:52 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 23 Aug 2011 21:36:52 +0200
Subject: [SciPy-User] scipy install issue
In-Reply-To: <OFB036DB27.DBC48C40-ON652578F0.0020DAF2-652578F0.0021492E@in.ibm.com>
References: <OFB036DB27.DBC48C40-ON652578F0.0020DAF2-652578F0.0021492E@in.ibm.com>
Message-ID: <CABL7CQg=j4+xc2+QSTe+ctmosS-RPiDbYwNBug=8DZ9AT9y=rg@mail.gmail.com>

On Thu, Aug 18, 2011 at 8:03 AM, Senthil Palanisamy <senthipa at in.ibm.com>wrote:

>
> Hi, i am trying to install scipy on my aix 5.3 machine,
>
> i am getting following error.,
>

This seems to be a common problem with BLAS on AIX. This was the most
helpful message I found:
https://stat.ethz.ch/pipermail/r-help/2000-July/007486.html.

Ralf

>
>
>
> compile options: '-DNO_ATLAS_INFO=1
>
> -I/gpfs1/utils/python/Python-2.7.2/lib/python2.7/site-packages/numpy/core/include
>  -I/gpfs1/utils/python/Python-2.7.2/include/python2.7 -c'
> xlc_r: scipy/integrate/_odepackmodule.c
> /gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix
> /usr/bin/xlf95
>  -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp
> -bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg
> build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib
> -Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o
> build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so
> ld: 0711-317 ERROR: Undefined symbol: .idamax_
> ld: 0711-317 ERROR: Undefined symbol: .dscal_
> ld: 0711-317 ERROR: Undefined symbol: .daxpy_
> ld: 0711-317 ERROR: Undefined symbol: .ddot_
> ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more
> information.
> ld: 0711-317 ERROR: Undefined symbol: .idamax_
> ld: 0711-317 ERROR: Undefined symbol: .dscal_
> ld: 0711-317 ERROR: Undefined symbol: .daxpy_
> ld: 0711-317 ERROR: Undefined symbol: .ddot_
> ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more
> information.
> error: Command
> "/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/ld_so_aix
> /usr/bin/xlf95
>  -q64 -bI:/gpfs1/utils/python/Python-2.7.2/lib/python2.7/config/python.exp
> -bshared -F/tmp/tmpzC87nY/Sm7AEs_xlf.cfg
> build/temp.aix-5.3-2.7/scipy/integrate/_odepackmodule.o -L/usr/lib
> -Lbuild/temp.aix-5.3-2.7 -lodepack -llinpack_lite -lmach -lblas -o
> build/lib.aix-5.3-2.7/scipy/integrate/_odepack.so" failed with exit status
> 8
>
>
> details
>
> -scipy-0.9.0
> -numpy, BLAS , LAPACK are intalled already,
> - xlf and xlc compilersare using.
>
>
> please get back to me with solution, how can i edit the install script?
>
>
> ---  ---
> SenthilRaja Palanisamy  |  HPC Team
> India Systems & Technology Lab,
> EGL-D, Bangalore, KA-560071 India
> Email : senthipa at in.ibm.com
> IBM India Pvt Ltd.
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110823/b06f72e5/attachment.html>

From questions.anon at gmail.com  Tue Aug 23 19:00:36 2011
From: questions.anon at gmail.com (questions anon)
Date: Wed, 24 Aug 2011 09:00:36 +1000
Subject: [SciPy-User] memory error - numpy mean - netcdf4
Message-ID: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>

Hi All,
I am receiving a memory error when I try to calculate the Numpy mean across
many NetCDF files.
Is there a way to fix this? The code I am using is below.
Any feedback will be greatly appreciated.


from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as N
from mpl_toolkits.basemap import Basemap
from netcdftime import utime
from datetime import datetime
import os

MainFolder=r"E:/GriddedData/T_SFC/"

all_TSFC=[]
for (path, dirs, files) in os.walk(MainFolder):
    for dir in dirs:
        print dir
    path=path+'/'
    for ncfile in files:
        if ncfile[-3:]=='.nc':
            #print "dealing with ncfiles:", ncfile
            ncfile=os.path.join(path,ncfile)
            ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
            TSFC=ncfile.variables['T_SFC'][4::24,:,:]
            LAT=ncfile.variables['latitude'][:]
            LON=ncfile.variables['longitude'][:]
            TIME=ncfile.variables['time'][:]
            fillvalue=ncfile.variables['T_SFC']._FillValue
            ncfile.close()

            #combine all TSFC to make one array for analyses
            all_TSFC.append(TSFC)

big_array=N.ma.concatenate(all_TSFC)
#calculate the mean of the combined array
Mean=big_array.mean(axis=0)
print "the mean is", Mean


#plot output summary stats
map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
              llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
map.drawcoastlines()
map.drawstates()
x,y=map(*N.meshgrid(LON,LAT))
plt.title('TSFC Mean at 3pm')
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h])
plt.colorbar(CS,cax=cax, drawedges=True)

plt.savefig((os.path.join(MainFolder, 'Mean.png')))
plt.show()
plt.close()

print "end processing"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110824/278c6895/attachment.html>

From tsupinie at gmail.com  Tue Aug 23 22:54:22 2011
From: tsupinie at gmail.com (Tim Supinie)
Date: Tue, 23 Aug 2011 21:54:22 -0500
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
Message-ID: <CADFCwjJar8bDNsTy+O+d-yJ6wNEy-wsCrhFF21e223Am6PJxcg@mail.gmail.com>

At what point in the program are you getting the error?  Is there a stack
trace?

Pending the answers to those to questions, my first thought is to ask how
much data you're loading into memory?  How many files are there?  It's
possible that you're loading a whole bunch of data that you don't need, and
it's not getting cleared out by the garbage collector, which can generate
memory errors when you run out of memory.  Try removing as much data loading
as you can.  (Are you using TIME?  How big is each array you load in?)
Also, if the lats and lons in all the different files are the same, only
load the lats and lons from one file.  All these will not only help your
program use less memory, but help it run faster.

Finally, if that doesn't work, use the gc module and run gc.collect() after
every loop iteration to make sure Python's cleaning up after itself like it
should.  I think the garbage collector might not always run during loops,
which can create problems when you're loading a whole bunch of unused data.

Tim

On Tue, Aug 23, 2011 at 6:00 PM, questions anon <questions.anon at gmail.com>wrote:

> Hi All,
> I am receiving a memory error when I try to calculate the Numpy mean across
> many NetCDF files.
> Is there a way to fix this? The code I am using is below.
> Any feedback will be greatly appreciated.
>
>
> from netCDF4 import Dataset
> import matplotlib.pyplot as plt
> import numpy as N
> from mpl_toolkits.basemap import Basemap
> from netcdftime import utime
> from datetime import datetime
> import os
>
> MainFolder=r"E:/GriddedData/T_SFC/"
>
> all_TSFC=[]
> for (path, dirs, files) in os.walk(MainFolder):
>     for dir in dirs:
>         print dir
>     path=path+'/'
>     for ncfile in files:
>         if ncfile[-3:]=='.nc':
>             #print "dealing with ncfiles:", ncfile
>             ncfile=os.path.join(path,ncfile)
>             ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
>             TSFC=ncfile.variables['T_SFC'][4::24,:,:]
>             LAT=ncfile.variables['latitude'][:]
>             LON=ncfile.variables['longitude'][:]
>             TIME=ncfile.variables['time'][:]
>             fillvalue=ncfile.variables['T_SFC']._FillValue
>             ncfile.close()
>
>             #combine all TSFC to make one array for analyses
>             all_TSFC.append(TSFC)
>
> big_array=N.ma.concatenate(all_TSFC)
> #calculate the mean of the combined array
> Mean=big_array.mean(axis=0)
> print "the mean is", Mean
>
>
> #plot output summary stats
> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
>               llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
> map.drawcoastlines()
> map.drawstates()
> x,y=map(*N.meshgrid(LON,LAT))
> plt.title('TSFC Mean at 3pm')
> ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
> CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
> l,b,w,h =0.1,0.1,0.8,0.8
> cax = plt.axes([l+w+0.025, b, 0.025, h])
> plt.colorbar(CS,cax=cax, drawedges=True)
>
> plt.savefig((os.path.join(MainFolder, 'Mean.png')))
> plt.show()
> plt.close()
>
> print "end processing"
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110823/71a67521/attachment.html>

From alexandre.fayolle at logilab.fr  Wed Aug 24 07:38:50 2011
From: alexandre.fayolle at logilab.fr (Alexandre Fayolle)
Date: Wed, 24 Aug 2011 13:38:50 +0200
Subject: [SciPy-User] 3d convex hull
Message-ID: <201108241338.50940.alexandre.fayolle@logilab.fr>

Hello,

Is there any implementation of 3d convex hull computation algorithms in scipy?

Thanks

-- 
Alexandre Fayolle                              LOGILAB, Paris (France)
Formations Python, CubicWeb, Debian :  http://www.logilab.fr/formations
D?veloppement logiciel sur mesure :      http://www.logilab.fr/services
Informatique scientifique:               http://www.logilab.fr/science


From keith.hughitt at gmail.com  Wed Aug 24 09:00:59 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 24 Aug 2011 09:00:59 -0400
Subject: [SciPy-User] 3d convex hull
In-Reply-To: <201108241338.50940.alexandre.fayolle@logilab.fr>
References: <201108241338.50940.alexandre.fayolle@logilab.fr>
Message-ID: <CAOJcpR_3xR7aR2dgefWw8onfY-ApcZGZCGa=qd8xEdg2Ct5YoQ@mail.gmail.com>

How about
http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.Delaunay.convex_hull.html
 ?

On Wed, Aug 24, 2011 at 7:38 AM, Alexandre Fayolle <
alexandre.fayolle at logilab.fr> wrote:

> Hello,
>
> Is there any implementation of 3d convex hull computation algorithms in
> scipy?
>
> Thanks
>
> --
> Alexandre Fayolle                              LOGILAB, Paris (France)
> Formations Python, CubicWeb, Debian :  http://www.logilab.fr/formations
> D?veloppement logiciel sur mesure :      http://www.logilab.fr/services
> Informatique scientifique:               http://www.logilab.fr/science
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110824/8e4f2893/attachment.html>

From abraham.zamudio at gmail.com  Tue Aug 23 11:57:05 2011
From: abraham.zamudio at gmail.com (Abraham Zamudio)
Date: Tue, 23 Aug 2011 08:57:05 -0700 (PDT)
Subject: [SciPy-User] How to print the jacobian in the output of leastsq
	function
Message-ID: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com>

Hi All,

i use  the leastsq function from module scipy ... but what I want now
is the jacobian matrix of the algorithm ... How should I use this
function to print the Jacobian ??? .

Thx .


From josef.pktd at gmail.com  Wed Aug 24 10:23:56 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 24 Aug 2011 10:23:56 -0400
Subject: [SciPy-User] multivariate empirical distribution function,
	avoid double loop ?
Message-ID: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>

Does anyone know whether there is an algorithm that avoids the double
loop to get a multivariate empirical distribution function?

for point in data:
     count how many points in data are smaller or equal to point

with 1d data it's just argsort(argsort(data))

double loop version with some test cases is attached.

I didn't see a way that sorting would help.

Thanks,

Josef
-------------- next part --------------
A non-text attachment was scrubbed...
Name: try_mvecdf.py
Type: text/x-python
Size: 1022 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110824/c8425ae9/attachment.py>

From hhh.guo at gmail.com  Wed Aug 24 11:56:36 2011
From: hhh.guo at gmail.com (Ning Guo)
Date: Wed, 24 Aug 2011 23:56:36 +0800
Subject: [SciPy-User] 3d convex hull
In-Reply-To: <CAOJcpR_3xR7aR2dgefWw8onfY-ApcZGZCGa=qd8xEdg2Ct5YoQ@mail.gmail.com>
References: <201108241338.50940.alexandre.fayolle@logilab.fr>
	<CAOJcpR_3xR7aR2dgefWw8onfY-ApcZGZCGa=qd8xEdg2Ct5YoQ@mail.gmail.com>
Message-ID: <4E551F34.4@gmail.com>

On Wednesday, August 24, 2011 09:00 PM, Keith Hughitt wrote:

Hi,
I also want to try Delaunay function. But I cannot get enough info from 
the documentation. I want to output the Delaunay tetrahedral in 3D and 
need the vertex indices, facet areas and normals. How can I use the 
function in scipy.spatial? Now I have all the points with id and position.

Thanks!

> How about 
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.Delaunay.convex_hull.html ?
>
> On Wed, Aug 24, 2011 at 7:38 AM, Alexandre Fayolle 
> <alexandre.fayolle at logilab.fr <mailto:alexandre.fayolle at logilab.fr>> 
> wrote:
>
>     Hello,
>
>     Is there any implementation of 3d convex hull computation
>     algorithms in scipy?
>
>     Thanks
>
>     --
>     Alexandre Fayolle                              LOGILAB, Paris (France)
>     Formations Python, CubicWeb, Debian : http://www.logilab.fr/formations
>     D?veloppement logiciel sur mesure : http://www.logilab.fr/services
>     Informatique scientifique: http://www.logilab.fr/science
>     _______________________________________________
>     SciPy-User mailing list
>     SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


-- 
Geotechnical Group
Department of Civil and Environmental Engineering
Hong Kong University of Science and Technology
Clear Water Bay, Kowloon, Hong Kong

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110824/4f0842dd/attachment.html>

From alan.isaac at gmail.com  Wed Aug 24 14:27:15 2011
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Wed, 24 Aug 2011 14:27:15 -0400
Subject: [SciPy-User] multivariate empirical distribution function,
 avoid double loop ?
In-Reply-To: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
References: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
Message-ID: <4E554283.9040606@gmail.com>

On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote:
> Does anyone know whether there is an algorithm that avoids the double
> loop to get a multivariate empirical distribution function?

I think that is pretty standard.
I'll attach something posted awhile ago.
It seemed right at the time, but I did
not test it.  Once upon a time it was at
http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py

Cheers,
Alan


def empiricalcdf(data, method='Hazen'):
     """Return the empirical cdf.
     
     Methods available (here i goes from 1 to N)
         Hazen:       (i-0.5)/N
         Weibull:     i/(N+1)
         Chegodayev:  (i-.3)/(N+.4)
         Cunnane:     (i-.4)/(N+.2)
         Gringorten:  (i-.44)/(N+.12)
         California:  (i-1)/N

     :author: David Huard
     """
     i = np.argsort(np.argsort(data)) + 1.
     nobs = len(data)
     method = method.lower()
     if method == 'hazen':
         cdf = (i-0.5)/nobs
     elif method == 'weibull':
         cdf = i/(nobs+1.)
     elif method == 'california':
         cdf = (i-1.)/nobs
     elif method == 'chegodayev':
         cdf = (i-.3)/(nobs+.4)
     elif method == 'cunnane':
         cdf = (i-.4)/(nobs+.2)
     elif method == 'gringorten':
         cdf = (i-.44)/(nobs+.12)
     else:
         raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.'
     return cdf


From robert.kern at gmail.com  Wed Aug 24 14:34:03 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 24 Aug 2011 13:34:03 -0500
Subject: [SciPy-User] multivariate empirical distribution function,
 avoid double loop ?
In-Reply-To: <4E554283.9040606@gmail.com>
References: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
	<4E554283.9040606@gmail.com>
Message-ID: <CAF6FJiuxC-GEqCyWiLZ97d_vt4m9qcjJeWYgGoqXZo+0m4+wLw@mail.gmail.com>

On Wed, Aug 24, 2011 at 13:27, Alan G Isaac <alan.isaac at gmail.com> wrote:
> On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote:
>> Does anyone know whether there is an algorithm that avoids the double
>> loop to get a multivariate empirical distribution function?
>
> I think that is pretty standard.
> I'll attach something posted awhile ago.
> It seemed right at the time, but I did
> not test it. ?Once upon a time it was at
> http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py

That's *univariate*. He's asking for the multivariate case.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From josef.pktd at gmail.com  Wed Aug 24 14:59:09 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 24 Aug 2011 14:59:09 -0400
Subject: [SciPy-User] multivariate empirical distribution function,
 avoid double loop ?
In-Reply-To: <4E554283.9040606@gmail.com>
References: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
	<4E554283.9040606@gmail.com>
Message-ID: <CAMMTP+B7GrMY+Ef8Cnazi4dgFEjni65wjTQMzFz5eSJvyyMCPw@mail.gmail.com>

On Wed, Aug 24, 2011 at 2:27 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote:
>> Does anyone know whether there is an algorithm that avoids the double
>> loop to get a multivariate empirical distribution function?
>
> I think that is pretty standard.
> I'll attach something posted awhile ago.
> It seemed right at the time, but I did
> not test it. ?Once upon a time it was at
> http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py
>
> Cheers,
> Alan
>
>
> def empiricalcdf(data, method='Hazen'):
> ? ? """Return the empirical cdf.
>
> ? ? Methods available (here i goes from 1 to N)
> ? ? ? ? Hazen: ? ? ? (i-0.5)/N
> ? ? ? ? Weibull: ? ? i/(N+1)
> ? ? ? ? Chegodayev: ?(i-.3)/(N+.4)
> ? ? ? ? Cunnane: ? ? (i-.4)/(N+.2)
> ? ? ? ? Gringorten: ?(i-.44)/(N+.12)
> ? ? ? ? California: ?(i-1)/N
>
> ? ? :author: David Huard
> ? ? """
> ? ? i = np.argsort(np.argsort(data)) + 1.
> ? ? nobs = len(data)
> ? ? method = method.lower()
> ? ? if method == 'hazen':
> ? ? ? ? cdf = (i-0.5)/nobs
> ? ? elif method == 'weibull':
> ? ? ? ? cdf = i/(nobs+1.)
> ? ? elif method == 'california':
> ? ? ? ? cdf = (i-1.)/nobs
> ? ? elif method == 'chegodayev':
> ? ? ? ? cdf = (i-.3)/(nobs+.4)
> ? ? elif method == 'cunnane':
> ? ? ? ? cdf = (i-.4)/(nobs+.2)
> ? ? elif method == 'gringorten':
> ? ? ? ? cdf = (i-.44)/(nobs+.12)
> ? ? else:
> ? ? ? ? raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.'
> ? ? return cdf


Unfortunately it's 1d only, and I am working on multivariate, at least
bivariate.

Pierre has a 1d version similar to this in scipy.stats.mstats and a,
so far unused, copy is in statsmodels.

Thanks,
Josef


>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From david_baddeley at yahoo.com.au  Wed Aug 24 16:02:12 2011
From: david_baddeley at yahoo.com.au (David Baddeley)
Date: Wed, 24 Aug 2011 13:02:12 -0700 (PDT)
Subject: [SciPy-User] multivariate empirical distribution function,
	avoid double loop ?
Message-ID: <1314216132.86532.yint-ygo-j2me@web113416.mail.gq1.yahoo.com>

Sounds like it could be a case for scipy.spatial.kdtree. 

Cheers, David

On Thu, 25 Aug 2011 06:59 NZST josef.pktd at gmail.com wrote:

>On Wed, Aug 24, 2011 at 2:27 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>> On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote:
>>> Does anyone know whether there is an algorithm that avoids the double
>>> loop to get a multivariate empirical distribution function?
>>
>> I think that is pretty standard.
>> I'll attach something posted awhile ago.
>> It seemed right at the time, but I did
>> not test it. ?Once upon a time it was at
>> http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py
>>
>> Cheers,
>> Alan
>>
>>
>> def empiricalcdf(data, method='Hazen'):
>> ? ? """Return the empirical cdf.
>>
>> ? ? Methods available (here i goes from 1 to N)
>> ? ? ? ? Hazen: ? ? ? (i-0.5)/N
>> ? ? ? ? Weibull: ? ? i/(N+1)
>> ? ? ? ? Chegodayev: ?(i-.3)/(N+.4)
>> ? ? ? ? Cunnane: ? ? (i-.4)/(N+.2)
>> ? ? ? ? Gringorten: ?(i-.44)/(N+.12)
>> ? ? ? ? California: ?(i-1)/N
>>
>> ? ? :author: David Huard
>> ? ? """
>> ? ? i = np.argsort(np.argsort(data)) + 1.
>> ? ? nobs = len(data)
>> ? ? method = method.lower()
>> ? ? if method == 'hazen':
>> ? ? ? ? cdf = (i-0.5)/nobs
>> ? ? elif method == 'weibull':
>> ? ? ? ? cdf = i/(nobs+1.)
>> ? ? elif method == 'california':
>> ? ? ? ? cdf = (i-1.)/nobs
>> ? ? elif method == 'chegodayev':
>> ? ? ? ? cdf = (i-.3)/(nobs+.4)
>> ? ? elif method == 'cunnane':
>> ? ? ? ? cdf = (i-.4)/(nobs+.2)
>> ? ? elif method == 'gringorten':
>> ? ? ? ? cdf = (i-.44)/(nobs+.12)
>> ? ? else:
>> ? ? ? ? raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.'
>> ? ? return cdf
>
>
>Unfortunately it's 1d only, and I am working on multivariate, at least
>bivariate.
>
>Pierre has a 1d version similar to this in scipy.stats.mstats and a,
>so far unused, copy is in statsmodels.
>
>Thanks,
>Josef
>
>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>_______________________________________________
>SciPy-User mailing list
>SciPy-User at scipy.org
>http://mail.scipy.org/mailman/listinfo/scipy-user


From josef.pktd at gmail.com  Wed Aug 24 17:52:48 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 24 Aug 2011 17:52:48 -0400
Subject: [SciPy-User] multivariate empirical distribution function,
 avoid double loop ?
In-Reply-To: <1314216132.86532.yint-ygo-j2me@web113416.mail.gq1.yahoo.com>
References: <1314216132.86532.yint-ygo-j2me@web113416.mail.gq1.yahoo.com>
Message-ID: <CAMMTP+AuxVZ+NxLohd-ks+R2bVzOQ_y9dpM0XHQ6wS+cj8Bf=Q@mail.gmail.com>

On Wed, Aug 24, 2011 at 4:02 PM, David Baddeley
<david_baddeley at yahoo.com.au> wrote:
> Sounds like it could be a case for scipy.spatial.kdtree.

I don't see a way. Any suggestions?

"find all smaller points" doesn't induce a complete order, so I don't
see a way to define a distance.

The only way to reduce comparisons is graphical (?) rule out cases
that we know cannot be smaller.

I got a little bit further

def mvecdfvalues_noties(data):
    #use sort on first column
    ev = np.empty(data.shape[0], int)
    sortind0 = np.argsort(data[:,0])
    datas = data[sortind0, ...]
    for i,x in enumerate(datas):
        ev[i] = mvecdf(datas[:i, 1:], x[1:]) + 1
        #it should be possible to make this recursive

    ev2 = np.empty(data.shape[0], int)
    ev2[sortind0] = ev

    return ev2

poor man's timing

import time
t0 = time.time()
x4 = np.random.randn(5000,2)
mvecdfvalues(x4),
t1 = time.time()
mvecdfvalues_noties(x4)
t2 = time.time()
print t1-t0, t2-t1

>>>
5.492000103 1.59099984169

>>> 5.492000103 / 1.59099984169
3.4519174415292584

much better, but still pretty expensive in a Monte Carlo or Bootstrap.

Cheers,

Josef

>
> Cheers, David
>
> On Thu, 25 Aug 2011 06:59 NZST josef.pktd at gmail.com wrote:
>
>>On Wed, Aug 24, 2011 at 2:27 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>>> On 8/24/2011 10:23 AM, josef.pktd at gmail.com wrote:
>>>> Does anyone know whether there is an algorithm that avoids the double
>>>> loop to get a multivariate empirical distribution function?
>>>
>>> I think that is pretty standard.
>>> I'll attach something posted awhile ago.
>>> It seemed right at the time, but I did
>>> not test it. ?Once upon a time it was at
>>> http://svn.scipy.org/svn/scipy/trunk/scipy/sandbox/dhuard/stats.py
>>>
>>> Cheers,
>>> Alan
>>>
>>>
>>> def empiricalcdf(data, method='Hazen'):
>>> ? ? """Return the empirical cdf.
>>>
>>> ? ? Methods available (here i goes from 1 to N)
>>> ? ? ? ? Hazen: ? ? ? (i-0.5)/N
>>> ? ? ? ? Weibull: ? ? i/(N+1)
>>> ? ? ? ? Chegodayev: ?(i-.3)/(N+.4)
>>> ? ? ? ? Cunnane: ? ? (i-.4)/(N+.2)
>>> ? ? ? ? Gringorten: ?(i-.44)/(N+.12)
>>> ? ? ? ? California: ?(i-1)/N
>>>
>>> ? ? :author: David Huard
>>> ? ? """
>>> ? ? i = np.argsort(np.argsort(data)) + 1.
>>> ? ? nobs = len(data)
>>> ? ? method = method.lower()
>>> ? ? if method == 'hazen':
>>> ? ? ? ? cdf = (i-0.5)/nobs
>>> ? ? elif method == 'weibull':
>>> ? ? ? ? cdf = i/(nobs+1.)
>>> ? ? elif method == 'california':
>>> ? ? ? ? cdf = (i-1.)/nobs
>>> ? ? elif method == 'chegodayev':
>>> ? ? ? ? cdf = (i-.3)/(nobs+.4)
>>> ? ? elif method == 'cunnane':
>>> ? ? ? ? cdf = (i-.4)/(nobs+.2)
>>> ? ? elif method == 'gringorten':
>>> ? ? ? ? cdf = (i-.44)/(nobs+.12)
>>> ? ? else:
>>> ? ? ? ? raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.'
>>> ? ? return cdf
>>
>>
>>Unfortunately it's 1d only, and I am working on multivariate, at least
>>bivariate.
>>
>>Pierre has a 1d version similar to this in scipy.stats.mstats and a,
>>so far unused, copy is in statsmodels.
>>
>>Thanks,
>>Josef
>>
>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>_______________________________________________
>>SciPy-User mailing list
>>SciPy-User at scipy.org
>>http://mail.scipy.org/mailman/listinfo/scipy-user
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From robert.kern at gmail.com  Wed Aug 24 19:25:12 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 24 Aug 2011 18:25:12 -0500
Subject: [SciPy-User] multivariate empirical distribution function,
 avoid double loop ?
In-Reply-To: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
References: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
Message-ID: <CAF6FJiuXQ4aWB0wnSMzdD5zK=bFWG2wEC0YogrwdYk8bLTC2KA@mail.gmail.com>

On Wed, Aug 24, 2011 at 09:23,  <josef.pktd at gmail.com> wrote:
> Does anyone know whether there is an algorithm that avoids the double
> loop to get a multivariate empirical distribution function?
>
> for point in data:
> ? ? count how many points in data are smaller or equal to point
>
> with 1d data it's just argsort(argsort(data))
>
> double loop version with some test cases is attached.
>
> I didn't see a way that sorting would help.

If you can bear to make a few (nobs, nobs) bool arrays, you can do
just a kvars-sized loop in Python:

dominates = np.ones((len(data), len(data)), dtype=bool)
for x in data.T:
    dominates &= x[:,np.newaxis] > x
sorta_ranks = dominates.sum(axis=1)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From josef.pktd at gmail.com  Wed Aug 24 21:23:09 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 24 Aug 2011 21:23:09 -0400
Subject: [SciPy-User] multivariate empirical distribution function,
 avoid double loop ?
In-Reply-To: <CAF6FJiuXQ4aWB0wnSMzdD5zK=bFWG2wEC0YogrwdYk8bLTC2KA@mail.gmail.com>
References: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
	<CAF6FJiuXQ4aWB0wnSMzdD5zK=bFWG2wEC0YogrwdYk8bLTC2KA@mail.gmail.com>
Message-ID: <CAMMTP+B03K1fQdvyaYjcKm_q1q-2A=DiwxJorXmP=Wi_jdwZpw@mail.gmail.com>

On Wed, Aug 24, 2011 at 7:25 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Aug 24, 2011 at 09:23, ?<josef.pktd at gmail.com> wrote:
>> Does anyone know whether there is an algorithm that avoids the double
>> loop to get a multivariate empirical distribution function?
>>
>> for point in data:
>> ? ? count how many points in data are smaller or equal to point
>>
>> with 1d data it's just argsort(argsort(data))
>>
>> double loop version with some test cases is attached.
>>
>> I didn't see a way that sorting would help.
>
> If you can bear to make a few (nobs, nobs) bool arrays, you can do
> just a kvars-sized loop in Python:
>
> dominates = np.ones((len(data), len(data)), dtype=bool)
> for x in data.T:
> ? ?dominates &= x[:,np.newaxis] > x
> sorta_ranks = dominates.sum(axis=1)

Thanks, quite a bit better, 14 times faster for (5000,2) and still 2.5
times faster for (5000,20),
12 times for (10000,3) compared to my original.

Josef

>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
> ? -- Umberto Eco
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From charlesr.harris at gmail.com  Thu Aug 25 00:11:17 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 24 Aug 2011 22:11:17 -0600
Subject: [SciPy-User] How to print the jacobian in the output of leastsq
	function
In-Reply-To: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com>
References: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com>
Message-ID: <CAB6mnxLhj-kLX1GsrovZ1Y6zanD2Fj-3w4b=gntnmsjOMzb4KQ@mail.gmail.com>

On Tue, Aug 23, 2011 at 9:57 AM, Abraham Zamudio
<abraham.zamudio at gmail.com>wrote:

> Hi All,
>
> i use  the leastsq function from module scipy ... but what I want now
> is the jacobian matrix of the algorithm ... How should I use this
> function to print the Jacobian ??? .
>
>
That's tricky, as what is returned is the qr factorization of the Jacobian
stored in condensed form containing the r part and what I think are the
vectors of the Householder reflections that can be used to generate q. In
addition, the columns are pivoted. So I think it would take a bit of
research and work to recover the Jacobian. Perhaps someone here knows a bit
more about the specific function used in this function. If you just want the
covariance of the coefficients, that is easier to get, but note the
documentation is incorrect, you need to multiply by the variance, not the
standard deviation.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110824/111e4c7b/attachment.html>

From questions.anon at gmail.com  Thu Aug 25 00:39:47 2011
From: questions.anon at gmail.com (questions anon)
Date: Thu, 25 Aug 2011 14:39:47 +1000
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <CADFCwjJar8bDNsTy+O+d-yJ6wNEy-wsCrhFF21e223Am6PJxcg@mail.gmail.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
	<CADFCwjJar8bDNsTy+O+d-yJ6wNEy-wsCrhFF21e223Am6PJxcg@mail.gmail.com>
Message-ID: <CAN_=ogtyDecmxy5aEPDb38w6sTAfwgoVBf-p6MVDFyOeRpmmtQ@mail.gmail.com>

Thanks for your response.
The error I am receiving is:
*
*
*Traceback (most recent call last):*
*  File "d:\documents and settings\SLBurns\Work\My
Dropbox\Python_code\calculate_the_mean_across_multiple_netcdf_files_in_multiple_folders_add_shp_select_dirs.py",
line 50, in <module>*
*    big_array=N.ma.concatenate(all_TSFC)*
*  File "C:\Python27\lib\site-packages\numpy\ma\core.py", line 6155, in
concatenate*
*    d = np.concatenate([getdata(a) for a in arrays], axis)*
*MemoryError*


I have tried ignoring TIME and only using one slice of lat and long (because
they are the same for every file). I also tried entering the gc.collect() in
the loop but nothing seemed to help.
Anything else I could try? I am dealing with hundreds of files so maybe I
need a whole different method to calculate the mean?


On Wed, Aug 24, 2011 at 12:54 PM, Tim Supinie <tsupinie at gmail.com> wrote:

> At what point in the program are you getting the error?  Is there a stack
> trace?
>
> Pending the answers to those to questions, my first thought is to ask how
> much data you're loading into memory?  How many files are there?  It's
> possible that you're loading a whole bunch of data that you don't need, and
> it's not getting cleared out by the garbage collector, which can generate
> memory errors when you run out of memory.  Try removing as much data loading
> as you can.  (Are you using TIME?  How big is each array you load in?)
> Also, if the lats and lons in all the different files are the same, only
> load the lats and lons from one file.  All these will not only help your
> program use less memory, but help it run faster.
>
> Finally, if that doesn't work, use the gc module and run gc.collect() after
> every loop iteration to make sure Python's cleaning up after itself like it
> should.  I think the garbage collector might not always run during loops,
> which can create problems when you're loading a whole bunch of unused data.
>
> Tim
>
> On Tue, Aug 23, 2011 at 6:00 PM, questions anon <questions.anon at gmail.com>wrote:
>
>> Hi All,
>> I am receiving a memory error when I try to calculate the Numpy mean
>> across many NetCDF files.
>> Is there a way to fix this? The code I am using is below.
>> Any feedback will be greatly appreciated.
>>
>>
>> from netCDF4 import Dataset
>> import matplotlib.pyplot as plt
>> import numpy as N
>> from mpl_toolkits.basemap import Basemap
>> from netcdftime import utime
>> from datetime import datetime
>> import os
>>
>> MainFolder=r"E:/GriddedData/T_SFC/"
>>
>> all_TSFC=[]
>> for (path, dirs, files) in os.walk(MainFolder):
>>     for dir in dirs:
>>         print dir
>>     path=path+'/'
>>     for ncfile in files:
>>         if ncfile[-3:]=='.nc':
>>             #print "dealing with ncfiles:", ncfile
>>             ncfile=os.path.join(path,ncfile)
>>             ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
>>             TSFC=ncfile.variables['T_SFC'][4::24,:,:]
>>             LAT=ncfile.variables['latitude'][:]
>>             LON=ncfile.variables['longitude'][:]
>>             TIME=ncfile.variables['time'][:]
>>             fillvalue=ncfile.variables['T_SFC']._FillValue
>>             ncfile.close()
>>
>>             #combine all TSFC to make one array for analyses
>>             all_TSFC.append(TSFC)
>>
>> big_array=N.ma.concatenate(all_TSFC)
>> #calculate the mean of the combined array
>> Mean=big_array.mean(axis=0)
>> print "the mean is", Mean
>>
>>
>> #plot output summary stats
>> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
>>               llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
>> map.drawcoastlines()
>> map.drawstates()
>> x,y=map(*N.meshgrid(LON,LAT))
>> plt.title('TSFC Mean at 3pm')
>> ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
>> CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
>> l,b,w,h =0.1,0.1,0.8,0.8
>> cax = plt.axes([l+w+0.025, b, 0.025, h])
>> plt.colorbar(CS,cax=cax, drawedges=True)
>>
>> plt.savefig((os.path.join(MainFolder, 'Mean.png')))
>> plt.show()
>> plt.close()
>>
>> print "end processing"
>>
>>
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110825/6a99a7bc/attachment.html>

From josef.pktd at gmail.com  Thu Aug 25 00:50:04 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 25 Aug 2011 00:50:04 -0400
Subject: [SciPy-User] How to print the jacobian in the output of leastsq
	function
In-Reply-To: <CAB6mnxLhj-kLX1GsrovZ1Y6zanD2Fj-3w4b=gntnmsjOMzb4KQ@mail.gmail.com>
References: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com>
	<CAB6mnxLhj-kLX1GsrovZ1Y6zanD2Fj-3w4b=gntnmsjOMzb4KQ@mail.gmail.com>
Message-ID: <CAMMTP+A8hXdXe_BZVfgV-MU1rKPSLP78EK9aE6_cFMcCF0KQLA@mail.gmail.com>

On Thu, Aug 25, 2011 at 12:11 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Aug 23, 2011 at 9:57 AM, Abraham Zamudio <abraham.zamudio at gmail.com>
> wrote:
>>
>> Hi All,
>>
>> i use ?the leastsq function from module scipy ... but what I want now
>> is the jacobian matrix of the algorithm ... How should I use this
>> function to print the Jacobian ??? .
>>
>
> That's tricky, as what is returned is the qr factorization of the Jacobian
> stored in condensed form containing the r part and what I think are the
> vectors of the Householder reflections that can be used to generate q. In
> addition, the columns are pivoted. So I think it would take a bit of
> research and work to recover the Jacobian. Perhaps someone here knows a bit
> more about the specific function used in this function. If you just want the
> covariance of the coefficients, that is easier to get, but note the
> documentation is incorrect, you need to multiply by the variance, not the
> standard deviation.

Is it even possible to recover the Jacobian from this?

I never found a way (but I'm not an expert). I gave up and just
calculate a numerical derivative at the solution.

Since this question shows up regularly it would also be good to have
an answer if it is a definite NO (or at least not with what the
underlying function returns).

Josef

>
> Chuck
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From charlesr.harris at gmail.com  Thu Aug 25 01:04:06 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 24 Aug 2011 23:04:06 -0600
Subject: [SciPy-User] How to print the jacobian in the output of leastsq
	function
In-Reply-To: <CAMMTP+A8hXdXe_BZVfgV-MU1rKPSLP78EK9aE6_cFMcCF0KQLA@mail.gmail.com>
References: <3c621117-68fe-4be3-8fd4-f8d398b240a3@t29g2000vby.googlegroups.com>
	<CAB6mnxLhj-kLX1GsrovZ1Y6zanD2Fj-3w4b=gntnmsjOMzb4KQ@mail.gmail.com>
	<CAMMTP+A8hXdXe_BZVfgV-MU1rKPSLP78EK9aE6_cFMcCF0KQLA@mail.gmail.com>
Message-ID: <CAB6mnxLwC5xdNjuwpuYvrW-tobOnLsPGsjOQRCS5E7w6tasjRQ@mail.gmail.com>

On Wed, Aug 24, 2011 at 10:50 PM, <josef.pktd at gmail.com> wrote:

> On Thu, Aug 25, 2011 at 12:11 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Tue, Aug 23, 2011 at 9:57 AM, Abraham Zamudio <
> abraham.zamudio at gmail.com>
> > wrote:
> >>
> >> Hi All,
> >>
> >> i use  the leastsq function from module scipy ... but what I want now
> >> is the jacobian matrix of the algorithm ... How should I use this
> >> function to print the Jacobian ??? .
> >>
> >
> > That's tricky, as what is returned is the qr factorization of the
> Jacobian
> > stored in condensed form containing the r part and what I think are the
> > vectors of the Householder reflections that can be used to generate q. In
> > addition, the columns are pivoted. So I think it would take a bit of
> > research and work to recover the Jacobian. Perhaps someone here knows a
> bit
> > more about the specific function used in this function. If you just want
> the
> > covariance of the coefficients, that is easier to get, but note the
> > documentation is incorrect, you need to multiply by the variance, not the
> > standard deviation.
>
> Is it even possible to recover the Jacobian from this?
>
> I never found a way (but I'm not an expert). I gave up and just
> calculate a numerical derivative at the solution.
>
> Since this question shows up regularly it would also be good to have
> an answer if it is a definite NO (or at least not with what the
> underlying function returns).
>
>
I'm pretty sure the Jacobian can be recovered but I'd have to read through
the code to see how.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110824/66ebc6b5/attachment.html>

From srean.list at gmail.com  Thu Aug 25 01:53:59 2011
From: srean.list at gmail.com (srean)
Date: Thu, 25 Aug 2011 00:53:59 -0500
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <CAN_=ogtyDecmxy5aEPDb38w6sTAfwgoVBf-p6MVDFyOeRpmmtQ@mail.gmail.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
	<CADFCwjJar8bDNsTy+O+d-yJ6wNEy-wsCrhFF21e223Am6PJxcg@mail.gmail.com>
	<CAN_=ogtyDecmxy5aEPDb38w6sTAfwgoVBf-p6MVDFyOeRpmmtQ@mail.gmail.com>
Message-ID: <CAJewx88nXsYa__1aFj5WvPhTUeKB4vjU8k-y7H1teiyYs_oNBQ@mail.gmail.com>

Since you are processing so many files, wouldnt it be better to update the
mean from every file and close/unload that netcdf, i.e. do it one at a time
?

You would not need to load the entore data set into memory and neither will
you need to maintain the sum (which might risk an overflow in some cases)

On Wed, Aug 24, 2011 at 11:39 PM, questions anon
<questions.anon at gmail.com>wrote:

> Thanks for your response.
> The error I am receiving is:
> *
> *
> *Traceback (most recent call last):*
> *  File "d:\documents and settings\SLBurns\Work\My
> Dropbox\Python_code\calculate_the_mean_across_multiple_netcdf_files_in_multiple_folders_add_shp_select_dirs.py",
> line 50, in <module>*
> *    big_array=N.ma.concatenate(all_TSFC)*
> *  File "C:\Python27\lib\site-packages\numpy\ma\core.py", line 6155, in
> concatenate*
> *    d = np.concatenate([getdata(a) for a in arrays], axis)*
> *MemoryError*
>
>
> I have tried ignoring TIME and only using one slice of lat and long
> (because they are the same for every file). I also tried entering
> the gc.collect() in the loop but nothing seemed to help.
> Anything else I could try? I am dealing with hundreds of files so maybe I
> need a whole different method to calculate the mean?
>
>
>
> On Wed, Aug 24, 2011 at 12:54 PM, Tim Supinie <tsupinie at gmail.com> wrote:
>
>> At what point in the program are you getting the error?  Is there a stack
>> trace?
>>
>> Pending the answers to those to questions, my first thought is to ask how
>> much data you're loading into memory?  How many files are there?  It's
>> possible that you're loading a whole bunch of data that you don't need, and
>> it's not getting cleared out by the garbage collector, which can generate
>> memory errors when you run out of memory.  Try removing as much data loading
>> as you can.  (Are you using TIME?  How big is each array you load in?)
>> Also, if the lats and lons in all the different files are the same, only
>> load the lats and lons from one file.  All these will not only help your
>> program use less memory, but help it run faster.
>>
>> Finally, if that doesn't work, use the gc module and run gc.collect()
>> after every loop iteration to make sure Python's cleaning up after itself
>> like it should.  I think the garbage collector might not always run during
>> loops, which can create problems when you're loading a whole bunch of unused
>> data.
>>
>> Tim
>>
>> On Tue, Aug 23, 2011 at 6:00 PM, questions anon <questions.anon at gmail.com
>> > wrote:
>>
>>> Hi All,
>>> I am receiving a memory error when I try to calculate the Numpy mean
>>> across many NetCDF files.
>>> Is there a way to fix this? The code I am using is below.
>>> Any feedback will be greatly appreciated.
>>>
>>>
>>> from netCDF4 import Dataset
>>> import matplotlib.pyplot as plt
>>> import numpy as N
>>> from mpl_toolkits.basemap import Basemap
>>> from netcdftime import utime
>>> from datetime import datetime
>>> import os
>>>
>>> MainFolder=r"E:/GriddedData/T_SFC/"
>>>
>>> all_TSFC=[]
>>> for (path, dirs, files) in os.walk(MainFolder):
>>>     for dir in dirs:
>>>         print dir
>>>     path=path+'/'
>>>     for ncfile in files:
>>>         if ncfile[-3:]=='.nc':
>>>             #print "dealing with ncfiles:", ncfile
>>>             ncfile=os.path.join(path,ncfile)
>>>             ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
>>>             TSFC=ncfile.variables['T_SFC'][4::24,:,:]
>>>             LAT=ncfile.variables['latitude'][:]
>>>             LON=ncfile.variables['longitude'][:]
>>>             TIME=ncfile.variables['time'][:]
>>>             fillvalue=ncfile.variables['T_SFC']._FillValue
>>>             ncfile.close()
>>>
>>>             #combine all TSFC to make one array for analyses
>>>             all_TSFC.append(TSFC)
>>>
>>> big_array=N.ma.concatenate(all_TSFC)
>>> #calculate the mean of the combined array
>>> Mean=big_array.mean(axis=0)
>>> print "the mean is", Mean
>>>
>>>
>>> #plot output summary stats
>>> map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
>>>               llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
>>> map.drawcoastlines()
>>> map.drawstates()
>>> x,y=map(*N.meshgrid(LON,LAT))
>>> plt.title('TSFC Mean at 3pm')
>>> ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
>>> CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
>>> l,b,w,h =0.1,0.1,0.8,0.8
>>> cax = plt.axes([l+w+0.025, b, 0.025, h])
>>> plt.colorbar(CS,cax=cax, drawedges=True)
>>>
>>> plt.savefig((os.path.join(MainFolder, 'Mean.png')))
>>> plt.show()
>>> plt.close()
>>>
>>> print "end processing"
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110825/1f3c0f6d/attachment.html>

From ralf.gommers at googlemail.com  Thu Aug 25 03:05:03 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Thu, 25 Aug 2011 09:05:03 +0200
Subject: [SciPy-User] Speeding up Python Again
In-Reply-To: <CAABz-z_s0F6edGG99qVcMPkqMeDbZ-AcLiUQJ_oZHB_rUrc9qw@mail.gmail.com>
References: <CAABz-z_6y5_ghTUCG-EfWu0bR6f6+5UK8hN_=ZOB8D+chrMxEA@mail.gmail.com>
	<CAABz-z_s0F6edGG99qVcMPkqMeDbZ-AcLiUQJ_oZHB_rUrc9qw@mail.gmail.com>
Message-ID: <CABL7CQiNm_FuZAZygw+P3pJeX=C7YfaRKJa5VhV9LiPrb2QC4A@mail.gmail.com>

On Tue, Aug 23, 2011 at 11:11 AM, Rajeev Singh <rajs2010 at gmail.com> wrote:

>
>
> On Wed, Aug 10, 2011 at 6:48 PM, Rajeev Singh <rajs2010 at gmail.com> wrote:
> > Hi,
> > I was trying out the codes discussed
> > at
> http://technicaldiscovery.blogspot.com/2011/07/speeding-up-python-again.html
> > Here is a summary of my results -
> >             Computer: Desktop    imsc9    aravali   annapurna
> >                NumPy: 7.651419  4.219105  5.576453  4.858640
> >               Cython: 4.259419  3.477259  3.204909  2.357819
> >                Weave: 4.302778     *      3.298551  2.400000
> >       Looped Fortran: 4.199148  3.414484  3.202963  2.315644
> >   Vectorized Fortran: 3.118410  2.131966  1.512303  1.460251
> > pure fortran update1: 1.205727  1.964857  2.034688  1.336086
> > pure fortran update2: 0.600848  0.604649  0.573593  0.721339
> > imsc9, aravali and annapurna are HPC machines at my institute
> > * for some reason Weave didn't compile on imsc9
> >
> > Indeed there is about a factor of 7 to 12 difference between pure fortran
> > with update2 (vectorized) and the numpy version.
> > I should mention that I changed N to 150 in laplace_for.f90
> > Rajeev
>
> Hi,
>
> Continuing the comparison of various ways of implementing solving laplace
> equation, following result might interest you -
>
>                                      Desktop   imsc9  aravali annapurna
>                         Octave (0):  20.7866     *    21.6179     *
>      Vectorized Fortran (pure) (1):   0.7487  0.6501   0.7507  1.1619
>      Vectorized Fortran (f2py) (2):   0.7190  0.6089   0.6243  1.0312
>                          NumPy (3):   4.1343  2.5844   2.6565  3.7445
>                         Cython (4):   1.7273  1.9927   2.0471  1.3525
>                  Cython with C (5):   1.7248  1.9665   2.0354  1.3367
>                          Weave (6):   1.9818     *     2.1326  1.4003
>          Looped Fortran (f2py) (7):   1.6996  1.9657   2.0429  1.3354
>          Looped Fortran (pure) (8):   1.7189  2.0145   2.0917  1.5086
>                       C (pure) (9):   1.2820  1.9948   2.0527  1.4259
>
> imsc9, aravali and annapurna are HPC machines at my institute
> * for some reason Weave didn't compile on imsc9
> * octave isn't installed on imsc9 and annapurna
>
> The difference between numpy and fortran performance seems significant.
> However f2py does as well as pure fortran now. The difference from earlier
> case is that earlier there was a division inside the loop which I have
> replaced by multiplication by reciprocal. This does not affect the result
> but makes the execution faster in all cases except pure fortran (I guess
> fortran compiler was already doing it).
>
> I would be happy to give all the codes if someone is interested. Should we
> update the performance python page at scipy with these codes?
>
> It would be nice to this to http://www.scipy.org/PerformancePython. That
page currently has only one problem, to see a few different ones compared
with the same method gives a better impression of speed differences.

It's a wiki page, so you should be able to add your code, problem
description and results yourself.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110825/46375a11/attachment.html>

From brockp at umich.edu  Thu Aug 25 12:43:02 2011
From: brockp at umich.edu (Brock Palen)
Date: Thu, 25 Aug 2011 12:43:02 -0400
Subject: [SciPy-User] SciPy on HPC Podcast
Message-ID: <C401F4FD-8DB1-4146-905F-F0118B3B4567@umich.edu>

I host an HPC podcast www.rce-cast.com  we have had numpy featured on the show before and would like now to include SciPy into the show.

Would a Scipy dev or two be willing to take an hour to speak to us for the show?

Feel free to contact me off list.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


From jrgray at gmail.com  Fri Aug 26 12:06:19 2011
From: jrgray at gmail.com (Jeremy Gray)
Date: Fri, 26 Aug 2011 12:06:19 -0400
Subject: [SciPy-User] obtaining residuals from linear regression?
Message-ID: <CANqwJFht8hSrt2BbrBXzbMFfhyYtzrq3z8OujosUyao1CTinbg@mail.gmail.com>

Hi,

I'm new to scipy, and hope the answer is not trivial. I've searched the
archives, googled, and looked at documentation for linalg.lstsq()
numpy.polyfit(), and scipy.stats.linregress(), but it has not answered my
question.

my goal is to linearly adjust a set of observations (Y) for nuisance
variables (X1 ... Xn), so I can use the adjusted Y values in further
computations. One way to achieve what I want is to do a linear regression,
regressing out the nuisance variables, and saving the residuals (being the
part of Y that's not explained by X).

I see the option full=True returns residuals, but its the sum of the
residuals, whereas I am after the actual residuals on a case by case basis.

is there an option to get the raw residuals? it would save me computing them
again.

--Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/547f0ebc/attachment.html>

From josef.pktd at gmail.com  Fri Aug 26 12:29:28 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 26 Aug 2011 12:29:28 -0400
Subject: [SciPy-User] obtaining residuals from linear regression?
In-Reply-To: <CANqwJFht8hSrt2BbrBXzbMFfhyYtzrq3z8OujosUyao1CTinbg@mail.gmail.com>
References: <CANqwJFht8hSrt2BbrBXzbMFfhyYtzrq3z8OujosUyao1CTinbg@mail.gmail.com>
Message-ID: <CAMMTP+AOrWjeR1_CJTyMhrJb1-inhv8pijT5cioqYo0o+-t2-Q@mail.gmail.com>

On Fri, Aug 26, 2011 at 12:06 PM, Jeremy Gray <jrgray at gmail.com> wrote:
> Hi,
>
> I'm new to scipy, and hope the answer is not trivial. I've searched the
> archives, googled, and looked at documentation for linalg.lstsq()
> numpy.polyfit(), and scipy.stats.linregress(), but it has not answered my
> question.
>
> my goal is to linearly adjust a set of observations (Y) for nuisance
> variables (X1 ... Xn), so I can use the adjusted Y values in further
> computations. One way to achieve what I want is to do a linear regression,
> regressing out the nuisance variables, and saving the residuals (being the
> part of Y that's not explained by X).
>
> I see the option full=True returns residuals, but its the sum of the
> residuals, whereas I am after the actual residuals on a case by case basis.
>
> is there an option to get the raw residuals? it would save me computing them
> again.

for a full answer for the linear (in parameter) case

http://statsmodels.sourceforge.net/generated/scikits.statsmodels.regression.linear_model.OLS.html

http://statsmodels.sourceforge.net/generated/scikits.statsmodels.regression.linear_model.RegressionResults.html

Josef

>
> --Jeremy
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From Dharhas.Pothina at twdb.state.tx.us  Fri Aug 26 12:52:27 2011
From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina)
Date: Fri, 26 Aug 2011 11:52:27 -0500
Subject: [SciPy-User] Calculate surface area & volumes from delaunay
	triangulation
Message-ID: <4E5788FB0200009B0003D940@GWWEB.twdb.state.tx.us>


Hi, 

We have an old ArcGIS aml script that we are trying to replace. The original script takes the input from an ArcGIS TIN model (basically a delaunay triangulation of irregular xyz data points defining the bottom surface of the lake) and calculates the surface area and volume of the lake at different elevations (i.e. z cut planes) 

>From my googling it looks like I have options for the delaunay triangulation using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface area and volume calculations once I have the triangulation. I would appreciate any pointers. 

thanks, 

- dharhas  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/cf285eed/attachment.html>

From philmorefield at yahoo.com  Fri Aug 26 13:31:00 2011
From: philmorefield at yahoo.com (Phil Morefield)
Date: Fri, 26 Aug 2011 10:31:00 -0700 (PDT)
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
Message-ID: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com>

First off, the netCDF4 module has a multi-file class that concatonates multiple netCDF files for you: http://netcdf4-python.googlecode.com/svn/trunk/docs/netCDF4.MFDataset-class.html. That will simplify things a bit.
?
Second, usually the "TIME" dimension is axis=2. Axes 0 and 1 usually correspond to the X and Y dimensions.
?
Finally, you're getting the MemoryError because you're trying to?put an ginormous array into memory all at once. Your OS can't handle it.?Just loop through each?time step and keep a running total and?counter. Then divide your total?(which is an array) by?your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry.
?
?
?

From: questions anon <questions.anon at gmail.com>
To: scipy-user at scipy.org
Sent: Tuesday, August 23, 2011 7:00 PM
Subject: [SciPy-User] memory error - numpy mean - netcdf4


Hi All, 
I am receiving a memory error when I try to calculate the Numpy mean across many NetCDF files.
Is there a way to fix this? The code I am using is below.
Any feedback will be greatly appreciated.


from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as N
from mpl_toolkits.basemap import Basemap
from netcdftime import utime
from datetime import datetime
import os

MainFolder=r"E:/GriddedData/T_SFC/"

all_TSFC=[]?
for (path, dirs, files) in os.walk(MainFolder):
? ? for dir in dirs:
? ? ? ? print dir
? ? path=path+'/'
? ? for ncfile in files:
? ? ? ? if ncfile[-3:]=='.nc':
? ? ? ? ? ? #print "dealing with ncfiles:", ncfile
? ? ? ? ? ? ncfile=os.path.join(path,ncfile)
? ? ? ? ? ? ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
? ? ? ? ? ? TSFC=ncfile.variables['T_SFC'][4::24,:,:]
? ? ? ? ? ? LAT=ncfile.variables['latitude'][:]
? ? ? ? ? ? LON=ncfile.variables['longitude'][:]
? ? ? ? ? ? TIME=ncfile.variables['time'][:]
? ? ? ? ? ? fillvalue=ncfile.variables['T_SFC']._FillValue
? ? ? ? ? ? ncfile.close()

? ? ? ? ? ? #combine all TSFC to make one array for analyses
? ? ? ? ? ? all_TSFC.append(TSFC)
? ? ? ? ? ?
big_array=N.ma.concatenate(all_TSFC)
#calculate the mean of the combined array
Mean=big_array.mean(axis=0)
print "the mean is", Mean


#plot output summary stats
map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
? ? ? ? ? ? ? llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
map.drawcoastlines()
map.drawstates()
x,y=map(*N.meshgrid(LON,LAT))
plt.title('TSFC Mean at 3pm')
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h])
plt.colorbar(CS,cax=cax, drawedges=True)

plt.savefig((os.path.join(MainFolder, 'Mean.png')))
plt.show()
plt.close()

print "end processing" ? ? ? ? ?


_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/58ef882c/attachment.html>

From srean.list at gmail.com  Fri Aug 26 14:00:18 2011
From: srean.list at gmail.com (srean)
Date: Fri, 26 Aug 2011 13:00:18 -0500
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
	<1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com>
Message-ID: <CAJewx89dBwqHBGkXTnp55nsiebbN-yL6ESb0C_aBipTfpoa7jw@mail.gmail.com>

> Finally, you're getting the MemoryError because you're trying to put an
> ginormous array into memory all at once. Your OS can't handle it. Just loop
> through each time step and keep a running total and counter. Then divide
> your total (which is an array) by your counter (which is an integer or
> float) and presto: you have your average. It's plenty fast, don't worry.
>

In fact one can even avoid keeping the running total. If the values are
integers then the running total may overflow.

Say you have the mean \mu computed from N points and you have a new
collection of m points whose mean is t.
Then the mean on the N + m points is:   \mu_{new} = \mu + (m)/(N+m) ( t -
\mu)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/6256edb0/attachment.html>

From philmorefield at yahoo.com  Fri Aug 26 15:33:53 2011
From: philmorefield at yahoo.com (Phil Morefield)
Date: Fri, 26 Aug 2011 12:33:53 -0700 (PDT)
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <CAJewx89dBwqHBGkXTnp55nsiebbN-yL6ESb0C_aBipTfpoa7jw@mail.gmail.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
	<1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com>
	<CAJewx89dBwqHBGkXTnp55nsiebbN-yL6ESb0C_aBipTfpoa7jw@mail.gmail.com>
Message-ID: <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com>

"If the values are integers then the running total may overflow."

That's a good point. Though you could just do this:
?
###################################
import numpy as np
?
array = netcdf_variable[0]
?
for i in xrange(1, len(netcdf_variable) - 1, 1):
????array = np.true_divide(np.add(array,?array[i]), 2.0)
###################################
?
The formula you have written looks like you're collapsing everything into a single value. I think he's trying to average a bunch of 2D arrays into a single 2D array.
?
?
?
?
From: srean <srean.list at gmail.com>
To: Phil Morefield <philmorefield at yahoo.com>; SciPy Users List <scipy-user at scipy.org>
Sent: Friday, August 26, 2011 2:00 PM
Subject: Re: [SciPy-User] memory error - numpy mean - netcdf4


Finally, you're getting the MemoryError because you're trying to?put an ginormous array into memory all at once. Your OS can't handle it.?Just loop through each?time step and keep a running total and?counter. Then divide your total?(which is an array) by?your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry.
>

In fact one can even avoid keeping the running total. If the values are integers then the running total may overflow.

Say you have the mean \mu computed from N points and you have a new collection of m points whose mean is t.
Then the mean on the N + m points is: ? \mu_{new} = \mu + (m)/(N+m) ( t - \mu) 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/4625f64d/attachment.html>

From philmorefield at yahoo.com  Fri Aug 26 17:58:17 2011
From: philmorefield at yahoo.com (Phil Morefield)
Date: Fri, 26 Aug 2011 14:58:17 -0700 (PDT)
Subject: [SciPy-User] Fw:  memory error - numpy mean - netcdf4
In-Reply-To: <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
	<1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com>
	<CAJewx89dBwqHBGkXTnp55nsiebbN-yL6ESb0C_aBipTfpoa7jw@mail.gmail.com>
	<1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com>
Message-ID: <1314395897.41715.YahooMailNeo@web161309.mail.bf1.yahoo.com>

import numpy as np
?
array = netcdf_variable[0]
?
for i in xrange(1, len(netcdf_variable) - 1, 1):
????array = np.true_divide(np.add(array,?array[i]), 2.0)
?
?
Oops. That's not right. That's what I get for being hasty. Something like this maybe:
?
#########################################
import numpy as np
?
array = np.true_divide(netcdf_variable[0], len(netcdf_variable))
?
for i in xrange(1, len(netcdf_variable) - 1, 1):
????array = np.add(array, np.true_divide(array[i], len(netcdf_variable)))
#########################################
?


----- Forwarded Message -----
From: Phil Morefield <philmorefield at yahoo.com>
To: srean <srean.list at gmail.com>; SciPy Users List <scipy-user at scipy.org>
Sent: Friday, August 26, 2011 3:33 PM
Subject: Re: [SciPy-User] memory error - numpy mean - netcdf4


"If the values are integers then the running total may overflow."

That's a good point. Though you could just do this:

###################################
import numpy as np

array = netcdf_variable[0]

for i in xrange(1, len(netcdf_variable) - 1, 1):
????array = np.true_divide(np.add(array,?array[i]), 2.0)
###################################
?
The formula you have written looks like you're collapsing everything into a single value. I think he's trying to average a bunch of 2D arrays into a single 2D array.
?
?
?
?
From: srean <srean.list at gmail.com>
To: Phil Morefield <philmorefield at yahoo.com>; SciPy Users List <scipy-user at scipy.org>
Sent: Friday, August 26, 2011 2:00 PM
Subject: Re: [SciPy-User] memory error - numpy mean - netcdf4


Finally, you're getting the MemoryError because you're trying to?put an ginormous array into memory all at once. Your OS can't handle it.?Just loop through each?time step and keep a running total and?counter. Then divide your total?(which is an array) by?your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry.
>

In fact one can even avoid keeping the running total. If the values are integers then the running total may overflow.

Say you have the mean \mu computed from N points and you have a new collection of m points whose mean is t.
Then the mean on the N + m points is: ? \mu_{new} = \mu + (m)/(N+m) ( t - \mu) 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/68b86f0d/attachment.html>

From srean.list at gmail.com  Fri Aug 26 20:54:41 2011
From: srean.list at gmail.com (srean)
Date: Fri, 26 Aug 2011 19:54:41 -0500
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
	<1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com>
	<CAJewx89dBwqHBGkXTnp55nsiebbN-yL6ESb0C_aBipTfpoa7jw@mail.gmail.com>
	<1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com>
Message-ID: <CAJewx8_QGwDoN7xyt0_A490DsEL1ypWPNR01Ht1=8aNb-YLNjw@mail.gmail.com>

On Fri, Aug 26, 2011 at 2:33 PM, Phil Morefield <philmorefield at yahoo.com>wrote:

>
> The formula you have written looks like you're collapsing everything into a
> single value. I think he's trying to average a bunch of 2D arrays into a
> single 2D array.
>

You are correct, the form that I posted can be read as if it is  for
updating single mean vector \mu, but you can use the same for an nd-array
trivially. Just have \mu and t as nd-arrays. m can be one too.  Numpy
broadcasting will take care of the rest.

One advantage is that it requires only a constant amount of memory for the
computation, you can even read the data in from an infinite pipe or
generator that yields a single vector or a matrix at a time (or bundles them
up m at a time). It will always be uptodate with the current estimate of the
means. In fact will work for any moment too.

--srean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/2babc07a/attachment.html>

From lutz.maibaum at gmail.com  Fri Aug 26 21:39:54 2011
From: lutz.maibaum at gmail.com (Lutz Maibaum)
Date: Fri, 26 Aug 2011 18:39:54 -0700
Subject: [SciPy-User] 0 * inf = 0 for some sparse matrix operations
Message-ID: <E1A18DC7-617E-488E-A540-C2745433EDF1@gmail.com>

Hello,

most of the time, trying to multiply 0 by infinity results in NaNs. However, in some sparse matrix operations such as matrix multiplication, such terms seem to be silently ignored (or interpreted as zeros). Is this the intended behavior? Is there a way to perform matrix multiplications such that 0*inf = NaN?

Any help would be much appreciated. Best wishes,

  Lutz


In [2]: import numpy as np

In [3]: from scipy.sparse import csr_matrix

In [4]: 0 * np.inf
Out[4]: nan

In [5]: np.array([0]) * np.array([np.inf])
/opt/local/bin/ipython-2.6:1: RuntimeWarning: invalid value encountered in multiply
  #!/opt/local/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python
Out[5]: array([ nan])

In [6]: np.array([0]).dot(np.array([np.inf]))
Out[6]: nan

In [7]: (csr_matrix([0]) * csr_matrix([np.inf])).toarray()
Out[7]: array([[ 0.]])

In [8]: (csr_matrix([0]).dot(csr_matrix([np.inf]))).toarray()
Out[8]: array([[ 0.]])

In [9]: (csr_matrix([0]).multiply(csr_matrix([np.inf]))).toarray()
Out[9]: array([[ nan]])


From dbigbear at gmail.com  Sat Aug 27 01:19:09 2011
From: dbigbear at gmail.com (Xiong Deng)
Date: Sat, 27 Aug 2011 13:19:09 +0800
Subject: [SciPy-User] Install Scipy Errors: ImportError:
 /path_to/liblapack.so: undefined symbol: ztbsv_
Message-ID: <CADx86fFnXE-gJSF3MfDj7O=EAPQp4o3iXVMi3kbvVQhsQ5nNZA@mail.gmail.com>

Hi all,

I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times, but
always encountering the problem:

[work at XXX]$ python -c 'import scipy.optimize; scipy.optimize.test()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/scipy/optimize/__init__.py",
line 11, in <module>
    from lbfgsb import fmin_l_bfgs_b
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py",
line 28, in <module>
    import _lbfgsb
ImportError: /home/work/local/lib/liblapack.so: undefined symbol: ztbsv_

I can pass some other tests like:


[work at XXX:~/local]$ python -c 'import scipy.ndimage;
scipy.ndimage.test()'
Running unit tests for scipy.ndimage
NumPy version 1.6.1
NumPy is installed in
/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy
SciPy version 0.9.0
SciPy is installed in
/home/work/local/python-2.7.1/lib/python2.7/site-packages/scipy
Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201
(Red Hat 3.4.5-2)]
nose version 1.1.2
.........S.................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 411 tests in 1.247s

OK (SKIP=1)

The problem seems due to the lib of Lapack. So I tried the solutions posted
on the internet before.

1) The liblapack.so may be not complete...SO I tried this:
    # integrate lapack with atlas:
    cd lib/
    mkdir tmp
    cd tmp/
    ar x ../liblapack.a
    cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a
    ar r ../liblapack.a *.o
    cd ../..
    make check
    make ptcheck
    cp include/* ~/include/
    cp lib/*.a ~/lib/

That is, after installing atlas, there is another liblapack.a (in addition
to the lapack_LINUX.a after Lapack) in its lib, but it is about 500k, so I
integrate it with the lapack_LINUX.a from installing Lapack. The final
liblapack.a is about 9.3m, The liblapack.so is about 5m

2) re-install Lapack and atlas many times....No use

3) I found there is a lapack.so under scipy/lib, and it is about 500K, but I
think it may be not the problem, becaues the failure is "ImportError:
/home/work/local/lib/liblapack.so: undefined symbol: ztbsv_". Scipy seemed
to import liblapack.so in my general lib directory...

4) One thing  I am not sure is that I used gcc 4.7 and gfortran to compile
lapack and atlas, but my python 2.7 was built using gcc 3.4.5.....Is this a
problem?


Anyone can help?
_______________________________________________________________
My configuration of the installation:

* ATLAS 3.8.4
* lapack 3.3.1
* numpy 1.6.1
* SciPy version 0.9.0
* dateutil 1.5
* Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
20051201 (Red Hat 3.4.5-2)]
* nose version 1.1.2
* gcc (GCC) 4.7.0 20110820 (experimental)
* LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010 x86_64
x86_64 x86_64 GNU/Linux

site.cfg of Scipy:

[DEFAULT]
library_dirs = /home/work/local/lib
include_dirs = /home/work/local/include
[lapack_opt]
libraries = lapack, f77blas, cblas, atlas

site.cfg of Numpy:

[DEFAULT]
library_dirs = /home/work/local/lib
include_dirs = /home/work/local/include
[lapack_opt]
libraries = lapack, f77blas, cblas, atlas


In addition, there are failures as well when test Numpy:

>>> import numpy
>>> numpy.test('1')
Running unit tests for numpy
NumPy version 1.6.1
NumPy is installed in
/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy
Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201
(Red Hat 3.4.5-2)]
nose version 1.1.2
======================================================================
FAIL: Test basic arithmetic function errors
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py",
line 215, in knownfailer
    return f(*args, **kwargs)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
line 367, in test_floating_exceptions_power
    np.power, ftype(2), ftype(2**fi.nexp))
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
line 271, in assert_raises_fpe
    "Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py",
line 34, in assert_
    raise AssertionError(msg)
AssertionError: Type <type 'numpy.float64'> did not raise fpe error
'overflow'.

======================================================================
FAIL: Test generic loops.
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_ufunc.py",
line 86, in test_generic_loops
    assert_almost_equal(fone(x), fone_val, err_msg=msg)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py",
line 448, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals PyUFunc_F_F
 ACTUAL: array([ 0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j],
dtype=complex64)
 DESIRED: 1

======================================================================
FAIL: test_umath.TestComplexFunctions.test_loss_of_precision(<type
'numpy.complex64'>,)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_umath.py",
line 931, in check_loss_of_precision
    check(x_basic, 2*eps/1e-3)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_umath.py",
line 901, in check
    'arcsinh')
AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh')

======================================================================
FAIL: test_umath.TestComplexFunctions.test_precisions_consistent
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_umath.py",
line 812, in test_precisions_consistent
    assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py",
line 448, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 6 decimals fch-fcd <ufunc 'arcsin'>
 ACTUAL: 2.3561945j
 DESIRED: (0.66623943249251527+1.0612750619050355j)

======================================================================
FAIL: test_kind.TestKind.test_all
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/f2py/tests/test_kind.py",
line 30, in test_all
    'selectedrealkind(%s): expected %r but got %r' %  (i,
selected_real_kind(i), selectedrealkind(i)))
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py",
line 34, in assert_
    raise AssertionError(msg)
AssertionError: selectedrealkind(19): expected -1 but got 16

----------------------------------------------------------------------
Ran 3552 tests in 29.977s

FAILED (KNOWNFAIL=3, failures=5)
<nose.result.TextTestResult run=3552 errors=0 failures=5>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110827/ab4abfd9/attachment.html>

From pav at iki.fi  Sat Aug 27 03:53:56 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 27 Aug 2011 07:53:56 +0000 (UTC)
Subject: [SciPy-User] 3d convex hull
References: <201108241338.50940.alexandre.fayolle@logilab.fr>
	<CAOJcpR_3xR7aR2dgefWw8onfY-ApcZGZCGa=qd8xEdg2Ct5YoQ@mail.gmail.com>
	<4E551F34.4@gmail.com>
Message-ID: <j3a7qk$igl$1@dough.gmane.org>

Wed, 24 Aug 2011 23:56:36 +0800, Ning Guo wrote:
> I also want to try Delaunay function. But I cannot get enough info from
> the documentation. I want to output the Delaunay tetrahedral in 3D and
> need the vertex indices, facet areas and normals. How can I use the
> function in scipy.spatial? Now I have all the points with id and
> position.

Suppose you have 20 points in 3-D:

    import numpy as np
    import scipy.spatial

    points = np.random.rand(20, 3)
    tri = scipy.spatial.Delaunay(points)

The indices of the vertices of tetrahedron number `j` are
in `tri.vertices[j]`. The facet areas and normals can be computed
for each tetrahedron via vector cross products:

    tetra_points = tri.points[tri.vertices] # (N, 4, 3) array

    face_normals = np.empty_like(tetra_points)
    face_normals[:,0] = np.cross(tetra_points[:,0], tetra_points[:,1])
    face_normals[:,1] = np.cross(tetra_points[:,0], tetra_points[:,2])
    face_normals[:,2] = np.cross(tetra_points[:,0], tetra_points[:,3])
    face_normals[:,3] = np.cross(tetra_points[:,1], tetra_points[:,2])

    face_normal_lengths = np.sqrt(np.sum(face_normals**2, axis=2))

    face_normals /= face_normal_lengths[:,:,np.newaxis]
    face_areas = 0.5 * face_normal_lengths

One important point I don't know at the moment is if those normals
actually point away from the center of the tetrahedra. You'd have
to check the Qhull documentation to check whether they have a winding
convention that guarantees certain ordering of the vertices of
the simplices.

(Note: the above is untested code, so check it works first :)

-- 
Pauli Virtanen


From ralf.gommers at googlemail.com  Sat Aug 27 07:59:47 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 27 Aug 2011 13:59:47 +0200
Subject: [SciPy-User] lmfit-py -- simple least squares minimization
In-Reply-To: <CA+7ESbpt4HFJ9Vti4MUAGG9h+ouaxMkFjVS+QwYUY+ffcJOjYw@mail.gmail.com>
References: <CA+7ESbpt4HFJ9Vti4MUAGG9h+ouaxMkFjVS+QwYUY+ffcJOjYw@mail.gmail.com>
Message-ID: <CABL7CQjdxO4qqHPo0v72zf6_zBZDBqzehtFLFMEnzVfr9C=4KQ@mail.gmail.com>

Hi Matt,

On Mon, Aug 15, 2011 at 3:05 PM, Matt Newville
<newville at cars.uchicago.edu>wrote:

> Hi,
>
> Having used on numpy and scipy for many years and being very pleased
> with them, I've found an area which I think might benefit from a
> modest improvement, and have tried to implement this.
>
> The scipy.optimize routines are robust, but seem a little unfriendly
> to people coming from proprietary environments or Numerical
> Recipes-level tools.   Specifically, the Levenberg-Marquardt algorithm
> is used heavily in many domains (including the x-ray spectroscopy
> fields I am most familiar with), but the MINPACK and
> scipy.optimize.leastsq implementation lack convenient ways to:
>      -  turn on/off parameters for fitting,  that is, to "fix"
> certain parameters.
>      -  place simple min/max bounds on parameters
>      -  place simple mathematical constraints on parameters.
>
> While these limitations can be worked around, doing so requires
> putting many options into the function to be minimized, which is
> somewhat inconvenient.    On the other hand, these features do exist
> in less robust fitting code that is not based on directly on MINPACK
> or as well-supported as scipy.
>
> I've written a module to do this so that the least-squares
> minimization from scipy.optimize.leastsq can take bounded and
> constrained parameters, and tried to make it of general use.    This
> code (BSD-licensed, somewhat documented) is at
>         http://github.com/newville/lmfit-py
>
> The constraint mechanism is a bit involved (using the ast module
> instead of 'eval'), but the rest of the code is quite straightforward
> and simple.   Currently, this supports minimization with
> scipy.optimize.leastsq, scipy.optimize.fmin_l_bfgs_b, and
> scipy.optimize.anneal. Supporting other algorithms could be possible.
>
> If you find this interesting or useful, I'd appreciate any feedback
> you might have.  For example, this is not currently organized as a
> scikit -- would that be preferable?
>
>
This will probably be useful to me at some point. Whether or not you
organize it as a scikit, it may be good to list your package at
http://scipy.org/Topical_Software.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110827/73311e6c/attachment.html>

From otrov at hush.ai  Fri Aug 26 13:25:10 2011
From: otrov at hush.ai (Kliment)
Date: Fri, 26 Aug 2011 19:25:10 +0200
Subject: [SciPy-User] Return variable value by function value
Message-ID: <20110826172510.2EC696F446@smtp.hushmail.com>

Hello,

this will be very simple to any of you I guess, but I don't know 
well numpy.

I declare "x = arange(1,100)" and "y = sqrt(1 - x**2/10E+4)"
How can I return x value when y = 0.95 for example?

I hope I don't have to transform y equation by x as I know that and 
expect some more strait approach


Thanks for your time


From josef.pktd at gmail.com  Sat Aug 27 10:35:44 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 27 Aug 2011 10:35:44 -0400
Subject: [SciPy-User] Return variable value by function value
In-Reply-To: <20110826172510.2EC696F446@smtp.hushmail.com>
References: <20110826172510.2EC696F446@smtp.hushmail.com>
Message-ID: <CAMMTP+DqGeQcxW9zm4Uw+LZEp1HDEkA+1PzUggVyZTUZqkF3bg@mail.gmail.com>

On Fri, Aug 26, 2011 at 1:25 PM, Kliment <otrov at hush.ai> wrote:
> Hello,
>
> this will be very simple to any of you I guess, but I don't know
> well numpy.
>
> I declare "x = arange(1,100)" and "y = sqrt(1 - x**2/10E+4)"
> How can I return x value when y = 0.95 for example?
>
> I hope I don't have to transform y equation by x as I know that and
> expect some more strait approach

numerically: scipy.optimize rootfinding, fsolve, brentq, ... see docs

Josef

>
>
> Thanks for your time
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From hhh.guo at gmail.com  Sat Aug 27 11:12:24 2011
From: hhh.guo at gmail.com (Ning Guo)
Date: Sat, 27 Aug 2011 23:12:24 +0800
Subject: [SciPy-User] 3d convex hull
In-Reply-To: <j3a7qk$igl$1@dough.gmane.org>
References: <201108241338.50940.alexandre.fayolle@logilab.fr>	<CAOJcpR_3xR7aR2dgefWw8onfY-ApcZGZCGa=qd8xEdg2Ct5YoQ@mail.gmail.com>	<4E551F34.4@gmail.com>
	<j3a7qk$igl$1@dough.gmane.org>
Message-ID: <4E590958.30103@gmail.com>

On Saturday, August 27, 2011 03:53 PM, Pauli Virtanen wrote:

Thanks Pauli!
You pointed out how to calculate the normals and areas using cross 
product. It's really smart and I will use this method if the Delaunay 
function cannot provide results directly.

Also, the formula to calculate normal may be like this:

face_normals[:,0] = np.cross(tetra_points[:,0]-tetra_points[:,2],tetra_points[:,1]-tetra_points[:,2])  // facet 0-1-2
face_normals[:,1] = np.cross(tetra_points[:,0]-tetra_points[:,3],tetra_points[:,2]-tetra_points[:,3])  // facet 0-2-3
face_normals[:,2] = np.cross(tetra_points[:,0]-tetra_points[:,1],tetra_points[:,3]-tetra_points[:,1])  // facet 0-3-1
face_normals[:,3] = np.cross(tetra_points[:,1]-tetra_points[:,3],tetra_points[:,2]-tetra_points[:,3])  // facet 1-2-3

Regarding to the order of the vertices, I'm also not sure about their 
convention. I'm trying to figure it out.

Best regards!
Ning

> Wed, 24 Aug 2011 23:56:36 +0800, Ning Guo wrote:
>> I also want to try Delaunay function. But I cannot get enough info from
>> the documentation. I want to output the Delaunay tetrahedral in 3D and
>> need the vertex indices, facet areas and normals. How can I use the
>> function in scipy.spatial? Now I have all the points with id and
>> position.
> Suppose you have 20 points in 3-D:
>
>      import numpy as np
>      import scipy.spatial
>
>      points = np.random.rand(20, 3)
>      tri = scipy.spatial.Delaunay(points)
>
> The indices of the vertices of tetrahedron number `j` are
> in `tri.vertices[j]`. The facet areas and normals can be computed
> for each tetrahedron via vector cross products:
>
>      tetra_points = tri.points[tri.vertices] # (N, 4, 3) array
>
>      face_normals = np.empty_like(tetra_points)
>      face_normals[:,0] = np.cross(tetra_points[:,0], tetra_points[:,1])
>      face_normals[:,1] = np.cross(tetra_points[:,0], tetra_points[:,2])
>      face_normals[:,2] = np.cross(tetra_points[:,0], tetra_points[:,3])
>      face_normals[:,3] = np.cross(tetra_points[:,1], tetra_points[:,2])
>
>      face_normal_lengths = np.sqrt(np.sum(face_normals**2, axis=2))
>
>      face_normals /= face_normal_lengths[:,:,np.newaxis]
>      face_areas = 0.5 * face_normal_lengths
>
> One important point I don't know at the moment is if those normals
> actually point away from the center of the tetrahedra. You'd have
> to check the Qhull documentation to check whether they have a winding
> convention that guarantees certain ordering of the vertices of
> the simplices.
>
> (Note: the above is untested code, so check it works first :)
>


-- 
Geotechnical Group
Department of Civil and Environmental Engineering
Hong Kong University of Science and Technology
Clear Water Bay, Kowloon, Hong Kong


From prabhu at aero.iitb.ac.in  Sat Aug 27 12:48:33 2011
From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran)
Date: Sat, 27 Aug 2011 22:18:33 +0530
Subject: [SciPy-User] PySPH 0.9beta release
Message-ID: <4E591FE1.2030304@aero.iitb.ac.in>

Hi,

We are pleased to announce a 0.9beta release of PySPH.  This is our 
first public release.

PySPH (http://pysph.googlecode.com) is an open source framework for
Smoothed Particle Hydrodynamics (SPH) simulations. It is implemented in
Python and the performance critical parts are implemented in Cython. The
framework provides for load balanced, parallel execution of solvers. It
is designed to be easy to extend.  Check our homepage for more details.

Quick Installation
-------------------

The major prerequisite is NumPy (http://numpy.scipy.org) and a C++
compiler.  To use the built-in viewer you will need to have Mayavi
installed.  If you need parallel support you must have mpi4py installed
but this is optional. To install a released version do:

   $ easy_install pysph

More information
-----------------

Project home:  http://pysph.googlecode.com
Documentation: http://packages.python.org/PySPH
PyPI:          http://pypi.python.org/pypi/PySPH


Cheers,
PySPH developers


From cjordan1 at uw.edu  Sat Aug 27 14:19:24 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Sat, 27 Aug 2011 14:19:24 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
Message-ID: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>

Hi--I've been a moderately heavy R user for the past two years, so
about a month ago I took an (abbreviated) version of a simple data
analysis I did in R and tried to rewrite as much of it as possible,
line by line, into python using numpy and statsmodels. I didn't use
pandas, and I can't comment on how much it might have simplified
things.

This comparison might be useful to some people, so I stuck it up on a
github repo. My overall impression is that R is much stronger for
interactive data analysis. Click on the link for more details why,
which are summarized in the README file.

https://github.com/chrisjordansquire/r_vs_py

The code examples should run out of the box with no downloads (other
than R, Python, numpy, scipy, and statsmodels) required.

-Chris Jordan-Squire


From matthew.brett at gmail.com  Sat Aug 27 14:27:39 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 27 Aug 2011 11:27:39 -0700
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
Message-ID: <CAH6Pt5qb1gNk0hy2H-O-S2ujuJJiD=D6VEpbHDDtBW20hmXrBA@mail.gmail.com>

Hi,

On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> Hi--I've been a moderately heavy R user for the past two years, so
> about a month ago I took an (abbreviated) version of a simple data
> analysis I did in R and tried to rewrite as much of it as possible,
> line by line, into python using numpy and statsmodels. I didn't use
> pandas, and I can't comment on how much it might have simplified
> things.
>
> This comparison might be useful to some people, so I stuck it up on a
> github repo. My overall impression is that R is much stronger for
> interactive data analysis. Click on the link for more details why,
> which are summarized in the README file.
>
> https://github.com/chrisjordansquire/r_vs_py
>
> The code examples should run out of the box with no downloads (other
> than R, Python, numpy, scipy, and statsmodels) required.

Thank you very much for doing that - it's a very useful exercise.  I
hope we can make use of it to discuss how to get better, in the true
spirit of:

Confront the Brutal Facts
http://en.wikipedia.org/wiki/Good_to_Great

See you,

Matthew


From cjordan1 at uw.edu  Sat Aug 27 14:44:12 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Sat, 27 Aug 2011 14:44:12 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAH6Pt5qb1gNk0hy2H-O-S2ujuJJiD=D6VEpbHDDtBW20hmXrBA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<CAH6Pt5qb1gNk0hy2H-O-S2ujuJJiD=D6VEpbHDDtBW20hmXrBA@mail.gmail.com>
Message-ID: <CAEJxiFrT37Pyad0LK+1CLWqGji+k289efq0WQX9sjugq22Sc0w@mail.gmail.com>

On Sat, Aug 27, 2011 at 2:27 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire
> <cjordan1 at uw.edu> wrote:
>> Hi--I've been a moderately heavy R user for the past two years, so
>> about a month ago I took an (abbreviated) version of a simple data
>> analysis I did in R and tried to rewrite as much of it as possible,
>> line by line, into python using numpy and statsmodels. I didn't use
>> pandas, and I can't comment on how much it might have simplified
>> things.
>>
>> This comparison might be useful to some people, so I stuck it up on a
>> github repo. My overall impression is that R is much stronger for
>> interactive data analysis. Click on the link for more details why,
>> which are summarized in the README file.
>>
>> https://github.com/chrisjordansquire/r_vs_py
>>
>> The code examples should run out of the box with no downloads (other
>> than R, Python, numpy, scipy, and statsmodels) required.
>
> Thank you very much for doing that - it's a very useful exercise. ?I
> hope we can make use of it to discuss how to get better, in the true

Hopefully. I suppose I should also mention, for those that don't want
to click on the link, that the two largest reasons R was much simpler
to use were because it was easier to construct models and easier to
view entries I'd stuck into matrices. R's graphing capabilities seemed
slightly more friendly, but that might have just been my familiarity
with them.

(As an aside, numpy arrays' print method don't make them friendly for
interactive viewing. Even ipython couldn't make a few of the matrices
I made very intelligible, and it's easy to construct examples that
make numpy arrays hideous to behold. For example,

x = np.arange(5).reshape(5,1)
y = np.ones(5).reshape(1,5)
z = x*y
z[0,0] += 0.0001
print z

[[  1.00000000e-04   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00]
 [  1.00000000e+00   1.00000000e+00   1.00000000e+00   1.00000000e+00
    1.00000000e+00]
 [  2.00000000e+00   2.00000000e+00   2.00000000e+00   2.00000000e+00
    2.00000000e+00]
 [  3.00000000e+00   3.00000000e+00   3.00000000e+00   3.00000000e+00
    3.00000000e+00]
 [  4.00000000e+00   4.00000000e+00   4.00000000e+00   4.00000000e+00
    4.00000000e+00]]

(Strangely, it looks much more tolerable if x  =
np.arange(1,6).reshape(5,1) instead.)

If you do the same thing in R,

x = rep(0:4,5)
x = matrix(x,ncol=5)
x[1,1] = 0.000001
x

you get

      [,1] [,2] [,3] [,4] [,5]
[1,] 1e-06    0    0    0    0
[2,] 1e+00    1    1    1    1
[3,] 2e+00    2    2    2    2
[4,] 3e+00    3    3    3    3
[5,] 4e+00    4    4    4    4

much more readable.)


As a simple metric, my .r file was about 1/2 the size of the .py file,
even though I couldn't do everything in python that I could in R.
(These commands were meant to be entered interactively, so the length
of the length of the file is, perhaps, a more valid metric then usual
to be concerned about.)

-Chris Jordan-Squire


> spirit of:
>
> Confront the Brutal Facts
> http://en.wikipedia.org/wiki/Good_to_Great
>
> See you,
>
> Matthew
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From dbigbear at gmail.com  Sat Aug 27 15:16:16 2011
From: dbigbear at gmail.com (Xiong Deng)
Date: Sun, 28 Aug 2011 03:16:16 +0800
Subject: [SciPy-User] How can I solve a equation like solve a function
 containint expressions like sqrt(log(x) - 1) = 2 and exp((log(x) - 1.5)**2
 - 3) = 5
Message-ID: <CADx86fGCVe_r+hXiGpmT8hK7=GsOh82ckHupxAah2rX1bYoURQ@mail.gmail.com>

HI,

Hi, I am trying to solve an equation containing both exp, log, erfc, and
they may be embedded into each other....But sympy cannot handle this, as
shown below:

>>> from sympy import solve, exp, log, pi
>>>from sympy.mpmath import *
>>>from sympy import Symbol
>>>x=Symbol('x')
>>>sigma = 4
>>>mu = 1.5
>>>solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) - mu)**2
/ sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) - 1, x)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/functions/functions.py",
line 287, in log
    return ctx.ln(x)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py",
line 984, in f
    x = ctx.convert(x)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp_python.py",
line 662, in convert
    return ctx._convert_fallback(x, strings)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/sympy/mpmath/ctx_mp.py",
line 556, in _convert_fallback
    raise TypeError("cannot create mpf from " + repr(x))
TypeError: cannot create mpf from x

But sqrt, log, exp, itself is OK, as shown as below:

>>> solve((1.0 / sqrt(2 * pi) * x * sigma) - 1, x)
[0.626657068657750]

SO, How can I solve an equation containint expressions like sqrt(log(x) -
1)=0 or exp((log(x) - mu)**2 - 3) = 0???

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/c788a1fc/attachment.html>

From josef.pktd at gmail.com  Sat Aug 27 15:55:29 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 27 Aug 2011 15:55:29 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFrT37Pyad0LK+1CLWqGji+k289efq0WQX9sjugq22Sc0w@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<CAH6Pt5qb1gNk0hy2H-O-S2ujuJJiD=D6VEpbHDDtBW20hmXrBA@mail.gmail.com>
	<CAEJxiFrT37Pyad0LK+1CLWqGji+k289efq0WQX9sjugq22Sc0w@mail.gmail.com>
Message-ID: <CAMMTP+BNX-m+dDxjF+SZ0zVpnvxm3zcHV46ThMQu1YuZF12DqQ@mail.gmail.com>

On Sat, Aug 27, 2011 at 2:44 PM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> On Sat, Aug 27, 2011 at 2:27 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire
>> <cjordan1 at uw.edu> wrote:
>>> Hi--I've been a moderately heavy R user for the past two years, so
>>> about a month ago I took an (abbreviated) version of a simple data
>>> analysis I did in R and tried to rewrite as much of it as possible,
>>> line by line, into python using numpy and statsmodels. I didn't use
>>> pandas, and I can't comment on how much it might have simplified
>>> things.
>>>
>>> This comparison might be useful to some people, so I stuck it up on a
>>> github repo. My overall impression is that R is much stronger for
>>> interactive data analysis. Click on the link for more details why,
>>> which are summarized in the README file.
>>>
>>> https://github.com/chrisjordansquire/r_vs_py
>>>
>>> The code examples should run out of the box with no downloads (other
>>> than R, Python, numpy, scipy, and statsmodels) required.
>>
>> Thank you very much for doing that - it's a very useful exercise. ?I
>> hope we can make use of it to discuss how to get better, in the true
>
> Hopefully. I suppose I should also mention, for those that don't want
> to click on the link, that the two largest reasons R was much simpler
> to use were because it was easier to construct models and easier to
> view entries I'd stuck into matrices. R's graphing capabilities seemed
> slightly more friendly, but that might have just been my familiarity
> with them.
>
> (As an aside, numpy arrays' print method don't make them friendly for
> interactive viewing. Even ipython couldn't make a few of the matrices
> I made very intelligible, and it's easy to construct examples that
> make numpy arrays hideous to behold. For example,

for interactive viewing spyder has an array viewer (variable explorer)
similar to matlab

>
> x = np.arange(5).reshape(5,1)
> y = np.ones(5).reshape(1,5)
> z = x*y
> z[0,0] += 0.0001
> print z
>
> [[ ?1.00000000e-04 ? 0.00000000e+00 ? 0.00000000e+00 ? 0.00000000e+00
> ? ?0.00000000e+00]
> ?[ ?1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00
> ? ?1.00000000e+00]
> ?[ ?2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00
> ? ?2.00000000e+00]
> ?[ ?3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00
> ? ?3.00000000e+00]
> ?[ ?4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00
> ? ?4.00000000e+00]]

>>> from scikits.statsmodels.iolib import SimpleTable
>>> print SimpleTable(z)
======================
0.0001 0.0 0.0 0.0 0.0
 1.0   1.0 1.0 1.0 1.0
 2.0   2.0 2.0 2.0 2.0
 3.0   3.0 3.0 3.0 3.0
 4.0   4.0 4.0 4.0 4.0
----------------------

>>> z[0,0] = 1e-6
>>> print SimpleTable(z)
=====================
1e-06 0.0 0.0 0.0 0.0
 1.0  1.0 1.0 1.0 1.0
 2.0  2.0 2.0 2.0 2.0
 3.0  3.0 3.0 3.0 3.0
 4.0  4.0 4.0 4.0 4.0
---------------------

>
> (Strangely, it looks much more tolerable if x ?=
> np.arange(1,6).reshape(5,1) instead.)
>
> If you do the same thing in R,
>
> x = rep(0:4,5)
> x = matrix(x,ncol=5)
> x[1,1] = 0.000001
> x
>
> you get
>
> ? ? ?[,1] [,2] [,3] [,4] [,5]
> [1,] 1e-06 ? ?0 ? ?0 ? ?0 ? ?0
> [2,] 1e+00 ? ?1 ? ?1 ? ?1 ? ?1
> [3,] 2e+00 ? ?2 ? ?2 ? ?2 ? ?2
> [4,] 3e+00 ? ?3 ? ?3 ? ?3 ? ?3
> [5,] 4e+00 ? ?4 ? ?4 ? ?4 ? ?4
>
> much more readable.)
>
>
> As a simple metric, my .r file was about 1/2 the size of the .py file,
> even though I couldn't do everything in python that I could in R.
> (These commands were meant to be entered interactively, so the length
> of the length of the file is, perhaps, a more valid metric then usual
> to be concerned about.)

predefining your categorical variables would save quite a few lines.

Josef

>
> -Chris Jordan-Squire
>
>
>> spirit of:
>>
>> Confront the Brutal Facts
>> http://en.wikipedia.org/wiki/Good_to_Great
>>
>> See you,
>>
>> Matthew
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jason-sage at creativetrax.com  Sat Aug 27 17:03:43 2011
From: jason-sage at creativetrax.com (Jason Grout)
Date: Sat, 27 Aug 2011 16:03:43 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
Message-ID: <4E595BAF.1080509@creativetrax.com>

On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
> This comparison might be useful to some people, so I stuck it up on a
> github repo. My overall impression is that R is much stronger for
> interactive data analysis. Click on the link for more details why,
> which are summarized in the README file.

 From the README:

"In fact, using Python without the IPython qtconsole is practically
impossible for this sort of cut and paste, interactive analysis.
The shell IPython doesn't allow it because it automatically adds
whitespace on multiline bits of code, breaking pre-formatted code's
alignment. Cutting and pasting works for the standard python shell,
but then you lose all the advantages of IPython."


You might use %cpaste in the ipython normal shell to paste without it 
automatically inserting spaces:

In [5]: %cpaste
Pasting code; enter '--' alone on the line to stop.
:if 1>0:
:    print 'hi'
:--
hi

Thanks,

Jason


From robert.kern at gmail.com  Sat Aug 27 18:02:28 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 27 Aug 2011 17:02:28 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <4E595BAF.1080509@creativetrax.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
Message-ID: <CAF6FJit91L3gD7-tCR6ci7cOVU5d1ju0Fok5Uio6N5QGqxQyaw@mail.gmail.com>

On Sat, Aug 27, 2011 at 16:03, Jason Grout <jason-sage at creativetrax.com> wrote:
> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>> This comparison might be useful to some people, so I stuck it up on a
>> github repo. My overall impression is that R is much stronger for
>> interactive data analysis. Click on the link for more details why,
>> which are summarized in the README file.
>
> ?From the README:
>
> "In fact, using Python without the IPython qtconsole is practically
> impossible for this sort of cut and paste, interactive analysis.
> The shell IPython doesn't allow it because it automatically adds
> whitespace on multiline bits of code, breaking pre-formatted code's
> alignment. Cutting and pasting works for the standard python shell,
> but then you lose all the advantages of IPython."
>
>
>
> You might use %cpaste in the ipython normal shell to paste without it
> automatically inserting spaces:
>
> In [5]: %cpaste
> Pasting code; enter '--' alone on the line to stop.
> :if 1>0:
> : ? ?print 'hi'
> :--
> hi

Or even just %paste!

|1> %paste?
Type:       Magic function
Base Class: <type 'instancemethod'>
String Form:<bound method TerminalInteractiveShell.magic_paste of
<IPython.frontend.terminal.interactiveshell.TerminalInteractiveShell
object at 0x1528770>>
Namespace:  IPython internal
File:       /Users/rkern/git/ipython/IPython/frontend/terminal/interactiveshell.py
Definition: %paste(self, parameter_s='')
Docstring:
Paste & execute a pre-formatted code block from clipboard.

The text is pulled directly from the clipboard without user
intervention and printed back on the screen before execution (unless
the -q flag is given to force quiet mode).

The block is dedented prior to execution to enable execution of method
definitions. '>' and '+' characters at the beginning of a line are
ignored, to allow pasting directly from e-mails, diff files and
doctests (the '...' continuation prompt is also stripped).  The
executed block is also assigned to variable named 'pasted_block' for
later editing with '%edit pasted_block'.

You can also pass a variable name as an argument, e.g. '%paste foo'.
This assigns the pasted block to variable 'foo' as string, without
dedenting or executing it (preceding >>> and + is still stripped)

Options
-------

  -r: re-executes the block previously entered by cpaste.

  -q: quiet mode: do not echo the pasted text back to the terminal.

IPython statements (magics, shell escapes) are not supported (yet).

See also
--------
cpaste: manually paste code into terminal until you mark its end.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From wesmckinn at gmail.com  Sat Aug 27 18:06:47 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sat, 27 Aug 2011 18:06:47 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <4E595BAF.1080509@creativetrax.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
Message-ID: <CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>

On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
<jason-sage at creativetrax.com> wrote:
> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>> This comparison might be useful to some people, so I stuck it up on a
>> github repo. My overall impression is that R is much stronger for
>> interactive data analysis. Click on the link for more details why,
>> which are summarized in the README file.
>
> ?From the README:
>
> "In fact, using Python without the IPython qtconsole is practically
> impossible for this sort of cut and paste, interactive analysis.
> The shell IPython doesn't allow it because it automatically adds
> whitespace on multiline bits of code, breaking pre-formatted code's
> alignment. Cutting and pasting works for the standard python shell,
> but then you lose all the advantages of IPython."
>
>
>
> You might use %cpaste in the ipython normal shell to paste without it
> automatically inserting spaces:
>
> In [5]: %cpaste
> Pasting code; enter '--' alone on the line to stop.
> :if 1>0:
> : ? ?print 'hi'
> :--
> hi
>
> Thanks,
>
> Jason
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

This strikes me as a textbook example of why we need an integrated
formula framework in statsmodels. I'll make a pass through when I get
a chance and see if there are some places where pandas would really
help out. For example, the weighted average by sex and occupation is
what groupby is all about:

hrdf = DataFrame(hrdat)

# note DataFrame allows you to change the dtype of a column!
hrdf['sex'] = np.where(hrdf['sex'] == 1, 'male', 'female')

def compute_stats(group):
  sum_weight = group['A_ERNLWT'].sum()
  wave_hrwage = (group['hrwage'] * group['A_ERNLWT']).sum() / sum_weight
  return Series({'sum_weight' : sum_weight,
                 'wave_hrwage' : wave_hrwage})

wocc = hrdf.groupby(['sex', 'occ']).apply(compute_stats)

In [39]: wocc
Out[39]:
            sum_weight  wave_hrwage
female  1   7.669e+05   23.41
        2   1.541e+06   24.39
        3   1.082e+06   10.02
        4   6.996e+05   13.49
        5   1.325e+06   16.28
        8   5.796e+04   20.44
        9   1.277e+05   12.27
        10  1.12e+05    12.44
male    1   7.325e+05   34.96
        2   1.198e+06   29.06
        3   8.283e+05   13.45
        4   5.013e+05   20.48
        5   4.367e+05   14.96
        7   6.484e+05   17.78
        8   4.424e+05   20.39
        9   6.064e+05   17.64
        10  5.256e+05   17.76

(Of course I'm showing up some swank new pandas 0.4 stuff, i.e.
hierarchical indexing and multi-key groupby)


From pav at iki.fi  Sat Aug 27 18:08:18 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 27 Aug 2011 22:08:18 +0000 (UTC)
Subject: [SciPy-User] 3d convex hull
References: <201108241338.50940.alexandre.fayolle@logilab.fr>
	<CAOJcpR_3xR7aR2dgefWw8onfY-ApcZGZCGa=qd8xEdg2Ct5YoQ@mail.gmail.com>
	<4E551F34.4@gmail.com> <j3a7qk$igl$1@dough.gmane.org>
	<4E590958.30103@gmail.com>
Message-ID: <j3bpsh$eh0$2@dough.gmane.org>

Sat, 27 Aug 2011 23:12:24 +0800, Ning Guo wrote:
> On Saturday, August 27, 2011 03:53 PM, Pauli Virtanen wrote:
[clip]
> Also, the formula to calculate normal may be like this:
> 
> face_normals[:,0] =
> np.cross(tetra_points[:,0]-tetra_points[:,2],tetra_points[:,1]-tetra_points[:,2])
[clip]

Ah yes, exactly like that, my brain apparently wasn't working properly.

> Regarding to the order of the vertices, I'm also not sure about their
> convention. I'm trying to figure it out.

If you find it out, please let us know, as this would be an useful thing
to mention in the documentation. However, I'm not sure at the moment
whether Qhull provides such ordering guarantees.

-- 
Pauli Virtanen


From matthew.brett at gmail.com  Sat Aug 27 18:56:29 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 27 Aug 2011 15:56:29 -0700
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
Message-ID: <CAH6Pt5pauf91GTZaDTC7puJCkHXNyJJt3acRZEv2tiFkePhd3A@mail.gmail.com>

Hi,

On Sat, Aug 27, 2011 at 3:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
> <jason-sage at creativetrax.com> wrote:
>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>> This comparison might be useful to some people, so I stuck it up on a
>>> github repo. My overall impression is that R is much stronger for
>>> interactive data analysis. Click on the link for more details why,
>>> which are summarized in the README file.
>>
>> ?From the README:
>>
>> "In fact, using Python without the IPython qtconsole is practically
>> impossible for this sort of cut and paste, interactive analysis.
>> The shell IPython doesn't allow it because it automatically adds
>> whitespace on multiline bits of code, breaking pre-formatted code's
>> alignment. Cutting and pasting works for the standard python shell,
>> but then you lose all the advantages of IPython."
>>
>>
>>
>> You might use %cpaste in the ipython normal shell to paste without it
>> automatically inserting spaces:
>>
>> In [5]: %cpaste
>> Pasting code; enter '--' alone on the line to stop.
>> :if 1>0:
>> : ? ?print 'hi'
>> :--
>> hi
>>
>> Thanks,
>>
>> Jason
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
> This strikes me as a textbook example of why we need an integrated
> formula framework in statsmodels.

Yes, at a superficial glance Chris' document sounded like an ideal
use-case tester for the battle of the formulas, or, in a less martial
mode, for defining what we want formulas to do.

I got sidetracked on my document on that - and am still sidetracked,
but should get to it soon, by which I mean, in the next month, unless
someone prompts me earlier...

See you,

Matthew


From cjordan1 at uw.edu  Sat Aug 27 19:30:46 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Sat, 27 Aug 2011 19:30:46 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
Message-ID: <CAEJxiFoJz94-aMnRECy0NzgSy60a0qWmTpMvTA8hVLPOexnGDw@mail.gmail.com>

On Sat, Aug 27, 2011 at 6:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
> <jason-sage at creativetrax.com> wrote:
>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>> This comparison might be useful to some people, so I stuck it up on a
>>> github repo. My overall impression is that R is much stronger for
>>> interactive data analysis. Click on the link for more details why,
>>> which are summarized in the README file.
>>
>> ?From the README:
>>
>> "In fact, using Python without the IPython qtconsole is practically
>> impossible for this sort of cut and paste, interactive analysis.
>> The shell IPython doesn't allow it because it automatically adds
>> whitespace on multiline bits of code, breaking pre-formatted code's
>> alignment. Cutting and pasting works for the standard python shell,
>> but then you lose all the advantages of IPython."
>>
>>
>>
>> You might use %cpaste in the ipython normal shell to paste without it
>> automatically inserting spaces:
>>
>> In [5]: %cpaste
>> Pasting code; enter '--' alone on the line to stop.
>> :if 1>0:
>> : ? ?print 'hi'
>> :--
>> hi
>>
>> Thanks,
>>
>> Jason
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
> This strikes me as a textbook example of why we need an integrated
> formula framework in statsmodels. I'll make a pass through when I get
> a chance and see if there are some places where pandas would really
> help out. For example, the weighted average by sex and occupation is
> what groupby is all about:
>
> hrdf = DataFrame(hrdat)
>
> # note DataFrame allows you to change the dtype of a column!
> hrdf['sex'] = np.where(hrdf['sex'] == 1, 'male', 'female')
>
> def compute_stats(group):
> ?sum_weight = group['A_ERNLWT'].sum()
> ?wave_hrwage = (group['hrwage'] * group['A_ERNLWT']).sum() / sum_weight
> ?return Series({'sum_weight' : sum_weight,
> ? ? ? ? ? ? ? ? 'wave_hrwage' : wave_hrwage})
>
> wocc = hrdf.groupby(['sex', 'occ']).apply(compute_stats)
>
> In [39]: wocc
> Out[39]:
> ? ? ? ? ? ?sum_weight ?wave_hrwage
> female ?1 ? 7.669e+05 ? 23.41
> ? ? ? ?2 ? 1.541e+06 ? 24.39
> ? ? ? ?3 ? 1.082e+06 ? 10.02
> ? ? ? ?4 ? 6.996e+05 ? 13.49
> ? ? ? ?5 ? 1.325e+06 ? 16.28
> ? ? ? ?8 ? 5.796e+04 ? 20.44
> ? ? ? ?9 ? 1.277e+05 ? 12.27
> ? ? ? ?10 ?1.12e+05 ? ?12.44
> male ? ?1 ? 7.325e+05 ? 34.96
> ? ? ? ?2 ? 1.198e+06 ? 29.06
> ? ? ? ?3 ? 8.283e+05 ? 13.45
> ? ? ? ?4 ? 5.013e+05 ? 20.48
> ? ? ? ?5 ? 4.367e+05 ? 14.96
> ? ? ? ?7 ? 6.484e+05 ? 17.78
> ? ? ? ?8 ? 4.424e+05 ? 20.39
> ? ? ? ?9 ? 6.064e+05 ? 17.64
> ? ? ? ?10 ?5.256e+05 ? 17.76
>
> (Of course I'm showing up some swank new pandas 0.4 stuff, i.e.
> hierarchical indexing and multi-key groupby)
>

Nifty! I will have to look at these parts of pandas closer.

-Chris JS


 _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From wesmckinn at gmail.com  Sat Aug 27 19:32:22 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sat, 27 Aug 2011 19:32:22 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFoJz94-aMnRECy0NzgSy60a0qWmTpMvTA8hVLPOexnGDw@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAEJxiFoJz94-aMnRECy0NzgSy60a0qWmTpMvTA8hVLPOexnGDw@mail.gmail.com>
Message-ID: <CAJPUwMCzK2xrnQkGtkx+UpZRNKFzJoFpMvM1fGrN2MnYgH_P2w@mail.gmail.com>

On Sat, Aug 27, 2011 at 7:30 PM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> On Sat, Aug 27, 2011 at 6:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>> <jason-sage at creativetrax.com> wrote:
>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>> This comparison might be useful to some people, so I stuck it up on a
>>>> github repo. My overall impression is that R is much stronger for
>>>> interactive data analysis. Click on the link for more details why,
>>>> which are summarized in the README file.
>>>
>>> ?From the README:
>>>
>>> "In fact, using Python without the IPython qtconsole is practically
>>> impossible for this sort of cut and paste, interactive analysis.
>>> The shell IPython doesn't allow it because it automatically adds
>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>> alignment. Cutting and pasting works for the standard python shell,
>>> but then you lose all the advantages of IPython."
>>>
>>>
>>>
>>> You might use %cpaste in the ipython normal shell to paste without it
>>> automatically inserting spaces:
>>>
>>> In [5]: %cpaste
>>> Pasting code; enter '--' alone on the line to stop.
>>> :if 1>0:
>>> : ? ?print 'hi'
>>> :--
>>> hi
>>>
>>> Thanks,
>>>
>>> Jason
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
>> This strikes me as a textbook example of why we need an integrated
>> formula framework in statsmodels. I'll make a pass through when I get
>> a chance and see if there are some places where pandas would really
>> help out. For example, the weighted average by sex and occupation is
>> what groupby is all about:
>>
>> hrdf = DataFrame(hrdat)
>>
>> # note DataFrame allows you to change the dtype of a column!
>> hrdf['sex'] = np.where(hrdf['sex'] == 1, 'male', 'female')
>>
>> def compute_stats(group):
>> ?sum_weight = group['A_ERNLWT'].sum()
>> ?wave_hrwage = (group['hrwage'] * group['A_ERNLWT']).sum() / sum_weight
>> ?return Series({'sum_weight' : sum_weight,
>> ? ? ? ? ? ? ? ? 'wave_hrwage' : wave_hrwage})
>>
>> wocc = hrdf.groupby(['sex', 'occ']).apply(compute_stats)
>>
>> In [39]: wocc
>> Out[39]:
>> ? ? ? ? ? ?sum_weight ?wave_hrwage
>> female ?1 ? 7.669e+05 ? 23.41
>> ? ? ? ?2 ? 1.541e+06 ? 24.39
>> ? ? ? ?3 ? 1.082e+06 ? 10.02
>> ? ? ? ?4 ? 6.996e+05 ? 13.49
>> ? ? ? ?5 ? 1.325e+06 ? 16.28
>> ? ? ? ?8 ? 5.796e+04 ? 20.44
>> ? ? ? ?9 ? 1.277e+05 ? 12.27
>> ? ? ? ?10 ?1.12e+05 ? ?12.44
>> male ? ?1 ? 7.325e+05 ? 34.96
>> ? ? ? ?2 ? 1.198e+06 ? 29.06
>> ? ? ? ?3 ? 8.283e+05 ? 13.45
>> ? ? ? ?4 ? 5.013e+05 ? 20.48
>> ? ? ? ?5 ? 4.367e+05 ? 14.96
>> ? ? ? ?7 ? 6.484e+05 ? 17.78
>> ? ? ? ?8 ? 4.424e+05 ? 20.39
>> ? ? ? ?9 ? 6.064e+05 ? 17.64
>> ? ? ? ?10 ?5.256e+05 ? 17.76
>>
>> (Of course I'm showing up some swank new pandas 0.4 stuff, i.e.
>> hierarchical indexing and multi-key groupby)
>>
>
> Nifty! I will have to look at these parts of pandas closer.
>
> -Chris JS
>
>
>
> ?_______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

I am working hard on documentation for all the new stuff, but I am but
one person :) I hope to have the docs (pandas.sourceforge.net , under
heavy construction at the moment) in more complete shape within a
week.

- Wes


From josef.pktd at gmail.com  Sat Aug 27 20:00:29 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 27 Aug 2011 20:00:29 -0400
Subject: [SciPy-User] multivariate empirical distribution function,
 avoid double loop ?
In-Reply-To: <CAMMTP+B03K1fQdvyaYjcKm_q1q-2A=DiwxJorXmP=Wi_jdwZpw@mail.gmail.com>
References: <CAMMTP+B6iZ=4t24tz1JFM_p7HQ3Xpo9R2Dsb48fMNpYrsjraWg@mail.gmail.com>
	<CAF6FJiuXQ4aWB0wnSMzdD5zK=bFWG2wEC0YogrwdYk8bLTC2KA@mail.gmail.com>
	<CAMMTP+B03K1fQdvyaYjcKm_q1q-2A=DiwxJorXmP=Wi_jdwZpw@mail.gmail.com>
Message-ID: <CAMMTP+BVB8rgJzg0s5tCHy5sq+XEyy4ySfL9RrXBk61jpL7rXw@mail.gmail.com>

On Wed, Aug 24, 2011 at 9:23 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Aug 24, 2011 at 7:25 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Wed, Aug 24, 2011 at 09:23, ?<josef.pktd at gmail.com> wrote:
>>> Does anyone know whether there is an algorithm that avoids the double
>>> loop to get a multivariate empirical distribution function?
>>>
>>> for point in data:
>>> ? ? count how many points in data are smaller or equal to point
>>>
>>> with 1d data it's just argsort(argsort(data))
>>>
>>> double loop version with some test cases is attached.
>>>
>>> I didn't see a way that sorting would help.
>>
>> If you can bear to make a few (nobs, nobs) bool arrays, you can do
>> just a kvars-sized loop in Python:
>>
>> dominates = np.ones((len(data), len(data)), dtype=bool)
>> for x in data.T:
>> ? ?dominates &= x[:,np.newaxis] > x
>> sorta_ranks = dominates.sum(axis=1)
>
> Thanks, quite a bit better, 14 times faster for (5000,2) and still 2.5
> times faster for (5000,20),
> 12 times for (10000,3) compared to my original.

attached a first draft of what I'm after

Josef

>
> Josef
>
>>
>> --
>> Robert Kern
>>
>> "I have come to believe that the whole world is an enigma, a harmless
>> enigma that is made terrible by our own mad attempt to interpret it as
>> though it had an underlying truth."
>> ? -- Umberto Eco
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mvecdf.py
Type: text/x-python
Size: 5168 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110827/0284065b/attachment.py>

From bsouthey at gmail.com  Sat Aug 27 22:15:01 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Sat, 27 Aug 2011 21:15:01 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
Message-ID: <CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>

On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
> <jason-sage at creativetrax.com> wrote:
>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>> This comparison might be useful to some people, so I stuck it up on a
>>> github repo. My overall impression is that R is much stronger for
>>> interactive data analysis. Click on the link for more details why,
>>> which are summarized in the README file.
>>
>> ?From the README:
>>
>> "In fact, using Python without the IPython qtconsole is practically
>> impossible for this sort of cut and paste, interactive analysis.
>> The shell IPython doesn't allow it because it automatically adds
>> whitespace on multiline bits of code, breaking pre-formatted code's
>> alignment. Cutting and pasting works for the standard python shell,
>> but then you lose all the advantages of IPython."
>>
>>
>>
>> You might use %cpaste in the ipython normal shell to paste without it
>> automatically inserting spaces:
>>
>> In [5]: %cpaste
>> Pasting code; enter '--' alone on the line to stop.
>> :if 1>0:
>> : ? ?print 'hi'
>> :--
>> hi
>>
>> Thanks,
>>
>> Jason
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
> This strikes me as a textbook example of why we need an integrated
> formula framework in statsmodels. I'll make a pass through when I get
> a chance and see if there are some places where pandas would really
> help out.

We used to have a formula class is scipy.stats and I do not follow
nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
had this (extremely flexible but very hard to comprehend). It was what
I had argued was needed ages ago for statsmodel. But it needs a
community effort because the syntax required serves multiple
communities with different annotations and needs. That is also seen
from the different approaches taken by the stats packages from S/R,
SAS, Genstat (and those are just are ones I have used).


Bruce


From dbigbear at gmail.com  Sun Aug 28 02:16:06 2011
From: dbigbear at gmail.com (Johnny)
Date: Sat, 27 Aug 2011 23:16:06 -0700 (PDT)
Subject: [SciPy-User] Install Scipy Errors: ImportError:
 /path_to/liblapack.so: undefined symbol: ztbsv_
Message-ID: <7b0b4836-fb15-4489-860a-c1684529f30c@p37g2000prp.googlegroups.com>

Hi all,

I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times,
but always encountering the problem:

[work <at> XXX]$ python -c 'import scipy.optimize;
scipy.optimize.test()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
scipy/optimize/__init__.py", line 11, in <module>
    from lbfgsb import fmin_l_bfgs_b
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
scipy/optimize/lbfgsb.py", line 28, in <module>
    import _lbfgsb
ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
ztbsv_

I can pass some other tests like:


[work <at> XXX:~/local]$ python -c 'import scipy.ndimage;
scipy.ndimage.test()'
Running unit tests for scipy.ndimage
NumPy version 1.6.1
NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
packages/numpy
SciPy version 0.9.0
SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
packages/scipy
Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
20051201 (Red Hat 3.4.5-2)]
nose version 1.1.2
.........S.................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 411 tests in 1.247s

OK (SKIP=1)

The problem seems due to the lib of Lapack. So I tried the solutions
posted on the internet before.

1) The liblapack.so may be not complete...SO I tried this:
    # integrate lapack with atlas:
    cd lib/
    mkdir tmp
    cd tmp/
    ar x ../liblapack.a
    cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a
    ar r ../liblapack.a *.o
    cd ../..
    make check
    make ptcheck
    cp include/* ~/include/
    cp lib/*.a ~/lib/

That is, after installing atlas, there is another liblapack.a (in
addition to the lapack_LINUX.a after Lapack) in its lib, but it is
about 500k, so I integrate it with the lapack_LINUX.a from installing
Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about
5m

2) re-install Lapack and atlas many times....No use

3) I found there is a lapack.so under scipy/lib, and it is about 500K,
but I think it may be not the problem, becaues the failure is
"ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
ztbsv_". Scipy seemed to import liblapack.so in my general lib
directory...

4) One thing  I am not sure is that I used gcc 4.7 and gfortran to
compile lapack and atlas, but my python 2.7 was built using gcc
3.4.5.....Is this a problem?


Anyone can help?
_______________________________________________________________
My configuration of the installation:

* ATLAS 3.8.4
* lapack 3.3.1
* numpy 1.6.1
* SciPy version 0.9.0
* dateutil 1.5
* Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
20051201 (Red Hat 3.4.5-2)]
* nose version 1.1.2
* gcc (GCC) 4.7.0 20110820 (experimental)
* LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010
x86_64 x86_64 x86_64 GNU/Linux

site.cfg of Scipy:

[DEFAULT]
library_dirs = /home/work/local/lib
include_dirs = /home/work/local/include
[lapack_opt]
libraries = lapack, f77blas, cblas, atlas

site.cfg of Numpy:

[DEFAULT]
library_dirs = /home/work/local/lib
include_dirs = /home/work/local/include
[lapack_opt]
libraries = lapack, f77blas, cblas, atlas


In addition, there are failures as well when test Numpy:

>>> import numpy
>>> numpy.test('1')
Running unit tests for numpy
NumPy version 1.6.1
NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
packages/numpy
Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
20051201 (Red Hat 3.4.5-2)]
nose version 1.1.2
======================================================================
FAIL: Test basic arithmetic function errors
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/testing/decorators.py", line 215, in knownfailer
    return f(*args, **kwargs)
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/core/tests/test_numeric.py", line 367, in
test_floating_exceptions_power
    np.power, ftype(2), ftype(2**fi.nexp))
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe
    "Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/testing/utils.py", line 34, in assert_
    raise AssertionError(msg)
AssertionError: Type <type 'numpy.float64'> did not raise fpe error
'overflow'.

======================================================================
FAIL: Test generic loops.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops
    assert_almost_equal(fone(x), fone_val, err_msg=msg)
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/testing/utils.py", line 448, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals PyUFunc_F_F
 ACTUAL: array([ 0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j],
dtype=complex64)
 DESIRED: 1

======================================================================
FAIL: test_umath.TestComplexFunctions.test_loss_of_precision(<type
'numpy.complex64'>,)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision
    check(x_basic, 2*eps/1e-3)
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/core/tests/test_umath.py", line 901, in check
    'arcsinh')
AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh')

======================================================================
FAIL: test_umath.TestComplexFunctions.test_precisions_consistent
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/core/tests/test_umath.py", line 812, in
test_precisions_consistent
    assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f)
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/testing/utils.py", line 448, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 6 decimals fch-fcd <ufunc 'arcsin'>
 ACTUAL: 2.3561945j
 DESIRED: (0.66623943249251527+1.0612750619050355j)

======================================================================
FAIL: test_kind.TestKind.test_all
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/f2py/tests/test_kind.py", line 30, in test_all
    'selectedrealkind(%s): expected %r but got %r' %  (i,
selected_real_kind(i), selectedrealkind(i)))
  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
numpy/testing/utils.py", line 34, in assert_
    raise AssertionError(msg)
AssertionError: selectedrealkind(19): expected -1 but got 16

----------------------------------------------------------------------
Ran 3552 tests in 29.977s

FAILED (KNOWNFAIL=3, failures=5)
<nose.result.TextTestResult run=3552 errors=0 failures=5>


From dbigbear at gmail.com  Sun Aug 28 02:18:54 2011
From: dbigbear at gmail.com (Johnny)
Date: Sat, 27 Aug 2011 23:18:54 -0700 (PDT)
Subject: [SciPy-User] How can I solve a equation like
 sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 - 3) = 5
Message-ID: <57952d36-c803-451c-bdeb-b2f1299290e7@y39g2000prd.googlegroups.com>

Hi, I am trying to solve the follow equation:

solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) -
mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) -
1, x)

I am not sure Scipy can do it and how it can do it ?


Many thanks
Xiong


From robert.kern at gmail.com  Sun Aug 28 03:36:06 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 28 Aug 2011 02:36:06 -0500
Subject: [SciPy-User] How can I solve a equation like
 sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 - 3) = 5
In-Reply-To: <57952d36-c803-451c-bdeb-b2f1299290e7@y39g2000prd.googlegroups.com>
References: <57952d36-c803-451c-bdeb-b2f1299290e7@y39g2000prd.googlegroups.com>
Message-ID: <CAF6FJiv55hN-FkNmXryXy1s0Opc-n96z=RkF2o4wNnxnDoU=Pw@mail.gmail.com>

On Sun, Aug 28, 2011 at 01:18, Johnny <dbigbear at gmail.com> wrote:
> Hi, I am trying to solve the follow equation:
>
> solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) -
> mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) -
> 1, x)
>
> I am not sure Scipy can do it and how it can do it ?

[~]
|18> from numpy import sqrt, log, exp

[~]
|19> from scipy.special import erfc

[~]
|20> def f(x, mu=1.0, sigma=0.1):
...>     return x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) -
...>         mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma *
sqrt(2))) - 1.0
...>

[~]
|21> from scipy.optimize import fsolve

[~]
|22> fsolve(f, 3.0)
array([ 2.88207063])

[~]
|23> f(_)
array([  4.44089210e-16])

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From wilson.andrew.j at gmail.com  Sat Aug 27 10:47:28 2011
From: wilson.andrew.j at gmail.com (Andy Wilson)
Date: Sat, 27 Aug 2011 09:47:28 -0500
Subject: [SciPy-User] Return variable value by function value
In-Reply-To: <20110826172510.2EC696F446@smtp.hushmail.com>
References: <20110826172510.2EC696F446@smtp.hushmail.com>
Message-ID: <CAPpTVmqA9P4NHjC3TH0YfrLSe0ukbkMcz+2zTA4O6xLyJrkeug@mail.gmail.com>

If an approximation is good enough, you can use scipy.interpolate.interp1d
to get a function that returns interpolated values. Your example doesn't
quite work because 0.95 is out of the range of the initial input.


import numpy as np
import scipy.interpolate

x = np.arange(0,100)
y = np.sqrt(1 - x**2/10E+4)

interp_func = scipy.interpolate.interp1d(x, y, kind='quadratic')

new_x = 0.95
interp_y = interp_func(new_x)
actual_y = np.sqrt(1 - new_x**2/10E+4)

print "actual value: %s" % actual_y
print "interpolated value %s" % interp_y
print "difference: %s" % (actual_y - interp_y)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110827/544ae7a2/attachment.html>

From ali.franco95 at gmail.com  Sat Aug 27 23:21:40 2011
From: ali.franco95 at gmail.com (ali franco)
Date: Sun, 28 Aug 2011 15:21:40 +1200
Subject: [SciPy-User] Sotring data for fast access
Message-ID: <CAPoWvn7g-NKf21eopVKrDsbTXCDbxK3cj5pyKQ=JfD829mQcwQ@mail.gmail.com>

There are two parts to my question. One: I have to do a double integration
on a grid

answer = integrate (  f(x,y) times besselfunction(x,y))

Now, I have read that the besselfunction can be precomputed and saved to
disk for fast access. How do I do this? Right now, I am evaluating the
besselfunction from scipy.special as it is required.

Second question: I have numerically integrated a differential equation and I
use the splined solution to solve other differential equations. However the
splined solution is slow. Is there a way to make this faster?

thanks guys
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/3f4acea4/attachment.html>

From ali.franco95 at gmail.com  Sun Aug 28 07:02:54 2011
From: ali.franco95 at gmail.com (ali franco)
Date: Sun, 28 Aug 2011 23:02:54 +1200
Subject: [SciPy-User] Does odeint return derivative
Message-ID: <CAPoWvn5oZFr637aZJBhm4Ws9i-pjsMbR4xGdr5oQa6dN=CSn_g@mail.gmail.com>

I am solving a system of differential equations using odeint. Is there a
simpler way of also getting the derivative other than calculating the
derivative from the solutions obtained which in my case is going to take
alot of extra time.

thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/f2f55861/attachment.html>

From ali.franco95 at gmail.com  Sun Aug 28 07:48:57 2011
From: ali.franco95 at gmail.com (ali franco)
Date: Sun, 28 Aug 2011 23:48:57 +1200
Subject: [SciPy-User] RectBivariateSpline
Message-ID: <CAPoWvn6Bo=iwpA+oHVUs_wFyCHNFLw9Bxai4kBMMMo5TMurakg@mail.gmail.com>

Can RectBivariateSpline be used to calculated derivatives and integrals? If
it can't, can you please suggest some thing else that does. I have to use a
two dimensional spline on a rectangular mesh. The problem with alternatives
to RectBivariateSpline such as BivariateSpline , And UnivariateSpline I
found was that they require the data be specified either on a square grid or
that they be equally spaced neither of which my data points satisfy.
thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/1251bbf9/attachment.html>

From bsouthey at gmail.com  Sun Aug 28 10:40:15 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Sun, 28 Aug 2011 09:40:15 -0500
Subject: [SciPy-User] Install Scipy Errors: ImportError:
 /path_to/liblapack.so: undefined symbol: ztbsv_
In-Reply-To: <7b0b4836-fb15-4489-860a-c1684529f30c@p37g2000prp.googlegroups.com>
References: <7b0b4836-fb15-4489-860a-c1684529f30c@p37g2000prp.googlegroups.com>
Message-ID: <CAAea2pb=4WQ=dJqxhOK7tmP-r2qfvDrka1mXTQE6-5A1+69xtQ@mail.gmail.com>

On Sun, Aug 28, 2011 at 1:16 AM, Johnny <dbigbear at gmail.com> wrote:
> Hi all,
>
> I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times,
> but always encountering the problem:
>
> [work <at> XXX]$ python -c 'import scipy.optimize;
> scipy.optimize.test()'
> Traceback (most recent call last):
> ?File "<string>", line 1, in <module>
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> scipy/optimize/__init__.py", line 11, in <module>
> ? ?from lbfgsb import fmin_l_bfgs_b
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> scipy/optimize/lbfgsb.py", line 28, in <module>
> ? ?import _lbfgsb
> ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
> ztbsv_
>
> I can pass some other tests like:
>
>
> [work <at> XXX:~/local]$ python -c 'import scipy.ndimage;
> scipy.ndimage.test()'
> Running unit tests for scipy.ndimage
> NumPy version 1.6.1
> NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> packages/numpy
> SciPy version 0.9.0
> SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> packages/scipy
> Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> 20051201 (Red Hat 3.4.5-2)]
> nose version 1.1.2
> .........S.................................................................................................................................................................................................................................................................................................................................................................................................................
> ----------------------------------------------------------------------
> Ran 411 tests in 1.247s
>
> OK (SKIP=1)
>
> The problem seems due to the lib of Lapack. So I tried the solutions
> posted on the internet before.
>
> 1) The liblapack.so may be not complete...SO I tried this:
> ? ?# integrate lapack with atlas:
> ? ?cd lib/
> ? ?mkdir tmp
> ? ?cd tmp/
> ? ?ar x ../liblapack.a
> ? ?cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a
> ? ?ar r ../liblapack.a *.o
> ? ?cd ../..
> ? ?make check
> ? ?make ptcheck
> ? ?cp include/* ~/include/
> ? ?cp lib/*.a ~/lib/
>
> That is, after installing atlas, there is another liblapack.a (in
> addition to the lapack_LINUX.a after Lapack) in its lib, but it is
> about 500k, so I integrate it with the lapack_LINUX.a from installing
> Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about
> 5m
>
> 2) re-install Lapack and atlas many times....No use
>
> 3) I found there is a lapack.so under scipy/lib, and it is about 500K,
> but I think it may be not the problem, becaues the failure is
> "ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
> ztbsv_". Scipy seemed to import liblapack.so in my general lib
> directory...
>
> 4) One thing ?I am not sure is that I used gcc 4.7 and gfortran to
> compile lapack and atlas, but my python 2.7 was built using gcc
> 3.4.5.....Is this a problem?
>
>
> Anyone can help?
> _______________________________________________________________
> My configuration of the installation:
>
> * ATLAS 3.8.4
> * lapack 3.3.1
> * numpy 1.6.1
> * SciPy version 0.9.0
> * dateutil 1.5
> * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> 20051201 (Red Hat 3.4.5-2)]
> * nose version 1.1.2
> * gcc (GCC) 4.7.0 20110820 (experimental)
> * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010
> x86_64 x86_64 x86_64 GNU/Linux
>
> site.cfg of Scipy:
>
> [DEFAULT]
> library_dirs = /home/work/local/lib
> include_dirs = /home/work/local/include
> [lapack_opt]
> libraries = lapack, f77blas, cblas, atlas
>
> site.cfg of Numpy:
>
> [DEFAULT]
> library_dirs = /home/work/local/lib
> include_dirs = /home/work/local/include
> [lapack_opt]
> libraries = lapack, f77blas, cblas, atlas
>
>
> In addition, there are failures as well when test Numpy:
>
>>>> import numpy
>>>> numpy.test('1')
> Running unit tests for numpy
> NumPy version 1.6.1
> NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> packages/numpy
> Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> 20051201 (Red Hat 3.4.5-2)]
> nose version 1.1.2
> ======================================================================
> FAIL: Test basic arithmetic function errors
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/decorators.py", line 215, in knownfailer
> ? ?return f(*args, **kwargs)
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_numeric.py", line 367, in
> test_floating_exceptions_power
> ? ?np.power, ftype(2), ftype(2**fi.nexp))
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe
> ? ?"Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 34, in assert_
> ? ?raise AssertionError(msg)
> AssertionError: Type <type 'numpy.float64'> did not raise fpe error
> 'overflow'.
>
> ======================================================================
> FAIL: Test generic loops.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops
> ? ?assert_almost_equal(fone(x), fone_val, err_msg=msg)
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 448, in assert_almost_equal
> ? ?raise AssertionError(msg)
> AssertionError:
> Arrays are not almost equal to 7 decimals PyUFunc_F_F
> ?ACTUAL: array([ 0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j],
> dtype=complex64)
> ?DESIRED: 1
>
> ======================================================================
> FAIL: test_umath.TestComplexFunctions.test_loss_of_precision(<type
> 'numpy.complex64'>,)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> case.py", line 197, in runTest
> ? ?self.test(*self.arg)
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision
> ? ?check(x_basic, 2*eps/1e-3)
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_umath.py", line 901, in check
> ? ?'arcsinh')
> AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh')
>
> ======================================================================
> FAIL: test_umath.TestComplexFunctions.test_precisions_consistent
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> case.py", line 197, in runTest
> ? ?self.test(*self.arg)
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_umath.py", line 812, in
> test_precisions_consistent
> ? ?assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f)
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 448, in assert_almost_equal
> ? ?raise AssertionError(msg)
> AssertionError:
> Arrays are not almost equal to 6 decimals fch-fcd <ufunc 'arcsin'>
> ?ACTUAL: 2.3561945j
> ?DESIRED: (0.66623943249251527+1.0612750619050355j)
>
> ======================================================================
> FAIL: test_kind.TestKind.test_all
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> case.py", line 197, in runTest
> ? ?self.test(*self.arg)
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/f2py/tests/test_kind.py", line 30, in test_all
> ? ?'selectedrealkind(%s): expected %r but got %r' % ?(i,
> selected_real_kind(i), selectedrealkind(i)))
> ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 34, in assert_
> ? ?raise AssertionError(msg)
> AssertionError: selectedrealkind(19): expected -1 but got 16
>
> ----------------------------------------------------------------------
> Ran 3552 tests in 29.977s
>
> FAILED (KNOWNFAIL=3, failures=5)
> <nose.result.TextTestResult run=3552 errors=0 failures=5>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
Hi,
What Linux distro are you actually using?
Unless you have some issue, I would install the atlas version provided
by the distro as I have long term success with Fedora's packages
across multiple versions.

If you still want to build it yourself, then you need to be using the
same compiler version everywhere.

The ztbsv_ error suggests that you have not build blas, lapack and
atlas correctly. It is hard to get those right so very carefully check
the build logs and run the associated tests.

Bruce


From rob.clewley at gmail.com  Sun Aug 28 12:32:42 2011
From: rob.clewley at gmail.com (Rob Clewley)
Date: Sun, 28 Aug 2011 12:32:42 -0400
Subject: [SciPy-User] Does odeint return derivative
In-Reply-To: <CAPoWvn5oZFr637aZJBhm4Ws9i-pjsMbR4xGdr5oQa6dN=CSn_g@mail.gmail.com>
References: <CAPoWvn5oZFr637aZJBhm4Ws9i-pjsMbR4xGdr5oQa6dN=CSn_g@mail.gmail.com>
Message-ID: <CA+7tCySZxdvtfzfFSLXi=ERHD4pRt7YtFOhqJsDC_unkcW6Rrg@mail.gmail.com>

Hi,

On Sun, Aug 28, 2011 at 7:02 AM, ali franco <ali.franco95 at gmail.com> wrote:
> I am solving a system of differential equations using odeint. Is there a
> simpler way of also getting the derivative other than calculating the
> derivative from the solutions obtained which in my case is going to take
> alot of extra time.

Maybe I misunderstand which derivative you are interested in, but if
you have a system

x' = f(x, t)

then the rates of change of the state variables x at any given time
and known state position are simply given by calling function f
directly. That's what the ODE means by definition. So if you have
solved for a trajectory and have an array of time t and state x
values, just pass a pair of x, t values to f to find out how fast x is
changing at that point.

This *is* a very simple and cheap way to get the derivative (you can
even vectorize it), but I'm guessing that you were considering doing
some kind of finite differencing to obtain approximate derivatives
from the trajectory.

Anyway, hope that helps.
Rob


From rob.clewley at gmail.com  Sun Aug 28 14:40:52 2011
From: rob.clewley at gmail.com (Rob Clewley)
Date: Sun, 28 Aug 2011 14:40:52 -0400
Subject: [SciPy-User] Sotring data for fast access
In-Reply-To: <CAPoWvn7g-NKf21eopVKrDsbTXCDbxK3cj5pyKQ=JfD829mQcwQ@mail.gmail.com>
References: <CAPoWvn7g-NKf21eopVKrDsbTXCDbxK3cj5pyKQ=JfD829mQcwQ@mail.gmail.com>
Message-ID: <CA+7tCySo9z+DskBGfbJxvaE5=u1808x6QmgZHceq6dfEybRbAQ@mail.gmail.com>

Hi Ali,

On Sat, Aug 27, 2011 at 11:21 PM, ali franco <ali.franco95 at gmail.com> wrote:
> There are two parts to my question.

I'm not sure I understand enough about the first q to answer it with
authority, but my gut instinct, FWIW, is that using a table of
pre-computed values on some mesh of (x,y) means you'll have to accept
interpolated values for the Bessel function when it's needed at new
(x,y) values. I mean, if you already know all the (x,y) values you'll
need, I don't see the benefit in any precomputation. Maybe the best
splines to use in this and your ODE problem are ones where you impose
the knots from the sampled values and their first derivatives there,
since you can explicitly compute the derivatives for Bessel functions
and for ODE right-hand sides. That guarantees pretty good accuracy of
the fit, and you'll be using quadratics between every pair of knots.
If you don't specify the derivatives and use cubics, you'll run the
risk of nasty behavior such as Runge's phenomenon.

> Second question: I have numerically integrated a differential equation and I
> use the splined solution to solve other differential equations. However the
> splined solution is slow. Is there a way to make this faster?

What exactly do you mean by using the solution to solve other DEs? It
would help for you to provide concrete examples when posting on
forums. If you mean that you are using a spline-interpolated curve as
a time-dependent term in another DE's right-hand side, then yes,
that's going to be slower in pure python. But if speed is such an
issue, you shouldn't be using odeint. PyDSTool supports external input
signals of a similar kind running C-based integrators very quickly
compared to odeint, but currently only supports piecewise-linear
interpolation of those signals. This can be accurate enough if you
have a very high sampling rate for that signal relative to your new
DE's time steps, particularly because the integrator is guaranteed to
step to the knot points. Higher order splines in C are on my to do
list, though. PyDSTool's interface to scipy's vode wrapper does
support higher order splines but it's done in python, so probably
won't be faster than what you're already doing.

-Rob


From jsseabold at gmail.com  Sun Aug 28 15:07:19 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Sun, 28 Aug 2011 15:07:19 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFrT37Pyad0LK+1CLWqGji+k289efq0WQX9sjugq22Sc0w@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<CAH6Pt5qb1gNk0hy2H-O-S2ujuJJiD=D6VEpbHDDtBW20hmXrBA@mail.gmail.com>
	<CAEJxiFrT37Pyad0LK+1CLWqGji+k289efq0WQX9sjugq22Sc0w@mail.gmail.com>
Message-ID: <CAKF=Djt0wtoRXapXjBPZ6osORHaBRgi=CgX=05NA91yzxqzyfQ@mail.gmail.com>

On Sat, Aug 27, 2011 at 2:44 PM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> On Sat, Aug 27, 2011 at 2:27 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> On Sat, Aug 27, 2011 at 11:19 AM, Christopher Jordan-Squire
>> <cjordan1 at uw.edu> wrote:
>>> Hi--I've been a moderately heavy R user for the past two years, so
>>> about a month ago I took an (abbreviated) version of a simple data
>>> analysis I did in R and tried to rewrite as much of it as possible,
>>> line by line, into python using numpy and statsmodels. I didn't use
>>> pandas, and I can't comment on how much it might have simplified
>>> things.
>>>
>>> This comparison might be useful to some people, so I stuck it up on a
>>> github repo. My overall impression is that R is much stronger for
>>> interactive data analysis. Click on the link for more details why,
>>> which are summarized in the README file.
>>>
>>> https://github.com/chrisjordansquire/r_vs_py
>>>
>>> The code examples should run out of the box with no downloads (other
>>> than R, Python, numpy, scipy, and statsmodels) required.
>>
>> Thank you very much for doing that - it's a very useful exercise. ?I
>> hope we can make use of it to discuss how to get better, in the true
>
> Hopefully. I suppose I should also mention, for those that don't want
> to click on the link, that the two largest reasons R was much simpler
> to use were because it was easier to construct models and easier to
> view entries I'd stuck into matrices. R's graphing capabilities seemed
> slightly more friendly, but that might have just been my familiarity
> with them.
>
> (As an aside, numpy arrays' print method don't make them friendly for
> interactive viewing. Even ipython couldn't make a few of the matrices
> I made very intelligible, and it's easy to construct examples that
> make numpy arrays hideous to behold. For example,
>
> x = np.arange(5).reshape(5,1)
> y = np.ones(5).reshape(1,5)
> z = x*y
> z[0,0] += 0.0001
> print z
>
> [[ ?1.00000000e-04 ? 0.00000000e+00 ? 0.00000000e+00 ? 0.00000000e+00
> ? ?0.00000000e+00]
> ?[ ?1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00
> ? ?1.00000000e+00]
> ?[ ?2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00 ? 2.00000000e+00
> ? ?2.00000000e+00]
> ?[ ?3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00 ? 3.00000000e+00
> ? ?3.00000000e+00]
> ?[ ?4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00 ? 4.00000000e+00
> ? ?4.00000000e+00]]
>

My default

[~/statsmodels/]
[1]:

[~/statsmodels/]
[1]: x = np.arange(5).reshape(5,1)

[~/statsmodels/]
[2]: y = np.ones(5).reshape(1,5)

[~/statsmodels/]
[3]: z = x*y

[~/statsmodels/]
[4]: z[0,0] += 0.0001

[~/statsmodels/]
[5]: print z
[[ 0.0001  0.      0.      0.      0.    ]
 [ 1.      1.      1.      1.      1.    ]
 [ 2.      2.      2.      2.      2.    ]
 [ 3.      3.      3.      3.      3.    ]
 [ 4.      4.      4.      4.      4.    ]]

[~/statsmodels/]
[6]: np.set_printoptions(suppress=False)

[~/statsmodels/]
[7]: print z
[[  1.00000000e-04   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00]
 [  1.00000000e+00   1.00000000e+00   1.00000000e+00   1.00000000e+00
    1.00000000e+00]
 [  2.00000000e+00   2.00000000e+00   2.00000000e+00   2.00000000e+00
    2.00000000e+00]
 [  3.00000000e+00   3.00000000e+00   3.00000000e+00   3.00000000e+00
    3.00000000e+00]
 [  4.00000000e+00   4.00000000e+00   4.00000000e+00   4.00000000e+00
    4.00000000e+00]]

Skipper

> (Strangely, it looks much more tolerable if x ?=
> np.arange(1,6).reshape(5,1) instead.)
>
> If you do the same thing in R,
>
> x = rep(0:4,5)
> x = matrix(x,ncol=5)
> x[1,1] = 0.000001
> x
>
> you get
>
> ? ? ?[,1] [,2] [,3] [,4] [,5]
> [1,] 1e-06 ? ?0 ? ?0 ? ?0 ? ?0
> [2,] 1e+00 ? ?1 ? ?1 ? ?1 ? ?1
> [3,] 2e+00 ? ?2 ? ?2 ? ?2 ? ?2
> [4,] 3e+00 ? ?3 ? ?3 ? ?3 ? ?3
> [5,] 4e+00 ? ?4 ? ?4 ? ?4 ? ?4
>
> much more readable.)
>
>
> As a simple metric, my .r file was about 1/2 the size of the .py file,
> even though I couldn't do everything in python that I could in R.
> (These commands were meant to be entered interactively, so the length
> of the length of the file is, perhaps, a more valid metric then usual
> to be concerned about.)
>
> -Chris Jordan-Squire
>
>
>> spirit of:
>>
>> Confront the Brutal Facts
>> http://en.wikipedia.org/wiki/Good_to_Great
>>
>> See you,
>>
>> Matthew
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jsseabold at gmail.com  Sun Aug 28 15:54:49 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Sun, 28 Aug 2011 15:54:49 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
Message-ID: <CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>

On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>> <jason-sage at creativetrax.com> wrote:
>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>> This comparison might be useful to some people, so I stuck it up on a
>>>> github repo. My overall impression is that R is much stronger for
>>>> interactive data analysis. Click on the link for more details why,
>>>> which are summarized in the README file.
>>>
>>> ?From the README:
>>>
>>> "In fact, using Python without the IPython qtconsole is practically
>>> impossible for this sort of cut and paste, interactive analysis.
>>> The shell IPython doesn't allow it because it automatically adds
>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>> alignment. Cutting and pasting works for the standard python shell,
>>> but then you lose all the advantages of IPython."
>>>
>>>
>>>
>>> You might use %cpaste in the ipython normal shell to paste without it
>>> automatically inserting spaces:
>>>
>>> In [5]: %cpaste
>>> Pasting code; enter '--' alone on the line to stop.
>>> :if 1>0:
>>> : ? ?print 'hi'
>>> :--
>>> hi
>>>
>>> Thanks,
>>>
>>> Jason
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
>> This strikes me as a textbook example of why we need an integrated
>> formula framework in statsmodels. I'll make a pass through when I get
>> a chance and see if there are some places where pandas would really
>> help out.
>
> We used to have a formula class is scipy.stats and I do not follow
> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
> had this (extremely flexible but very hard to comprehend). It was what
> I had argued was needed ages ago for statsmodel. But it needs a
> community effort because the syntax required serves multiple
> communities with different annotations and needs. That is also seen
> from the different approaches taken by the stats packages from S/R,
> SAS, Genstat (and those are just are ones I have used).
>

We have held this discussion at _great_ length multiple times on the
statsmodels list and are in the process of trying to integrate
Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
the statsmodels base.

http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework

and more recently

https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?

https://github.com/statsmodels/formula
https://github.com/statsmodels/charlton

Wes and I made some effort to go through this at SciPy. From where I
sit, I think it's difficult to disentangle the data structures from
the formula implementation, or maybe I'd just prefer to finish
tackling the former because it's much more straightforward. So I'd
like to first finish the pandas-integration branch that we've started
and then focus on the formula support. This is on my (our, I hope...)
immediate long-term goal list. Then I'd like to come back to the
community and hash out the 'rules of the game' details for formulas
after we have some code for people to play with, which promises to be
"fun."

https://github.com/statsmodels/statsmodels/tree/pandas-integration

FWIW, I could also improve the categorical function to be much nicer
for the given examples (ie., take a list, drop a reference category),
but I don't know that it's worth it, because it's really just a
stop-gap and ideally users shouldn't have to rely on it. Thoughts on
more stop-gap?

If I understand Chris' concerns, I think pandas + formula will go a
long way towards bridging the gap between Python and R usability, but
it's a large effort and there are only a handful (at best) of people
writing code -- Wes being the only one who's more or less "full time"
as far as I can tell. The 0.4 statsmodels release should be very
exciting though, I hope. I'm looking forward to it, at least. Then
there's only the small problem of building an infrastructure and
community like CRAN so we can have specialists writing and maintaining
code...but I hope once all the tools are in place this will seem much
less daunting. There certainly seems to be the right sentiment for it.

Skipper


From bsouthey at gmail.com  Sun Aug 28 21:16:08 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Sun, 28 Aug 2011 20:16:08 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
Message-ID: <CAAea2pbVN3pT=Vh=-PT3ZCFP7Fb3g2q7fpjL7BozTn9mguXGog@mail.gmail.com>

On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>> <jason-sage at creativetrax.com> wrote:
>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>> github repo. My overall impression is that R is much stronger for
>>>>> interactive data analysis. Click on the link for more details why,
>>>>> which are summarized in the README file.
>>>>
>>>> ?From the README:
>>>>
>>>> "In fact, using Python without the IPython qtconsole is practically
>>>> impossible for this sort of cut and paste, interactive analysis.
>>>> The shell IPython doesn't allow it because it automatically adds
>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>> alignment. Cutting and pasting works for the standard python shell,
>>>> but then you lose all the advantages of IPython."
>>>>
>>>>
>>>>
>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>> automatically inserting spaces:
>>>>
>>>> In [5]: %cpaste
>>>> Pasting code; enter '--' alone on the line to stop.
>>>> :if 1>0:
>>>> : ? ?print 'hi'
>>>> :--
>>>> hi
>>>>
>>>> Thanks,
>>>>
>>>> Jason
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>
>>> This strikes me as a textbook example of why we need an integrated
>>> formula framework in statsmodels. I'll make a pass through when I get
>>> a chance and see if there are some places where pandas would really
>>> help out.
>>
>> We used to have a formula class is scipy.stats and I do not follow
>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>> had this (extremely flexible but very hard to comprehend). It was what
>> I had argued was needed ages ago for statsmodel. But it needs a
>> community effort because the syntax required serves multiple
>> communities with different annotations and needs. That is also seen
>> from the different approaches taken by the stats packages from S/R,
>> SAS, Genstat (and those are just are ones I have used).
>>
>
> We have held this discussion at _great_ length multiple times on the
> statsmodels list and are in the process of trying to integrate
> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
> the statsmodels base.
>
> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>
> and more recently
>
> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>
> https://github.com/statsmodels/formula
> https://github.com/statsmodels/charlton
>
> Wes and I made some effort to go through this at SciPy. From where I
> sit, I think it's difficult to disentangle the data structures from
> the formula implementation, or maybe I'd just prefer to finish
> tackling the former because it's much more straightforward. So I'd
> like to first finish the pandas-integration branch that we've started
> and then focus on the formula support. This is on my (our, I hope...)
> immediate long-term goal list. Then I'd like to come back to the
> community and hash out the 'rules of the game' details for formulas
> after we have some code for people to play with, which promises to be
> "fun."
>
> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>
> FWIW, I could also improve the categorical function to be much nicer
> for the given examples (ie., take a list, drop a reference category),
> but I don't know that it's worth it, because it's really just a
> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
> more stop-gap?
>
> If I understand Chris' concerns, I think pandas + formula will go a
> long way towards bridging the gap between Python and R usability, but
> it's a large effort and there are only a handful (at best) of people
> writing code -- Wes being the only one who's more or less "full time"
> as far as I can tell. The 0.4 statsmodels release should be very
> exciting though, I hope. I'm looking forward to it, at least. Then
> there's only the small problem of building an infrastructure and
> community like CRAN so we can have specialists writing and maintaining
> code...but I hope once all the tools are in place this will seem much
> less daunting. There certainly seems to be the right sentiment for it.
>
> Skipper
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

Thanks for the info!

Actually it is impossible to "disentangle the data structures from the
formula implementation". You have to make design designs for example
defining factors- R does that in the dataframe (as.factor() is not
part of the formula), SAS using class statements etc.


Bruce


From dbigbear at gmail.com  Mon Aug 29 02:54:42 2011
From: dbigbear at gmail.com (Xiong Deng)
Date: Mon, 29 Aug 2011 14:54:42 +0800
Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 47
In-Reply-To: <mailman.3485.1314542392.1086.scipy-user@scipy.org>
References: <mailman.3485.1314542392.1086.scipy-user@scipy.org>
Message-ID: <CADx86fFZCFaq5EO5EfRzDWE5StNUxyVziSN6z23pRdjgF7B4PQ@mail.gmail.com>

Dear Bruce,

My Linux is?

Red Hat Enterprise Linux AS release 4 (Nahant Update 3)

Many thanks
John

On 28 August 2011 22:39, <scipy-user-request at scipy.org> wrote:

> Send SciPy-User mailing list submissions to
>        scipy-user at scipy.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mail.scipy.org/mailman/listinfo/scipy-user
> or, via email, send a message with subject or body 'help' to
>        scipy-user-request at scipy.org
>
> You can reach the person managing the list at
>        scipy-user-owner at scipy.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of SciPy-User digest..."
>
>
> Today's Topics:
>
>   1. Install Scipy Errors: ImportError: /path_to/liblapack.so:
>      undefined symbol: ztbsv_ (Johnny)
>   2. How can I solve a equation like
>      sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) -
>      1.5)**2 - 3) = 5 (Johnny)
>   3. Re: How can I solve a equation like
>      sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) -
>      1.5)**2 - 3) = 5 (Robert Kern)
>   4. Re: Return variable value by function value (Andy Wilson)
>   5. Sotring data for fast access (ali franco)
>   6. Does odeint return derivative (ali franco)
>   7. RectBivariateSpline (ali franco)
>   8. Re: Install Scipy Errors: ImportError: /path_to/liblapack.so:
>      undefined symbol: ztbsv_ (Bruce Southey)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 27 Aug 2011 23:16:06 -0700 (PDT)
> From: Johnny <dbigbear at gmail.com>
> Subject: [SciPy-User] Install Scipy Errors: ImportError:
>        /path_to/liblapack.so: undefined symbol: ztbsv_
> To: scipy-user at scipy.org
> Message-ID:
>        <7b0b4836-fb15-4489-860a-c1684529f30c at p37g2000prp.googlegroups.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi all,
>
> I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times,
> but always encountering the problem:
>
> [work <at> XXX]$ python -c 'import scipy.optimize;
> scipy.optimize.test()'
> Traceback (most recent call last):
>  File "<string>", line 1, in <module>
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> scipy/optimize/__init__.py", line 11, in <module>
>    from lbfgsb import fmin_l_bfgs_b
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> scipy/optimize/lbfgsb.py", line 28, in <module>
>    import _lbfgsb
> ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
> ztbsv_
>
> I can pass some other tests like:
>
>
> [work <at> XXX:~/local]$ python -c 'import scipy.ndimage;
> scipy.ndimage.test()'
> Running unit tests for scipy.ndimage
> NumPy version 1.6.1
> NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> packages/numpy
> SciPy version 0.9.0
> SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> packages/scipy
> Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> 20051201 (Red Hat 3.4.5-2)]
> nose version 1.1.2
>
> .........S.................................................................................................................................................................................................................................................................................................................................................................................................................
> ----------------------------------------------------------------------
> Ran 411 tests in 1.247s
>
> OK (SKIP=1)
>
> The problem seems due to the lib of Lapack. So I tried the solutions
> posted on the internet before.
>
> 1) The liblapack.so may be not complete...SO I tried this:
>    # integrate lapack with atlas:
>    cd lib/
>    mkdir tmp
>    cd tmp/
>    ar x ../liblapack.a
>    cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a
>    ar r ../liblapack.a *.o
>    cd ../..
>    make check
>    make ptcheck
>    cp include/* ~/include/
>    cp lib/*.a ~/lib/
>
> That is, after installing atlas, there is another liblapack.a (in
> addition to the lapack_LINUX.a after Lapack) in its lib, but it is
> about 500k, so I integrate it with the lapack_LINUX.a from installing
> Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about
> 5m
>
> 2) re-install Lapack and atlas many times....No use
>
> 3) I found there is a lapack.so under scipy/lib, and it is about 500K,
> but I think it may be not the problem, becaues the failure is
> "ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
> ztbsv_". Scipy seemed to import liblapack.so in my general lib
> directory...
>
> 4) One thing  I am not sure is that I used gcc 4.7 and gfortran to
> compile lapack and atlas, but my python 2.7 was built using gcc
> 3.4.5.....Is this a problem?
>
>
> Anyone can help?
> _______________________________________________________________
> My configuration of the installation:
>
> * ATLAS 3.8.4
> * lapack 3.3.1
> * numpy 1.6.1
> * SciPy version 0.9.0
> * dateutil 1.5
> * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> 20051201 (Red Hat 3.4.5-2)]
> * nose version 1.1.2
> * gcc (GCC) 4.7.0 20110820 (experimental)
> * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010
> x86_64 x86_64 x86_64 GNU/Linux
>
> site.cfg of Scipy:
>
> [DEFAULT]
> library_dirs = /home/work/local/lib
> include_dirs = /home/work/local/include
> [lapack_opt]
> libraries = lapack, f77blas, cblas, atlas
>
> site.cfg of Numpy:
>
> [DEFAULT]
> library_dirs = /home/work/local/lib
> include_dirs = /home/work/local/include
> [lapack_opt]
> libraries = lapack, f77blas, cblas, atlas
>
>
> In addition, there are failures as well when test Numpy:
>
> >>> import numpy
> >>> numpy.test('1')
> Running unit tests for numpy
> NumPy version 1.6.1
> NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> packages/numpy
> Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> 20051201 (Red Hat 3.4.5-2)]
> nose version 1.1.2
> ======================================================================
> FAIL: Test basic arithmetic function errors
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/decorators.py", line 215, in knownfailer
>    return f(*args, **kwargs)
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_numeric.py", line 367, in
> test_floating_exceptions_power
>    np.power, ftype(2), ftype(2**fi.nexp))
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe
>    "Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 34, in assert_
>    raise AssertionError(msg)
> AssertionError: Type <type 'numpy.float64'> did not raise fpe error
> 'overflow'.
>
> ======================================================================
> FAIL: Test generic loops.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops
>    assert_almost_equal(fone(x), fone_val, err_msg=msg)
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 448, in assert_almost_equal
>    raise AssertionError(msg)
> AssertionError:
> Arrays are not almost equal to 7 decimals PyUFunc_F_F
>  ACTUAL: array([ 0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j],
> dtype=complex64)
>  DESIRED: 1
>
> ======================================================================
> FAIL: test_umath.TestComplexFunctions.test_loss_of_precision(<type
> 'numpy.complex64'>,)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> case.py", line 197, in runTest
>    self.test(*self.arg)
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision
>    check(x_basic, 2*eps/1e-3)
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_umath.py", line 901, in check
>    'arcsinh')
> AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh')
>
> ======================================================================
> FAIL: test_umath.TestComplexFunctions.test_precisions_consistent
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> case.py", line 197, in runTest
>    self.test(*self.arg)
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/core/tests/test_umath.py", line 812, in
> test_precisions_consistent
>    assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f)
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 448, in assert_almost_equal
>    raise AssertionError(msg)
> AssertionError:
> Arrays are not almost equal to 6 decimals fch-fcd <ufunc 'arcsin'>
>  ACTUAL: 2.3561945j
>  DESIRED: (0.66623943249251527+1.0612750619050355j)
>
> ======================================================================
> FAIL: test_kind.TestKind.test_all
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> case.py", line 197, in runTest
>    self.test(*self.arg)
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/f2py/tests/test_kind.py", line 30, in test_all
>    'selectedrealkind(%s): expected %r but got %r' %  (i,
> selected_real_kind(i), selectedrealkind(i)))
>  File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> numpy/testing/utils.py", line 34, in assert_
>    raise AssertionError(msg)
> AssertionError: selectedrealkind(19): expected -1 but got 16
>
> ----------------------------------------------------------------------
> Ran 3552 tests in 29.977s
>
> FAILED (KNOWNFAIL=3, failures=5)
> <nose.result.TextTestResult run=3552 errors=0 failures=5>
>
>
> ------------------------------
>
> Message: 2
> Date: Sat, 27 Aug 2011 23:18:54 -0700 (PDT)
> From: Johnny <dbigbear at gmail.com>
> Subject: [SciPy-User] How can I solve a equation like
>        sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 -
>        3) = 5
> To: scipy-user at scipy.org
> Message-ID:
>        <57952d36-c803-451c-bdeb-b2f1299290e7 at y39g2000prd.googlegroups.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi, I am trying to solve the follow equation:
>
> solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) -
> mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) -
> 1, x)
>
> I am not sure Scipy can do it and how it can do it ?
>
>
> Many thanks
> Xiong
>
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Sun, 28 Aug 2011 02:36:06 -0500
> From: Robert Kern <robert.kern at gmail.com>
> Subject: Re: [SciPy-User] How can I solve a equation like
>        sqrt(log(x)*erfc(exp(log(x)+1)) - 1) = 2 and exp((log(x) - 1.5)**2 -
>        3) = 5
> To: SciPy Users List <scipy-user at scipy.org>
> Message-ID:
>        <CAF6FJiv55hN-FkNmXryXy1s0Opc-n96z=RkF2o4wNnxnDoU=Pw at mail.gmail.com
> >
> Content-Type: text/plain; charset=UTF-8
>
> On Sun, Aug 28, 2011 at 01:18, Johnny <dbigbear at gmail.com> wrote:
> > Hi, I am trying to solve the follow equation:
> >
> > solve(x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x) -
> > mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma * sqrt(2))) -
> > 1, x)
> >
> > I am not sure Scipy can do it and how it can do it ?
>
> [~]
> |18> from numpy import sqrt, log, exp
>
> [~]
> |19> from scipy.special import erfc
>
> [~]
> |20> def f(x, mu=1.0, sigma=0.1):
> ...>     return x * ((1.0 / sqrt(2 * pi) * x * sigma) * exp(-0.5 * (log(x)
> -
> ...>         mu)**2 / sigma**2)) + 0.5 * erfc((mu - log(x)) / (sigma *
> sqrt(2))) - 1.0
> ...>
>
> [~]
> |21> from scipy.optimize import fsolve
>
> [~]
> |22> fsolve(f, 3.0)
> array([ 2.88207063])
>
> [~]
> |23> f(_)
> array([  4.44089210e-16])
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
> ? -- Umberto Eco
>
>
> ------------------------------
>
> Message: 4
> Date: Sat, 27 Aug 2011 09:47:28 -0500
> From: Andy Wilson <wilson.andrew.j at gmail.com>
> Subject: Re: [SciPy-User] Return variable value by function value
> To: SciPy Users List <scipy-user at scipy.org>
> Message-ID:
>        <CAPpTVmqA9P4NHjC3TH0YfrLSe0ukbkMcz+2zTA4O6xLyJrkeug at mail.gmail.com
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> If an approximation is good enough, you can use scipy.interpolate.interp1d
> to get a function that returns interpolated values. Your example doesn't
> quite work because 0.95 is out of the range of the initial input.
>
>
> import numpy as np
> import scipy.interpolate
>
> x = np.arange(0,100)
> y = np.sqrt(1 - x**2/10E+4)
>
> interp_func = scipy.interpolate.interp1d(x, y, kind='quadratic')
>
> new_x = 0.95
> interp_y = interp_func(new_x)
> actual_y = np.sqrt(1 - new_x**2/10E+4)
>
> print "actual value: %s" % actual_y
> print "interpolated value %s" % interp_y
> print "difference: %s" % (actual_y - interp_y)
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mail.scipy.org/pipermail/scipy-user/attachments/20110827/544ae7a2/attachment-0001.html
>
> ------------------------------
>
> Message: 5
> Date: Sun, 28 Aug 2011 15:21:40 +1200
> From: ali franco <ali.franco95 at gmail.com>
> Subject: [SciPy-User] Sotring data for fast access
> To: scipy-user at scipy.org
> Message-ID:
>        <CAPoWvn7g-NKf21eopVKrDsbTXCDbxK3cj5pyKQ=JfD829mQcwQ at mail.gmail.com
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> There are two parts to my question. One: I have to do a double integration
> on a grid
>
> answer = integrate (  f(x,y) times besselfunction(x,y))
>
> Now, I have read that the besselfunction can be precomputed and saved to
> disk for fast access. How do I do this? Right now, I am evaluating the
> besselfunction from scipy.special as it is required.
>
> Second question: I have numerically integrated a differential equation and
> I
> use the splined solution to solve other differential equations. However the
> splined solution is slow. Is there a way to make this faster?
>
> thanks guys
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/3f4acea4/attachment-0001.html
>
> ------------------------------
>
> Message: 6
> Date: Sun, 28 Aug 2011 23:02:54 +1200
> From: ali franco <ali.franco95 at gmail.com>
> Subject: [SciPy-User] Does odeint return derivative
> To: scipy-user at scipy.org
> Message-ID:
>        <CAPoWvn5oZFr637aZJBhm4Ws9i-pjsMbR4xGdr5oQa6dN=CSn_g at mail.gmail.com
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> I am solving a system of differential equations using odeint. Is there a
> simpler way of also getting the derivative other than calculating the
> derivative from the solutions obtained which in my case is going to take
> alot of extra time.
>
> thanks
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/f2f55861/attachment-0001.html
>
> ------------------------------
>
> Message: 7
> Date: Sun, 28 Aug 2011 23:48:57 +1200
> From: ali franco <ali.franco95 at gmail.com>
> Subject: [SciPy-User] RectBivariateSpline
> To: scipy-user at scipy.org
> Message-ID:
>        <CAPoWvn6Bo=iwpA+oHVUs_wFyCHNFLw9Bxai4kBMMMo5TMurakg at mail.gmail.com
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> Can RectBivariateSpline be used to calculated derivatives and integrals? If
> it can't, can you please suggest some thing else that does. I have to use a
> two dimensional spline on a rectangular mesh. The problem with alternatives
> to RectBivariateSpline such as BivariateSpline , And UnivariateSpline I
> found was that they require the data be specified either on a square grid
> or
> that they be equally spaced neither of which my data points satisfy.
> thanks
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mail.scipy.org/pipermail/scipy-user/attachments/20110828/1251bbf9/attachment-0001.html
>
> ------------------------------
>
> Message: 8
> Date: Sun, 28 Aug 2011 09:40:15 -0500
> From: Bruce Southey <bsouthey at gmail.com>
> Subject: Re: [SciPy-User] Install Scipy Errors: ImportError:
>        /path_to/liblapack.so: undefined symbol: ztbsv_
> To: SciPy Users List <scipy-user at scipy.org>
> Message-ID:
>        <CAAea2pb=4WQ=dJqxhOK7tmP-r2qfvDrka1mXTQE6-5A1+69xtQ at mail.gmail.com
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Sun, Aug 28, 2011 at 1:16 AM, Johnny <dbigbear at gmail.com> wrote:
> > Hi all,
> >
> > I am installing lapack, atlas, numpy, scipy on my LINUX for TEN times,
> > but always encountering the problem:
> >
> > [work <at> XXX]$ python -c 'import scipy.optimize;
> > scipy.optimize.test()'
> > Traceback (most recent call last):
> > ?File "<string>", line 1, in <module>
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > scipy/optimize/__init__.py", line 11, in <module>
> > ? ?from lbfgsb import fmin_l_bfgs_b
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > scipy/optimize/lbfgsb.py", line 28, in <module>
> > ? ?import _lbfgsb
> > ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
> > ztbsv_
> >
> > I can pass some other tests like:
> >
> >
> > [work <at> XXX:~/local]$ python -c 'import scipy.ndimage;
> > scipy.ndimage.test()'
> > Running unit tests for scipy.ndimage
> > NumPy version 1.6.1
> > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> > packages/numpy
> > SciPy version 0.9.0
> > SciPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> > packages/scipy
> > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> > 20051201 (Red Hat 3.4.5-2)]
> > nose version 1.1.2
> >
> .........S.................................................................................................................................................................................................................................................................................................................................................................................................................
> > ----------------------------------------------------------------------
> > Ran 411 tests in 1.247s
> >
> > OK (SKIP=1)
> >
> > The problem seems due to the lib of Lapack. So I tried the solutions
> > posted on the internet before.
> >
> > 1) The liblapack.so may be not complete...SO I tried this:
> > ? ?# integrate lapack with atlas:
> > ? ?cd lib/
> > ? ?mkdir tmp
> > ? ?cd tmp/
> > ? ?ar x ../liblapack.a
> > ? ?cp ~/path_to/lapack-3.1.1/lapack_LINUX.a ../liblapack.a
> > ? ?ar r ../liblapack.a *.o
> > ? ?cd ../..
> > ? ?make check
> > ? ?make ptcheck
> > ? ?cp include/* ~/include/
> > ? ?cp lib/*.a ~/lib/
> >
> > That is, after installing atlas, there is another liblapack.a (in
> > addition to the lapack_LINUX.a after Lapack) in its lib, but it is
> > about 500k, so I integrate it with the lapack_LINUX.a from installing
> > Lapack. The final liblapack.a is about 9.3m, The liblapack.so is about
> > 5m
> >
> > 2) re-install Lapack and atlas many times....No use
> >
> > 3) I found there is a lapack.so under scipy/lib, and it is about 500K,
> > but I think it may be not the problem, becaues the failure is
> > "ImportError: /home/work/local/lib/liblapack.so: undefined symbol:
> > ztbsv_". Scipy seemed to import liblapack.so in my general lib
> > directory...
> >
> > 4) One thing ?I am not sure is that I used gcc 4.7 and gfortran to
> > compile lapack and atlas, but my python 2.7 was built using gcc
> > 3.4.5.....Is this a problem?
> >
> >
> > Anyone can help?
> > _______________________________________________________________
> > My configuration of the installation:
> >
> > * ATLAS 3.8.4
> > * lapack 3.3.1
> > * numpy 1.6.1
> > * SciPy version 0.9.0
> > * dateutil 1.5
> > * Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> > 20051201 (Red Hat 3.4.5-2)]
> > * nose version 1.1.2
> > * gcc (GCC) 4.7.0 20110820 (experimental)
> > * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010
> > x86_64 x86_64 x86_64 GNU/Linux
> >
> > site.cfg of Scipy:
> >
> > [DEFAULT]
> > library_dirs = /home/work/local/lib
> > include_dirs = /home/work/local/include
> > [lapack_opt]
> > libraries = lapack, f77blas, cblas, atlas
> >
> > site.cfg of Numpy:
> >
> > [DEFAULT]
> > library_dirs = /home/work/local/lib
> > include_dirs = /home/work/local/include
> > [lapack_opt]
> > libraries = lapack, f77blas, cblas, atlas
> >
> >
> > In addition, there are failures as well when test Numpy:
> >
> >>>> import numpy
> >>>> numpy.test('1')
> > Running unit tests for numpy
> > NumPy version 1.6.1
> > NumPy is installed in /home/work/local/python-2.7.1/lib/python2.7/site-
> > packages/numpy
> > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> > 20051201 (Red Hat 3.4.5-2)]
> > nose version 1.1.2
> > ======================================================================
> > FAIL: Test basic arithmetic function errors
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/testing/decorators.py", line 215, in knownfailer
> > ? ?return f(*args, **kwargs)
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/core/tests/test_numeric.py", line 367, in
> > test_floating_exceptions_power
> > ? ?np.power, ftype(2), ftype(2**fi.nexp))
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/core/tests/test_numeric.py", line 271, in assert_raises_fpe
> > ? ?"Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/testing/utils.py", line 34, in assert_
> > ? ?raise AssertionError(msg)
> > AssertionError: Type <type 'numpy.float64'> did not raise fpe error
> > 'overflow'.
> >
> > ======================================================================
> > FAIL: Test generic loops.
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/core/tests/test_ufunc.py", line 86, in test_generic_loops
> > ? ?assert_almost_equal(fone(x), fone_val, err_msg=msg)
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/testing/utils.py", line 448, in assert_almost_equal
> > ? ?raise AssertionError(msg)
> > AssertionError:
> > Arrays are not almost equal to 7 decimals PyUFunc_F_F
> > ?ACTUAL: array([ 0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j, ?0.+0.j],
> > dtype=complex64)
> > ?DESIRED: 1
> >
> > ======================================================================
> > FAIL: test_umath.TestComplexFunctions.test_loss_of_precision(<type
> > 'numpy.complex64'>,)
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> > case.py", line 197, in runTest
> > ? ?self.test(*self.arg)
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/core/tests/test_umath.py", line 931, in check_loss_of_precision
> > ? ?check(x_basic, 2*eps/1e-3)
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/core/tests/test_umath.py", line 901, in check
> > ? ?'arcsinh')
> > AssertionError: (0, 0.0010023052, 0.9987238, 'arcsinh')
> >
> > ======================================================================
> > FAIL: test_umath.TestComplexFunctions.test_precisions_consistent
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> > case.py", line 197, in runTest
> > ? ?self.test(*self.arg)
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/core/tests/test_umath.py", line 812, in
> > test_precisions_consistent
> > ? ?assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f)
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/testing/utils.py", line 448, in assert_almost_equal
> > ? ?raise AssertionError(msg)
> > AssertionError:
> > Arrays are not almost equal to 6 decimals fch-fcd <ufunc 'arcsin'>
> > ?ACTUAL: 2.3561945j
> > ?DESIRED: (0.66623943249251527+1.0612750619050355j)
> >
> > ======================================================================
> > FAIL: test_kind.TestKind.test_all
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/nose/
> > case.py", line 197, in runTest
> > ? ?self.test(*self.arg)
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/f2py/tests/test_kind.py", line 30, in test_all
> > ? ?'selectedrealkind(%s): expected %r but got %r' % ?(i,
> > selected_real_kind(i), selectedrealkind(i)))
> > ?File "/home/work/local/python-2.7.1/lib/python2.7/site-packages/
> > numpy/testing/utils.py", line 34, in assert_
> > ? ?raise AssertionError(msg)
> > AssertionError: selectedrealkind(19): expected -1 but got 16
> >
> > ----------------------------------------------------------------------
> > Ran 3552 tests in 29.977s
> >
> > FAILED (KNOWNFAIL=3, failures=5)
> > <nose.result.TextTestResult run=3552 errors=0 failures=5>
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> Hi,
> What Linux distro are you actually using?
> Unless you have some issue, I would install the atlas version provided
> by the distro as I have long term success with Fedora's packages
> across multiple versions.
>
> If you still want to build it yourself, then you need to be using the
> same compiler version everywhere.
>
> The ztbsv_ error suggests that you have not build blas, lapack and
> atlas correctly. It is hard to get those right so very carefully check
> the build logs and run the associated tests.
>
> Bruce
>
>
> ------------------------------
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> End of SciPy-User Digest, Vol 96, Issue 47
> ******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/02a40084/attachment.html>

From dbigbear at gmail.com  Mon Aug 29 03:13:08 2011
From: dbigbear at gmail.com (Xiong Deng)
Date: Mon, 29 Aug 2011 15:13:08 +0800
Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined
	symbol: mpfr_get_z_exp"
Message-ID: <CADx86fER9fQ9guYpcBYjst_eg+qfNz7v=kuudYCRauyR5S-4gg@mail.gmail.com>

Hi ALL,

I am trying to install numpy, scipy on my Linux. I have build and installed
numpy on it kindof correctly with only one failure shown that:

[work at tc-fcr-bid03.tc.baidu.com:~/xiongdeng/soft/scipy-0.9.0]$ python
Python 2.7.1 (r271:86832, Jan 13 2011, 22:17:56)
[GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.test()
Running unit tests for numpy
NumPy version 1.6.1
NumPy is installed in
/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy
Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201
(Red Hat 3.4.5-2)]
nose version 1.1.2
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F...........................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................K......................K.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
======================================================================
FAIL: Test basic arithmetic function errors
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py",
line 215, in knownfailer
    return f(*args, **kwargs)
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
line 367, in test_floating_exceptions_power
    np.power, ftype(2), ftype(2**fi.nexp))
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
line 271, in assert_raises_fpe
    "Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
  File
"/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py",
line 34, in assert_
    raise AssertionError(msg)
AssertionError: Type <type 'numpy.float64'> did not raise fpe error
'overflow'.

----------------------------------------------------------------------
Ran 3533 tests in 12.494s

FAILED (KNOWNFAIL=3, failures=1)
<nose.result.TextTestResult run=3533 errors=0 failures=1>


HOWEVER, when I am building my scipy, there is a big error, causing
termination of the building process. The messages are as below:

/home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951:
symbol lookup error:
/home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951:
undefined symbol: mpfr_get_z_exp
error: Command "/home/work/local/gcc-4.7/bin/gfortran -Wall -ffixed-form
-fno-second-underscore -fPIC -O3 -funroll-loops
-I/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/include
-c -c scipy/special/specfun/specfun.f -o
build/temp.linux-x86_64-2.7/scipy/special/specfun/specfun.o" failed with
exit status 1

IN ADDITION: One thing I have to say: when I compiled and installed gcc-4.7
locally, I did not install GMP, *MPFR*, and MPC. They are installed after
gcc-4.7....The problem may be due to this???? But How can I fix it without
re-installing gcc-4.7 ???


-----------------------------------------------------------------------------------------------------------------

My configuration:

My configuration of the installation:

* ATLAS 3.8.4
* lapack 3.3.1
* numpy 1.6.1
* SciPy version 0.9.0 (when building scipy version 0.8.0, the error is the
same )
* dateutil 1.5
* Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
20051201 (Red Hat 3.4.5-2)]
* nose version 1.1.2
* gcc (GCC) 4.7.0 20110820 (experimental)
* LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010
x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux AS release 4 (Nahant Update 3)

export:

declare -x ATLAS="/home/work/local/lib/libatlas.so"
declare -x G_BROKEN_FILENAMES="1"
declare -x HISTSIZE="1000"
declare -x HOME="/home/work"
declare -x INPUTRC="/etc/inputrc"
declare -x LANG="en_US"
declare -x LAPACK="/home/work/local/lib/liblapack.so"
declare -x LC_CTYPE="zh_CN.gb18030"
declare -x
LD_LIBRARY_PATH=":/home/work/local/lib/:/home/work/local/gcc-4.7/lib/"
declare -x LESSOPEN="|/usr/bin/lesspipe.sh %s"
declare -x LOGNAME="work"
declare -x
LS_COLORS="no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:"
declare -x MAC="64"
declare -x MAIL="/var/spool/mail/work"
declare -x OLDPWD="/home/work/xiongdeng/soft"
declare -x
PATH="/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/share/baidu/bin:/home/work/local/python-2.7.1/bin/:/home/work/local/vim/bin/:/home/work/local/svn/bin/:/home/work/local/hadoop-client/hadoop/bin/:/home/work/local/script/:/home/work/local/bin/:/home/work/local/gcc-4.7/bin/:/home/work/bin"
declare -x PROMPT_COMMAND="echo -ne \"\\e]0;tc-fcr-bid03\\a\""
declare -x PWD="/home/work/xiongdeng/soft/scipy-0.9.0"
declare -x SDIR_FILE="/home/work/.sdir_label"
declare -x SHELL="/bin/bash"
declare -x SHLVL="1"
declare -x SVN_EDITOR="/home/work/local/vim/bin/vim"
declare -x TERM="linux"
declare -x USER="work"

ll /home/work/local/lib

-rw-r--r--  1 work work 13424422 Aug 29 01:55 libatlas.a
-rwxr-xr-x  1 work work  8463221 Aug 29 01:55 libatlas.so
-rw-r--r--  1 work work   466344 Aug 29 01:55 libcblas.a
-rwxr-xr-x  1 work work   142677 Aug 29 01:55 libcblas.so
-rw-r--r--  1 work work   566602 Aug 29 01:55 libf77blas.a
-rwxr-xr-x  1 work work   158579 Aug 29 01:55 libf77blas.so
-rw-r--r--  1 work work   515618 Aug 23 21:05 libgcc_s.so
-rw-r--r--  1 work work   515618 Aug 23 21:05 libgcc_s.so.1
-rw-r--r--  1 work work 16135904 Aug 27 02:04 libgfortran.a
-rwxr-xr-x  1 work work     1034 Aug 27 02:04 libgfortran.la
-rw-r--r--  1 work work  5962327 Aug 27 02:04 libgfortran.so
-rwxr-xr-x  1 work work  5962327 Aug 27 02:04 libgfortran.so.3
-rwxr-xr-x  1 work work  5962327 Aug 27 02:04 libgfortran.so.3.0.0
-rw-r--r--  1 work work      269 Aug 27 02:04 libgfortran.spec
-rw-r--r--  1 work work  1134848 Aug 24 16:01 libgmp.a
-rwxr-xr-x  1 work work      923 Aug 24 16:01 libgmp.la
lrwxrwxrwx  1 work work       16 Aug 24 16:01 libgmp.so -> libgmp.so.10.0.2
lrwxrwxrwx  1 work work       16 Aug 24 16:01 libgmp.so.10 ->
libgmp.so.10.0.2
-rw-r--r--  1 work work   471045 Aug 24 16:01 libgmp.so.10.0.2
-rw-r--r--  1 work work   642892 Aug 23 21:05 libgomp.a
-rwxr-xr-x  1 work work      958 Aug 23 21:05 libgomp.la
-rw-r--r--  1 work work   329946 Aug 23 21:05 libgomp.so
-rwxr-xr-x  1 work work   329946 Aug 23 21:05 libgomp.so.1
-rwxr-xr-x  1 work work   329946 Aug 23 21:05 libgomp.so.1.0.0
-rw-r--r--  1 work work      170 Aug 23 21:05 libgomp.spec
-rw-r--r--  1 work work  1275386 Aug 23 21:05 libiberty.a
-rw-r--r--  1 work work 10561756 Aug 29 01:55 liblapack.a
-rwxr-xr-x  1 work work  5683831 Aug 29 01:55 liblapack.so
-rw-r--r--  1 work work  1133984 Aug 23 20:48 liblzma.a
-rwxr-xr-x  1 work work      924 Aug 23 20:48 liblzma.la
lrwxrwxrwx  1 work work       16 Aug 23 20:48 liblzma.so -> liblzma.so.5.0.3
lrwxrwxrwx  1 work work       16 Aug 23 20:48 liblzma.so.5 ->
liblzma.so.5.0.3
-rw-r--r--  1 work work   582278 Aug 23 20:48 liblzma.so.5.0.3
-rw-r--r--  1 work work   199486 Aug 24 16:06 libmpc.a
-rwxr-xr-x  1 work work     1027 Aug 24 16:06 libmpc.la
lrwxrwxrwx  1 work work       15 Aug 24 16:06 libmpc.so -> libmpc.so.2.0.0
lrwxrwxrwx  1 work work       15 Aug 24 16:06 libmpc.so.2 -> libmpc.so.2.0.0
-rw-r--r--  1 work work    91064 Aug 24 16:06 libmpc.so.2.0.0
-rw-r--r--  1 work work  2905606 Aug 24 15:53 libmpfr.a
-rwxr-xr-x  1 work work      948 Aug 24 15:53 libmpfr.la
lrwxrwxrwx  1 work work       16 Aug 24 15:53 libmpfr.so -> libmpfr.so.4.0.1
lrwxrwxrwx  1 work work       16 Aug 24 20:58 libmpfr.so.1 ->
libmpfr.so.4.0.1
lrwxrwxrwx  1 work work       16 Aug 24 15:53 libmpfr.so.4 ->
libmpfr.so.4.0.1
-rw-r--r--  1 work work  1317215 Aug 24 15:53 libmpfr.so.4.0.1
-rw-r--r--  1 work work   466902 Aug 29 01:55 libptcblas.a
-rw-r--r--  1 work work   567162 Aug 29 01:55 libptf77blas.a
-rw-r--r--  1 work work  1583884 Aug 23 21:05 libquadmath.a
-rwxr-xr-x  1 work work      985 Aug 23 21:05 libquadmath.la
-rw-r--r--  1 work work   903165 Aug 23 21:05 libquadmath.so
-rwxr-xr-x  1 work work   903165 Aug 23 21:05 libquadmath.so.0
-rwxr-xr-x  1 work work   903165 Aug 23 21:05 libquadmath.so.0.0.0
-rw-r--r--  1 work work   103546 Aug 23 21:05 libssp.a
-rwxr-xr-x  1 work work      946 Aug 23 21:05 libssp.la
-rw-r--r--  1 work work     3534 Aug 23 21:05 libssp_nonshared.a
-rwxr-xr-x  1 work work      928 Aug 23 21:05 libssp_nonshared.la
-rw-r--r--  1 work work    48981 Aug 23 21:05 libssp.so
-rwxr-xr-x  1 work work    48981 Aug 23 21:05 libssp.so.0
-rwxr-xr-x  1 work work    48981 Aug 23 21:05 libssp.so.0.0.0
-rw-r--r--  1 work work 15669078 Aug 23 21:05 libstdc++.a
-rwxr-xr-x  1 work work      973 Aug 23 21:05 libstdc++.la
-rw-r--r--  1 work work  6408505 Aug 23 21:05 libstdc++.so
-rwxr-xr-x  1 work work  6408505 Aug 23 21:05 libstdc++.so.6
-rwxr-xr-x  1 work work  6408505 Aug 23 21:05 libstdc++.so.6.0.17
-rw-r--r--  1 work work     2330 Aug 23 21:05 libstdc++.so.6.0.17-gdb.py
-rw-r--r--  1 work work  1092892 Aug 23 21:05 libsupc++.a
-rwxr-xr-x  1 work work      907 Aug 23 21:05 libsupc++.la
-rw-rw-r--  1 work work   490328 Aug 29 01:55 libtstatlas.a
-rw-rw-r--  1 work work   515296 Aug 27 12:15 lilapack.a

ll /home/work/local/include

drwxr-xr-x  2 work work  4096 Aug 24 14:43 atlas
-rw-rw-r--  1 work work  1773 Aug 27 11:48 atlas_buildinfo.h
-rw-rw-r--  1 work work    90 Aug 27 11:48 atlas_cacheedge.h
-rw-rw-r--  1 work work   147 Aug 27 11:48 atlas_cmv.h
-rw-rw-r--  1 work work   451 Aug 27 11:48 atlas_cmvN.h
-rw-rw-r--  1 work work   387 Aug 27 11:48 atlas_cmvS.h
-rw-rw-r--  1 work work   481 Aug 27 11:48 atlas_cmvT.h
-rw-rw-r--  1 work work   455 Aug 27 11:48 atlas_cNCmm.h
-rw-rw-r--  1 work work   353 Aug 27 11:48 atlas_cr1.h
-rw-rw-r--  1 work work    77 Aug 27 11:48 atlas_csNKB.h
-rw-rw-r--  1 work work   195 Aug 27 11:48 atlas_csysinfo.h
-rw-rw-r--  1 work work     0 Aug 27 11:48 atlas_ctrsmXover.h
-rw-rw-r--  1 work work   147 Aug 27 11:48 atlas_dmv.h
-rw-rw-r--  1 work work   451 Aug 27 11:48 atlas_dmvN.h
-rw-rw-r--  1 work work   387 Aug 27 11:48 atlas_dmvS.h
-rw-rw-r--  1 work work   482 Aug 27 11:48 atlas_dmvT.h
-rw-rw-r--  1 work work   455 Aug 27 11:48 atlas_dNCmm.h
-rw-rw-r--  1 work work   352 Aug 27 11:48 atlas_dr1.h
-rw-rw-r--  1 work work   195 Aug 27 11:48 atlas_dsysinfo.h
-rw-rw-r--  1 work work   112 Aug 27 11:48 atlas_dtrsmXover.h
-rw-rw-r--  1 work work   112 Aug 27 11:48 atlas_pthreads.h
-rw-rw-r--  1 work work   147 Aug 27 11:48 atlas_smv.h
-rw-rw-r--  1 work work   451 Aug 27 11:48 atlas_smvN.h
-rw-rw-r--  1 work work   438 Aug 27 11:48 atlas_smvS.h
-rw-rw-r--  1 work work   637 Aug 27 11:48 atlas_smvT.h
-rw-rw-r--  1 work work   455 Aug 27 11:48 atlas_sNCmm.h
-rw-rw-r--  1 work work   353 Aug 27 11:48 atlas_sr1.h
-rw-rw-r--  1 work work   195 Aug 27 11:48 atlas_ssysinfo.h
-rw-rw-r--  1 work work   112 Aug 27 11:48 atlas_strsmXover.h
-rw-rw-r--  1 work work   191 Aug 27 11:48 atlas_trsmNB.h
-rw-rw-r--  1 work work   562 Aug 27 11:48 atlas_type.h
-rw-rw-r--  1 work work    77 Aug 27 11:48 atlas_zdNKB.h
-rw-rw-r--  1 work work   147 Aug 27 11:48 atlas_zmv.h
-rw-rw-r--  1 work work   451 Aug 27 11:48 atlas_zmvN.h
-rw-rw-r--  1 work work   386 Aug 27 11:48 atlas_zmvS.h
-rw-rw-r--  1 work work   481 Aug 27 11:48 atlas_zmvT.h
-rw-rw-r--  1 work work   455 Aug 27 11:48 atlas_zNCmm.h
-rw-rw-r--  1 work work   353 Aug 27 11:48 atlas_zr1.h
-rw-rw-r--  1 work work   195 Aug 27 11:48 atlas_zsysinfo.h
-rw-rw-r--  1 work work     0 Aug 27 11:48 atlas_ztrsmXover.h
-rw-r--r--  1 work work 33895 Aug 28 16:17 cblas.h
-rw-r--r--  1 work work  8225 Aug 28 16:17 clapack.h
-rw-rw-r--  1 work work  2719 Aug 27 11:48 cmm.h
-rw-rw-r--  1 work work   540 Aug 27 11:48 cXover.h
-rw-rw-r--  1 work work   657 Aug 27 11:48 dmm.h
-rw-rw-r--  1 work work   526 Aug 27 11:48 dXover.h
-rw-r--r--  1 work work 86216 Aug 24 16:01 gmp.h
drwxrwxr-x  2 work work  4096 Aug 23 20:48 lzma
-rw-r--r--  1 work work  9274 Aug 23 20:48 lzma.h
-rw-r--r--  1 work work 13049 Aug 24 16:06 mpc.h
-rw-r--r--  1 work work  6288 Aug 24 15:53 mpf2mpfr.h
-rw-r--r--  1 work work 47981 Aug 24 15:53 mpfr.h
-rw-rw-r--  1 work work   658 Aug 27 11:48 smm.h
-rw-rw-r--  1 work work   523 Aug 27 11:48 sXover.h
-rw-rw-r--  1 work work  2718 Aug 27 11:48 zmm.h
-rw-rw-r--  1 work work   555 Aug 27 11:48 zXover.h


Many thanks
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/766076fc/attachment.html>

From hhh.guo at gmail.com  Mon Aug 29 03:36:21 2011
From: hhh.guo at gmail.com (Ning Guo)
Date: Mon, 29 Aug 2011 15:36:21 +0800
Subject: [SciPy-User] 3d convex hull
In-Reply-To: <j3bpsh$eh0$2@dough.gmane.org>
References: <201108241338.50940.alexandre.fayolle@logilab.fr>	<CAOJcpR_3xR7aR2dgefWw8onfY-ApcZGZCGa=qd8xEdg2Ct5YoQ@mail.gmail.com>	<4E551F34.4@gmail.com>
	<j3a7qk$igl$1@dough.gmane.org>	<4E590958.30103@gmail.com>
	<j3bpsh$eh0$2@dough.gmane.org>
Message-ID: <4E5B4175.10300@gmail.com>

On Sunday, August 28, 2011 06:08 AM, Pauli Virtanen wrote:

It seems qhull does not output the vertices in a consistent order. I 
have to use the inner product of the normal with a side edge to 
determine the sign :-(

> Sat, 27 Aug 2011 23:12:24 +0800, Ning Guo wrote:
>> On Saturday, August 27, 2011 03:53 PM, Pauli Virtanen wrote:
> [clip]
>> Also, the formula to calculate normal may be like this:
>>
>> face_normals[:,0] =
>> np.cross(tetra_points[:,0]-tetra_points[:,2],tetra_points[:,1]-tetra_points[:,2])
> [clip]
>
> Ah yes, exactly like that, my brain apparently wasn't working properly.
>
>> Regarding to the order of the vertices, I'm also not sure about their
>> convention. I'm trying to figure it out.
> If you find it out, please let us know, as this would be an useful thing
> to mention in the documentation. However, I'm not sure at the moment
> whether Qhull provides such ordering guarantees.
>


-- 
Geotechnical Group
Department of Civil and Environmental Engineering
Hong Kong University of Science and Technology
Clear Water Bay, Kowloon, Hong Kong


From cournape at gmail.com  Mon Aug 29 04:12:03 2011
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 29 Aug 2011 10:12:03 +0200
Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined
 symbol: mpfr_get_z_exp"
In-Reply-To: <CADx86fER9fQ9guYpcBYjst_eg+qfNz7v=kuudYCRauyR5S-4gg@mail.gmail.com>
References: <CADx86fER9fQ9guYpcBYjst_eg+qfNz7v=kuudYCRauyR5S-4gg@mail.gmail.com>
Message-ID: <CAGY4rcUTB=7nmS0Q6P_5Dd50YeO0W5PNxuQTHdaJJDYPXx8wew@mail.gmail.com>

On Mon, Aug 29, 2011 at 9:13 AM, Xiong Deng <dbigbear at gmail.com> wrote:
> Hi ALL,
>
> I am trying to install numpy, scipy on my Linux. I have build and installed
> numpy on it kindof correctly with only one failure shown that:
>
> [work at tc-fcr-bid03.tc.baidu.com:~/xiongdeng/soft/scipy-0.9.0]$ python
> Python 2.7.1 (r271:86832, Jan 13 2011, 22:17:56)
> [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import numpy
>>>> numpy.test()
> Running unit tests for numpy
> NumPy version 1.6.1
> NumPy is installed in
> /home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy
> Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5 20051201
> (Red Hat 3.4.5-2)]
> nose version 1.1.2
> ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F...........................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................K......................K.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ======================================================================
> FAIL: Test basic arithmetic function errors
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ? File
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py",
> line 215, in knownfailer
> ??? return f(*args, **kwargs)
> ? File
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
> line 367, in test_floating_exceptions_power
> ??? np.power, ftype(2), ftype(2**fi.nexp))
> ? File
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
> line 271, in assert_raises_fpe
> ??? "Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
> ? File
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py",
> line 34, in assert_
> ??? raise AssertionError(msg)
> AssertionError: Type <type 'numpy.float64'> did not raise fpe error
> 'overflow'.
>
> ----------------------------------------------------------------------
> Ran 3533 tests in 12.494s
>
> FAILED (KNOWNFAIL=3, failures=1)
> <nose.result.TextTestResult run=3533 errors=0 failures=1>
>
>
> HOWEVER, when I am building my scipy, there is a big error, causing
> termination of the building process. The messages are as below:
>
> /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951:
> symbol lookup error:
> /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951:
> undefined symbol: mpfr_get_z_exp
> error: Command "/home/work/local/gcc-4.7/bin/gfortran -Wall -ffixed-form
> -fno-second-underscore -fPIC -O3 -funroll-loops
> -I/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/include
> -c -c scipy/special/specfun/specfun.f -o
> build/temp.linux-x86_64-2.7/scipy/special/specfun/specfun.o" failed with
> exit status 1
>
> IN ADDITION: One thing I have to say: when I compiled and installed gcc-4.7
> locally, I did not install GMP, MPFR, and MPC. They are installed after
> gcc-4.7....The problem may be due to this???? But How can I fix it without
> re-installing gcc-4.7 ???

Most likely, you did not build gcc  and gfortran correctly. Why don't
you use the gcc included on your system ?

cheers,

David


From collinstocks at gmail.com  Mon Aug 29 04:49:02 2011
From: collinstocks at gmail.com (Collin Stocks)
Date: Mon, 29 Aug 2011 04:49:02 -0400
Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 47
In-Reply-To: <CADx86fFZCFaq5EO5EfRzDWE5StNUxyVziSN6z23pRdjgF7B4PQ@mail.gmail.com>
References: <mailman.3485.1314542392.1086.scipy-user@scipy.org>
	<CADx86fFZCFaq5EO5EfRzDWE5StNUxyVziSN6z23pRdjgF7B4PQ@mail.gmail.com>
Message-ID: <1314607742.4449.14.camel@SietchTabr>

When you use digest email for the list, it makes it very difficult to
keep the thread of the conversation. Please try to keep this in mind
when replying. Just common courtesy when being active in mailing lists.

Most email clients support filters of some sort, so that is generally a
better alternative to using digest email.

My $0.02 USD.

-- Collin
-------------- next part --------------
An embedded message was scrubbed...
From: Xiong Deng <dbigbear at gmail.com>
Subject: Re: [SciPy-User] SciPy-User Digest, Vol 96, Issue 47
Date: Mon, 29 Aug 2011 14:54:42 +0800
Size: 68581
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/9401c5e7/attachment.mht>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/9401c5e7/attachment.sig>

From dbigbear at gmail.com  Mon Aug 29 05:13:53 2011
From: dbigbear at gmail.com (Xiong Deng)
Date: Mon, 29 Aug 2011 17:13:53 +0800
Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined
	symbol: mpfr_get_z_exp"
Message-ID: <CADx86fFKwx-UwN0rAr-xS+5DHVvtH2K2dGEQUMmOWv7mRVESCQ@mail.gmail.com>

Hi,

I just find out that the gcc-4.7 is downloaded as a binary distri. I did not
compile gcc-4.7 myself...

The gcc included on my system is gcc 3.4.5 and there seems no gfortran built
on it (However there is a g77 on it, which cause problems while building
numpy/scipy....)...In addition, there are not mpc, mpfr, gmp on it with gcc
3.4.5, so I need a new gcc with gfortran, mpc ,mpfr ,gmp, which is necessay
for numpy/scipy....

Btw: I am not sure if there is a way to install mpc, mpfr, and gmp, after
gcc has been installed and link gcc with mpc, mpfr, gmp...

Many thanks
XIong


Message: 2
Date: Mon, 29 Aug 2011 10:12:03 +0200
From: David Cournapeau <cournape at gmail.com>
Subject: Re: [SciPy-User] Error when building scipy.0.9.0 - "f951:
       undefined symbol: mpfr_get_z_exp"
To: SciPy Users List <scipy-user at scipy.org>
Message-ID:
       <CAGY4rcUTB=7nmS0Q6P_5Dd50YeO0W5PNxuQTHdaJJDYPXx8wew at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Mon, Aug 29, 2011 at 9:13 AM, Xiong Deng <dbigbear at gmail.com> wrote:
> Hi ALL,
>
> I am trying to install numpy, scipy on my Linux. I have build and
installed
> numpy on it kindof correctly with only one failure shown that:
>
> [work at tc-fcr-bid03.tc.baidu.
>
> com:~/xiongdeng/soft/scipy-0.9.0]$ python
> > Python 2.7.1 (r271:86832, Jan 13 2011, 22:17:56)
> > [GCC 3.4.5 20051201 (Red Hat 3.4.5-2)] on linux2
> > Type "help", "copyright", "credits" or "license" for more information.
> >>>> import numpy
> >>>> numpy.test()
> > Running unit tests for numpy
> > NumPy version 1.6.1
> > NumPy is installed in
> > /home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy
> > Python version 2.7.1 (r271:86832, Jan 13 2011, 22:17:56) [GCC 3.4.5
> 20051201
> > (Red Hat 3.4.5-2)]
> > nose version 1.1.2
> >
> ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F...........................................................................................................................................................................................................................................................................................................................................................................................................................K.............................
>
>  ....................................................................K......................K.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>
>  .............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>
>  .......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> > ======================================================================
> > FAIL: Test basic arithmetic function errors
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> > ? File
> >
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/decorators.py",
> > line 215, in knownfailer
> > ??? return f(*args, **kwargs)
> > ? File
> >
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
> > line 367, in test_floating_exceptions_power
> > ??? np.power, ftype(2), ftype(2**fi.nexp))
> > ? File
> >
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py",
> > line 271, in assert_raises_fpe
> > ??? "Type %s did not raise fpe error '%s'." % (ftype, fpeerr))
> > ? File
> >
> "/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/testing/utils.py",
> > line 34, in assert_
> > ??? raise AssertionError(msg)
> > AssertionError: Type <type 'numpy.float64'> did not raise fpe error
> > 'overflow'.
> >
> > ----------------------------------------------------------------------
> > Ran 3533 tests in 12.494s
> >
> > FAILED (KNOWNFAIL=3, failures=1)
> > <nose.result.TextTestResult run=3533 errors=0 failures=1>
> >
> >
> > HOWEVER, when I am building my scipy, there is a big error, causing
> > termination of the building process. The messages are as below:
> >
> >
> /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951:
> > symbol lookup error:
> >
> /home/work/local/gcc-4.7/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/f951:
> > undefined symbol: mpfr_get_z_exp
> > error: Command "/home/work/local/gcc-4.7/bin/gfortran -Wall -ffixed-form
> > -fno-second-underscore -fPIC -O3 -funroll-loops
> >
> -I/home/work/local/python-2.7.1/lib/python2.7/site-packages/numpy/core/include
> > -c -c scipy/special/specfun/specfun.f -o
> > build/temp.linux-x86_64-2.7/scipy/special/specfun/specfun.o" failed with
> > exit status 1
> >
> > IN ADDITION: One thing I have to say: when I compiled and installed
> gcc-4.7
> > locally, I did not install GMP, MPFR, and MPC. They are installed after
> > gcc-4.7....The problem may be due to this???? But How can I fix it
> without
> > re-installing gcc-4.7 ???
>
> Most likely, you did not build gcc  and gfortran correctly. Why don't
> you use the gcc included on your system ?
>
> cheers,
>
> David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/546f0f69/attachment.html>

From cournape at gmail.com  Mon Aug 29 07:20:17 2011
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 29 Aug 2011 13:20:17 +0200
Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined
 symbol: mpfr_get_z_exp"
In-Reply-To: <CADx86fFKwx-UwN0rAr-xS+5DHVvtH2K2dGEQUMmOWv7mRVESCQ@mail.gmail.com>
References: <CADx86fFKwx-UwN0rAr-xS+5DHVvtH2K2dGEQUMmOWv7mRVESCQ@mail.gmail.com>
Message-ID: <CAGY4rcV=VNn0hNvSYKwdJKFD7uODwcBZ3YEC48RESB+vT4S51w@mail.gmail.com>

On Mon, Aug 29, 2011 at 11:13 AM, Xiong Deng <dbigbear at gmail.com> wrote:
> Hi,
>
> I just find out that the gcc-4.7 is downloaded as a binary distri. I did not
> compile gcc-4.7 myself...

Then the binary is buggy or not adapted to your platform.

>
> The gcc included on my system is gcc 3.4.5 and there seems no gfortran built
> on it (However there is a g77 on it, which cause problems while building
> numpy/scipy....).

You should be able to build numpy and scipy with g77. You should not
try mixing compiler versions unlesss you are willing to spend quite
some time debugging subtle mismatches issues.

>..In addition, there are not mpc, mpfr, gmp on it with gcc
> 3.4.5, so I need a new gcc with gfortran, mpc ,mpfr ,gmp, which is necessay
> for numpy/scipy....

The usual way to do this is to first build mpfr and gmp with whatever
compiler you have (gcc 3.4.5 here), and then build the new version of
gcc and gfortran. But again, you would be better just using the
compilers you have on your machine.

cheers,

David


From bsouthey at gmail.com  Mon Aug 29 09:42:44 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Mon, 29 Aug 2011 08:42:44 -0500
Subject: [SciPy-User] Error when building scipy.0.9.0 - "f951: undefined
 symbol: mpfr_get_z_exp"
In-Reply-To: <CAGY4rcV=VNn0hNvSYKwdJKFD7uODwcBZ3YEC48RESB+vT4S51w@mail.gmail.com>
References: <CADx86fFKwx-UwN0rAr-xS+5DHVvtH2K2dGEQUMmOWv7mRVESCQ@mail.gmail.com>
	<CAGY4rcV=VNn0hNvSYKwdJKFD7uODwcBZ3YEC48RESB+vT4S51w@mail.gmail.com>
Message-ID: <4E5B9754.1070204@gmail.com>

On 08/29/2011 06:20 AM, David Cournapeau wrote:
> On Mon, Aug 29, 2011 at 11:13 AM, Xiong Deng<dbigbear at gmail.com>  wrote:
>> Hi,
>>
>> I just find out that the gcc-4.7 is downloaded as a binary distri. I did not
>> compile gcc-4.7 myself...
> Then the binary is buggy or not adapted to your platform.
>
>> The gcc included on my system is gcc 3.4.5 and there seems no gfortran built
>> on it (However there is a g77 on it, which cause problems while building
>> numpy/scipy....).
> You should be able to build numpy and scipy with g77. You should not
> try mixing compiler versions unlesss you are willing to spend quite
> some time debugging subtle mismatches issues.
>
>> ..In addition, there are not mpc, mpfr, gmp on it with gcc
>> 3.4.5, so I need a new gcc with gfortran, mpc ,mpfr ,gmp, which is necessay
>> for numpy/scipy....
> The usual way to do this is to first build mpfr and gmp with whatever
> compiler you have (gcc 3.4.5 here), and then build the new version of
> gcc and gfortran. But again, you would be better just using the
> compilers you have on your machine.
>
> cheers,
>
> David
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
Assuming this is the same John or Johnny as before,
RHEL does provide Atlas as a package so you should get it from Red Hat 
package manager.

If you build things yourself, you must ensure that all previous versions 
have been removed from everywhere and that you are linking paths are to 
the correct locations.

Bruce
PS please use one name and one thread


From cjordan1 at uw.edu  Mon Aug 29 10:57:43 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Mon, 29 Aug 2011 09:57:43 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
Message-ID: <CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>

On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>> <jason-sage at creativetrax.com> wrote:
>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>> github repo. My overall impression is that R is much stronger for
>>>>> interactive data analysis. Click on the link for more details why,
>>>>> which are summarized in the README file.
>>>>
>>>> ?From the README:
>>>>
>>>> "In fact, using Python without the IPython qtconsole is practically
>>>> impossible for this sort of cut and paste, interactive analysis.
>>>> The shell IPython doesn't allow it because it automatically adds
>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>> alignment. Cutting and pasting works for the standard python shell,
>>>> but then you lose all the advantages of IPython."
>>>>
>>>>
>>>>
>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>> automatically inserting spaces:
>>>>
>>>> In [5]: %cpaste
>>>> Pasting code; enter '--' alone on the line to stop.
>>>> :if 1>0:
>>>> : ? ?print 'hi'
>>>> :--
>>>> hi
>>>>
>>>> Thanks,
>>>>
>>>> Jason
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>
>>> This strikes me as a textbook example of why we need an integrated
>>> formula framework in statsmodels. I'll make a pass through when I get
>>> a chance and see if there are some places where pandas would really
>>> help out.
>>
>> We used to have a formula class is scipy.stats and I do not follow
>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>> had this (extremely flexible but very hard to comprehend). It was what
>> I had argued was needed ages ago for statsmodel. But it needs a
>> community effort because the syntax required serves multiple
>> communities with different annotations and needs. That is also seen
>> from the different approaches taken by the stats packages from S/R,
>> SAS, Genstat (and those are just are ones I have used).
>>
>
> We have held this discussion at _great_ length multiple times on the
> statsmodels list and are in the process of trying to integrate
> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
> the statsmodels base.
>
> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>
> and more recently
>
> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>
> https://github.com/statsmodels/formula
> https://github.com/statsmodels/charlton
>
> Wes and I made some effort to go through this at SciPy. From where I
> sit, I think it's difficult to disentangle the data structures from
> the formula implementation, or maybe I'd just prefer to finish
> tackling the former because it's much more straightforward. So I'd
> like to first finish the pandas-integration branch that we've started
> and then focus on the formula support. This is on my (our, I hope...)
> immediate long-term goal list. Then I'd like to come back to the
> community and hash out the 'rules of the game' details for formulas
> after we have some code for people to play with, which promises to be
> "fun."
>
> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>
> FWIW, I could also improve the categorical function to be much nicer
> for the given examples (ie., take a list, drop a reference category),
> but I don't know that it's worth it, because it's really just a
> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
> more stop-gap?
>

I want more usability, but I agree that a stop-gap probably isn't the
right way to go, unless it has things we'd eventually want anyways.

> If I understand Chris' concerns, I think pandas + formula will go a
> long way towards bridging the gap between Python and R usability, but

Yes, I agree. pandas + formulas would go a long, long way towards more
usability.

Though I really, really want a scatterplot smoother (i.e., lowess) in
statsmodels. I use it a lot, and the final part of my R file was
entirely lowess. (And, I should add, that was the part people liked
best since one of the main goals of the assignment was to generate
nifty pictures that could be used to summarize the data.)

> it's a large effort and there are only a handful (at best) of people
> writing code -- Wes being the only one who's more or less "full time"
> as far as I can tell. The 0.4 statsmodels release should be very
> exciting though, I hope. I'm looking forward to it, at least. Then
> there's only the small problem of building an infrastructure and
> community like CRAN so we can have specialists writing and maintaining
> code...but I hope once all the tools are in place this will seem much
> less daunting. There certainly seems to be the right sentiment for it.
>

At the very least creating and testing models would be much simpler.
For weeks I've been wanting to see if gmm is the same as gee by
fitting both models to the same dataset, but I've been putting it off
because I didn't want to construct the design matrices by hand for
such a simple question. (GMM--Generalized Method of Moments--is a
standard econometrics model and GEE--Generalized Estimating
Equations--is a standard biostatics model. They're both
generalizations of quasi-likelihood and appear very similar, but I
want to fit some models to figure out if they're exactly the same.)

-Chris JS

> Skipper
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jsseabold at gmail.com  Mon Aug 29 11:10:17 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Mon, 29 Aug 2011 11:10:17 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
Message-ID: <CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>

On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>> <jason-sage at creativetrax.com> wrote:
>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>> which are summarized in the README file.
>>>>>
>>>>> ?From the README:
>>>>>
>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>> but then you lose all the advantages of IPython."
>>>>>
>>>>>
>>>>>
>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>> automatically inserting spaces:
>>>>>
>>>>> In [5]: %cpaste
>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>> :if 1>0:
>>>>> : ? ?print 'hi'
>>>>> :--
>>>>> hi
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jason
>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>>
>>>> This strikes me as a textbook example of why we need an integrated
>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>> a chance and see if there are some places where pandas would really
>>>> help out.
>>>
>>> We used to have a formula class is scipy.stats and I do not follow
>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>> had this (extremely flexible but very hard to comprehend). It was what
>>> I had argued was needed ages ago for statsmodel. But it needs a
>>> community effort because the syntax required serves multiple
>>> communities with different annotations and needs. That is also seen
>>> from the different approaches taken by the stats packages from S/R,
>>> SAS, Genstat (and those are just are ones I have used).
>>>
>>
>> We have held this discussion at _great_ length multiple times on the
>> statsmodels list and are in the process of trying to integrate
>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>> the statsmodels base.
>>
>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>
>> and more recently
>>
>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>
>> https://github.com/statsmodels/formula
>> https://github.com/statsmodels/charlton
>>
>> Wes and I made some effort to go through this at SciPy. From where I
>> sit, I think it's difficult to disentangle the data structures from
>> the formula implementation, or maybe I'd just prefer to finish
>> tackling the former because it's much more straightforward. So I'd
>> like to first finish the pandas-integration branch that we've started
>> and then focus on the formula support. This is on my (our, I hope...)
>> immediate long-term goal list. Then I'd like to come back to the
>> community and hash out the 'rules of the game' details for formulas
>> after we have some code for people to play with, which promises to be
>> "fun."
>>
>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>
>> FWIW, I could also improve the categorical function to be much nicer
>> for the given examples (ie., take a list, drop a reference category),
>> but I don't know that it's worth it, because it's really just a
>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>> more stop-gap?
>>
>
> I want more usability, but I agree that a stop-gap probably isn't the
> right way to go, unless it has things we'd eventually want anyways.
>
>> If I understand Chris' concerns, I think pandas + formula will go a
>> long way towards bridging the gap between Python and R usability, but
>
> Yes, I agree. pandas + formulas would go a long, long way towards more
> usability.
>
> Though I really, really want a scatterplot smoother (i.e., lowess) in
> statsmodels. I use it a lot, and the final part of my R file was
> entirely lowess. (And, I should add, that was the part people liked
> best since one of the main goals of the assignment was to generate
> nifty pictures that could be used to summarize the data.)
>

Working my way through the pull requests. Very time poor...

>> it's a large effort and there are only a handful (at best) of people
>> writing code -- Wes being the only one who's more or less "full time"
>> as far as I can tell. The 0.4 statsmodels release should be very
>> exciting though, I hope. I'm looking forward to it, at least. Then
>> there's only the small problem of building an infrastructure and
>> community like CRAN so we can have specialists writing and maintaining
>> code...but I hope once all the tools are in place this will seem much
>> less daunting. There certainly seems to be the right sentiment for it.
>>
>
> At the very least creating and testing models would be much simpler.
> For weeks I've been wanting to see if gmm is the same as gee by
> fitting both models to the same dataset, but I've been putting it off
> because I didn't want to construct the design matrices by hand for
> such a simple question. (GMM--Generalized Method of Moments--is a
> standard econometrics model and GEE--Generalized Estimating
> Equations--is a standard biostatics model. They're both
> generalizations of quasi-likelihood and appear very similar, but I
> want to fit some models to figure out if they're exactly the same.)
>

Oh, it's not *that* bad. I agree, of course, that it could be better,
but I've been using mainly Python for my work, including GMM and
estimating equations models (mainly empirical likelihood and
generalized maximum entropy) for the last ~two years.

Skipper


From cjordan1 at uw.edu  Mon Aug 29 11:21:35 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Mon, 29 Aug 2011 10:21:35 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
Message-ID: <CAEJxiFq7D=G6sDxSdCusRoAhNhostNtGzh4Vyv5uPss1EQk_Vw@mail.gmail.com>

On Mon, Aug 29, 2011 at 10:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
> <cjordan1 at uw.edu> wrote:
>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>> which are summarized in the README file.
>>>>>>
>>>>>> ?From the README:
>>>>>>
>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>> but then you lose all the advantages of IPython."
>>>>>>
>>>>>>
>>>>>>
>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>> automatically inserting spaces:
>>>>>>
>>>>>> In [5]: %cpaste
>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>> :if 1>0:
>>>>>> : ? ?print 'hi'
>>>>>> :--
>>>>>> hi
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jason
>>>>>>
>>>>>> _______________________________________________
>>>>>> SciPy-User mailing list
>>>>>> SciPy-User at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>
>>>>>
>>>>> This strikes me as a textbook example of why we need an integrated
>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>> a chance and see if there are some places where pandas would really
>>>>> help out.
>>>>
>>>> We used to have a formula class is scipy.stats and I do not follow
>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>> community effort because the syntax required serves multiple
>>>> communities with different annotations and needs. That is also seen
>>>> from the different approaches taken by the stats packages from S/R,
>>>> SAS, Genstat (and those are just are ones I have used).
>>>>
>>>
>>> We have held this discussion at _great_ length multiple times on the
>>> statsmodels list and are in the process of trying to integrate
>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>> the statsmodels base.
>>>
>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>
>>> and more recently
>>>
>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>
>>> https://github.com/statsmodels/formula
>>> https://github.com/statsmodels/charlton
>>>
>>> Wes and I made some effort to go through this at SciPy. From where I
>>> sit, I think it's difficult to disentangle the data structures from
>>> the formula implementation, or maybe I'd just prefer to finish
>>> tackling the former because it's much more straightforward. So I'd
>>> like to first finish the pandas-integration branch that we've started
>>> and then focus on the formula support. This is on my (our, I hope...)
>>> immediate long-term goal list. Then I'd like to come back to the
>>> community and hash out the 'rules of the game' details for formulas
>>> after we have some code for people to play with, which promises to be
>>> "fun."
>>>
>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>
>>> FWIW, I could also improve the categorical function to be much nicer
>>> for the given examples (ie., take a list, drop a reference category),
>>> but I don't know that it's worth it, because it's really just a
>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>> more stop-gap?
>>>
>>
>> I want more usability, but I agree that a stop-gap probably isn't the
>> right way to go, unless it has things we'd eventually want anyways.
>>
>>> If I understand Chris' concerns, I think pandas + formula will go a
>>> long way towards bridging the gap between Python and R usability, but
>>
>> Yes, I agree. pandas + formulas would go a long, long way towards more
>> usability.
>>
>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>> statsmodels. I use it a lot, and the final part of my R file was
>> entirely lowess. (And, I should add, that was the part people liked
>> best since one of the main goals of the assignment was to generate
>> nifty pictures that could be used to summarize the data.)
>>
>
> Working my way through the pull requests. Very time poor...

:-) Thanks Skipper!

>
>>> it's a large effort and there are only a handful (at best) of people
>>> writing code -- Wes being the only one who's more or less "full time"
>>> as far as I can tell. The 0.4 statsmodels release should be very
>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>> there's only the small problem of building an infrastructure and
>>> community like CRAN so we can have specialists writing and maintaining
>>> code...but I hope once all the tools are in place this will seem much
>>> less daunting. There certainly seems to be the right sentiment for it.
>>>
>>
>> At the very least creating and testing models would be much simpler.
>> For weeks I've been wanting to see if gmm is the same as gee by
>> fitting both models to the same dataset, but I've been putting it off
>> because I didn't want to construct the design matrices by hand for
>> such a simple question. (GMM--Generalized Method of Moments--is a
>> standard econometrics model and GEE--Generalized Estimating
>> Equations--is a standard biostatics model. They're both
>> generalizations of quasi-likelihood and appear very similar, but I
>> want to fit some models to figure out if they're exactly the same.)
>>
>
> Oh, it's not *that* bad. I agree, of course, that it could be better,
> but I've been using mainly Python for my work, including GMM and
> estimating equations models (mainly empirical likelihood and
> generalized maximum entropy) for the last ~two years.
>

Yes, I didn't mean to imply it was unusable. Merely that it's kinda
time consuming but not fun to think about design matrices. I'm sure it
becomes easier if you keep doing it for awhile.

My main point was that it would be a simpler to try to put new models
into statsmodels with the formula because it'd make testing easier.
Since you could add/remove terms and interactions from the model in
attempts to break the fitting procedure.

-Chris JS

> Skipper
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Mon Aug 29 11:27:06 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 29 Aug 2011 11:27:06 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
Message-ID: <CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>

On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
> <cjordan1 at uw.edu> wrote:
>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>> which are summarized in the README file.
>>>>>>
>>>>>> ?From the README:
>>>>>>
>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>> but then you lose all the advantages of IPython."
>>>>>>
>>>>>>
>>>>>>
>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>> automatically inserting spaces:
>>>>>>
>>>>>> In [5]: %cpaste
>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>> :if 1>0:
>>>>>> : ? ?print 'hi'
>>>>>> :--
>>>>>> hi
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jason
>>>>>>
>>>>>> _______________________________________________
>>>>>> SciPy-User mailing list
>>>>>> SciPy-User at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>
>>>>>
>>>>> This strikes me as a textbook example of why we need an integrated
>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>> a chance and see if there are some places where pandas would really
>>>>> help out.
>>>>
>>>> We used to have a formula class is scipy.stats and I do not follow
>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>> community effort because the syntax required serves multiple
>>>> communities with different annotations and needs. That is also seen
>>>> from the different approaches taken by the stats packages from S/R,
>>>> SAS, Genstat (and those are just are ones I have used).
>>>>
>>>
>>> We have held this discussion at _great_ length multiple times on the
>>> statsmodels list and are in the process of trying to integrate
>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>> the statsmodels base.
>>>
>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>
>>> and more recently
>>>
>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>
>>> https://github.com/statsmodels/formula
>>> https://github.com/statsmodels/charlton
>>>
>>> Wes and I made some effort to go through this at SciPy. From where I
>>> sit, I think it's difficult to disentangle the data structures from
>>> the formula implementation, or maybe I'd just prefer to finish
>>> tackling the former because it's much more straightforward. So I'd
>>> like to first finish the pandas-integration branch that we've started
>>> and then focus on the formula support. This is on my (our, I hope...)
>>> immediate long-term goal list. Then I'd like to come back to the
>>> community and hash out the 'rules of the game' details for formulas
>>> after we have some code for people to play with, which promises to be
>>> "fun."
>>>
>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>
>>> FWIW, I could also improve the categorical function to be much nicer
>>> for the given examples (ie., take a list, drop a reference category),
>>> but I don't know that it's worth it, because it's really just a
>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>> more stop-gap?
>>>
>>
>> I want more usability, but I agree that a stop-gap probably isn't the
>> right way to go, unless it has things we'd eventually want anyways.
>>
>>> If I understand Chris' concerns, I think pandas + formula will go a
>>> long way towards bridging the gap between Python and R usability, but
>>
>> Yes, I agree. pandas + formulas would go a long, long way towards more
>> usability.
>>
>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>> statsmodels. I use it a lot, and the final part of my R file was
>> entirely lowess. (And, I should add, that was the part people liked
>> best since one of the main goals of the assignment was to generate
>> nifty pictures that could be used to summarize the data.)
>>
>
> Working my way through the pull requests. Very time poor...
>
>>> it's a large effort and there are only a handful (at best) of people
>>> writing code -- Wes being the only one who's more or less "full time"
>>> as far as I can tell. The 0.4 statsmodels release should be very
>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>> there's only the small problem of building an infrastructure and
>>> community like CRAN so we can have specialists writing and maintaining
>>> code...but I hope once all the tools are in place this will seem much
>>> less daunting. There certainly seems to be the right sentiment for it.
>>>
>>
>> At the very least creating and testing models would be much simpler.
>> For weeks I've been wanting to see if gmm is the same as gee by
>> fitting both models to the same dataset, but I've been putting it off
>> because I didn't want to construct the design matrices by hand for
>> such a simple question. (GMM--Generalized Method of Moments--is a
>> standard econometrics model and GEE--Generalized Estimating
>> Equations--is a standard biostatics model. They're both
>> generalizations of quasi-likelihood and appear very similar, but I
>> want to fit some models to figure out if they're exactly the same.)

Since GMM is still in the sandbox, the interface is not very polished,
and it's missing some enhancements. I recommend asking on the mailing
list if it's not clear.

Note GMM itself is very general and will never be a quick interactive
method. The main work will always be to define the moment conditions
(a bit similar to non-linear function estimation, optimize.leastsq).

There are and will be special subclasses, eg. IV2SLS, that have
predefined moment conditions, but, still, it's up to the user do
construct design and instrument arrays.
And as far as I remember, the GMM/GEE package in R doesn't have a
formula interface either.

Josef

>>
>
> Oh, it's not *that* bad. I agree, of course, that it could be better,
> but I've been using mainly Python for my work, including GMM and
> estimating equations models (mainly empirical likelihood and
> generalized maximum entropy) for the last ~two years.
>
> Skipper
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From cjordan1 at uw.edu  Mon Aug 29 11:34:06 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Mon, 29 Aug 2011 10:34:06 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
Message-ID: <CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>

On Mon, Aug 29, 2011 at 10:27 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
>> <cjordan1 at uw.edu> wrote:
>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>>> which are summarized in the README file.
>>>>>>>
>>>>>>> ?From the README:
>>>>>>>
>>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>>> but then you lose all the advantages of IPython."
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>>> automatically inserting spaces:
>>>>>>>
>>>>>>> In [5]: %cpaste
>>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>>> :if 1>0:
>>>>>>> : ? ?print 'hi'
>>>>>>> :--
>>>>>>> hi
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Jason
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> SciPy-User mailing list
>>>>>>> SciPy-User at scipy.org
>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>>
>>>>>>
>>>>>> This strikes me as a textbook example of why we need an integrated
>>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>>> a chance and see if there are some places where pandas would really
>>>>>> help out.
>>>>>
>>>>> We used to have a formula class is scipy.stats and I do not follow
>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>>> community effort because the syntax required serves multiple
>>>>> communities with different annotations and needs. That is also seen
>>>>> from the different approaches taken by the stats packages from S/R,
>>>>> SAS, Genstat (and those are just are ones I have used).
>>>>>
>>>>
>>>> We have held this discussion at _great_ length multiple times on the
>>>> statsmodels list and are in the process of trying to integrate
>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>>> the statsmodels base.
>>>>
>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>>
>>>> and more recently
>>>>
>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>>
>>>> https://github.com/statsmodels/formula
>>>> https://github.com/statsmodels/charlton
>>>>
>>>> Wes and I made some effort to go through this at SciPy. From where I
>>>> sit, I think it's difficult to disentangle the data structures from
>>>> the formula implementation, or maybe I'd just prefer to finish
>>>> tackling the former because it's much more straightforward. So I'd
>>>> like to first finish the pandas-integration branch that we've started
>>>> and then focus on the formula support. This is on my (our, I hope...)
>>>> immediate long-term goal list. Then I'd like to come back to the
>>>> community and hash out the 'rules of the game' details for formulas
>>>> after we have some code for people to play with, which promises to be
>>>> "fun."
>>>>
>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>>
>>>> FWIW, I could also improve the categorical function to be much nicer
>>>> for the given examples (ie., take a list, drop a reference category),
>>>> but I don't know that it's worth it, because it's really just a
>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>>> more stop-gap?
>>>>
>>>
>>> I want more usability, but I agree that a stop-gap probably isn't the
>>> right way to go, unless it has things we'd eventually want anyways.
>>>
>>>> If I understand Chris' concerns, I think pandas + formula will go a
>>>> long way towards bridging the gap between Python and R usability, but
>>>
>>> Yes, I agree. pandas + formulas would go a long, long way towards more
>>> usability.
>>>
>>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>>> statsmodels. I use it a lot, and the final part of my R file was
>>> entirely lowess. (And, I should add, that was the part people liked
>>> best since one of the main goals of the assignment was to generate
>>> nifty pictures that could be used to summarize the data.)
>>>
>>
>> Working my way through the pull requests. Very time poor...
>>
>>>> it's a large effort and there are only a handful (at best) of people
>>>> writing code -- Wes being the only one who's more or less "full time"
>>>> as far as I can tell. The 0.4 statsmodels release should be very
>>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>>> there's only the small problem of building an infrastructure and
>>>> community like CRAN so we can have specialists writing and maintaining
>>>> code...but I hope once all the tools are in place this will seem much
>>>> less daunting. There certainly seems to be the right sentiment for it.
>>>>
>>>
>>> At the very least creating and testing models would be much simpler.
>>> For weeks I've been wanting to see if gmm is the same as gee by
>>> fitting both models to the same dataset, but I've been putting it off
>>> because I didn't want to construct the design matrices by hand for
>>> such a simple question. (GMM--Generalized Method of Moments--is a
>>> standard econometrics model and GEE--Generalized Estimating
>>> Equations--is a standard biostatics model. They're both
>>> generalizations of quasi-likelihood and appear very similar, but I
>>> want to fit some models to figure out if they're exactly the same.)
>
> Since GMM is still in the sandbox, the interface is not very polished,
> and it's missing some enhancements. I recommend asking on the mailing
> list if it's not clear.
>
> Note GMM itself is very general and will never be a quick interactive
> method. The main work will always be to define the moment conditions
> (a bit similar to non-linear function estimation, optimize.leastsq).
>
> There are and will be special subclasses, eg. IV2SLS, that have
> predefined moment conditions, but, still, it's up to the user do
> construct design and instrument arrays.
> And as far as I remember, the GMM/GEE package in R doesn't have a
> formula interface either.
>

Both of the two gee packages in R I know of have formula interfaces.

http://cran.r-project.org/web/packages/geepack/
http://cran.r-project.org/web/packages/gee/index.html

-Chris JS

> Josef
>
>>>
>>
>> Oh, it's not *that* bad. I agree, of course, that it could be better,
>> but I've been using mainly Python for my work, including GMM and
>> estimating equations models (mainly empirical likelihood and
>> generalized maximum entropy) for the last ~two years.
>>
>> Skipper
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jjstickel at vcn.com  Mon Aug 29 11:36:32 2011
From: jjstickel at vcn.com (Jonathan Stickel)
Date: Mon, 29 Aug 2011 09:36:32 -0600
Subject: [SciPy-User] R vs Python for simple interactive data, analysis
In-Reply-To: <mailman.3522.1314629839.1086.scipy-user@scipy.org>
References: <mailman.3522.1314629839.1086.scipy-user@scipy.org>
Message-ID: <4E5BB200.1060300@vcn.com>

On 8/29/11 08:57 , scipy-user-request at scipy.org wrote:
> Though I really, really want a scatterplot smoother (i.e., lowess) in
> statsmodels. I use it a lot, and the final part of my R file was
> entirely lowess. (And, I should add, that was the part people liked
> best since one of the main goals of the assignment was to generate
> nifty pictures that could be used to summarize the data.)

I have an interest in smoothing methods and created the 
scikits.datasmooth package:

http://pypi.python.org/pypi/scikits.datasmooth/

Right now it just contains a regularization method, but it might be a 
good place for loess/lowess if someone is interested in contributing it 
there.  From a google search it seems that there are some 
implementations floating around.  Alternatively, I would be satisfied 
with moving my smoothing by regularization code over to another 
module/package if it would get more use.

Regards,
Jonathan


From josef.pktd at gmail.com  Mon Aug 29 11:42:32 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 29 Aug 2011 11:42:32 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
	<CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
Message-ID: <CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>

On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> On Mon, Aug 29, 2011 at 10:27 AM, ?<josef.pktd at gmail.com> wrote:
>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
>>> <cjordan1 at uw.edu> wrote:
>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>>>> which are summarized in the README file.
>>>>>>>>
>>>>>>>> ?From the README:
>>>>>>>>
>>>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>>>> but then you lose all the advantages of IPython."
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>>>> automatically inserting spaces:
>>>>>>>>
>>>>>>>> In [5]: %cpaste
>>>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>>>> :if 1>0:
>>>>>>>> : ? ?print 'hi'
>>>>>>>> :--
>>>>>>>> hi
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Jason
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> SciPy-User mailing list
>>>>>>>> SciPy-User at scipy.org
>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>>>
>>>>>>>
>>>>>>> This strikes me as a textbook example of why we need an integrated
>>>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>>>> a chance and see if there are some places where pandas would really
>>>>>>> help out.
>>>>>>
>>>>>> We used to have a formula class is scipy.stats and I do not follow
>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>>>> community effort because the syntax required serves multiple
>>>>>> communities with different annotations and needs. That is also seen
>>>>>> from the different approaches taken by the stats packages from S/R,
>>>>>> SAS, Genstat (and those are just are ones I have used).
>>>>>>
>>>>>
>>>>> We have held this discussion at _great_ length multiple times on the
>>>>> statsmodels list and are in the process of trying to integrate
>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>>>> the statsmodels base.
>>>>>
>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>>>
>>>>> and more recently
>>>>>
>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>>>
>>>>> https://github.com/statsmodels/formula
>>>>> https://github.com/statsmodels/charlton
>>>>>
>>>>> Wes and I made some effort to go through this at SciPy. From where I
>>>>> sit, I think it's difficult to disentangle the data structures from
>>>>> the formula implementation, or maybe I'd just prefer to finish
>>>>> tackling the former because it's much more straightforward. So I'd
>>>>> like to first finish the pandas-integration branch that we've started
>>>>> and then focus on the formula support. This is on my (our, I hope...)
>>>>> immediate long-term goal list. Then I'd like to come back to the
>>>>> community and hash out the 'rules of the game' details for formulas
>>>>> after we have some code for people to play with, which promises to be
>>>>> "fun."
>>>>>
>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>>>
>>>>> FWIW, I could also improve the categorical function to be much nicer
>>>>> for the given examples (ie., take a list, drop a reference category),
>>>>> but I don't know that it's worth it, because it's really just a
>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>>>> more stop-gap?
>>>>>
>>>>
>>>> I want more usability, but I agree that a stop-gap probably isn't the
>>>> right way to go, unless it has things we'd eventually want anyways.
>>>>
>>>>> If I understand Chris' concerns, I think pandas + formula will go a
>>>>> long way towards bridging the gap between Python and R usability, but
>>>>
>>>> Yes, I agree. pandas + formulas would go a long, long way towards more
>>>> usability.
>>>>
>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>>>> statsmodels. I use it a lot, and the final part of my R file was
>>>> entirely lowess. (And, I should add, that was the part people liked
>>>> best since one of the main goals of the assignment was to generate
>>>> nifty pictures that could be used to summarize the data.)
>>>>
>>>
>>> Working my way through the pull requests. Very time poor...
>>>
>>>>> it's a large effort and there are only a handful (at best) of people
>>>>> writing code -- Wes being the only one who's more or less "full time"
>>>>> as far as I can tell. The 0.4 statsmodels release should be very
>>>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>>>> there's only the small problem of building an infrastructure and
>>>>> community like CRAN so we can have specialists writing and maintaining
>>>>> code...but I hope once all the tools are in place this will seem much
>>>>> less daunting. There certainly seems to be the right sentiment for it.
>>>>>
>>>>
>>>> At the very least creating and testing models would be much simpler.
>>>> For weeks I've been wanting to see if gmm is the same as gee by
>>>> fitting both models to the same dataset, but I've been putting it off
>>>> because I didn't want to construct the design matrices by hand for
>>>> such a simple question. (GMM--Generalized Method of Moments--is a
>>>> standard econometrics model and GEE--Generalized Estimating
>>>> Equations--is a standard biostatics model. They're both
>>>> generalizations of quasi-likelihood and appear very similar, but I
>>>> want to fit some models to figure out if they're exactly the same.)
>>
>> Since GMM is still in the sandbox, the interface is not very polished,
>> and it's missing some enhancements. I recommend asking on the mailing
>> list if it's not clear.
>>
>> Note GMM itself is very general and will never be a quick interactive
>> method. The main work will always be to define the moment conditions
>> (a bit similar to non-linear function estimation, optimize.leastsq).
>>
>> There are and will be special subclasses, eg. IV2SLS, that have
>> predefined moment conditions, but, still, it's up to the user do
>> construct design and instrument arrays.
>> And as far as I remember, the GMM/GEE package in R doesn't have a
>> formula interface either.
>>
>
> Both of the two gee packages in R I know of have formula interfaces.
>
> http://cran.r-project.org/web/packages/geepack/
> http://cran.r-project.org/web/packages/gee/index.html

I have to look at this. I mixed up some acronyms, I meant GEL and GMM
http://cran.r-project.org/web/packages/gmm/index.html
the vignette was one of my readings, and the STATA description for GMM.

I never really looked at GEE. (That's Skipper's private work so far.)

Josef

>
> -Chris JS
>
>> Josef
>>
>>>>
>>>
>>> Oh, it's not *that* bad. I agree, of course, that it could be better,
>>> but I've been using mainly Python for my work, including GMM and
>>> estimating equations models (mainly empirical likelihood and
>>> generalized maximum entropy) for the last ~two years.
>>>
>>> Skipper
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jsseabold at gmail.com  Mon Aug 29 12:27:49 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Mon, 29 Aug 2011 12:27:49 -0400
Subject: [SciPy-User] R vs Python for simple interactive data, analysis
In-Reply-To: <4E5BB200.1060300@vcn.com>
References: <mailman.3522.1314629839.1086.scipy-user@scipy.org>
	<4E5BB200.1060300@vcn.com>
Message-ID: <CAKF=Djua8xe-YWDmT03JQTUV5noK7Grm-_ga8SGsiKaeSDRDvA@mail.gmail.com>

On Mon, Aug 29, 2011 at 11:36 AM, Jonathan Stickel <jjstickel at vcn.com> wrote:
> On 8/29/11 08:57 , scipy-user-request at scipy.org wrote:
>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>> statsmodels. I use it a lot, and the final part of my R file was
>> entirely lowess. (And, I should add, that was the part people liked
>> best since one of the main goals of the assignment was to generate
>> nifty pictures that could be used to summarize the data.)
>
> I have an interest in smoothing methods and created the
> scikits.datasmooth package:
>
> http://pypi.python.org/pypi/scikits.datasmooth/
>
> Right now it just contains a regularization method, but it might be a
> good place for loess/lowess if someone is interested in contributing it
> there. ?From a google search it seems that there are some
> implementations floating around. ?Alternatively, I would be satisfied
> with moving my smoothing by regularization code over to another
> module/package if it would get more use.
>

Chris has a pending pull request for lowess in statsmodels.
https://github.com/statsmodels/statsmodels/pull/5

Perhaps there is some desire for keeping these tools together? I don't
know what's out there well enough.

Skipper


From josef.pktd at gmail.com  Mon Aug 29 12:59:08 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 29 Aug 2011 12:59:08 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
	<CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
	<CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>
Message-ID: <CAMMTP+Azd6T9p_YSySRwtA63Oxf7mtVs3M9mkpFJUSODxcrB3A@mail.gmail.com>

On Mon, Aug 29, 2011 at 11:42 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire
> <cjordan1 at uw.edu> wrote:
>> On Mon, Aug 29, 2011 at 10:27 AM, ?<josef.pktd at gmail.com> wrote:
>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
>>>> <cjordan1 at uw.edu> wrote:
>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>>>>> which are summarized in the README file.
>>>>>>>>>
>>>>>>>>> ?From the README:
>>>>>>>>>
>>>>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>>>>> but then you lose all the advantages of IPython."
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>>>>> automatically inserting spaces:
>>>>>>>>>
>>>>>>>>> In [5]: %cpaste
>>>>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>>>>> :if 1>0:
>>>>>>>>> : ? ?print 'hi'
>>>>>>>>> :--
>>>>>>>>> hi
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Jason
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> SciPy-User mailing list
>>>>>>>>> SciPy-User at scipy.org
>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>>>>
>>>>>>>>
>>>>>>>> This strikes me as a textbook example of why we need an integrated
>>>>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>>>>> a chance and see if there are some places where pandas would really
>>>>>>>> help out.
>>>>>>>
>>>>>>> We used to have a formula class is scipy.stats and I do not follow
>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>>>>> community effort because the syntax required serves multiple
>>>>>>> communities with different annotations and needs. That is also seen
>>>>>>> from the different approaches taken by the stats packages from S/R,
>>>>>>> SAS, Genstat (and those are just are ones I have used).
>>>>>>>
>>>>>>
>>>>>> We have held this discussion at _great_ length multiple times on the
>>>>>> statsmodels list and are in the process of trying to integrate
>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>>>>> the statsmodels base.
>>>>>>
>>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>>>>
>>>>>> and more recently
>>>>>>
>>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>>>>
>>>>>> https://github.com/statsmodels/formula
>>>>>> https://github.com/statsmodels/charlton
>>>>>>
>>>>>> Wes and I made some effort to go through this at SciPy. From where I
>>>>>> sit, I think it's difficult to disentangle the data structures from
>>>>>> the formula implementation, or maybe I'd just prefer to finish
>>>>>> tackling the former because it's much more straightforward. So I'd
>>>>>> like to first finish the pandas-integration branch that we've started
>>>>>> and then focus on the formula support. This is on my (our, I hope...)
>>>>>> immediate long-term goal list. Then I'd like to come back to the
>>>>>> community and hash out the 'rules of the game' details for formulas
>>>>>> after we have some code for people to play with, which promises to be
>>>>>> "fun."
>>>>>>
>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>>>>
>>>>>> FWIW, I could also improve the categorical function to be much nicer
>>>>>> for the given examples (ie., take a list, drop a reference category),
>>>>>> but I don't know that it's worth it, because it's really just a
>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>>>>> more stop-gap?
>>>>>>
>>>>>
>>>>> I want more usability, but I agree that a stop-gap probably isn't the
>>>>> right way to go, unless it has things we'd eventually want anyways.
>>>>>
>>>>>> If I understand Chris' concerns, I think pandas + formula will go a
>>>>>> long way towards bridging the gap between Python and R usability, but
>>>>>
>>>>> Yes, I agree. pandas + formulas would go a long, long way towards more
>>>>> usability.
>>>>>
>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>>>>> statsmodels. I use it a lot, and the final part of my R file was
>>>>> entirely lowess. (And, I should add, that was the part people liked
>>>>> best since one of the main goals of the assignment was to generate
>>>>> nifty pictures that could be used to summarize the data.)
>>>>>
>>>>
>>>> Working my way through the pull requests. Very time poor...
>>>>
>>>>>> it's a large effort and there are only a handful (at best) of people
>>>>>> writing code -- Wes being the only one who's more or less "full time"
>>>>>> as far as I can tell. The 0.4 statsmodels release should be very
>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>>>>> there's only the small problem of building an infrastructure and
>>>>>> community like CRAN so we can have specialists writing and maintaining
>>>>>> code...but I hope once all the tools are in place this will seem much
>>>>>> less daunting. There certainly seems to be the right sentiment for it.
>>>>>>
>>>>>
>>>>> At the very least creating and testing models would be much simpler.
>>>>> For weeks I've been wanting to see if gmm is the same as gee by
>>>>> fitting both models to the same dataset, but I've been putting it off
>>>>> because I didn't want to construct the design matrices by hand for
>>>>> such a simple question. (GMM--Generalized Method of Moments--is a
>>>>> standard econometrics model and GEE--Generalized Estimating
>>>>> Equations--is a standard biostatics model. They're both
>>>>> generalizations of quasi-likelihood and appear very similar, but I
>>>>> want to fit some models to figure out if they're exactly the same.)
>>>
>>> Since GMM is still in the sandbox, the interface is not very polished,
>>> and it's missing some enhancements. I recommend asking on the mailing
>>> list if it's not clear.
>>>
>>> Note GMM itself is very general and will never be a quick interactive
>>> method. The main work will always be to define the moment conditions
>>> (a bit similar to non-linear function estimation, optimize.leastsq).
>>>
>>> There are and will be special subclasses, eg. IV2SLS, that have
>>> predefined moment conditions, but, still, it's up to the user do
>>> construct design and instrument arrays.
>>> And as far as I remember, the GMM/GEE package in R doesn't have a
>>> formula interface either.
>>>
>>
>> Both of the two gee packages in R I know of have formula interfaces.
>>
>> http://cran.r-project.org/web/packages/geepack/
>> http://cran.r-project.org/web/packages/gee/index.html

This is very different from what's in GMM in statsmodels so far. The
help file is very short, so I'm mostly guessing.
It seems to be for (a subset) of generalized linear models with
longitudinal/panel covariance structures. Something like this will
eventually (once we get panel data models)  as a special case of GMM
in statsmodels, assuming it's similar to what I know from the
econometrics literature.

Most of the subclasses of GMM that I currently have, are focused on
instrumental variable estimation, including non-linear regression.
This should be expanded over time.

But GMM itself is designed for subclassing by someone who wants to use
her/his own moment conditions, as in
http://cran.r-project.org/web/packages/gmm/index.html
or for us to implement specific models with it.

If someone wants to use it, then I have to quickly add the options for
the kernels of the weighting matrix, which I keep postponing.
Currently there is only a truncated, uniform kernel that assumes
observations are order by time, but users can provide their own
weighting function.

Josef

>
> I have to look at this. I mixed up some acronyms, I meant GEL and GMM
> http://cran.r-project.org/web/packages/gmm/index.html
> the vignette was one of my readings, and the STATA description for GMM.
>
> I never really looked at GEE. (That's Skipper's private work so far.)
>
> Josef
>
>>
>> -Chris JS
>>
>>> Josef
>>>
>>>>>
>>>>
>>>> Oh, it's not *that* bad. I agree, of course, that it could be better,
>>>> but I've been using mainly Python for my work, including GMM and
>>>> estimating equations models (mainly empirical likelihood and
>>>> generalized maximum entropy) for the last ~two years.
>>>>
>>>> Skipper
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>


From josef.pktd at gmail.com  Mon Aug 29 13:13:48 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 29 Aug 2011 13:13:48 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAMMTP+Azd6T9p_YSySRwtA63Oxf7mtVs3M9mkpFJUSODxcrB3A@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
	<CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
	<CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>
	<CAMMTP+Azd6T9p_YSySRwtA63Oxf7mtVs3M9mkpFJUSODxcrB3A@mail.gmail.com>
Message-ID: <CAMMTP+D2iRfMH+be8yJF54s3B7BV2uQGz1EkW-8deSecuMaUqA@mail.gmail.com>

On Mon, Aug 29, 2011 at 12:59 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 11:42 AM, ?<josef.pktd at gmail.com> wrote:
>> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire
>> <cjordan1 at uw.edu> wrote:
>>> On Mon, Aug 29, 2011 at 10:27 AM, ?<josef.pktd at gmail.com> wrote:
>>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
>>>>> <cjordan1 at uw.edu> wrote:
>>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>>>>>> which are summarized in the README file.
>>>>>>>>>>
>>>>>>>>>> ?From the README:
>>>>>>>>>>
>>>>>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>>>>>> but then you lose all the advantages of IPython."
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>>>>>> automatically inserting spaces:
>>>>>>>>>>
>>>>>>>>>> In [5]: %cpaste
>>>>>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>>>>>> :if 1>0:
>>>>>>>>>> : ? ?print 'hi'
>>>>>>>>>> :--
>>>>>>>>>> hi
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Jason
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> SciPy-User mailing list
>>>>>>>>>> SciPy-User at scipy.org
>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This strikes me as a textbook example of why we need an integrated
>>>>>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>>>>>> a chance and see if there are some places where pandas would really
>>>>>>>>> help out.
>>>>>>>>
>>>>>>>> We used to have a formula class is scipy.stats and I do not follow
>>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>>>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>>>>>> community effort because the syntax required serves multiple
>>>>>>>> communities with different annotations and needs. That is also seen
>>>>>>>> from the different approaches taken by the stats packages from S/R,
>>>>>>>> SAS, Genstat (and those are just are ones I have used).
>>>>>>>>
>>>>>>>
>>>>>>> We have held this discussion at _great_ length multiple times on the
>>>>>>> statsmodels list and are in the process of trying to integrate
>>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>>>>>> the statsmodels base.
>>>>>>>
>>>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>>>>>
>>>>>>> and more recently
>>>>>>>
>>>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>>>>>
>>>>>>> https://github.com/statsmodels/formula
>>>>>>> https://github.com/statsmodels/charlton
>>>>>>>
>>>>>>> Wes and I made some effort to go through this at SciPy. From where I
>>>>>>> sit, I think it's difficult to disentangle the data structures from
>>>>>>> the formula implementation, or maybe I'd just prefer to finish
>>>>>>> tackling the former because it's much more straightforward. So I'd
>>>>>>> like to first finish the pandas-integration branch that we've started
>>>>>>> and then focus on the formula support. This is on my (our, I hope...)
>>>>>>> immediate long-term goal list. Then I'd like to come back to the
>>>>>>> community and hash out the 'rules of the game' details for formulas
>>>>>>> after we have some code for people to play with, which promises to be
>>>>>>> "fun."
>>>>>>>
>>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>>>>>
>>>>>>> FWIW, I could also improve the categorical function to be much nicer
>>>>>>> for the given examples (ie., take a list, drop a reference category),
>>>>>>> but I don't know that it's worth it, because it's really just a
>>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>>>>>> more stop-gap?
>>>>>>>
>>>>>>
>>>>>> I want more usability, but I agree that a stop-gap probably isn't the
>>>>>> right way to go, unless it has things we'd eventually want anyways.
>>>>>>
>>>>>>> If I understand Chris' concerns, I think pandas + formula will go a
>>>>>>> long way towards bridging the gap between Python and R usability, but
>>>>>>
>>>>>> Yes, I agree. pandas + formulas would go a long, long way towards more
>>>>>> usability.
>>>>>>
>>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>>>>>> statsmodels. I use it a lot, and the final part of my R file was
>>>>>> entirely lowess. (And, I should add, that was the part people liked
>>>>>> best since one of the main goals of the assignment was to generate
>>>>>> nifty pictures that could be used to summarize the data.)
>>>>>>
>>>>>
>>>>> Working my way through the pull requests. Very time poor...
>>>>>
>>>>>>> it's a large effort and there are only a handful (at best) of people
>>>>>>> writing code -- Wes being the only one who's more or less "full time"
>>>>>>> as far as I can tell. The 0.4 statsmodels release should be very
>>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>>>>>> there's only the small problem of building an infrastructure and
>>>>>>> community like CRAN so we can have specialists writing and maintaining
>>>>>>> code...but I hope once all the tools are in place this will seem much
>>>>>>> less daunting. There certainly seems to be the right sentiment for it.
>>>>>>>
>>>>>>
>>>>>> At the very least creating and testing models would be much simpler.
>>>>>> For weeks I've been wanting to see if gmm is the same as gee by
>>>>>> fitting both models to the same dataset, but I've been putting it off
>>>>>> because I didn't want to construct the design matrices by hand for
>>>>>> such a simple question. (GMM--Generalized Method of Moments--is a
>>>>>> standard econometrics model and GEE--Generalized Estimating
>>>>>> Equations--is a standard biostatics model. They're both
>>>>>> generalizations of quasi-likelihood and appear very similar, but I
>>>>>> want to fit some models to figure out if they're exactly the same.)
>>>>
>>>> Since GMM is still in the sandbox, the interface is not very polished,
>>>> and it's missing some enhancements. I recommend asking on the mailing
>>>> list if it's not clear.
>>>>
>>>> Note GMM itself is very general and will never be a quick interactive
>>>> method. The main work will always be to define the moment conditions
>>>> (a bit similar to non-linear function estimation, optimize.leastsq).
>>>>
>>>> There are and will be special subclasses, eg. IV2SLS, that have
>>>> predefined moment conditions, but, still, it's up to the user do
>>>> construct design and instrument arrays.
>>>> And as far as I remember, the GMM/GEE package in R doesn't have a
>>>> formula interface either.
>>>>
>>>
>>> Both of the two gee packages in R I know of have formula interfaces.
>>>
>>> http://cran.r-project.org/web/packages/geepack/
>>> http://cran.r-project.org/web/packages/gee/index.html
>
> This is very different from what's in GMM in statsmodels so far. The
> help file is very short, so I'm mostly guessing.
> It seems to be for (a subset) of generalized linear models with
> longitudinal/panel covariance structures. Something like this will
> eventually (once we get panel data models) ?as a special case of GMM
> in statsmodels, assuming it's similar to what I know from the
> econometrics literature.
>
> Most of the subclasses of GMM that I currently have, are focused on
> instrumental variable estimation, including non-linear regression.
> This should be expanded over time.
>
> But GMM itself is designed for subclassing by someone who wants to use
> her/his own moment conditions, as in
> http://cran.r-project.org/web/packages/gmm/index.html
> or for us to implement specific models with it.
>
> If someone wants to use it, then I have to quickly add the options for
> the kernels of the weighting matrix, which I keep postponing.
> Currently there is only a truncated, uniform kernel that assumes
> observations are order by time, but users can provide their own
> weighting function.
>
> Josef
>
>>
>> I have to look at this. I mixed up some acronyms, I meant GEL and GMM
>> http://cran.r-project.org/web/packages/gmm/index.html
>> the vignette was one of my readings, and the STATA description for GMM.
>>
>> I never really looked at GEE. (That's Skipper's private work so far.)
>>
>> Josef
>>
>>>
>>> -Chris JS
>>>
>>>> Josef
>>>>
>>>>>>
>>>>>
>>>>> Oh, it's not *that* bad. I agree, of course, that it could be better,
>>>>> but I've been using mainly Python for my work, including GMM and
>>>>> estimating equations models (mainly empirical likelihood and
>>>>> generalized maximum entropy) for the last ~two years.
>>>>>
>>>>> Skipper
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
>

just to make another point:

Without someone adding mixed effects, hierachical, panel/longitudinal
models, and .... it will not help to have a formula interface to them.
(Thanks to Scott we will soon have survival)

Josef


From alacast at gmail.com  Mon Aug 29 13:38:09 2011
From: alacast at gmail.com (Alacast)
Date: Mon, 29 Aug 2011 18:38:09 +0100
Subject: [SciPy-User] Hilbert transform
Message-ID: <CAGoRfgERZW8WqrUY3=UkkeQKz_ND5dspmyzahX4As=H9QvgU0A@mail.gmail.com>

I'm doing some analyses on sets of real-valued time series in which I want
to know the envelope/instantaneous amplitude of each series in the set.
Consequently, I've been taking the Hilbert transform (using
scipy.signal.hilbert), then taking the absolute value of the result.

The problem is that sometimes this process is far too slow. These time
series can have on the order of 10^5 to 10^6 data points, and the sets can
have up to 128 time series. Some datasets have been taking an hour or hours
to compute on a perfectly modern computing node (1TB of RAM, plenty of
2.27Ghz cores, etc.). Is this expected behavior?

I learned that Scipy's Hilbert transform implementation uses FFT, and that
Scipy's FFT implementation can run in O(n^2) time when the number of time
points is prime. This happened in a few of my datasets, but I've now
included a check and correction for that (drop the last data point, so now
the number is even and consequently not prime). Still, I observe a good
amount of variability in run times, and they are rather long. Thoughts?

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/7c59ef28/attachment.html>

From robert.kern at gmail.com  Mon Aug 29 14:06:02 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 29 Aug 2011 13:06:02 -0500
Subject: [SciPy-User] Hilbert transform
In-Reply-To: <CAGoRfgERZW8WqrUY3=UkkeQKz_ND5dspmyzahX4As=H9QvgU0A@mail.gmail.com>
References: <CAGoRfgERZW8WqrUY3=UkkeQKz_ND5dspmyzahX4As=H9QvgU0A@mail.gmail.com>
Message-ID: <CAF6FJiswfUeJ_4f6xYNkd1d5bXoM5hYfWsfCSPyX3chzucfGkA@mail.gmail.com>

On Mon, Aug 29, 2011 at 12:38, Alacast <alacast at gmail.com> wrote:
> I'm doing some analyses on sets of real-valued time series in which I want
> to know the envelope/instantaneous amplitude of each series in the set.
> Consequently, I've been taking the Hilbert transform (using
> scipy.signal.hilbert), then taking the absolute value of the result.
> The problem is that sometimes this process is far too slow. These time
> series can have on the order of 10^5 to 10^6 data points, and the sets can
> have up to 128 time series. Some datasets have been taking an hour or hours
> to compute on a perfectly modern computing node (1TB of RAM, plenty of
> 2.27Ghz cores, etc.). Is this expected behavior?
> I learned that Scipy's Hilbert transform implementation uses FFT, and that
> Scipy's FFT implementation can run in O(n^2) time when the number of time
> points is prime. This happened in a few of my datasets, but I've now
> included a check and correction for that (drop the last data point, so now
> the number is even and consequently not prime). Still, I observe a good
> amount of variability in run times, and they are rather long. Thoughts?

Having N be prime is just the extreme case. Basically, the FFT
recursively computes the DFT. It can only recurse on integral factors
of N, so any prime factor M must be computed the slow way, taking
O(M^2) steps. You probably have large prime factors sitting around. A
typical approach is to pad your signal with 0s until the next power of
2 or other reasonably-factorable size.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From otrov at hush.ai  Sun Aug 28 10:04:25 2011
From: otrov at hush.ai (Kliment)
Date: Sun, 28 Aug 2011 16:04:25 +0200
Subject: [SciPy-User] Return variable value by function value
Message-ID: <20110828140425.E64D9E6719@smtp.hushmail.com>

Thanks for your input guys

So in similar cases I should use interpolation function (or solver 
depending on initial function) from SciPy package

Example I provided was from scratch of course, but it seems that 
0.95 is still in y range:

>>> sqrt(1 - 98**2/10E+4)
0.95076811052958654

>>> sqrt(1 - 99**2/10E+4)
0.94973154101567037


Regards,
Kliment


From cjordan1 at uw.edu  Mon Aug 29 16:55:08 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Mon, 29 Aug 2011 15:55:08 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
Message-ID: <CAEJxiFp+ev4ORv=2i1h9L2-zaVqT2xO8dp__8NLTdqWMQt_FMQ@mail.gmail.com>

I've just pushed an updated version of the .r and .py files to github,
as well as a summary of the corrections/suggestions from the mailing
list. I'd appreciate any further comments/suggestions.

Compared to the original .r and .py files, in these revised version:
-The R code was cleaned up because I realized I didn't need to use
    as.factor if I made the relevant variables into factors
-The python code was cleaned up by computing the 'sub-design matrices'
    associated with each factor variable before hand and stashing
    them in a dictionary
-Names were added to the variables in the regression by creating them
    from the calls to sm.categorical and stashing them in a dictionary

Notably, the helper fucntions and stashing of the pieces of design matrices
simplified the calls for model fitting, but they didn't noticeably shorten
the code. They also required a small increase in complexity. (In terms of the
data structures and function calls used to create the list of names and
the design matrices.)

I also added some comments to the effect that:
*one can use paste or cpaste in the IPython shell
*np.set_printoptions or sm.iolib.SimpleTable can be used to help with
printing of numpy arrays
*names can be added by the user to regression model summaries
*one can make helper functions to construct design matrices and keep
track of names, but the simplest way of doing it isn't robust to
subset-ing the data in the presence of categorical variables

Did I miss anything?

-Chris JS


On Sat, Aug 27, 2011 at 1:19 PM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> Hi--I've been a moderately heavy R user for the past two years, so
> about a month ago I took an (abbreviated) version of a simple data
> analysis I did in R and tried to rewrite as much of it as possible,
> line by line, into python using numpy and statsmodels. I didn't use
> pandas, and I can't comment on how much it might have simplified
> things.
>
> This comparison might be useful to some people, so I stuck it up on a
> github repo. My overall impression is that R is much stronger for
> interactive data analysis. Click on the link for more details why,
> which are summarized in the README file.
>
> https://github.com/chrisjordansquire/r_vs_py
>
> The code examples should run out of the box with no downloads (other
> than R, Python, numpy, scipy, and statsmodels) required.
>
> -Chris Jordan-Squire
>


From cjordan1 at uw.edu  Mon Aug 29 17:03:00 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Mon, 29 Aug 2011 16:03:00 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAMMTP+D2iRfMH+be8yJF54s3B7BV2uQGz1EkW-8deSecuMaUqA@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
	<CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
	<CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>
	<CAMMTP+Azd6T9p_YSySRwtA63Oxf7mtVs3M9mkpFJUSODxcrB3A@mail.gmail.com>
	<CAMMTP+D2iRfMH+be8yJF54s3B7BV2uQGz1EkW-8deSecuMaUqA@mail.gmail.com>
Message-ID: <CAEJxiFr60ekfHDw-aw3q5Ur4oFu0=ET04XRR+vb_O90Cf_rLdg@mail.gmail.com>

On Mon, Aug 29, 2011 at 12:13 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 12:59 PM, ?<josef.pktd at gmail.com> wrote:
>> On Mon, Aug 29, 2011 at 11:42 AM, ?<josef.pktd at gmail.com> wrote:
>>> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire
>>> <cjordan1 at uw.edu> wrote:
>>>> On Mon, Aug 29, 2011 at 10:27 AM, ?<josef.pktd at gmail.com> wrote:
>>>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
>>>>>> <cjordan1 at uw.edu> wrote:
>>>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>>>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>>>>>>> which are summarized in the README file.
>>>>>>>>>>>
>>>>>>>>>>> ?From the README:
>>>>>>>>>>>
>>>>>>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>>>>>>> but then you lose all the advantages of IPython."
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>>>>>>> automatically inserting spaces:
>>>>>>>>>>>
>>>>>>>>>>> In [5]: %cpaste
>>>>>>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>>>>>>> :if 1>0:
>>>>>>>>>>> : ? ?print 'hi'
>>>>>>>>>>> :--
>>>>>>>>>>> hi
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Jason
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> SciPy-User mailing list
>>>>>>>>>>> SciPy-User at scipy.org
>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This strikes me as a textbook example of why we need an integrated
>>>>>>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>>>>>>> a chance and see if there are some places where pandas would really
>>>>>>>>>> help out.
>>>>>>>>>
>>>>>>>>> We used to have a formula class is scipy.stats and I do not follow
>>>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>>>>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>>>>>>> community effort because the syntax required serves multiple
>>>>>>>>> communities with different annotations and needs. That is also seen
>>>>>>>>> from the different approaches taken by the stats packages from S/R,
>>>>>>>>> SAS, Genstat (and those are just are ones I have used).
>>>>>>>>>
>>>>>>>>
>>>>>>>> We have held this discussion at _great_ length multiple times on the
>>>>>>>> statsmodels list and are in the process of trying to integrate
>>>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>>>>>>> the statsmodels base.
>>>>>>>>
>>>>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>>>>>>
>>>>>>>> and more recently
>>>>>>>>
>>>>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>>>>>>
>>>>>>>> https://github.com/statsmodels/formula
>>>>>>>> https://github.com/statsmodels/charlton
>>>>>>>>
>>>>>>>> Wes and I made some effort to go through this at SciPy. From where I
>>>>>>>> sit, I think it's difficult to disentangle the data structures from
>>>>>>>> the formula implementation, or maybe I'd just prefer to finish
>>>>>>>> tackling the former because it's much more straightforward. So I'd
>>>>>>>> like to first finish the pandas-integration branch that we've started
>>>>>>>> and then focus on the formula support. This is on my (our, I hope...)
>>>>>>>> immediate long-term goal list. Then I'd like to come back to the
>>>>>>>> community and hash out the 'rules of the game' details for formulas
>>>>>>>> after we have some code for people to play with, which promises to be
>>>>>>>> "fun."
>>>>>>>>
>>>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>>>>>>
>>>>>>>> FWIW, I could also improve the categorical function to be much nicer
>>>>>>>> for the given examples (ie., take a list, drop a reference category),
>>>>>>>> but I don't know that it's worth it, because it's really just a
>>>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>>>>>>> more stop-gap?
>>>>>>>>
>>>>>>>
>>>>>>> I want more usability, but I agree that a stop-gap probably isn't the
>>>>>>> right way to go, unless it has things we'd eventually want anyways.
>>>>>>>
>>>>>>>> If I understand Chris' concerns, I think pandas + formula will go a
>>>>>>>> long way towards bridging the gap between Python and R usability, but
>>>>>>>
>>>>>>> Yes, I agree. pandas + formulas would go a long, long way towards more
>>>>>>> usability.
>>>>>>>
>>>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>>>>>>> statsmodels. I use it a lot, and the final part of my R file was
>>>>>>> entirely lowess. (And, I should add, that was the part people liked
>>>>>>> best since one of the main goals of the assignment was to generate
>>>>>>> nifty pictures that could be used to summarize the data.)
>>>>>>>
>>>>>>
>>>>>> Working my way through the pull requests. Very time poor...
>>>>>>
>>>>>>>> it's a large effort and there are only a handful (at best) of people
>>>>>>>> writing code -- Wes being the only one who's more or less "full time"
>>>>>>>> as far as I can tell. The 0.4 statsmodels release should be very
>>>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>>>>>>> there's only the small problem of building an infrastructure and
>>>>>>>> community like CRAN so we can have specialists writing and maintaining
>>>>>>>> code...but I hope once all the tools are in place this will seem much
>>>>>>>> less daunting. There certainly seems to be the right sentiment for it.
>>>>>>>>
>>>>>>>
>>>>>>> At the very least creating and testing models would be much simpler.
>>>>>>> For weeks I've been wanting to see if gmm is the same as gee by
>>>>>>> fitting both models to the same dataset, but I've been putting it off
>>>>>>> because I didn't want to construct the design matrices by hand for
>>>>>>> such a simple question. (GMM--Generalized Method of Moments--is a
>>>>>>> standard econometrics model and GEE--Generalized Estimating
>>>>>>> Equations--is a standard biostatics model. They're both
>>>>>>> generalizations of quasi-likelihood and appear very similar, but I
>>>>>>> want to fit some models to figure out if they're exactly the same.)
>>>>>
>>>>> Since GMM is still in the sandbox, the interface is not very polished,
>>>>> and it's missing some enhancements. I recommend asking on the mailing
>>>>> list if it's not clear.
>>>>>
>>>>> Note GMM itself is very general and will never be a quick interactive
>>>>> method. The main work will always be to define the moment conditions
>>>>> (a bit similar to non-linear function estimation, optimize.leastsq).
>>>>>
>>>>> There are and will be special subclasses, eg. IV2SLS, that have
>>>>> predefined moment conditions, but, still, it's up to the user do
>>>>> construct design and instrument arrays.
>>>>> And as far as I remember, the GMM/GEE package in R doesn't have a
>>>>> formula interface either.
>>>>>
>>>>
>>>> Both of the two gee packages in R I know of have formula interfaces.
>>>>
>>>> http://cran.r-project.org/web/packages/geepack/
>>>> http://cran.r-project.org/web/packages/gee/index.html
>>
>> This is very different from what's in GMM in statsmodels so far. The
>> help file is very short, so I'm mostly guessing.
>> It seems to be for (a subset) of generalized linear models with
>> longitudinal/panel covariance structures. Something like this will
>> eventually (once we get panel data models) ?as a special case of GMM
>> in statsmodels, assuming it's similar to what I know from the
>> econometrics literature.
>>
>> Most of the subclasses of GMM that I currently have, are focused on
>> instrumental variable estimation, including non-linear regression.
>> This should be expanded over time.
>>
>> But GMM itself is designed for subclassing by someone who wants to use
>> her/his own moment conditions, as in
>> http://cran.r-project.org/web/packages/gmm/index.html
>> or for us to implement specific models with it.
>>
>> If someone wants to use it, then I have to quickly add the options for
>> the kernels of the weighting matrix, which I keep postponing.
>> Currently there is only a truncated, uniform kernel that assumes
>> observations are order by time, but users can provide their own
>> weighting function.
>>
>> Josef
>>
>>>
>>> I have to look at this. I mixed up some acronyms, I meant GEL and GMM
>>> http://cran.r-project.org/web/packages/gmm/index.html
>>> the vignette was one of my readings, and the STATA description for GMM.
>>>
>>> I never really looked at GEE. (That's Skipper's private work so far.)
>>>
>>> Josef
>>>
>>>>
>>>> -Chris JS
>>>>
>>>>> Josef
>>>>>
>>>>>>>
>>>>>>
>>>>>> Oh, it's not *that* bad. I agree, of course, that it could be better,
>>>>>> but I've been using mainly Python for my work, including GMM and
>>>>>> estimating equations models (mainly empirical likelihood and
>>>>>> generalized maximum entropy) for the last ~two years.
>>>>>>
>>>>>> Skipper
>>>>>> _______________________________________________
>>>>>> SciPy-User mailing list
>>>>>> SciPy-User at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>
>>
>
> just to make another point:
>
> Without someone adding mixed effects, hierachical, panel/longitudinal
> models, and .... it will not help to have a formula interface to them.
> (Thanks to Scott we will soon have survival)
>

I don't think I understand.

I assumed that the formula framework is essentially orthogonal to the
models themselves. In the sense that it should be simple to adapt a
formula framework to new models. At least if they're some variety of
linear model, and provided the formula framework is designed to allow
for grouping syntax from the beginning. I think easy of extension to
new models is a major goal, in fact, since we want it to be easy for
people to contribute new models.

-Chris JS


> Josef
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Mon Aug 29 17:51:02 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 29 Aug 2011 17:51:02 -0400
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAEJxiFr60ekfHDw-aw3q5Ur4oFu0=ET04XRR+vb_O90Cf_rLdg@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
	<CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
	<CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>
	<CAMMTP+Azd6T9p_YSySRwtA63Oxf7mtVs3M9mkpFJUSODxcrB3A@mail.gmail.com>
	<CAMMTP+D2iRfMH+be8yJF54s3B7BV2uQGz1EkW-8deSecuMaUqA@mail.gmail.com>
	<CAEJxiFr60ekfHDw-aw3q5Ur4oFu0=ET04XRR+vb_O90Cf_rLdg@mail.gmail.com>
Message-ID: <CAMMTP+DBOmCBj1-Rp_HBCUqinBaS0_EhhZOO6qn5AuHdbRcgFw@mail.gmail.com>

On Mon, Aug 29, 2011 at 5:03 PM, Christopher Jordan-Squire
<cjordan1 at uw.edu> wrote:
> On Mon, Aug 29, 2011 at 12:13 PM, ?<josef.pktd at gmail.com> wrote:
>> On Mon, Aug 29, 2011 at 12:59 PM, ?<josef.pktd at gmail.com> wrote:
>>> On Mon, Aug 29, 2011 at 11:42 AM, ?<josef.pktd at gmail.com> wrote:
>>>> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire
>>>> <cjordan1 at uw.edu> wrote:
>>>>> On Mon, Aug 29, 2011 at 10:27 AM, ?<josef.pktd at gmail.com> wrote:
>>>>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
>>>>>>> <cjordan1 at uw.edu> wrote:
>>>>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
>>>>>>>>>>> <jason-sage at creativetrax.com> wrote:
>>>>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
>>>>>>>>>>>>> This comparison might be useful to some people, so I stuck it up on a
>>>>>>>>>>>>> github repo. My overall impression is that R is much stronger for
>>>>>>>>>>>>> interactive data analysis. Click on the link for more details why,
>>>>>>>>>>>>> which are summarized in the README file.
>>>>>>>>>>>>
>>>>>>>>>>>> ?From the README:
>>>>>>>>>>>>
>>>>>>>>>>>> "In fact, using Python without the IPython qtconsole is practically
>>>>>>>>>>>> impossible for this sort of cut and paste, interactive analysis.
>>>>>>>>>>>> The shell IPython doesn't allow it because it automatically adds
>>>>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted code's
>>>>>>>>>>>> alignment. Cutting and pasting works for the standard python shell,
>>>>>>>>>>>> but then you lose all the advantages of IPython."
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> You might use %cpaste in the ipython normal shell to paste without it
>>>>>>>>>>>> automatically inserting spaces:
>>>>>>>>>>>>
>>>>>>>>>>>> In [5]: %cpaste
>>>>>>>>>>>> Pasting code; enter '--' alone on the line to stop.
>>>>>>>>>>>> :if 1>0:
>>>>>>>>>>>> : ? ?print 'hi'
>>>>>>>>>>>> :--
>>>>>>>>>>>> hi
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Jason
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> SciPy-User mailing list
>>>>>>>>>>>> SciPy-User at scipy.org
>>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This strikes me as a textbook example of why we need an integrated
>>>>>>>>>>> formula framework in statsmodels. I'll make a pass through when I get
>>>>>>>>>>> a chance and see if there are some places where pandas would really
>>>>>>>>>>> help out.
>>>>>>>>>>
>>>>>>>>>> We used to have a formula class is scipy.stats and I do not follow
>>>>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it also
>>>>>>>>>> had this (extremely flexible but very hard to comprehend). It was what
>>>>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a
>>>>>>>>>> community effort because the syntax required serves multiple
>>>>>>>>>> communities with different annotations and needs. That is also seen
>>>>>>>>>> from the different approaches taken by the stats packages from S/R,
>>>>>>>>>> SAS, Genstat (and those are just are ones I have used).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We have held this discussion at _great_ length multiple times on the
>>>>>>>>> statsmodels list and are in the process of trying to integrate
>>>>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy) into
>>>>>>>>> the statsmodels base.
>>>>>>>>>
>>>>>>>>> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
>>>>>>>>>
>>>>>>>>> and more recently
>>>>>>>>>
>>>>>>>>> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931?
>>>>>>>>>
>>>>>>>>> https://github.com/statsmodels/formula
>>>>>>>>> https://github.com/statsmodels/charlton
>>>>>>>>>
>>>>>>>>> Wes and I made some effort to go through this at SciPy. From where I
>>>>>>>>> sit, I think it's difficult to disentangle the data structures from
>>>>>>>>> the formula implementation, or maybe I'd just prefer to finish
>>>>>>>>> tackling the former because it's much more straightforward. So I'd
>>>>>>>>> like to first finish the pandas-integration branch that we've started
>>>>>>>>> and then focus on the formula support. This is on my (our, I hope...)
>>>>>>>>> immediate long-term goal list. Then I'd like to come back to the
>>>>>>>>> community and hash out the 'rules of the game' details for formulas
>>>>>>>>> after we have some code for people to play with, which promises to be
>>>>>>>>> "fun."
>>>>>>>>>
>>>>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
>>>>>>>>>
>>>>>>>>> FWIW, I could also improve the categorical function to be much nicer
>>>>>>>>> for the given examples (ie., take a list, drop a reference category),
>>>>>>>>> but I don't know that it's worth it, because it's really just a
>>>>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts on
>>>>>>>>> more stop-gap?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I want more usability, but I agree that a stop-gap probably isn't the
>>>>>>>> right way to go, unless it has things we'd eventually want anyways.
>>>>>>>>
>>>>>>>>> If I understand Chris' concerns, I think pandas + formula will go a
>>>>>>>>> long way towards bridging the gap between Python and R usability, but
>>>>>>>>
>>>>>>>> Yes, I agree. pandas + formulas would go a long, long way towards more
>>>>>>>> usability.
>>>>>>>>
>>>>>>>> Though I really, really want a scatterplot smoother (i.e., lowess) in
>>>>>>>> statsmodels. I use it a lot, and the final part of my R file was
>>>>>>>> entirely lowess. (And, I should add, that was the part people liked
>>>>>>>> best since one of the main goals of the assignment was to generate
>>>>>>>> nifty pictures that could be used to summarize the data.)
>>>>>>>>
>>>>>>>
>>>>>>> Working my way through the pull requests. Very time poor...
>>>>>>>
>>>>>>>>> it's a large effort and there are only a handful (at best) of people
>>>>>>>>> writing code -- Wes being the only one who's more or less "full time"
>>>>>>>>> as far as I can tell. The 0.4 statsmodels release should be very
>>>>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then
>>>>>>>>> there's only the small problem of building an infrastructure and
>>>>>>>>> community like CRAN so we can have specialists writing and maintaining
>>>>>>>>> code...but I hope once all the tools are in place this will seem much
>>>>>>>>> less daunting. There certainly seems to be the right sentiment for it.
>>>>>>>>>
>>>>>>>>
>>>>>>>> At the very least creating and testing models would be much simpler.
>>>>>>>> For weeks I've been wanting to see if gmm is the same as gee by
>>>>>>>> fitting both models to the same dataset, but I've been putting it off
>>>>>>>> because I didn't want to construct the design matrices by hand for
>>>>>>>> such a simple question. (GMM--Generalized Method of Moments--is a
>>>>>>>> standard econometrics model and GEE--Generalized Estimating
>>>>>>>> Equations--is a standard biostatics model. They're both
>>>>>>>> generalizations of quasi-likelihood and appear very similar, but I
>>>>>>>> want to fit some models to figure out if they're exactly the same.)
>>>>>>
>>>>>> Since GMM is still in the sandbox, the interface is not very polished,
>>>>>> and it's missing some enhancements. I recommend asking on the mailing
>>>>>> list if it's not clear.
>>>>>>
>>>>>> Note GMM itself is very general and will never be a quick interactive
>>>>>> method. The main work will always be to define the moment conditions
>>>>>> (a bit similar to non-linear function estimation, optimize.leastsq).
>>>>>>
>>>>>> There are and will be special subclasses, eg. IV2SLS, that have
>>>>>> predefined moment conditions, but, still, it's up to the user do
>>>>>> construct design and instrument arrays.
>>>>>> And as far as I remember, the GMM/GEE package in R doesn't have a
>>>>>> formula interface either.
>>>>>>
>>>>>
>>>>> Both of the two gee packages in R I know of have formula interfaces.
>>>>>
>>>>> http://cran.r-project.org/web/packages/geepack/
>>>>> http://cran.r-project.org/web/packages/gee/index.html
>>>
>>> This is very different from what's in GMM in statsmodels so far. The
>>> help file is very short, so I'm mostly guessing.
>>> It seems to be for (a subset) of generalized linear models with
>>> longitudinal/panel covariance structures. Something like this will
>>> eventually (once we get panel data models) ?as a special case of GMM
>>> in statsmodels, assuming it's similar to what I know from the
>>> econometrics literature.
>>>
>>> Most of the subclasses of GMM that I currently have, are focused on
>>> instrumental variable estimation, including non-linear regression.
>>> This should be expanded over time.
>>>
>>> But GMM itself is designed for subclassing by someone who wants to use
>>> her/his own moment conditions, as in
>>> http://cran.r-project.org/web/packages/gmm/index.html
>>> or for us to implement specific models with it.
>>>
>>> If someone wants to use it, then I have to quickly add the options for
>>> the kernels of the weighting matrix, which I keep postponing.
>>> Currently there is only a truncated, uniform kernel that assumes
>>> observations are order by time, but users can provide their own
>>> weighting function.
>>>
>>> Josef
>>>
>>>>
>>>> I have to look at this. I mixed up some acronyms, I meant GEL and GMM
>>>> http://cran.r-project.org/web/packages/gmm/index.html
>>>> the vignette was one of my readings, and the STATA description for GMM.
>>>>
>>>> I never really looked at GEE. (That's Skipper's private work so far.)
>>>>
>>>> Josef
>>>>
>>>>>
>>>>> -Chris JS
>>>>>
>>>>>> Josef
>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Oh, it's not *that* bad. I agree, of course, that it could be better,
>>>>>>> but I've been using mainly Python for my work, including GMM and
>>>>>>> estimating equations models (mainly empirical likelihood and
>>>>>>> generalized maximum entropy) for the last ~two years.
>>>>>>>
>>>>>>> Skipper
>>>>>>> _______________________________________________
>>>>>>> SciPy-User mailing list
>>>>>>> SciPy-User at scipy.org
>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>>
>>>>>> _______________________________________________
>>>>>> SciPy-User mailing list
>>>>>> SciPy-User at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>>
>>>
>>
>> just to make another point:
>>
>> Without someone adding mixed effects, hierachical, panel/longitudinal
>> models, and .... it will not help to have a formula interface to them.
>> (Thanks to Scott we will soon have survival)
>>
>
> I don't think I understand.
>
> I assumed that the formula framework is essentially orthogonal to the
> models themselves. In the sense that it should be simple to adapt a
> formula framework to new models. At least if they're some variety of
> linear model, and provided the formula framework is designed to allow
> for grouping syntax from the beginning. I think easy of extension to
> new models is a major goal, in fact, since we want it to be easy for
> people to contribute new models.

We still need to program the linear algebra to find the estimator, and
we need to define and calculate all the result statistics for the
different models.
(generic GLS won't work well because of the nobs*nobs covariance
matrix, I tried a little bit in the sandbox.)

As an example:   mixed effects model with REML, ...

y = X*b + Z*g, with X fixed regressors/effects and Z random effects.
assume design matrices X and Z are already constructed.

Since I don't know the statistics literature well (in contrast to
econometrics panel data), I started to translate a matlab version to
help me understand this.
But the results don't match up, and I haven't had access to matlab for
a while now.
And I think now literal translation of long matlab functions doesn't
really help, compared to writing from a good textbook with checking of
some crucial steps.

It's only 250 lines of code, but dense, and I had spent quite some time on this.
The standard solution of normal equation looks still simple, but
that's just the beginning and writing the tests often takes almost as
much time as writing the code.

My experience for the things I don't know well: It takes 2 weeks of
staring at it and playing with it, and then it ends up just as a few
lines (or a few hundred lines) of code.

The old mixed effects model with repeated measurements (EM algorithm)
based on the original formula code still sits in the sandbox. It
doesn't quite work, but the formula code makes it difficult to
understand, and it would require a week or five to cleanup, enhance,
test, ...
Since neither Skipper nor I are specifically interested (in the sense
of: It is not what we know and would use ourselves), it is still
waiting there.

The old survival is also still sitting in the sandbox, but Scott wrote
a new version without formula, I looks like it is also soon ready for
a pull request, or review leading up to a pull request. (I find
Scott's version much easier to read because it uses basic python and
numpy data structures, instead of several layers of formula
abstraction.)

Josef


>
> -Chris JS
>
>
>> Josef
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mixed.py
Type: text/x-python
Size: 15805 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/52c55808/attachment.py>

From questions.anon at gmail.com  Mon Aug 29 18:55:50 2011
From: questions.anon at gmail.com (questions anon)
Date: Tue, 30 Aug 2011 08:55:50 +1000
Subject: [SciPy-User] memory error - numpy mean - netcdf4
In-Reply-To: <CAJewx8_QGwDoN7xyt0_A490DsEL1ypWPNR01Ht1=8aNb-YLNjw@mail.gmail.com>
References: <CAN_=ogu+8MaUkK1KkepAmwC+28RvYYWLQyC6y9e0RX7F4=7LFg@mail.gmail.com>
	<1314379860.63640.YahooMailNeo@web161317.mail.bf1.yahoo.com>
	<CAJewx89dBwqHBGkXTnp55nsiebbN-yL6ESb0C_aBipTfpoa7jw@mail.gmail.com>
	<1314387233.14273.YahooMailNeo@web161303.mail.bf1.yahoo.com>
	<CAJewx8_QGwDoN7xyt0_A490DsEL1ypWPNR01Ht1=8aNb-YLNjw@mail.gmail.com>
Message-ID: <CAN_=ogsoTULUg8JXqU=-sE+y0xB1nPh+HqGnz4B9gEFsmDLrtw@mail.gmail.com>

Thanks for all of the responses. I have tried adding in the code you
mentioned (see below). I am not sure if I am putting it in the correct
place? and I am now receiving another error:
"UserWarning: Warning: converting a masked element to nan."
Not sure if that is bringing me any closer? Any feedback will be greatly
appreciated.

from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as N
from mpl_toolkits.basemap import Basemap
import os

MainFolder=r"E:/DSE_BushfireClimatologyProject/griddeddatasamples/GriddedData/T_SFC/"
all_TSFC=[]
for (path, dirs, files) in os.walk(MainFolder):
    for dir in dirs:
        print dir
    path=path+'/'
    for ncfile in files:
        if ncfile[-3:]=='.nc':
            ncfile=os.path.join(path,ncfile)
            ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
            TSFC=ncfile.variables['T_SFC'][4::24,:,:]
            LAT=ncfile.variables['latitude'][:]
            LON=ncfile.variables['longitude'][:]
            #TIME=ncfile.variables['time'][:]
            fillvalue=ncfile.variables['T_SFC']._FillValue
            ncfile.close()

            array=N.true_divide(TSFC[0],len(TSFC))

            for i in xrange(1, len(TSFC)-1,1):
                            array=N.add(array,
N.true_divide(array[i],len(TSFC)))

#plot output summary stats
map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
              llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')

x,y=map(*N.meshgrid(LON,LAT))
plt.title('TSFC Mean at 3pm')
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
CS = map.contourf(x,y,array, cmap=plt.cm.jet)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h])
plt.colorbar(CS,cax=cax, drawedges=True)

plt.savefig((os.path.join(MainFolder, 'Mean.png')))
plt.show()
plt.close()


On Sat, Aug 27, 2011 at 10:54 AM, srean <srean.list at gmail.com> wrote:

>
> On Fri, Aug 26, 2011 at 2:33 PM, Phil Morefield <philmorefield at yahoo.com>wrote:
>
>>
>> The formula you have written looks like you're collapsing everything into
>> a single value. I think he's trying to average a bunch of 2D arrays into a
>> single 2D array.
>>
>
> You are correct, the form that I posted can be read as if it is  for
> updating single mean vector \mu, but you can use the same for an nd-array
> trivially. Just have \mu and t as nd-arrays. m can be one too.  Numpy
> broadcasting will take care of the rest.
>
> One advantage is that it requires only a constant amount of memory for the
> computation, you can even read the data in from an infinite pipe or
> generator that yields a single vector or a matrix at a time (or bundles them
> up m at a time). It will always be uptodate with the current estimate of the
> means. In fact will work for any moment too.
>
> --srean
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/7954d9d7/attachment.html>

From sbassi at gmail.com  Mon Aug 29 18:58:44 2011
From: sbassi at gmail.com (Sebastian Bassi)
Date: Mon, 29 Aug 2011 19:58:44 -0300
Subject: [SciPy-User] density map?
Message-ID: <CACL7rbMiyNoYTHtxA+XA9f3jWs776gdSgQS1Y2w=GH1vwbNDjA@mail.gmail.com>

Hello,

I have a 2-D Numpy array with intensity data.
I'd like to plot it like this
http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif
For each value in a position, it will be colored with a color, if the
value is higher the color will be more intense (maybe from blue to
red).
All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were
using functions instead of data from a matrix/array.
Any idea?
Best,
SB.


From josef.pktd at gmail.com  Mon Aug 29 19:03:14 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 29 Aug 2011 19:03:14 -0400
Subject: [SciPy-User] density map?
In-Reply-To: <CACL7rbMiyNoYTHtxA+XA9f3jWs776gdSgQS1Y2w=GH1vwbNDjA@mail.gmail.com>
References: <CACL7rbMiyNoYTHtxA+XA9f3jWs776gdSgQS1Y2w=GH1vwbNDjA@mail.gmail.com>
Message-ID: <CAMMTP+DXHzY4fe3ON6tXO2HrHxRF36t1dWjUHTPDN8Wv6J-UTg@mail.gmail.com>

On Mon, Aug 29, 2011 at 6:58 PM, Sebastian Bassi <sbassi at gmail.com> wrote:
> Hello,
>
> I have a 2-D Numpy array with intensity data.
> I'd like to plot it like this
> http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif
> For each value in a position, it will be colored with a color, if the
> value is higher the color will be more intense (maybe from blue to
> red).
> All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were
> using functions instead of data from a matrix/array.
> Any idea?

scipy.stats.gaussian_kde

https://picasaweb.google.com/106983885143680349926/Joepy#5611180522655961714

or some other non-parametric density estimator

Josef


> Best,
> SB.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Mon Aug 29 19:05:29 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 29 Aug 2011 19:05:29 -0400
Subject: [SciPy-User] density map?
In-Reply-To: <CAMMTP+DXHzY4fe3ON6tXO2HrHxRF36t1dWjUHTPDN8Wv6J-UTg@mail.gmail.com>
References: <CACL7rbMiyNoYTHtxA+XA9f3jWs776gdSgQS1Y2w=GH1vwbNDjA@mail.gmail.com>
	<CAMMTP+DXHzY4fe3ON6tXO2HrHxRF36t1dWjUHTPDN8Wv6J-UTg@mail.gmail.com>
Message-ID: <CAMMTP+BfZmoBqD3azbG1K+eiQotTEKMQQFTZauibCD5Wwxf9Zw@mail.gmail.com>

On Mon, Aug 29, 2011 at 7:03 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 6:58 PM, Sebastian Bassi <sbassi at gmail.com> wrote:
>> Hello,
>>
>> I have a 2-D Numpy array with intensity data.
>> I'd like to plot it like this
>> http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif
>> For each value in a position, it will be colored with a color, if the
>> value is higher the color will be more intense (maybe from blue to
>> red).
>> All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were
>> using functions instead of data from a matrix/array.
>> Any idea?
>
> scipy.stats.gaussian_kde
>
> https://picasaweb.google.com/106983885143680349926/Joepy#5611180522655961714
>
> or some other non-parametric density estimator

That's not the right answer, I guess if you have already intensities,
then you don't need to estimate the density anymore.

Is it interpolation to a meshgrid that you need?

Josef

>
> Josef
>
>
>> Best,
>> SB.
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>


From njs at pobox.com  Mon Aug 29 19:19:55 2011
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 29 Aug 2011 16:19:55 -0700
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAMMTP+DBOmCBj1-Rp_HBCUqinBaS0_EhhZOO6qn5AuHdbRcgFw@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
	<CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
	<CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>
	<CAMMTP+Azd6T9p_YSySRwtA63Oxf7mtVs3M9mkpFJUSODxcrB3A@mail.gmail.com>
	<CAMMTP+D2iRfMH+be8yJF54s3B7BV2uQGz1EkW-8deSecuMaUqA@mail.gmail.com>
	<CAEJxiFr60ekfHDw-aw3q5Ur4oFu0=ET04XRR+vb_O90Cf_rLdg@mail.gmail.com>
	<CAMMTP+DBOmCBj1-Rp_HBCUqinBaS0_EhhZOO6qn5AuHdbRcgFw@mail.gmail.com>
Message-ID: <CAPJVwBkb0P8ymTZX_RvpyZh-0QH6=EOHwS=TBeov8=ug+4h-1w@mail.gmail.com>

On Mon, Aug 29, 2011 at 2:51 PM,  <josef.pktd at gmail.com> wrote:
> As an example: ? mixed effects model with REML, ...
>
> y = X*b + Z*g, with X fixed regressors/effects and Z random effects.
> assume design matrices X and Z are already constructed.
>
> Since I don't know the statistics literature well (in contrast to
> econometrics panel data), I started to translate a matlab version to
> help me understand this.
> But the results don't match up, and I haven't had access to matlab for
> a while now.
> And I think now literal translation of long matlab functions doesn't
> really help, compared to writing from a good textbook with checking of
> some crucial steps.

I found the "vignettes" that Doug Bates wrote alongside the lme4
package to be pretty good descriptions of the relevant implementation
tricks: http://cran.r-project.org/web/packages/lme4/index.html

-- Nathaniel


From fspaolo at gmail.com  Mon Aug 29 19:35:37 2011
From: fspaolo at gmail.com (Fernando Paolo)
Date: Mon, 29 Aug 2011 16:35:37 -0700
Subject: [SciPy-User] density map?
In-Reply-To: <CAMMTP+BfZmoBqD3azbG1K+eiQotTEKMQQFTZauibCD5Wwxf9Zw@mail.gmail.com>
References: <CACL7rbMiyNoYTHtxA+XA9f3jWs776gdSgQS1Y2w=GH1vwbNDjA@mail.gmail.com>
	<CAMMTP+DXHzY4fe3ON6tXO2HrHxRF36t1dWjUHTPDN8Wv6J-UTg@mail.gmail.com>
	<CAMMTP+BfZmoBqD3azbG1K+eiQotTEKMQQFTZauibCD5Wwxf9Zw@mail.gmail.com>
Message-ID: <CAPBk00Fm3ebnMXUe6H=6mPi4R_NdJrhaKP7uz56SixODPC8JwQ@mail.gmail.com>

If what you want is to plot the intensities ("on a grid"), and you
have a 2-D Numpy array (`data`) where the columns are (say) `x`, `y`,
`z`, you can do:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.mlab import griddata

x = data[:,0]
y = data[:,1]
z = data[:,2]

# define the grid: nx, ny == number of grid points
xi = np.linspace(x.min(), x.max(), nx)
yi = np.linspace(y.min(), y.max(), ny)

# interpolate your data to a regular grid
Zi = ml.griddata(x, y, z, xi, yi)

# plot a continuous surface
plt.contourf(xi, yi, Zi, 15, cmap=plt.cm.jet)
plt.colorbar()
plt.show()

you can check:

http://www.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data

-Fernando


On Mon, Aug 29, 2011 at 4:05 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Aug 29, 2011 at 7:03 PM, ?<josef.pktd at gmail.com> wrote:
>> On Mon, Aug 29, 2011 at 6:58 PM, Sebastian Bassi <sbassi at gmail.com> wrote:
>>> Hello,
>>>
>>> I have a 2-D Numpy array with intensity data.
>>> I'd like to plot it like this
>>> http://crocdoc.ifas.ufl.edu/images/posters/ecologyofgatorholes/9_figure6.gif
>>> For each value in a position, it will be colored with a color, if the
>>> value is higher the color will be more intense (maybe from blue to
>>> red).
>>> All examples I found on http://www.scipy.org/Cookbook/Matplotlib/ were
>>> using functions instead of data from a matrix/array.
>>> Any idea?
>>
>> scipy.stats.gaussian_kde
>>
>> https://picasaweb.google.com/106983885143680349926/Joepy#5611180522655961714
>>
>> or some other non-parametric density estimator
>
> That's not the right answer, I guess if you have already intensities,
> then you don't need to estimate the density anymore.
>
> Is it interpolation to a meshgrid that you need?
>
> Josef
>
>>
>> Josef
>>
>>
>>> Best,
>>> SB.
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


-- 
Fernando Paolo
Institute of Geophysics & Planetary Physics
Scripps Institution of Oceanography
University of California, San Diego
9500 Gilman Drive
La Jolla, CA 92093-0225


From bsouthey at gmail.com  Mon Aug 29 20:57:03 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Mon, 29 Aug 2011 19:57:03 -0500
Subject: [SciPy-User] R vs Python for simple interactive data analysis
In-Reply-To: <CAPJVwBkb0P8ymTZX_RvpyZh-0QH6=EOHwS=TBeov8=ug+4h-1w@mail.gmail.com>
References: <CAEJxiFpaj0aDRajVgTRL5+_k6eXkynH3C3QZ2Wi4+2CTSF_JoA@mail.gmail.com>
	<4E595BAF.1080509@creativetrax.com>
	<CAJPUwMDpU-_X+9Y6tDPpJ29hB3rcq_7auOUjpPLucrAo7EwkQA@mail.gmail.com>
	<CAAea2pYth3KfZA+tv-+r6PgAr1z8=sczYZH9ejKfXPjGvZsF1A@mail.gmail.com>
	<CAKF=Djv2skHMZ6jjKLcUgHgf5+yVFm063vndpk=VQeOWjqaboQ@mail.gmail.com>
	<CAEJxiFq1jZ8a1oCfZFSD60A-ObtG7xvgxhaH+1jxQ4Xogte-Yw@mail.gmail.com>
	<CAKF=Dju8XypaW7JYOSMy-XUZyfFOpDvuRuq57ZJYZpPKJv2=Nw@mail.gmail.com>
	<CAMMTP+DJSWHGzt8v-xF0d4_bdwctaRE9NZmf6oWwK7f8xLrp9A@mail.gmail.com>
	<CAEJxiFr3_m43+S+CRtVBVUvh8O4-DqCF3gnoyR0j7d3QGOcQgA@mail.gmail.com>
	<CAMMTP+B29W+-QX_-KkBbictEzWdz-M4k_en-k5+nPYwzaR5Gdg@mail.gmail.com>
	<CAMMTP+Azd6T9p_YSySRwtA63Oxf7mtVs3M9mkpFJUSODxcrB3A@mail.gmail.com>
	<CAMMTP+D2iRfMH+be8yJF54s3B7BV2uQGz1EkW-8deSecuMaUqA@mail.gmail.com>
	<CAEJxiFr60ekfHDw-aw3q5Ur4oFu0=ET04XRR+vb_O90Cf_rLdg@mail.gmail.com>
	<CAMMTP+DBOmCBj1-Rp_HBCUqinBaS0_EhhZOO6qn5AuHdbRcgFw@mail.gmail.com>
	<CAPJVwBkb0P8ymTZX_RvpyZh-0QH6=EOHwS=TBeov8=ug+4h-1w@mail.gmail.com>
Message-ID: <CAAea2pbdD3kR0uPUSpyZydnbCRD-cm-aO3JzXqANsayx+uKeMQ@mail.gmail.com>

On Mon, Aug 29, 2011 at 6:19 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Mon, Aug 29, 2011 at 2:51 PM, ?<josef.pktd at gmail.com> wrote:
>> As an example: ? mixed effects model with REML, ...
>>
>> y = X*b + Z*g, with X fixed regressors/effects and Z random effects.
>> assume design matrices X and Z are already constructed.
>>
>> Since I don't know the statistics literature well (in contrast to
>> econometrics panel data), I started to translate a matlab version to
>> help me understand this.
>> But the results don't match up, and I haven't had access to matlab for
>> a while now.
>> And I think now literal translation of long matlab functions doesn't
>> really help, compared to writing from a good textbook with checking of
>> some crucial steps.
>
> I found the "vignettes" that Doug Bates wrote alongside the lme4
> package to be pretty good descriptions of the relevant implementation
> tricks: http://cran.r-project.org/web/packages/lme4/index.html
>
> -- Nathaniel
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
Lots of memories...

As Josef said, you need the formula to create:
1) The design matrix of the fixed effects - nothing special
2) The design matrix for the random effects - somewhat interesting
3) The variance-covariance structure of the random effects - 'lots of fun'
4) The variance-covariance structure of the residual effects - 'lots of fun'

The combination of 3) and 4) addresses a huge range of models but it
gets hard really quickly.

That excludes methodology:
1) Maximum likelihood and restricted maximum likelihood are done via
iterative MIVQUE in the file Josef provided. Basically you are
iterating the mixed model equations so somewhat easy but rather slow.
2) R (Bates' with Lindstrom or Pinheiro) and SAS use second derivative
methods (here Mixed procedure with REML or ML) - probably the fast
approach
3) ASReml uses average information REML - neat approach but probably
rather uncommon for the vast majority of people.

But I don' recall Jonathan's approach with his formula code.

Bruce


From dbigbear at gmail.com  Tue Aug 30 01:42:57 2011
From: dbigbear at gmail.com (Xiong Deng)
Date: Tue, 30 Aug 2011 13:42:57 +0800
Subject: [SciPy-User] How to install cpufreq-selector
Message-ID: <CADx86fEecOcM5-XttuDMjV-nu0UV6FL_j7N32dSF0czXtMS-Vg@mail.gmail.com>

 Hi,

I am installing numpy, scipy, atlas which requires disabling CPU throttling.

http://math-atlas.sourceforge.net/atlas_install/

OS:
* LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010
x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux AS release 4 (Nahant Update 3)

It is very strange that cpufreq-selector seems not exist in the system...I
tried to install it myself, but cannot find any source code or install
package of it on the internet...

So how can get the CPU throttling disabled and how can I have
cpufreq-selector installed ?

Tried to manipulate the file
/proc/acpi/processor/CPU/throttling

but it does not exist.

Thank you
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/cc7e36a2/attachment.html>

From robert.kern at gmail.com  Tue Aug 30 02:28:02 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 01:28:02 -0500
Subject: [SciPy-User] How to install cpufreq-selector
In-Reply-To: <CADx86fEecOcM5-XttuDMjV-nu0UV6FL_j7N32dSF0czXtMS-Vg@mail.gmail.com>
References: <CADx86fEecOcM5-XttuDMjV-nu0UV6FL_j7N32dSF0czXtMS-Vg@mail.gmail.com>
Message-ID: <CAF6FJisjZFPQHhxH+xkF8TX-F0+PMH8iciKgFJ2z8oT9k8-xSg@mail.gmail.com>

On Tue, Aug 30, 2011 at 00:42, Xiong Deng <dbigbear at gmail.com> wrote:
> Hi,
>
> I am installing numpy, scipy, atlas which requires disabling CPU throttling.
>
> http://math-atlas.sourceforge.net/atlas_install/
>
> OS:
> * LINUX: Linux XXX 2.6.9_5-9-0-0 #1 SMP Wed Jun 23 14:03:19 CST 2010
> x86_64 x86_64 x86_64 GNU/Linux
> Red Hat Enterprise Linux AS release 4 (Nahant Update 3)
>
> It is very strange that cpufreq-selector seems not exist in the system...I
> tried to install it myself, but cannot find any source code or install
> package of it on the internet...

Googling suggests that it may be available on some Red Hat versions as
cpufreq-utils:

  http://forums.fedoraforum.org/archive/index.php/t-92619.html

Or it simply may not exist on RHEL4.

> So how can get the CPU throttling disabled and how can I have
> cpufreq-selector installed ?
>
> Tried to manipulate the file
> /proc/acpi/processor/CPU/throttling
>
> but it does not exist.

The above link has other locations on the filesystem where these
settings may be manipulated directly. It can vary from version to
version of the Linux kernel and even the configuration of the
particular build of the kernel. You may need to ask on the RHEL4
support mailing lists to get better information.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From lists at hilboll.de  Tue Aug 30 05:29:34 2011
From: lists at hilboll.de (lists at hilboll.de)
Date: Tue, 30 Aug 2011 11:29:34 +0200
Subject: [SciPy-User] 2d spline interpolation with periodic boundaries
Message-ID: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu>

Hi there,

I want to do 2d interpolation with periodic boundary conditions.
Basically, I have data on a global (as in Earth) rectangular (as in
degrees latitude/longitude) grid, and I'd like to interpolate to arbitrary
points on the Earth's surface. So I need periodic boundary conditions in
the zonal direction.

Now, I'd like to look into ``scipy.interpolate.interp2d`` and
``scipy.interpolate.RectBivariateSpline``. Now my question is: Is it
enough to give an Western boundary at 360? in addition to the Eastern
values at 0? to really give me period boundary conditions?

Another question is: How does RectBivariateSpline work? There's not much
info in the docs as to what the function actually does, math-wise.

Any help is greatly appreciated :)

Cheers,
Andreas.


From lists at hilboll.de  Tue Aug 30 06:01:06 2011
From: lists at hilboll.de (Andreas H.)
Date: Tue, 30 Aug 2011 12:01:06 +0200
Subject: [SciPy-User] Calculation of weights depending on area
Message-ID: <dc557fa517a6c4979e2b03bbaeb9e7fe.squirrel@srv2.s4y.tournesol-consulting.eu>

Hi,

again a question coming from analysis of geodata. Say, I have 3d
(lat/lon/z) data, in the easiest case on a rectangular grid. Now I would
like to re-grid these data to a new (again rectangular, in the simplest
case) grid by calculating the volume-weighted mean of the original grid.

So for each cell of the new grid, the algorithm should take the
volume-weighted average of those grid cells from the first grid which "are
part of" the new cell.

Is there any algorithm in SciPy to do this? If not, do you have any
suggestion on where to start? Perhaps there's some library from a more
low-level language that could be wrapped?

Any help is greatly appreciated :)

Cheers,
Andreas.


From ralf.gommers at googlemail.com  Tue Aug 30 08:11:18 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 30 Aug 2011 14:11:18 +0200
Subject: [SciPy-User] RectBivariateSpline
In-Reply-To: <CAPoWvn6Bo=iwpA+oHVUs_wFyCHNFLw9Bxai4kBMMMo5TMurakg@mail.gmail.com>
References: <CAPoWvn6Bo=iwpA+oHVUs_wFyCHNFLw9Bxai4kBMMMo5TMurakg@mail.gmail.com>
Message-ID: <CABL7CQi4djUBkKSKL-LszjKRe1R=b3_DRBsTkvG1YGOFVgY6gg@mail.gmail.com>

On Sun, Aug 28, 2011 at 1:48 PM, ali franco <ali.franco95 at gmail.com> wrote:

> Can RectBivariateSpline be used to calculated derivatives and integrals?


RectBivariateSpline has an "integral" method that should do what it says.
bisplev should be able to evaluate derivates for you, you can feed it the
RectBivariateSpline.tck attribute (which I just notice is undocumented). It
may be useful for BivariateSpline to grow a "derivative" method that does
this. A patch would be very welcome.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/f00ba5c8/attachment.html>

From Jerome.Kieffer at esrf.fr  Tue Aug 30 09:06:38 2011
From: Jerome.Kieffer at esrf.fr (ESRF)
Date: Tue, 30 Aug 2011 15:06:38 +0200
Subject: [SciPy-User] 2d spline interpolation with periodic boundaries
In-Reply-To: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <20110830150638.5510422d.Jerome.Kieffer@esrf.fr>

On Tue, 30 Aug 2011 11:29:34 +0200
lists at hilboll.de wrote:

> Another question is: How does RectBivariateSpline work? There's not much
> info in the docs as to what the function actually does, math-wise.
> 
> Any help is greatly appreciated :)

Hi,

it is a wrapper for "FITPACK" from Dierckx
http://www.netlib.org/dierckx/

Have a look at Fitpack's documentation to understand how it works (control points have to be ordered and other oddities  ...)

Cheers
-- 
ESRF <Jerome.Kieffer at esrf.fr>


From pav at iki.fi  Tue Aug 30 09:10:01 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 30 Aug 2011 13:10:01 +0000 (UTC)
Subject: [SciPy-User] Calculation of weights depending on area
References: <dc557fa517a6c4979e2b03bbaeb9e7fe.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <j3inf9$lu5$1@dough.gmane.org>

Hi,

Tue, 30 Aug 2011 12:01:06 +0200, Andreas H. wrote:
> again a question coming from analysis of geodata. Say, I have 3d
> (lat/lon/z) data, in the easiest case on a rectangular grid. Now I would
> like to re-grid these data to a new (again rectangular, in the simplest
> case) grid by calculating the volume-weighted mean of the original grid.
[clip]

Some suggestions:

(i)

For a rectangular grid, the operation in 3D seems to be a tensor
product of 1D operations. If so, you can write it as follows

	data = regrid_volume_1d(x, x_new, data, axis=0)
	data = regrid_volume_1d(y, y_new, data, axis=1)
	data = regrid_volume_1d(z, z_new, data, axis=2)

So it would be enough to first write a 1D version of the algorithm,
and make it such that it can operate on one axis at a time.

(ii)

An implementation of the 1D version can probably done first in Python.
Because it will (for 3D data) operate across slices with many points,
the result should be fast enough.

A function that may be useful here: numpy.searchsorted

-- 
Pauli Virtanen


From Dharhas.Pothina at twdb.state.tx.us  Tue Aug 30 09:44:27 2011
From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina)
Date: Tue, 30 Aug 2011 08:44:27 -0500
Subject: [SciPy-User] Calculate surface area and volume from intersection of
 volume and plane.
Message-ID: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us>


Hi, 


We have an old ArcGIS aml script that we are trying to replace. The original script takes the input from an ArcGIS TIN model (basically a 2D delaunay triangulation of irregular xy data points with z's defining the depth at each xy) and calculates the surface area and volume of the lake at different elevations (i.e. z cut planes) 


>From my googling it looks like I have options for the delaunay triangulation using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface area and volume calculations at various z planes once I have the triangulation. I would appreciate any pointers. 


thanks, 


- dharhas  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/9c654c1e/attachment.html>

From charlesr.harris at gmail.com  Tue Aug 30 09:47:22 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 07:47:22 -0600
Subject: [SciPy-User] Calculation of weights depending on area
In-Reply-To: <dc557fa517a6c4979e2b03bbaeb9e7fe.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <dc557fa517a6c4979e2b03bbaeb9e7fe.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <CAB6mnx+mo4tECEJuOs131bq3Zhv6c=K94q3Z+o_RyS1cTAArSA@mail.gmail.com>

On Tue, Aug 30, 2011 at 4:01 AM, Andreas H. <lists at hilboll.de> wrote:

> Hi,
>
> again a question coming from analysis of geodata. Say, I have 3d
> (lat/lon/z) data, in the easiest case on a rectangular grid. Now I would
> like to re-grid these data to a new (again rectangular, in the simplest
> case) grid by calculating the volume-weighted mean of the original grid.
>
> So for each cell of the new grid, the algorithm should take the
> volume-weighted average of those grid cells from the first grid which "are
> part of" the new cell.
>
> Is there any algorithm in SciPy to do this? If not, do you have any
> suggestion on where to start? Perhaps there's some library from a more
> low-level language that could be wrapped?
>
> Any help is greatly appreciated :)
>
>
Sounds vaguely like the drizzle algorithm from astronomy. Another approach
would be to subsample and convolve, or smooth and resample. Choosing a
suitable method will depend on the smoothness/sampling of the original data.

For the original approach, if your sample points are on an evenly spaced
grid you can use an fft approach. The sampled data gives rise to a periodic
spectrum, multiplication by the transform of a rectangular spot gives the
data convolved by 'pillars', essentially subsampling in the Fourier Domain.

Or you can compute the overlaps as you originally proposed. I don't know of
any software for that but someone is bound to have done it before.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/850f5c88/attachment.html>

From charlesr.harris at gmail.com  Tue Aug 30 10:42:33 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 08:42:33 -0600
Subject: [SciPy-User] Calculation of weights depending on area
In-Reply-To: <CAB6mnx+mo4tECEJuOs131bq3Zhv6c=K94q3Z+o_RyS1cTAArSA@mail.gmail.com>
References: <dc557fa517a6c4979e2b03bbaeb9e7fe.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CAB6mnx+mo4tECEJuOs131bq3Zhv6c=K94q3Z+o_RyS1cTAArSA@mail.gmail.com>
Message-ID: <CAB6mnxKavkSiRCVRuJ5A3CVtX909-_uig3MHC9NTHvHhS_g4Eg@mail.gmail.com>

On Tue, Aug 30, 2011 at 7:47 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Tue, Aug 30, 2011 at 4:01 AM, Andreas H. <lists at hilboll.de> wrote:
>
>> Hi,
>>
>> again a question coming from analysis of geodata. Say, I have 3d
>> (lat/lon/z) data, in the easiest case on a rectangular grid. Now I would
>> like to re-grid these data to a new (again rectangular, in the simplest
>> case) grid by calculating the volume-weighted mean of the original grid.
>>
>> So for each cell of the new grid, the algorithm should take the
>> volume-weighted average of those grid cells from the first grid which "are
>> part of" the new cell.
>>
>> Is there any algorithm in SciPy to do this? If not, do you have any
>> suggestion on where to start? Perhaps there's some library from a more
>> low-level language that could be wrapped?
>>
>> Any help is greatly appreciated :)
>>
>>
> Sounds vaguely like the drizzle algorithm from astronomy. Another approach
> would be to subsample and convolve, or smooth and resample. Choosing a
> suitable method will depend on the smoothness/sampling of the original data.
>
> For the original approach, if your sample points are on an evenly spaced
> grid you can use an fft approach. The sampled data gives rise to a periodic
> spectrum, multiplication by the transform of a rectangular spot gives the
> data convolved by 'pillars', essentially subsampling in the Fourier Domain.
>
> Or you can compute the overlaps as you originally proposed. I don't know of
> any software for that but someone is bound to have done it before.
>
>
I should mention that if you have a rectangular grid and the overlap is with
a rectangle of the same shape as the basic grid, then I think bilinear
interpolation will do what you want.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/b2ac23c2/attachment.html>

From robert.kern at gmail.com  Tue Aug 30 14:41:33 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 13:41:33 -0500
Subject: [SciPy-User] Calculate surface area and volume from
 intersection of volume and plane.
In-Reply-To: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us>
References: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us>
Message-ID: <CAF6FJiuNKux64Yr4bgPYV5z0xrNATT5tZz5ZQA0x9bJs0t6+gg@mail.gmail.com>

On Tue, Aug 30, 2011 at 08:44, Dharhas Pothina
<Dharhas.Pothina at twdb.state.tx.us> wrote:
> Hi,
>
> We have an old ArcGIS aml script that we are trying to replace. The original
> script takes the input from an ArcGIS TIN model (basically a 2D delaunay
> triangulation of irregular xy data points with z's defining the depth at
> each xy) and calculates the surface area and volume of the lake at different
> elevations (i.e. z cut planes)
>
> From my googling it looks like I have options for the delaunay triangulation
> using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface
> area and volume calculations at various z planes once I have the
> triangulation. I would appreciate any pointers.

Your previous email came through fine. There is no need to repeat it.

It's relatively straightforward to find the polygon of intersection
between the Z plane and the TIN. Just loop through the triangles and
check each of the 3 sides to see if one end is above while the other
end is below. Simple geometry determines the point of contact of that
side. Join up the two sides into a line segment and add that to your
list of line segments. The line segments join up into an irregular
polygon, probably with holes. The area of this polygon can be found by
a formula that you can Google for. E.g.:

  http://paulbourke.net/geometry/polyarea/

The volume can be calculated similarly. You can break up the volume
into triangular prisms projecting up from each of the triangles in the
TIN below the Z-plane. You can calculate the volume of each of those
prisms easily. Just be sure to properly take into account the
triangles that intersect the Z-plane. You only want to count the part
that's below the Z-plane.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From deil.christoph at googlemail.com  Tue Aug 30 16:15:17 2011
From: deil.christoph at googlemail.com (Christoph Deil)
Date: Tue, 30 Aug 2011 22:15:17 +0200
Subject: [SciPy-User] Unexpected covariance matrix from
	scipy.optimize.curve_fit
Message-ID: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>

I noticed that scipy.optimize.curve_fit returns parameter errors that don't scale with sigma, the standard deviation of ydata, as I expected.

Here is a code snippet to illustrate my point, which fits a straight line to five data points:
import numpy as np
from scipy.optimize import curve_fit
x = np.arange(5)
y = np.array([1, -2, 1, -2, 1])
sigma = np.array([1,  2, 1,  2, 1])
def f(x, a, b):
    return a + b * x
popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
perr = np.sqrt(pcov.diagonal())
print('*** sigma = {0} ***'.format(sigma))
print('popt: {0}'.format(popt))
print('perr: {0}'.format(perr))

I get the following result:
*** sigma = [1 2 1 2 1] ***
popt: [  5.71428536e-01   1.19956213e-08]
perr: [ 0.93867933  0.40391117]

Increasing sigma by a factor of 10,
sigma = 10 * np.array([1,  2, 1,  2, 1])
I get the following result:
*** sigma = [10 20 10 20 10] ***
popt: [  5.71428580e-01  -2.27625699e-09]
perr: [ 0.93895295  0.37079075]

The best-fit values stayed the same as expected.
But the error on the slope b decreased by 8% (the error on the offset a didn't change much)
I would have expected fit parameter errors to increase with increasing errors on the data!?
Is this a bug?

Looking at the source code I see that scipy.optimize.curve_fit multiplies the pcov obtained from scipy.optimize.leastsq by a factor s_sq:
https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438
    if (len(ydata) > len(p0)) and pcov is not None:
        s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
        pcov = pcov * s_sq

If so is it possible to add an explanation to
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
that pcov is multiplied with this s_sq factor and why that will give correct errors?

After I noticed this issue I saw that this s_sq factor is mentioned in the cov_x return parameter description of leastsq,
but I think it should be explained in curve_fit where it is applied, maybe leaving a reference in the cov_x leastsq description.

Also it would be nice to mention the full_output option in the curve_fit docu, I only realized after looking at the source code that this was possible.

Christoph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/4bc0cfa6/attachment.html>

From josef.pktd at gmail.com  Tue Aug 30 17:25:21 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 30 Aug 2011 17:25:21 -0400
Subject: [SciPy-User] Unexpected covariance matrix from
	scipy.optimize.curve_fit
In-Reply-To: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>
References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>
Message-ID: <CAMMTP+CxHpag2iYNaaQYbGYnz2JUpSax8U+D9qYU+b3EfFqBYA@mail.gmail.com>

On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil
<deil.christoph at googlemail.com> wrote:
> I noticed that scipy.optimize.curve_fit returns parameter errors that don't
> scale with sigma, the standard deviation of ydata, as I expected.
> Here is a code snippet to illustrate my point, which fits a straight line to
> five data points:
> import numpy as np
> from scipy.optimize import curve_fit
> x = np.arange(5)
> y = np.array([1, -2, 1, -2, 1])
> sigma = np.array([1,? 2, 1,? 2, 1])
> def f(x, a, b):
> ? ? return a + b * x
> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
> perr = np.sqrt(pcov.diagonal())
> print('*** sigma = {0} ***'.format(sigma))
> print('popt: {0}'.format(popt))
> print('perr: {0}'.format(perr))
> I get the following result:
> *** sigma = [1 2 1 2 1] ***
> popt: [? 5.71428536e-01 ? 1.19956213e-08]
> perr: [ 0.93867933? 0.40391117]
> Increasing sigma by a factor of 10,
> sigma = 10 * np.array([1,? 2, 1,? 2, 1])
> I get the following result:
> *** sigma = [10 20 10 20 10] ***
> popt: [? 5.71428580e-01? -2.27625699e-09]
> perr: [ 0.93895295? 0.37079075]
> The best-fit values stayed the same as expected.
> But the error on the slope b?decreased by 8% (the error on the offset a
> didn't change much)
> I would have expected fit parameter errors to increase with increasing
> errors on the data!?
> Is this a bug?

No bug in the formulas. I tested all of them when curve_fit was added.

However in your example the numerical cov lacks quite a bit of
precision. Trying your example with different starting values, I get a
0.05 difference in your perr (std of parameter estimates).

Trying smaller xtol and ftol doesn't change anything. (?)

Since it's linear

>>> import scikits.statsmodels.api as sm
>>> x = np.arange(5.)
>>> y = np.array([1, -2, 1, -2, 1.])
>>> sigma = np.array([1,  2, 1,  2, 1.])
>>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit()
>>> res.params
array([  5.71428571e-01,   1.11022302e-16])
>>> res.bse
array([ 0.98609784,  0.38892223])

>>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./(sigma*10)**2).fit()
>>> res.params
array([  5.71428571e-01,   1.94289029e-16])
>>> res.bse
array([ 0.98609784,  0.38892223])

rescaling doesn't change parameter estimates nor perr

Josef


> Looking at the source code I see that scipy.optimize.curve_fit multiplies
> the pcov obtained from scipy.optimize.leastsq by a factor s_sq:
> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438
>
> ????if (len(ydata) > len(p0)) and pcov is not None:
> ????????s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
> ????????pcov = pcov * s_sq
>
> If so is it possible to add an explanation to
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
> that pcov is multiplied with this s_sq factor and why that will give correct
> errors?
> After I noticed this issue I saw that this s_sq factor is mentioned in the
> cov_x return parameter description of leastsq,
> but I think it should be explained in curve_fit where it is applied, maybe
> leaving a reference in the cov_x leastsq description.
>
> Also it would be nice to mention the full_output option in the curve_fit
> docu, I only realized after looking at the source code that this was
> possible.
> Christoph
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From charlesr.harris at gmail.com  Tue Aug 30 23:19:36 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 21:19:36 -0600
Subject: [SciPy-User] Unexpected covariance matrix from
	scipy.optimize.curve_fit
In-Reply-To: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>
References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>
Message-ID: <CAB6mnxK6zEU_+EWojJV=rhV_LFgKnZPq2tYw_RZjagorDhX1RA@mail.gmail.com>

On Tue, Aug 30, 2011 at 2:15 PM, Christoph Deil <
deil.christoph at googlemail.com> wrote:

> I noticed that scipy.optimize.curve_fit returns parameter errors that don't
> scale with sigma, the standard deviation of ydata, as I expected.
>
> Here is a code snippet to illustrate my point, which fits a straight line
> to five data points:
> import numpy as np
> from scipy.optimize import curve_fit
> x = np.arange(5)
> y = np.array([1, -2, 1, -2, 1])
> sigma = np.array([1,  2, 1,  2, 1])
> def f(x, a, b):
>     return a + b * x
> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
> perr = np.sqrt(pcov.diagonal())
> print('*** sigma = {0} ***'.format(sigma))
> print('popt: {0}'.format(popt))
> print('perr: {0}'.format(perr))
>
> I get the following result:
> *** sigma = [1 2 1 2 1] ***
> popt: [  5.71428536e-01   1.19956213e-08]
> perr: [ 0.93867933  0.40391117]
>
> Increasing sigma by a factor of 10,
> sigma = 10 * np.array([1,  2, 1,  2, 1])
> I get the following result:
> *** sigma = [10 20 10 20 10] ***
> popt: [  5.71428580e-01  -2.27625699e-09]
> perr: [ 0.93895295  0.37079075]
>
> The best-fit values stayed the same as expected.
> But the error on the slope b decreased by 8% (the error on the offset a
> didn't change much)
> I would have expected fit parameter errors to increase with increasing
> errors on the data!?
> Is this a bug?
>
> Looking at the source code I see that scipy.optimize.curve_fit multiplies
> the pcov obtained from scipy.optimize.leastsq by a factor s_sq:
> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438
>
>     if (len(ydata) > len(p0)) and pcov is not None:
>         s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
>         pcov = pcov * s_sq
>
> If so is it possible to add an explanation to
>
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
> that pcov is multiplied with this s_sq factor and why that will give
> correct errors?
>
> After I noticed this issue I saw that this s_sq factor is mentioned in the
> cov_x return parameter description of leastsq,
> but I think it should be explained in curve_fit where it is applied, maybe
> leaving a reference in the cov_x leastsq description.
>
> Also it would be nice to mention the full_output option in the curve_fit
> docu, I only realized after looking at the source code that this was
> possible.
>
>
Five points, minus two parameters, doesn't give you much accuracy in
estimating the variance, look at the \Chi^2
distribution<http://en.wikipedia.org/wiki/Chi-square_distribution>for
three degrees of freedom. Generally, you would like a few hundred
points
for this sort of thing.

Note that the leastsq documentation about the cov is incorrect, it needs to
be multiplied by the variance fo the residuals, not the standard deviation.

Not to say that there isn't a bug here, just that the evidence is thin.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110830/a0805f27/attachment.html>

From alacast at gmail.com  Wed Aug 31 06:30:04 2011
From: alacast at gmail.com (Alacast)
Date: Wed, 31 Aug 2011 11:30:04 +0100
Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 55
In-Reply-To: <mailman.3533.1314651756.1086.scipy-user@scipy.org>
References: <mailman.3533.1314651756.1086.scipy-user@scipy.org>
Message-ID: <CAGoRfgHJZJStc=cuuCXgr3_QaZhEjcqxpB3o6HomrOnt4zWC_Q@mail.gmail.com>

Hilbert transform:
Padding with zeros to the next power of 2 sped it up greatly. Thanks! Is
there any reason hilbert doesn't do that automatically, then remove the
padding before returning the analytic signal?

On Mon, Aug 29, 2011 at 10:02 PM, <scipy-user-request at scipy.org> wrote:

> Send SciPy-User mailing list submissions to
>        scipy-user at scipy.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mail.scipy.org/mailman/listinfo/scipy-user
> or, via email, send a message with subject or body 'help' to
>        scipy-user-request at scipy.org
>
> You can reach the person managing the list at
>        scipy-user-owner at scipy.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of SciPy-User digest..."
>
>
> Today's Topics:
>
>   1. Re: R vs Python for simple interactive data analysis
>      (josef.pktd at gmail.com)
>   2. Hilbert transform (Alacast)
>   3. Re: Hilbert transform (Robert Kern)
>   4. Re: Return variable value by function value (Kliment)
>   5. Re: R vs Python for simple interactive data analysis
>      (Christopher Jordan-Squire)
>   6. Re: R vs Python for simple interactive data analysis
>      (Christopher Jordan-Squire)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 29 Aug 2011 13:13:48 -0400
> From: josef.pktd at gmail.com
> Subject: Re: [SciPy-User] R vs Python for simple interactive data
>        analysis
> To: SciPy Users List <scipy-user at scipy.org>
> Message-ID:
>        <CAMMTP+D2iRfMH+be8yJF54s3B7BV2uQGz1EkW-8deSecuMaUqA at mail.gmail.com
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Mon, Aug 29, 2011 at 12:59 PM,  <josef.pktd at gmail.com> wrote:
> > On Mon, Aug 29, 2011 at 11:42 AM, ?<josef.pktd at gmail.com> wrote:
> >> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire
> >> <cjordan1 at uw.edu> wrote:
> >>> On Mon, Aug 29, 2011 at 10:27 AM, ?<josef.pktd at gmail.com> wrote:
> >>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <
> jsseabold at gmail.com> wrote:
> >>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
> >>>>> <cjordan1 at uw.edu> wrote:
> >>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <
> jsseabold at gmail.com> wrote:
> >>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <
> bsouthey at gmail.com> wrote:
> >>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <
> wesmckinn at gmail.com> wrote:
> >>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
> >>>>>>>>> <jason-sage at creativetrax.com> wrote:
> >>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
> >>>>>>>>>>> This comparison might be useful to some people, so I stuck it
> up on a
> >>>>>>>>>>> github repo. My overall impression is that R is much stronger
> for
> >>>>>>>>>>> interactive data analysis. Click on the link for more details
> why,
> >>>>>>>>>>> which are summarized in the README file.
> >>>>>>>>>>
> >>>>>>>>>> ?From the README:
> >>>>>>>>>>
> >>>>>>>>>> "In fact, using Python without the IPython qtconsole is
> practically
> >>>>>>>>>> impossible for this sort of cut and paste, interactive analysis.
> >>>>>>>>>> The shell IPython doesn't allow it because it automatically adds
> >>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted
> code's
> >>>>>>>>>> alignment. Cutting and pasting works for the standard python
> shell,
> >>>>>>>>>> but then you lose all the advantages of IPython."
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> You might use %cpaste in the ipython normal shell to paste
> without it
> >>>>>>>>>> automatically inserting spaces:
> >>>>>>>>>>
> >>>>>>>>>> In [5]: %cpaste
> >>>>>>>>>> Pasting code; enter '--' alone on the line to stop.
> >>>>>>>>>> :if 1>0:
> >>>>>>>>>> : ? ?print 'hi'
> >>>>>>>>>> :--
> >>>>>>>>>> hi
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>> Jason
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> SciPy-User mailing list
> >>>>>>>>>> SciPy-User at scipy.org
> >>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> This strikes me as a textbook example of why we need an
> integrated
> >>>>>>>>> formula framework in statsmodels. I'll make a pass through when I
> get
> >>>>>>>>> a chance and see if there are some places where pandas would
> really
> >>>>>>>>> help out.
> >>>>>>>>
> >>>>>>>> We used to have a formula class is scipy.stats and I do not follow
> >>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it
> also
> >>>>>>>> had this (extremely flexible but very hard to comprehend). It was
> what
> >>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a
> >>>>>>>> community effort because the syntax required serves multiple
> >>>>>>>> communities with different annotations and needs. That is also
> seen
> >>>>>>>> from the different approaches taken by the stats packages from
> S/R,
> >>>>>>>> SAS, Genstat (and those are just are ones I have used).
> >>>>>>>>
> >>>>>>>
> >>>>>>> We have held this discussion at _great_ length multiple times on
> the
> >>>>>>> statsmodels list and are in the process of trying to integrate
> >>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy)
> into
> >>>>>>> the statsmodels base.
> >>>>>>>
> >>>>>>>
> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
> >>>>>>>
> >>>>>>> and more recently
> >>>>>>>
> >>>>>>>
> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931
> ?
> >>>>>>>
> >>>>>>> https://github.com/statsmodels/formula
> >>>>>>> https://github.com/statsmodels/charlton
> >>>>>>>
> >>>>>>> Wes and I made some effort to go through this at SciPy. From where
> I
> >>>>>>> sit, I think it's difficult to disentangle the data structures from
> >>>>>>> the formula implementation, or maybe I'd just prefer to finish
> >>>>>>> tackling the former because it's much more straightforward. So I'd
> >>>>>>> like to first finish the pandas-integration branch that we've
> started
> >>>>>>> and then focus on the formula support. This is on my (our, I
> hope...)
> >>>>>>> immediate long-term goal list. Then I'd like to come back to the
> >>>>>>> community and hash out the 'rules of the game' details for formulas
> >>>>>>> after we have some code for people to play with, which promises to
> be
> >>>>>>> "fun."
> >>>>>>>
> >>>>>>> https://github.com/statsmodels/statsmodels/tree/pandas-integration
> >>>>>>>
> >>>>>>> FWIW, I could also improve the categorical function to be much
> nicer
> >>>>>>> for the given examples (ie., take a list, drop a reference
> category),
> >>>>>>> but I don't know that it's worth it, because it's really just a
> >>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts
> on
> >>>>>>> more stop-gap?
> >>>>>>>
> >>>>>>
> >>>>>> I want more usability, but I agree that a stop-gap probably isn't
> the
> >>>>>> right way to go, unless it has things we'd eventually want anyways.
> >>>>>>
> >>>>>>> If I understand Chris' concerns, I think pandas + formula will go a
> >>>>>>> long way towards bridging the gap between Python and R usability,
> but
> >>>>>>
> >>>>>> Yes, I agree. pandas + formulas would go a long, long way towards
> more
> >>>>>> usability.
> >>>>>>
> >>>>>> Though I really, really want a scatterplot smoother (i.e., lowess)
> in
> >>>>>> statsmodels. I use it a lot, and the final part of my R file was
> >>>>>> entirely lowess. (And, I should add, that was the part people liked
> >>>>>> best since one of the main goals of the assignment was to generate
> >>>>>> nifty pictures that could be used to summarize the data.)
> >>>>>>
> >>>>>
> >>>>> Working my way through the pull requests. Very time poor...
> >>>>>
> >>>>>>> it's a large effort and there are only a handful (at best) of
> people
> >>>>>>> writing code -- Wes being the only one who's more or less "full
> time"
> >>>>>>> as far as I can tell. The 0.4 statsmodels release should be very
> >>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then
> >>>>>>> there's only the small problem of building an infrastructure and
> >>>>>>> community like CRAN so we can have specialists writing and
> maintaining
> >>>>>>> code...but I hope once all the tools are in place this will seem
> much
> >>>>>>> less daunting. There certainly seems to be the right sentiment for
> it.
> >>>>>>>
> >>>>>>
> >>>>>> At the very least creating and testing models would be much simpler.
> >>>>>> For weeks I've been wanting to see if gmm is the same as gee by
> >>>>>> fitting both models to the same dataset, but I've been putting it
> off
> >>>>>> because I didn't want to construct the design matrices by hand for
> >>>>>> such a simple question. (GMM--Generalized Method of Moments--is a
> >>>>>> standard econometrics model and GEE--Generalized Estimating
> >>>>>> Equations--is a standard biostatics model. They're both
> >>>>>> generalizations of quasi-likelihood and appear very similar, but I
> >>>>>> want to fit some models to figure out if they're exactly the same.)
> >>>>
> >>>> Since GMM is still in the sandbox, the interface is not very polished,
> >>>> and it's missing some enhancements. I recommend asking on the mailing
> >>>> list if it's not clear.
> >>>>
> >>>> Note GMM itself is very general and will never be a quick interactive
> >>>> method. The main work will always be to define the moment conditions
> >>>> (a bit similar to non-linear function estimation, optimize.leastsq).
> >>>>
> >>>> There are and will be special subclasses, eg. IV2SLS, that have
> >>>> predefined moment conditions, but, still, it's up to the user do
> >>>> construct design and instrument arrays.
> >>>> And as far as I remember, the GMM/GEE package in R doesn't have a
> >>>> formula interface either.
> >>>>
> >>>
> >>> Both of the two gee packages in R I know of have formula interfaces.
> >>>
> >>> http://cran.r-project.org/web/packages/geepack/
> >>> http://cran.r-project.org/web/packages/gee/index.html
> >
> > This is very different from what's in GMM in statsmodels so far. The
> > help file is very short, so I'm mostly guessing.
> > It seems to be for (a subset) of generalized linear models with
> > longitudinal/panel covariance structures. Something like this will
> > eventually (once we get panel data models) ?as a special case of GMM
> > in statsmodels, assuming it's similar to what I know from the
> > econometrics literature.
> >
> > Most of the subclasses of GMM that I currently have, are focused on
> > instrumental variable estimation, including non-linear regression.
> > This should be expanded over time.
> >
> > But GMM itself is designed for subclassing by someone who wants to use
> > her/his own moment conditions, as in
> > http://cran.r-project.org/web/packages/gmm/index.html
> > or for us to implement specific models with it.
> >
> > If someone wants to use it, then I have to quickly add the options for
> > the kernels of the weighting matrix, which I keep postponing.
> > Currently there is only a truncated, uniform kernel that assumes
> > observations are order by time, but users can provide their own
> > weighting function.
> >
> > Josef
> >
> >>
> >> I have to look at this. I mixed up some acronyms, I meant GEL and GMM
> >> http://cran.r-project.org/web/packages/gmm/index.html
> >> the vignette was one of my readings, and the STATA description for GMM.
> >>
> >> I never really looked at GEE. (That's Skipper's private work so far.)
> >>
> >> Josef
> >>
> >>>
> >>> -Chris JS
> >>>
> >>>> Josef
> >>>>
> >>>>>>
> >>>>>
> >>>>> Oh, it's not *that* bad. I agree, of course, that it could be better,
> >>>>> but I've been using mainly Python for my work, including GMM and
> >>>>> estimating equations models (mainly empirical likelihood and
> >>>>> generalized maximum entropy) for the last ~two years.
> >>>>>
> >>>>> Skipper
> >>>>> _______________________________________________
> >>>>> SciPy-User mailing list
> >>>>> SciPy-User at scipy.org
> >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>
> >>>> _______________________________________________
> >>>> SciPy-User mailing list
> >>>> SciPy-User at scipy.org
> >>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>
> >>> _______________________________________________
> >>> SciPy-User mailing list
> >>> SciPy-User at scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>
> >>
> >
>
> just to make another point:
>
> Without someone adding mixed effects, hierachical, panel/longitudinal
> models, and .... it will not help to have a formula interface to them.
> (Thanks to Scott we will soon have survival)
>
> Josef
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 29 Aug 2011 18:38:09 +0100
> From: Alacast <alacast at gmail.com>
> Subject: [SciPy-User] Hilbert transform
> To: scipy-user at scipy.org
> Message-ID:
>        <CAGoRfgERZW8WqrUY3=UkkeQKz_ND5dspmyzahX4As=H9QvgU0A at mail.gmail.com
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> I'm doing some analyses on sets of real-valued time series in which I want
> to know the envelope/instantaneous amplitude of each series in the set.
> Consequently, I've been taking the Hilbert transform (using
> scipy.signal.hilbert), then taking the absolute value of the result.
>
> The problem is that sometimes this process is far too slow. These time
> series can have on the order of 10^5 to 10^6 data points, and the sets can
> have up to 128 time series. Some datasets have been taking an hour or hours
> to compute on a perfectly modern computing node (1TB of RAM, plenty of
> 2.27Ghz cores, etc.). Is this expected behavior?
>
> I learned that Scipy's Hilbert transform implementation uses FFT, and that
> Scipy's FFT implementation can run in O(n^2) time when the number of time
> points is prime. This happened in a few of my datasets, but I've now
> included a check and correction for that (drop the last data point, so now
> the number is even and consequently not prime). Still, I observe a good
> amount of variability in run times, and they are rather long. Thoughts?
>
> Thanks!
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mail.scipy.org/pipermail/scipy-user/attachments/20110829/7c59ef28/attachment-0001.html
>
> ------------------------------
>
> Message: 3
> Date: Mon, 29 Aug 2011 13:06:02 -0500
> From: Robert Kern <robert.kern at gmail.com>
> Subject: Re: [SciPy-User] Hilbert transform
> To: SciPy Users List <scipy-user at scipy.org>
> Message-ID:
>        <CAF6FJiswfUeJ_4f6xYNkd1d5bXoM5hYfWsfCSPyX3chzucfGkA at mail.gmail.com
> >
> Content-Type: text/plain; charset=UTF-8
>
> On Mon, Aug 29, 2011 at 12:38, Alacast <alacast at gmail.com> wrote:
> > I'm doing some analyses on sets of real-valued time series in which I
> want
> > to know the envelope/instantaneous amplitude of each series in the set.
> > Consequently, I've been taking the Hilbert transform (using
> > scipy.signal.hilbert), then taking the absolute value of the result.
> > The problem is that sometimes this process is far too slow. These time
> > series can have on the order of 10^5 to 10^6 data points, and the sets
> can
> > have up to 128 time series. Some datasets have been taking an hour or
> hours
> > to compute on a perfectly modern computing node (1TB of RAM, plenty of
> > 2.27Ghz cores, etc.). Is this expected behavior?
> > I learned that Scipy's Hilbert transform implementation uses FFT, and
> that
> > Scipy's FFT implementation can run in O(n^2) time when the number of time
> > points is prime. This happened in a few of my datasets, but I've now
> > included a check and correction for that (drop the last data point, so
> now
> > the number is even and consequently not prime). Still, I observe a good
> > amount of variability in run times, and they are rather long. Thoughts?
>
> Having N be prime is just the extreme case. Basically, the FFT
> recursively computes the DFT. It can only recurse on integral factors
> of N, so any prime factor M must be computed the slow way, taking
> O(M^2) steps. You probably have large prime factors sitting around. A
> typical approach is to pad your signal with 0s until the next power of
> 2 or other reasonably-factorable size.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
> ? -- Umberto Eco
>
>
> ------------------------------
>
> Message: 4
> Date: Sun, 28 Aug 2011 16:04:25 +0200
> From: "Kliment" <otrov at hush.ai>
> Subject: Re: [SciPy-User] Return variable value by function value
> To: scipy-user at scipy.org
> Message-ID: <20110828140425.E64D9E6719 at smtp.hushmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Thanks for your input guys
>
> So in similar cases I should use interpolation function (or solver
> depending on initial function) from SciPy package
>
> Example I provided was from scratch of course, but it seems that
> 0.95 is still in y range:
>
> >>> sqrt(1 - 98**2/10E+4)
> 0.95076811052958654
>
> >>> sqrt(1 - 99**2/10E+4)
> 0.94973154101567037
>
>
> Regards,
> Kliment
>
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 29 Aug 2011 15:55:08 -0500
> From: Christopher Jordan-Squire <cjordan1 at uw.edu>
> Subject: Re: [SciPy-User] R vs Python for simple interactive data
>        analysis
> To: SciPy Users List <scipy-user at scipy.org>
> Message-ID:
>        <CAEJxiFp+ev4ORv=2i1h9L2-zaVqT2xO8dp__8NLTdqWMQt_FMQ at mail.gmail.com
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> I've just pushed an updated version of the .r and .py files to github,
> as well as a summary of the corrections/suggestions from the mailing
> list. I'd appreciate any further comments/suggestions.
>
> Compared to the original .r and .py files, in these revised version:
> -The R code was cleaned up because I realized I didn't need to use
>    as.factor if I made the relevant variables into factors
> -The python code was cleaned up by computing the 'sub-design matrices'
>    associated with each factor variable before hand and stashing
>    them in a dictionary
> -Names were added to the variables in the regression by creating them
>    from the calls to sm.categorical and stashing them in a dictionary
>
> Notably, the helper fucntions and stashing of the pieces of design matrices
> simplified the calls for model fitting, but they didn't noticeably shorten
> the code. They also required a small increase in complexity. (In terms of
> the
> data structures and function calls used to create the list of names and
> the design matrices.)
>
> I also added some comments to the effect that:
> *one can use paste or cpaste in the IPython shell
> *np.set_printoptions or sm.iolib.SimpleTable can be used to help with
> printing of numpy arrays
> *names can be added by the user to regression model summaries
> *one can make helper functions to construct design matrices and keep
> track of names, but the simplest way of doing it isn't robust to
> subset-ing the data in the presence of categorical variables
>
> Did I miss anything?
>
> -Chris JS
>
>
> On Sat, Aug 27, 2011 at 1:19 PM, Christopher Jordan-Squire
> <cjordan1 at uw.edu> wrote:
> > Hi--I've been a moderately heavy R user for the past two years, so
> > about a month ago I took an (abbreviated) version of a simple data
> > analysis I did in R and tried to rewrite as much of it as possible,
> > line by line, into python using numpy and statsmodels. I didn't use
> > pandas, and I can't comment on how much it might have simplified
> > things.
> >
> > This comparison might be useful to some people, so I stuck it up on a
> > github repo. My overall impression is that R is much stronger for
> > interactive data analysis. Click on the link for more details why,
> > which are summarized in the README file.
> >
> > https://github.com/chrisjordansquire/r_vs_py
> >
> > The code examples should run out of the box with no downloads (other
> > than R, Python, numpy, scipy, and statsmodels) required.
> >
> > -Chris Jordan-Squire
> >
>
>
> ------------------------------
>
> Message: 6
> Date: Mon, 29 Aug 2011 16:03:00 -0500
> From: Christopher Jordan-Squire <cjordan1 at uw.edu>
> Subject: Re: [SciPy-User] R vs Python for simple interactive data
>        analysis
> To: SciPy Users List <scipy-user at scipy.org>
> Message-ID:
>        <CAEJxiFr60ekfHDw-aw3q5Ur4oFu0=ET04XRR+vb_O90Cf_rLdg at mail.gmail.com
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Mon, Aug 29, 2011 at 12:13 PM,  <josef.pktd at gmail.com> wrote:
> > On Mon, Aug 29, 2011 at 12:59 PM, ?<josef.pktd at gmail.com> wrote:
> >> On Mon, Aug 29, 2011 at 11:42 AM, ?<josef.pktd at gmail.com> wrote:
> >>> On Mon, Aug 29, 2011 at 11:34 AM, Christopher Jordan-Squire
> >>> <cjordan1 at uw.edu> wrote:
> >>>> On Mon, Aug 29, 2011 at 10:27 AM, ?<josef.pktd at gmail.com> wrote:
> >>>>> On Mon, Aug 29, 2011 at 11:10 AM, Skipper Seabold <
> jsseabold at gmail.com> wrote:
> >>>>>> On Mon, Aug 29, 2011 at 10:57 AM, Christopher Jordan-Squire
> >>>>>> <cjordan1 at uw.edu> wrote:
> >>>>>>> On Sun, Aug 28, 2011 at 2:54 PM, Skipper Seabold <
> jsseabold at gmail.com> wrote:
> >>>>>>>> On Sat, Aug 27, 2011 at 10:15 PM, Bruce Southey <
> bsouthey at gmail.com> wrote:
> >>>>>>>>> On Sat, Aug 27, 2011 at 5:06 PM, Wes McKinney <
> wesmckinn at gmail.com> wrote:
> >>>>>>>>>> On Sat, Aug 27, 2011 at 5:03 PM, Jason Grout
> >>>>>>>>>> <jason-sage at creativetrax.com> wrote:
> >>>>>>>>>>> On 8/27/11 1:19 PM, Christopher Jordan-Squire wrote:
> >>>>>>>>>>>> This comparison might be useful to some people, so I stuck it
> up on a
> >>>>>>>>>>>> github repo. My overall impression is that R is much stronger
> for
> >>>>>>>>>>>> interactive data analysis. Click on the link for more details
> why,
> >>>>>>>>>>>> which are summarized in the README file.
> >>>>>>>>>>>
> >>>>>>>>>>> ?From the README:
> >>>>>>>>>>>
> >>>>>>>>>>> "In fact, using Python without the IPython qtconsole is
> practically
> >>>>>>>>>>> impossible for this sort of cut and paste, interactive
> analysis.
> >>>>>>>>>>> The shell IPython doesn't allow it because it automatically
> adds
> >>>>>>>>>>> whitespace on multiline bits of code, breaking pre-formatted
> code's
> >>>>>>>>>>> alignment. Cutting and pasting works for the standard python
> shell,
> >>>>>>>>>>> but then you lose all the advantages of IPython."
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> You might use %cpaste in the ipython normal shell to paste
> without it
> >>>>>>>>>>> automatically inserting spaces:
> >>>>>>>>>>>
> >>>>>>>>>>> In [5]: %cpaste
> >>>>>>>>>>> Pasting code; enter '--' alone on the line to stop.
> >>>>>>>>>>> :if 1>0:
> >>>>>>>>>>> : ? ?print 'hi'
> >>>>>>>>>>> :--
> >>>>>>>>>>> hi
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>>
> >>>>>>>>>>> Jason
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> SciPy-User mailing list
> >>>>>>>>>>> SciPy-User at scipy.org
> >>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> This strikes me as a textbook example of why we need an
> integrated
> >>>>>>>>>> formula framework in statsmodels. I'll make a pass through when
> I get
> >>>>>>>>>> a chance and see if there are some places where pandas would
> really
> >>>>>>>>>> help out.
> >>>>>>>>>
> >>>>>>>>> We used to have a formula class is scipy.stats and I do not
> follow
> >>>>>>>>> nipy (http://nipy.sourceforge.net/nipy/stable/index.html) as it
> also
> >>>>>>>>> had this (extremely flexible but very hard to comprehend). It was
> what
> >>>>>>>>> I had argued was needed ages ago for statsmodel. But it needs a
> >>>>>>>>> community effort because the syntax required serves multiple
> >>>>>>>>> communities with different annotations and needs. That is also
> seen
> >>>>>>>>> from the different approaches taken by the stats packages from
> S/R,
> >>>>>>>>> SAS, Genstat (and those are just are ones I have used).
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> We have held this discussion at _great_ length multiple times on
> the
> >>>>>>>> statsmodels list and are in the process of trying to integrate
> >>>>>>>> Charlton (from Nathaniel) and/or Formula (from Jonathan / NiPy)
> into
> >>>>>>>> the statsmodels base.
> >>>>>>>>
> >>>>>>>>
> http://statsmodels.sourceforge.net/dev/roadmap_todo.html#formula-framework
> >>>>>>>>
> >>>>>>>> and more recently
> >>>>>>>>
> >>>>>>>>
> https://groups.google.com/group/pystatsmodels/browse_thread/thread/a76ea5de9e96964b/fd85b80ae46c4931
> ?
> >>>>>>>>
> >>>>>>>> https://github.com/statsmodels/formula
> >>>>>>>> https://github.com/statsmodels/charlton
> >>>>>>>>
> >>>>>>>> Wes and I made some effort to go through this at SciPy. From where
> I
> >>>>>>>> sit, I think it's difficult to disentangle the data structures
> from
> >>>>>>>> the formula implementation, or maybe I'd just prefer to finish
> >>>>>>>> tackling the former because it's much more straightforward. So I'd
> >>>>>>>> like to first finish the pandas-integration branch that we've
> started
> >>>>>>>> and then focus on the formula support. This is on my (our, I
> hope...)
> >>>>>>>> immediate long-term goal list. Then I'd like to come back to the
> >>>>>>>> community and hash out the 'rules of the game' details for
> formulas
> >>>>>>>> after we have some code for people to play with, which promises to
> be
> >>>>>>>> "fun."
> >>>>>>>>
> >>>>>>>>
> https://github.com/statsmodels/statsmodels/tree/pandas-integration
> >>>>>>>>
> >>>>>>>> FWIW, I could also improve the categorical function to be much
> nicer
> >>>>>>>> for the given examples (ie., take a list, drop a reference
> category),
> >>>>>>>> but I don't know that it's worth it, because it's really just a
> >>>>>>>> stop-gap and ideally users shouldn't have to rely on it. Thoughts
> on
> >>>>>>>> more stop-gap?
> >>>>>>>>
> >>>>>>>
> >>>>>>> I want more usability, but I agree that a stop-gap probably isn't
> the
> >>>>>>> right way to go, unless it has things we'd eventually want anyways.
> >>>>>>>
> >>>>>>>> If I understand Chris' concerns, I think pandas + formula will go
> a
> >>>>>>>> long way towards bridging the gap between Python and R usability,
> but
> >>>>>>>
> >>>>>>> Yes, I agree. pandas + formulas would go a long, long way towards
> more
> >>>>>>> usability.
> >>>>>>>
> >>>>>>> Though I really, really want a scatterplot smoother (i.e., lowess)
> in
> >>>>>>> statsmodels. I use it a lot, and the final part of my R file was
> >>>>>>> entirely lowess. (And, I should add, that was the part people liked
> >>>>>>> best since one of the main goals of the assignment was to generate
> >>>>>>> nifty pictures that could be used to summarize the data.)
> >>>>>>>
> >>>>>>
> >>>>>> Working my way through the pull requests. Very time poor...
> >>>>>>
> >>>>>>>> it's a large effort and there are only a handful (at best) of
> people
> >>>>>>>> writing code -- Wes being the only one who's more or less "full
> time"
> >>>>>>>> as far as I can tell. The 0.4 statsmodels release should be very
> >>>>>>>> exciting though, I hope. I'm looking forward to it, at least. Then
> >>>>>>>> there's only the small problem of building an infrastructure and
> >>>>>>>> community like CRAN so we can have specialists writing and
> maintaining
> >>>>>>>> code...but I hope once all the tools are in place this will seem
> much
> >>>>>>>> less daunting. There certainly seems to be the right sentiment for
> it.
> >>>>>>>>
> >>>>>>>
> >>>>>>> At the very least creating and testing models would be much
> simpler.
> >>>>>>> For weeks I've been wanting to see if gmm is the same as gee by
> >>>>>>> fitting both models to the same dataset, but I've been putting it
> off
> >>>>>>> because I didn't want to construct the design matrices by hand for
> >>>>>>> such a simple question. (GMM--Generalized Method of Moments--is a
> >>>>>>> standard econometrics model and GEE--Generalized Estimating
> >>>>>>> Equations--is a standard biostatics model. They're both
> >>>>>>> generalizations of quasi-likelihood and appear very similar, but I
> >>>>>>> want to fit some models to figure out if they're exactly the same.)
> >>>>>
> >>>>> Since GMM is still in the sandbox, the interface is not very
> polished,
> >>>>> and it's missing some enhancements. I recommend asking on the mailing
> >>>>> list if it's not clear.
> >>>>>
> >>>>> Note GMM itself is very general and will never be a quick interactive
> >>>>> method. The main work will always be to define the moment conditions
> >>>>> (a bit similar to non-linear function estimation, optimize.leastsq).
> >>>>>
> >>>>> There are and will be special subclasses, eg. IV2SLS, that have
> >>>>> predefined moment conditions, but, still, it's up to the user do
> >>>>> construct design and instrument arrays.
> >>>>> And as far as I remember, the GMM/GEE package in R doesn't have a
> >>>>> formula interface either.
> >>>>>
> >>>>
> >>>> Both of the two gee packages in R I know of have formula interfaces.
> >>>>
> >>>> http://cran.r-project.org/web/packages/geepack/
> >>>> http://cran.r-project.org/web/packages/gee/index.html
> >>
> >> This is very different from what's in GMM in statsmodels so far. The
> >> help file is very short, so I'm mostly guessing.
> >> It seems to be for (a subset) of generalized linear models with
> >> longitudinal/panel covariance structures. Something like this will
> >> eventually (once we get panel data models) ?as a special case of GMM
> >> in statsmodels, assuming it's similar to what I know from the
> >> econometrics literature.
> >>
> >> Most of the subclasses of GMM that I currently have, are focused on
> >> instrumental variable estimation, including non-linear regression.
> >> This should be expanded over time.
> >>
> >> But GMM itself is designed for subclassing by someone who wants to use
> >> her/his own moment conditions, as in
> >> http://cran.r-project.org/web/packages/gmm/index.html
> >> or for us to implement specific models with it.
> >>
> >> If someone wants to use it, then I have to quickly add the options for
> >> the kernels of the weighting matrix, which I keep postponing.
> >> Currently there is only a truncated, uniform kernel that assumes
> >> observations are order by time, but users can provide their own
> >> weighting function.
> >>
> >> Josef
> >>
> >>>
> >>> I have to look at this. I mixed up some acronyms, I meant GEL and GMM
> >>> http://cran.r-project.org/web/packages/gmm/index.html
> >>> the vignette was one of my readings, and the STATA description for GMM.
> >>>
> >>> I never really looked at GEE. (That's Skipper's private work so far.)
> >>>
> >>> Josef
> >>>
> >>>>
> >>>> -Chris JS
> >>>>
> >>>>> Josef
> >>>>>
> >>>>>>>
> >>>>>>
> >>>>>> Oh, it's not *that* bad. I agree, of course, that it could be
> better,
> >>>>>> but I've been using mainly Python for my work, including GMM and
> >>>>>> estimating equations models (mainly empirical likelihood and
> >>>>>> generalized maximum entropy) for the last ~two years.
> >>>>>>
> >>>>>> Skipper
> >>>>>> _______________________________________________
> >>>>>> SciPy-User mailing list
> >>>>>> SciPy-User at scipy.org
> >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>>
> >>>>> _______________________________________________
> >>>>> SciPy-User mailing list
> >>>>> SciPy-User at scipy.org
> >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>>
> >>>> _______________________________________________
> >>>> SciPy-User mailing list
> >>>> SciPy-User at scipy.org
> >>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>>
> >>>
> >>
> >
> > just to make another point:
> >
> > Without someone adding mixed effects, hierachical, panel/longitudinal
> > models, and .... it will not help to have a formula interface to them.
> > (Thanks to Scott we will soon have survival)
> >
>
> I don't think I understand.
>
> I assumed that the formula framework is essentially orthogonal to the
> models themselves. In the sense that it should be simple to adapt a
> formula framework to new models. At least if they're some variety of
> linear model, and provided the formula framework is designed to allow
> for grouping syntax from the beginning. I think easy of extension to
> new models is a major goal, in fact, since we want it to be easy for
> people to contribute new models.
>
> -Chris JS
>
>
> > Josef
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
>
>
> ------------------------------
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> End of SciPy-User Digest, Vol 96, Issue 55
> ******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110831/018c9a2e/attachment.html>

From lists at hilboll.de  Wed Aug 31 08:17:51 2011
From: lists at hilboll.de (Andreas H.)
Date: Wed, 31 Aug 2011 14:17:51 +0200
Subject: [SciPy-User] 2d spline interpolation with periodic boundaries
In-Reply-To: <20110830150638.5510422d.Jerome.Kieffer@esrf.fr>
References: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu>
	<20110830150638.5510422d.Jerome.Kieffer@esrf.fr>
Message-ID: <e8d713588204c23911f61a1d4cc21afc.squirrel@srv2.s4y.tournesol-consulting.eu>

Hi Jerome,

>> Another question is: How does RectBivariateSpline work? There's not much
>> info in the docs as to what the function actually does, math-wise.
>
> it is a wrapper for "FITPACK" from Dierckx
> http://www.netlib.org/dierckx/
>
> Have a look at Fitpack's documentation to understand how it works (control
> points have to be ordered and other oddities  ...)

RectBivariateSpline is defined in scipy/interpolate/fitpack2.py, where all
I can find is a call to dfitpack.regrid_smth. However, in the FITPACK
library, I cannot find a function by that name -- and I couldn't really
find the source code to dfitpack.so to check how FITPACK actually gets
called ...

Any ideas?

Cheers,
Andreas.


From Dharhas.Pothina at twdb.state.tx.us  Wed Aug 31 08:20:52 2011
From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina)
Date: Wed, 31 Aug 2011 07:20:52 -0500
Subject: [SciPy-User] Calculate surface area and volume from
 intersection of volume and plane.
In-Reply-To: <CAF6FJiuNKux64Yr4bgPYV5z0xrNATT5tZz5ZQA0x9bJs0t6+gg@mail.gmail.com>
References: <4E5CA2EB0200009B0003DA97@GWWEB.twdb.state.tx.us>
	<CAF6FJiuNKux64Yr4bgPYV5z0xrNATT5tZz5ZQA0x9bJs0t6+gg@mail.gmail.com>
Message-ID: <4E5DE0D40200009B0003DB7B@GWWEB.twdb.state.tx.us>


Robert, 

sorry for the dup post and thanks for the pointers I think that gives me enough ideas to build something.
 
- dharhas 

>>> Robert Kern <robert.kern at gmail.com> 8/30/2011 1:41 PM >>>
On Tue, Aug 30, 2011 at 08:44, Dharhas Pothina
<Dharhas.Pothina at twdb.state.tx.us> wrote:
> Hi,
>
> We have an old ArcGIS aml script that we are trying to replace. The original
> script takes the input from an ArcGIS TIN model (basically a 2D delaunay
> triangulation of irregular xy data points with z's defining the depth at
> each xy) and calculates the surface area and volume of the lake at different
> elevations (i.e. z cut planes)
>
> From my googling it looks like I have options for the delaunay triangulation
> using scipy, matplotlib, cgal or mayavi. I'm not sure how to do the surface
> area and volume calculations at various z planes once I have the
> triangulation. I would appreciate any pointers.

Your previous email came through fine. There is no need to repeat it.

It's relatively straightforward to find the polygon of intersection
between the Z plane and the TIN. Just loop through the triangles and
check each of the 3 sides to see if one end is above while the other
end is below. Simple geometry determines the point of contact of that
side. Join up the two sides into a line segment and add that to your
list of line segments. The line segments join up into an irregular
polygon, probably with holes. The area of this polygon can be found by
a formula that you can Google for. E.g.:

  http://paulbourke.net/geometry/polyarea/

The volume can be calculated similarly. You can break up the volume
into triangular prisms projecting up from each of the triangles in the
TIN below the Z-plane. You can calculate the volume of each of those
prisms easily. Just be sure to properly take into account the
triangles that intersect the Z-plane. You only want to count the part
that's below the Z-plane.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110831/16fd79d1/attachment.html>

From dbigbear at gmail.com  Wed Aug 31 08:29:30 2011
From: dbigbear at gmail.com (Xiong Deng)
Date: Wed, 31 Aug 2011 20:29:30 +0800
Subject: [SciPy-User] Problem with Python + Hadoop: how to link .so outside
	Python
Message-ID: <CADx86fEhMjqnQfCg5Vo5BN2NerciP6oLRu2nG=UzB3CFVXbZ-A@mail.gmail.com>

Hi,

I have successfully installed scipy on my Python 2.7 on my local Linux, and
I want to pack my Python2.7 (with scipy) onto Hadoop and run my Python
MapReduce scipts,  like this:

 20 ${HADOOP_HOME}/bin/hadoop streaming \$
 21      -input "${input}" \$
 22      -output "${output}" \$
 23      -mapper "python27/bin/python27.sh rp_extractMap.py" \$
 24      -reducer "python27/bin/python27.sh rp_extractReduce.py" \$
 25      -partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner
\$
 26      -file rp_extractMap.py \$
 27      -file rp_extractReduce.py \$
 28      -file shitu_conf.py \$
 29      -cacheArchive "/share/python27.tar.gz#python27" \$
 30      -outputformat org.apache.hadoop.mapred.TextOutputFormat \$
 31      -inputformat org.apache.hadoop.mapred.CombineTextInputFormat \$
 32      -jobconf mapred.max.split.size="512000000" \$
 33      -jobconf mapred.job.name="[reserve_price][rp_extract]" \$
 34      -jobconf mapred.job.priority=HIGH \$
 35      -jobconf mapred.job.map.capacity=1000 \$
 36      -jobconf mapred.job.reduce.capacity=200 \$
 37      -jobconf mapred.reduce.tasks=200$
 38      -jobconf num.key.fields.for.partition=2$

I have to do this, because the Hadoop server installed its own python of
very low version which may not support some of my python scripts, and I do
not have privelege to install scipy lib on the server. So,I have to use the
-cacheArchieve command to use my own python2.7 with scipy....

But, I find out that some of the .so in scipy are linked to other dynamic
libs outside Python2.7.. For example

$ ldd
~/local/python-2.7.2/lib/python2.7/site-packages/scipy/linalg/flapack.so
        liblapack.so => /usr/local/atlas/lib/liblapack.so
(0x0000002a956fd000)
        libatlas.so => /usr/local/atlas/lib/libatlas.so (0x0000002a95df3000)
        libgfortran.so.3 =>
/home/work/local/gcc-4.6.1/lib64/libgfortran.so.3 (0x0000002a9668d000)
        libm.so.6 => /lib64/tls/libm.so.6 (0x0000002a968b6000)
        libgcc_s.so.1 => /home/work/local/gcc-4.6.1/lib64/libgcc_s.so.1
(0x0000002a96a3c000)
        libquadmath.so.0 =>
/home/work/local/gcc-4.6.1/lib64/libquadmath.so.0 (0x0000002a96b51000)
        libc.so.6 => /lib64/tls/libc.so.6 (0x0000002a96c87000)
        libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000002a96ebb000)
        /lib64/ld-linux-x86-64.so.2 (0x000000552aaaa000)


So, my question is: how can I include this libs? Should I search for all the
linked .so and .a under my local linux and pack them together with
Python2.7??? If yes, How can I get a full list of the libs needed and How
can make Python2.7 know where to find the new libs??

Thanks
Xiong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110831/0663e5b2/attachment.html>

From josef.pktd at gmail.com  Wed Aug 31 09:54:01 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 31 Aug 2011 09:54:01 -0400
Subject: [SciPy-User] 2d spline interpolation with periodic boundaries
In-Reply-To: <e8d713588204c23911f61a1d4cc21afc.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <6b2c33c620e2aeb2757283797f6223be.squirrel@srv2.s4y.tournesol-consulting.eu>
	<20110830150638.5510422d.Jerome.Kieffer@esrf.fr>
	<e8d713588204c23911f61a1d4cc21afc.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <CAMMTP+CsmrLY_xm5KFbrKqyKsGuN=7J8kQ0gHfs_KBjDzcvKKg@mail.gmail.com>

On Wed, Aug 31, 2011 at 8:17 AM, Andreas H. <lists at hilboll.de> wrote:
> Hi Jerome,
>
>>> Another question is: How does RectBivariateSpline work? There's not much
>>> info in the docs as to what the function actually does, math-wise.
>>
>> it is a wrapper for "FITPACK" from Dierckx
>> http://www.netlib.org/dierckx/
>>
>> Have a look at Fitpack's documentation to understand how it works (control
>> points have to be ordered and other oddities ?...)
>
> RectBivariateSpline is defined in scipy/interpolate/fitpack2.py, where all
> I can find is a call to dfitpack.regrid_smth. However, in the FITPACK
> library, I cannot find a function by that name -- and I couldn't really
> find the source code to dfitpack.so to check how FITPACK actually gets
> called ...
>
> Any ideas?

dfitpack is created by f2py

regrid_smth is defined in
"scipy\interpolate\src\fitpack.pyf"

and points to
fortranname regrid

"\scipy\interpolate\fitpack\regrid.f"

Josef
>
> Cheers,
> Andreas.
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From deil.christoph at googlemail.com  Wed Aug 31 11:09:20 2011
From: deil.christoph at googlemail.com (Christoph Deil)
Date: Wed, 31 Aug 2011 17:09:20 +0200
Subject: [SciPy-User] Unexpected covariance matrix from
	scipy.optimize.curve_fit
In-Reply-To: <CAMMTP+CxHpag2iYNaaQYbGYnz2JUpSax8U+D9qYU+b3EfFqBYA@mail.gmail.com>
References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>
	<CAMMTP+CxHpag2iYNaaQYbGYnz2JUpSax8U+D9qYU+b3EfFqBYA@mail.gmail.com>
Message-ID: <B6C94DD2-2106-4E9C-9D8D-67258F71F622@googlemail.com>


On Aug 30, 2011, at 11:25 PM, josef.pktd at gmail.com wrote:

> On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil
> <deil.christoph at googlemail.com> wrote:
>> I noticed that scipy.optimize.curve_fit returns parameter errors that don't
>> scale with sigma, the standard deviation of ydata, as I expected.
>> Here is a code snippet to illustrate my point, which fits a straight line to
>> five data points:
>> import numpy as np
>> from scipy.optimize import curve_fit
>> x = np.arange(5)
>> y = np.array([1, -2, 1, -2, 1])
>> sigma = np.array([1,  2, 1,  2, 1])
>> def f(x, a, b):
>>     return a + b * x
>> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
>> perr = np.sqrt(pcov.diagonal())
>> print('*** sigma = {0} ***'.format(sigma))
>> print('popt: {0}'.format(popt))
>> print('perr: {0}'.format(perr))
>> I get the following result:
>> *** sigma = [1 2 1 2 1] ***
>> popt: [  5.71428536e-01   1.19956213e-08]
>> perr: [ 0.93867933  0.40391117]
>> Increasing sigma by a factor of 10,
>> sigma = 10 * np.array([1,  2, 1,  2, 1])
>> I get the following result:
>> *** sigma = [10 20 10 20 10] ***
>> popt: [  5.71428580e-01  -2.27625699e-09]
>> perr: [ 0.93895295  0.37079075]
>> The best-fit values stayed the same as expected.
>> But the error on the slope b decreased by 8% (the error on the offset a
>> didn't change much)
>> I would have expected fit parameter errors to increase with increasing
>> errors on the data!?
>> Is this a bug?
> 
> No bug in the formulas. I tested all of them when curve_fit was added.
> 
> However in your example the numerical cov lacks quite a bit of
> precision. Trying your example with different starting values, I get a
> 0.05 difference in your perr (std of parameter estimates).
> 
> Trying smaller xtol and ftol doesn't change anything. (?)

Making ftol = 1e-15 very small I get a different wrong result:
popt: [  5.71428580e-01  -2.27625699e-09]
perr: [ 0.92582011  0.59868281]

What do I have to do to get a correct answer (say to 5 significant digits) from curve_fit for this simple example?

> 
> Since it's linear
> 
>>>> import scikits.statsmodels.api as sm
>>>> x = np.arange(5.)
>>>> y = np.array([1, -2, 1, -2, 1.])
>>>> sigma = np.array([1,  2, 1,  2, 1.])
>>>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit()
>>>> res.params
> array([  5.71428571e-01,   1.11022302e-16])
>>>> res.bse
> array([ 0.98609784,  0.38892223])
> 
>>>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./(sigma*10)**2).fit()
>>>> res.params
> array([  5.71428571e-01,   1.94289029e-16])
>>>> res.bse
> array([ 0.98609784,  0.38892223])
> 
> rescaling doesn't change parameter estimates nor perr

This is what I don't understand.
Why don't the parameter estimate errors increase with increasing errors sigma on the data points?
If I have less precise measurements, the model parameters should be less constrained?! 

I was using MINUIT before I learned Scipy and the error definition for a chi2 fit given in the MINUIT User Guide
http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html
as well as the example results here
http://code.google.com/p/pyminuit/wiki/GettingStartedGuide
don't mention the factor s_sq that is used in curve_fit to scale pcov.

Is the error definition in the MINUIT manual wrong?
Can you point me to a web resource that explains why the s_sq factor needs to be applied to the covariance matrix?

> 
> Josef
> 
> 

PS: I've attached a script to fit the two examples using statsmodels, scipy and minuit (applying the s_sq factor myself).

Here are the results I get (who's right for the first example? why does statsmodels only return on parameter value and error?):

    """Example from http://code.google.com/p/pyminuit/wiki/GettingStartedGuide"""
    x = np.array([1  , 2  , 3  , 4  ])
    y = np.array([1.1, 2.1, 2.4, 4.3])
    sigma = np.array([0.1, 0.1, 0.2, 0.1])

statsmodels.api.WLS
popt: [ 1.04516129]
perr: [ 0.0467711]
scipy.optimize.curve_fit
popt: [  8.53964011e-08   1.04516128e+00]
perr: [ 0.27452122  0.09784324]
minuit
popt: [-4.851674617611934e-14, 1.0451612903225629]
perr: [ 0.33828315  0.12647671]

    """Example from http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html"""
    x = np.arange(5)
    y = np.array([1, -2, 1, -2, 1])
    sigma = 10 * np.array([1,  2, 1,  2, 1])

statsmodels.api.WLS
popt: [  5.71428571e-01   7.63278329e-17]
perr: [ 0.98609784  0.38892223]
scipy.optimize.curve_fit
popt: [  5.71428662e-01  -8.73679511e-08]
perr: [ 0.97804034  0.3818681 ]
minuit
popt: [0.5714285714294132, 2.1449508835758024e-13]
perr: [ 0.98609784  0.38892223]


> 
> 
>> Looking at the source code I see that scipy.optimize.curve_fit multiplies
>> the pcov obtained from scipy.optimize.leastsq by a factor s_sq:
>> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438
>> 
>>     if (len(ydata) > len(p0)) and pcov is not None:
>>         s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
>>         pcov = pcov * s_sq
>> 
>> If so is it possible to add an explanation to
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
>> that pcov is multiplied with this s_sq factor and why that will give correct
>> errors?
>> After I noticed this issue I saw that this s_sq factor is mentioned in the
>> cov_x return parameter description of leastsq,
>> but I think it should be explained in curve_fit where it is applied, maybe
>> leaving a reference in the cov_x leastsq description.
>> 
>> Also it would be nice to mention the full_output option in the curve_fit
>> docu, I only realized after looking at the source code that this was
>> possible.
>> Christoph
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> 
>> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110831/1abe6290/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: chi2_example.py
Type: text/x-python-script
Size: 1802 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110831/1abe6290/attachment.bin>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110831/1abe6290/attachment-0001.html>

From josef.pktd at gmail.com  Wed Aug 31 12:10:52 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 31 Aug 2011 12:10:52 -0400
Subject: [SciPy-User] Unexpected covariance matrix from
	scipy.optimize.curve_fit
In-Reply-To: <B6C94DD2-2106-4E9C-9D8D-67258F71F622@googlemail.com>
References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>
	<CAMMTP+CxHpag2iYNaaQYbGYnz2JUpSax8U+D9qYU+b3EfFqBYA@mail.gmail.com>
	<B6C94DD2-2106-4E9C-9D8D-67258F71F622@googlemail.com>
Message-ID: <CAMMTP+CF0tC_fZq31czP=LGv_exr7E=PFvfKnLb0+whsegN3Tg@mail.gmail.com>

On Wed, Aug 31, 2011 at 11:09 AM, Christoph Deil
<deil.christoph at googlemail.com> wrote:
>
> On Aug 30, 2011, at 11:25 PM, josef.pktd at gmail.com wrote:
>
> On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil
> <deil.christoph at googlemail.com> wrote:
>
> I noticed that scipy.optimize.curve_fit returns parameter errors that don't
>
> scale with sigma, the standard deviation of ydata, as I expected.
>
> Here is a code snippet to illustrate my point, which fits a straight line to
>
> five data points:
>
> import numpy as np
>
> from scipy.optimize import curve_fit
>
> x = np.arange(5)
>
> y = np.array([1, -2, 1, -2, 1])
>
> sigma = np.array([1,? 2, 1,? 2, 1])
>
> def f(x, a, b):
>
> ? ? return a + b * x
>
> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
>
> perr = np.sqrt(pcov.diagonal())
>
> print('*** sigma = {0} ***'.format(sigma))
>
> print('popt: {0}'.format(popt))
>
> print('perr: {0}'.format(perr))
>
> I get the following result:
>
> *** sigma = [1 2 1 2 1] ***
>
> popt: [? 5.71428536e-01 ? 1.19956213e-08]
>
> perr: [ 0.93867933? 0.40391117]
>
> Increasing sigma by a factor of 10,
>
> sigma = 10 * np.array([1,? 2, 1,? 2, 1])
>
> I get the following result:
>
> *** sigma = [10 20 10 20 10] ***
>
> popt: [? 5.71428580e-01? -2.27625699e-09]
>
> perr: [ 0.93895295? 0.37079075]
>
> The best-fit values stayed the same as expected.
>
> But the error on the slope b?decreased by 8% (the error on the offset a
>
> didn't change much)
>
> I would have expected fit parameter errors to increase with increasing
>
> errors on the data!?
>
> Is this a bug?
>
> No bug in the formulas. I tested all of them when curve_fit was added.
>
> However in your example the numerical cov lacks quite a bit of
> precision. Trying your example with different starting values, I get a
> 0.05 difference in your perr (std of parameter estimates).
>
> Trying smaller xtol and ftol doesn't change anything. (?)
>
> Making ftol = 1e-15 very small I get a different wrong result:
> popt: [? 5.71428580e-01? -2.27625699e-09]
> perr: [ 0.92582011? 0.59868281]
> What do I have to do to get a correct answer (say to 5 significant digits)
> from curve_fit for this simple example?
>
> Since it's linear
>
> import scikits.statsmodels.api as sm
>
> x = np.arange(5.)
>
> y = np.array([1, -2, 1, -2, 1.])
>
> sigma = np.array([1, ?2, 1, ?2, 1.])
>
> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit()
>
> res.params
>
> array([ ?5.71428571e-01, ??1.11022302e-16])
>
> res.bse
>
> array([ 0.98609784, ?0.38892223])
>
> res = sm.WLS(y, sm.add_constant(x, prepend=True),
> weights=1./(sigma*10)**2).fit()
>
> res.params
>
> array([ ?5.71428571e-01, ??1.94289029e-16])
>
> res.bse
>
> array([ 0.98609784, ?0.38892223])
>
> rescaling doesn't change parameter estimates nor perr
>
> This is what I don't understand.
> Why don't the parameter estimate errors increase with increasing errors
> sigma on the data points?
> If I have less precise measurements, the model parameters should be less
> constrained?!
> I was using MINUIT before I learned Scipy and the error definition for a
> chi2 fit given in the MINUIT User Guide
> http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html
> as well as the example results here
> http://code.google.com/p/pyminuit/wiki/GettingStartedGuide
> don't mention the factor s_sq that is used in curve_fit to scale pcov.
> Is the error definition in the MINUIT manual wrong?
> Can you point me to a web resource that explains why the s_sq factor needs
> to be applied to the covariance matrix?

It's standard text book information, but Wikipedia seems to be lacking
a bit in this.

for the linear case
http://en.wikipedia.org/wiki/Ordinary_least_squares#Assuming_normality

cov_params = sigma^2 (X'X)^{-1}

for the non-linear case with leastsq, X is replaced by Jacobian,
otherwise everything is the same.

However, in your minuit links I saw only the Hessian mentioned (from
very fast skimming the pages)

With maximum likelihood, the inverse Hessian is the complete
covariance matrix, no additional multiplication is necessary.

Essentially, these are implementation details depending on how the
estimation is calculated, and there are various ways of numerically
approximating the Hessian.
That's why this is described for optimize.leastsq (incorrectly as
Chuck pointed out) and but not in optimize.curve_fit.

With leastsquares are maximum likelihood, rescaling both y and
f(x,params) has no effect on the parameter estimates, it's just like
changing units of y, meters instead of centimeters.

I guess scipy.odr would work differently, since it is splitting up the
errors between y and x's, but I never looked at the details.


>
> Josef
>
>
>
> PS: I've attached a script to fit the two examples using statsmodels, scipy
> and minuit (applying the s_sq factor myself).
> Here are the results I get (who's right for the first example? why does
> statsmodels only return on parameter value and error?):
> ? ??"""Example from
> http://code.google.com/p/pyminuit/wiki/GettingStartedGuide"""
> ? ? x = np.array([1? , 2? , 3? , 4? ])
> ? ? y = np.array([1.1, 2.1, 2.4, 4.3])
> ? ? sigma = np.array([0.1, 0.1, 0.2, 0.1])
> statsmodels.api.WLS
> popt: [ 1.04516129]
> perr: [ 0.0467711]
> scipy.optimize.curve_fit
> popt: [? 8.53964011e-08 ? 1.04516128e+00]
> perr: [ 0.27452122? 0.09784324]

that's what I get with example 1 when I run your script,
I don't know why you have one params in your case
(full_output threw an exception in curve_fit with scipy.__version__ '0.9.0'

statsmodels.api.WLS
popt: [ -6.66133815e-16   1.04516129e+00]
perr: [ 0.33828314  0.12647671]
scipy.optimize.curve_fit
popt: [  8.53964011e-08   1.04516128e+00]
perr: [ 0.27452122  0.09784324]


> minuit
> popt: [-4.851674617611934e-14, 1.0451612903225629]
> perr: [ 0.33828315? 0.12647671]

statsmodels and minuit agree pretty well

> ? ??"""Example from
> http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html"""
> ? ? x = np.arange(5)
> ? ? y = np.array([1, -2, 1, -2, 1])
> ? ? sigma = 10 * np.array([1,? 2, 1,? 2, 1])
> statsmodels.api.WLS
> popt: [? 5.71428571e-01 ? 7.63278329e-17]
> perr: [ 0.98609784? 0.38892223]
> scipy.optimize.curve_fit
> popt: [? 5.71428662e-01? -8.73679511e-08]
> perr: [ 0.97804034? 0.3818681 ]
> minuit
> popt: [0.5714285714294132, 2.1449508835758024e-13]
> perr: [ 0.98609784? 0.38892223]

statsmodels and minuit agree,

my guess is that the jacobian calculation of leastsq (curve_fit) is
not very good in these examples. Maybe trying Dfun or the other
options, epsfcn, will help.

I was trying to see whether I get better results calculation the
numerical derivatives in a different way, but had to spend the time
fixing bugs.
(NonlinearLS didn't work correctly with weights.)

Josef

>
>
>
>
> Looking at the source code I see that scipy.optimize.curve_fit multiplies
>
> the pcov obtained from scipy.optimize.leastsq by a factor s_sq:
>
> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438
>
> ????if (len(ydata) > len(p0)) and pcov is not None:
>
> ????????s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
>
> ????????pcov = pcov * s_sq
>
> If so is it possible to add an explanation to
>
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
>
> that pcov is multiplied with this s_sq factor and why that will give correct
>
> errors?
>
> After I noticed this issue I saw that this s_sq factor is mentioned in the
>
> cov_x return parameter description of leastsq,
>
> but I think it should be explained in curve_fit where it is applied, maybe
>
> leaving a reference in the cov_x leastsq description.
>
> Also it would be nice to mention the full_output option in the curve_fit
>
> docu, I only realized after looking at the source code that this was
>
> possible.
>
> Christoph
>
> _______________________________________________
>
> SciPy-User mailing list
>
> SciPy-User at scipy.org
>
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


From robert.kern at gmail.com  Wed Aug 31 15:12:37 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 31 Aug 2011 14:12:37 -0500
Subject: [SciPy-User] SciPy-User Digest, Vol 96, Issue 55
In-Reply-To: <CAGoRfgHJZJStc=cuuCXgr3_QaZhEjcqxpB3o6HomrOnt4zWC_Q@mail.gmail.com>
References: <mailman.3533.1314651756.1086.scipy-user@scipy.org>
	<CAGoRfgHJZJStc=cuuCXgr3_QaZhEjcqxpB3o6HomrOnt4zWC_Q@mail.gmail.com>
Message-ID: <CAF6FJispnH2=eL+oVcT+aTnFFFEWfKC-iZHprSUcTTnSFsQemQ@mail.gmail.com>

Please do no reply to digest messages. Please consider the digests to
be read-only. If you wish to participate in the mailing list, please
subscribe normally. If you must reply to digest messages, please edit
what you quote to just the portion that you respond to and adjust the
Subject line accordingly. Thank you.

On Wed, Aug 31, 2011 at 05:30, Alacast <alacast at gmail.com> wrote:
> Hilbert transform:
> Padding with zeros to the next power of 2 sped it up greatly. Thanks! Is
> there any reason hilbert doesn't do that automatically, then remove the
> padding before returning the analytic signal?

It's not always the right thing to do.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From robert.kern at gmail.com  Wed Aug 31 15:26:08 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 31 Aug 2011 14:26:08 -0500
Subject: [SciPy-User] Problem with Python + Hadoop: how to link .so
 outside Python
In-Reply-To: <CADx86fEhMjqnQfCg5Vo5BN2NerciP6oLRu2nG=UzB3CFVXbZ-A@mail.gmail.com>
References: <CADx86fEhMjqnQfCg5Vo5BN2NerciP6oLRu2nG=UzB3CFVXbZ-A@mail.gmail.com>
Message-ID: <CAF6FJis3Zpq6FHDii9ygKYOkFcOCVJb5PxW9zdG4Z+L5P-XiNA@mail.gmail.com>

On Wed, Aug 31, 2011 at 07:29, Xiong Deng <dbigbear at gmail.com> wrote:

> So, my question is: how can I include this libs? Should I search for all the
> linked .so and .a under my local linux and pack them together with
> Python2.7??? If yes, How can I get a full list of the libs needed and How
> can make Python2.7 know where to find the new libs??

You may get the best advice on a Hadoop mailing list. Some of this
depends on how -cacheArchive will unpack the archive and how Hadoop
Streaming will set up the environment for the subprocesses.

You may be able to use this tool to help you bundle up everything that
is necessary:

  http://stanford.edu/~pgbovine/cde.html

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From cweisiger at msg.ucsf.edu  Wed Aug 31 18:35:57 2011
From: cweisiger at msg.ucsf.edu (Chris Weisiger)
Date: Wed, 31 Aug 2011 15:35:57 -0700
Subject: [SciPy-User] Projecting volumes down to 2D
Message-ID: <CABHB1jL2z6bwPKuuYGV96ypaYM4KdtEZHsae-XoXFU4JzUo_2g@mail.gmail.com>

Briefly, I'm working on a visualization tool for five-dimensional
microscopy data (X/Y/Z/time/wavelength). Different wavelengths can be
transformed with respect to each other: X/Y/Z translation, rotation
about the Z axis, and uniform scaling in X and Y. We can then show
various 2D slices of the data that pass through a specific XYZT point:
an X-Y slice, an X-Z slice, a Y-Z slice, and slices through time.
These slices are generated by transforming the view coordinates and
using scipy.ndimage.map_coordinates.

Now we want to be able to project an entire row/column/etc. of pixels
into a single pixel. For example, in the X-Y slice, each pixel shown
is actually the brightest pixel from the entire Z column. This example
is easily done by taking the maximum along the Z axis and then
proceeding as normal with generating the slice, albeit with a Z
transformation of 0. That's because the other transformation
parameters don't move data through the Z axis. Thus I still only have
to transform X by Y pixels.

I'm having trouble with an edge case for transformed data, though: if
the projection axis is X or Y, and there is a rotation/scale factor,
then I can't see a way to avoid having to transform every single pixel
in a 3D volume to obtain the projection -- that is, transforming X by
Y by Z pixels. This is expensive. Obviously each pixel in the volume
must be considered to generate these projections, but does every pixel
have to be transformed? I don't suppose anyone knows of a way to
simplify the problem?

-Chris


From justinbois at gmail.com  Wed Aug 31 18:47:27 2011
From: justinbois at gmail.com (Justin Bois)
Date: Wed, 31 Aug 2011 15:47:27 -0700
Subject: [SciPy-User] Importing OpenCV makes Python segfault on Mac OS X
Message-ID: <CAPS-vE5RbuS9Diy0RS_ACFez4xUtNB1hM_mEG_Jq1etbNhpPbw@mail.gmail.com>

I am trying to use the OpenCV library with Python bindings on Mac OS X.  I
am using the Enthought Python Distribution for my Python/NumPy/etc. and
installed OpenCV 2.2.0 using Homebrew.  The installation of OpenCV seems to
work ok, and but when I try to import OpenCV, I get a segmentation fault.  I
get the same behavior if I build OpenCV 2.3.1 from source.  Below is what I
see.  (Note: when I use Python installed from MacPorts, I do not have this
problem, but I would like to stick with EPD.)  Any help with this would be
greatly appreciated!

% echo $PYTHONPATH
/Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages:/usr/local/lib/python2.7/site-packages

% which python
/Library/Frameworks/EPD64.framework/Versions/Current/bin/python

% more test.py
import cv
print 'hello world'

% gdb python
<bunch of gdb text>

Starting program:
/Library/Frameworks/EPD64.framework/Versions/7.1/bin/python test.py
Reading symbols for shared libraries .+++..... done
Reading symbols for shared libraries
.............................................................................................................
done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x0000000000000000 in ?? ()
(gdb) backtrace
#0  0x0000000000000000 in ?? ()
#1  0x00000001006814a4 in PyEval_GetGlobals ()
#2  0x00000001006977f2 in PyImport_Import ()
#3  0x000000010069799f in PyImport_ImportModule ()
#4  0x00000001004c2ee2 in initcv ()
#5  0x00000001000e4b9a in import_submodule ()
#6  0x00000001000e4dea in load_next ()
#7  0x00000001000e5778 in PyImport_ImportModuleLevel ()
#8  0x00000001000be2b3 in builtin___import__ ()
#9  0x000000010000d002 in PyObject_Call ()
#10 0x00000001000c3d27 in PyEval_CallObjectWithKeywords ()
#11 0x00000001000c72ae in PyEval_EvalFrameEx ()
#12 0x00000001000cca15 in PyEval_EvalCodeEx ()
#13 0x00000001000ccd16 in PyEval_EvalCode ()
#14 0x00000001000f11ee in PyRun_FileExFlags ()
#15 0x00000001000f2001 in PyRun_SimpleFileExFlags ()
#16 0x0000000100107c65 in Py_Main ()
#17 0x0000000100000f54 in start ()
(gdb)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110831/8f529f32/attachment.html>

From robert.kern at gmail.com  Wed Aug 31 19:13:34 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 31 Aug 2011 18:13:34 -0500
Subject: [SciPy-User] Importing OpenCV makes Python segfault on Mac OS X
In-Reply-To: <CAPS-vE5RbuS9Diy0RS_ACFez4xUtNB1hM_mEG_Jq1etbNhpPbw@mail.gmail.com>
References: <CAPS-vE5RbuS9Diy0RS_ACFez4xUtNB1hM_mEG_Jq1etbNhpPbw@mail.gmail.com>
Message-ID: <CAF6FJistLQVzt9gFig3Tz9eo5-3Cr9nrHoUrFtDtdX7sZHv9WQ@mail.gmail.com>

On Wed, Aug 31, 2011 at 17:47, Justin Bois <justinbois at gmail.com> wrote:
> I am trying to use the OpenCV library with Python bindings on Mac OS X.? I
> am using the Enthought Python Distribution for my Python/NumPy/etc. and
> installed OpenCV 2.2.0 using Homebrew.

Bug reports for EPD should be directed to epd.support at enthought.com. Thank you.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From josef.pktd at gmail.com  Wed Aug 31 21:45:14 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 31 Aug 2011 21:45:14 -0400
Subject: [SciPy-User] Unexpected covariance matrix from
	scipy.optimize.curve_fit
In-Reply-To: <CAMMTP+CF0tC_fZq31czP=LGv_exr7E=PFvfKnLb0+whsegN3Tg@mail.gmail.com>
References: <089E4569-0C53-4FC9-9B0F-353C4EF64478@googlemail.com>
	<CAMMTP+CxHpag2iYNaaQYbGYnz2JUpSax8U+D9qYU+b3EfFqBYA@mail.gmail.com>
	<B6C94DD2-2106-4E9C-9D8D-67258F71F622@googlemail.com>
	<CAMMTP+CF0tC_fZq31czP=LGv_exr7E=PFvfKnLb0+whsegN3Tg@mail.gmail.com>
Message-ID: <CAMMTP+DeXQ5r7Q9qr5uVk81AM=dd0jNrcv7bCErPhbaKg6wSew@mail.gmail.com>

On Wed, Aug 31, 2011 at 12:10 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Aug 31, 2011 at 11:09 AM, Christoph Deil
> <deil.christoph at googlemail.com> wrote:
>>
>> On Aug 30, 2011, at 11:25 PM, josef.pktd at gmail.com wrote:
>>
>> On Tue, Aug 30, 2011 at 4:15 PM, Christoph Deil
>> <deil.christoph at googlemail.com> wrote:
>>
>> I noticed that scipy.optimize.curve_fit returns parameter errors that don't
>>
>> scale with sigma, the standard deviation of ydata, as I expected.
>>
>> Here is a code snippet to illustrate my point, which fits a straight line to
>>
>> five data points:
>>
>> import numpy as np
>>
>> from scipy.optimize import curve_fit
>>
>> x = np.arange(5)
>>
>> y = np.array([1, -2, 1, -2, 1])
>>
>> sigma = np.array([1,? 2, 1,? 2, 1])
>>
>> def f(x, a, b):
>>
>> ? ? return a + b * x
>>
>> popt, pcov = curve_fit(f, x, y, p0=(0.42, 0.42), sigma=sigma)
>>
>> perr = np.sqrt(pcov.diagonal())
>>
>> print('*** sigma = {0} ***'.format(sigma))
>>
>> print('popt: {0}'.format(popt))
>>
>> print('perr: {0}'.format(perr))
>>
>> I get the following result:
>>
>> *** sigma = [1 2 1 2 1] ***
>>
>> popt: [? 5.71428536e-01 ? 1.19956213e-08]
>>
>> perr: [ 0.93867933? 0.40391117]
>>
>> Increasing sigma by a factor of 10,
>>
>> sigma = 10 * np.array([1,? 2, 1,? 2, 1])
>>
>> I get the following result:
>>
>> *** sigma = [10 20 10 20 10] ***
>>
>> popt: [? 5.71428580e-01? -2.27625699e-09]
>>
>> perr: [ 0.93895295? 0.37079075]
>>
>> The best-fit values stayed the same as expected.
>>
>> But the error on the slope b?decreased by 8% (the error on the offset a
>>
>> didn't change much)
>>
>> I would have expected fit parameter errors to increase with increasing
>>
>> errors on the data!?
>>
>> Is this a bug?
>>
>> No bug in the formulas. I tested all of them when curve_fit was added.
>>
>> However in your example the numerical cov lacks quite a bit of
>> precision. Trying your example with different starting values, I get a
>> 0.05 difference in your perr (std of parameter estimates).
>>
>> Trying smaller xtol and ftol doesn't change anything. (?)
>>
>> Making ftol = 1e-15 very small I get a different wrong result:
>> popt: [? 5.71428580e-01? -2.27625699e-09]
>> perr: [ 0.92582011? 0.59868281]
>> What do I have to do to get a correct answer (say to 5 significant digits)
>> from curve_fit for this simple example?
>>
>> Since it's linear
>>
>> import scikits.statsmodels.api as sm
>>
>> x = np.arange(5.)
>>
>> y = np.array([1, -2, 1, -2, 1.])
>>
>> sigma = np.array([1, ?2, 1, ?2, 1.])
>>
>> res = sm.WLS(y, sm.add_constant(x, prepend=True), weights=1./sigma**2).fit()
>>
>> res.params
>>
>> array([ ?5.71428571e-01, ??1.11022302e-16])
>>
>> res.bse
>>
>> array([ 0.98609784, ?0.38892223])
>>
>> res = sm.WLS(y, sm.add_constant(x, prepend=True),
>> weights=1./(sigma*10)**2).fit()
>>
>> res.params
>>
>> array([ ?5.71428571e-01, ??1.94289029e-16])
>>
>> res.bse
>>
>> array([ 0.98609784, ?0.38892223])
>>
>> rescaling doesn't change parameter estimates nor perr
>>
>> This is what I don't understand.
>> Why don't the parameter estimate errors increase with increasing errors
>> sigma on the data points?
>> If I have less precise measurements, the model parameters should be less
>> constrained?!
>> I was using MINUIT before I learned Scipy and the error definition for a
>> chi2 fit given in the MINUIT User Guide
>> http://wwwasdoc.web.cern.ch/wwwasdoc/minuit/node7.html
>> as well as the example results here
>> http://code.google.com/p/pyminuit/wiki/GettingStartedGuide
>> don't mention the factor s_sq that is used in curve_fit to scale pcov.
>> Is the error definition in the MINUIT manual wrong?
>> Can you point me to a web resource that explains why the s_sq factor needs
>> to be applied to the covariance matrix?
>
> It's standard text book information, but Wikipedia seems to be lacking
> a bit in this.
>
> for the linear case
> http://en.wikipedia.org/wiki/Ordinary_least_squares#Assuming_normality
>
> cov_params = sigma^2 (X'X)^{-1}
>
> for the non-linear case with leastsq, X is replaced by Jacobian,
> otherwise everything is the same.
>
> However, in your minuit links I saw only the Hessian mentioned (from
> very fast skimming the pages)
>
> With maximum likelihood, the inverse Hessian is the complete
> covariance matrix, no additional multiplication is necessary.
>
> Essentially, these are implementation details depending on how the
> estimation is calculated, and there are various ways of numerically
> approximating the Hessian.
> That's why this is described for optimize.leastsq (incorrectly as
> Chuck pointed out) and but not in optimize.curve_fit.
>
> With leastsquares are maximum likelihood, rescaling both y and
> f(x,params) has no effect on the parameter estimates, it's just like
> changing units of y, meters instead of centimeters.
>
> I guess scipy.odr would work differently, since it is splitting up the
> errors between y and x's, but I never looked at the details.
>
>
>>
>> Josef
>>
>>
>>
>> PS: I've attached a script to fit the two examples using statsmodels, scipy
>> and minuit (applying the s_sq factor myself).
>> Here are the results I get (who's right for the first example? why does
>> statsmodels only return on parameter value and error?):
>> ? ??"""Example from
>> http://code.google.com/p/pyminuit/wiki/GettingStartedGuide"""
>> ? ? x = np.array([1? , 2? , 3? , 4? ])
>> ? ? y = np.array([1.1, 2.1, 2.4, 4.3])
>> ? ? sigma = np.array([0.1, 0.1, 0.2, 0.1])
>> statsmodels.api.WLS
>> popt: [ 1.04516129]
>> perr: [ 0.0467711]
>> scipy.optimize.curve_fit
>> popt: [? 8.53964011e-08 ? 1.04516128e+00]
>> perr: [ 0.27452122? 0.09784324]
>
> that's what I get with example 1 when I run your script,
> I don't know why you have one params in your case
> (full_output threw an exception in curve_fit with scipy.__version__ '0.9.0'
>
> statsmodels.api.WLS
> popt: [ -6.66133815e-16 ? 1.04516129e+00]
> perr: [ 0.33828314 ?0.12647671]
> scipy.optimize.curve_fit
> popt: [ ?8.53964011e-08 ? 1.04516128e+00]
> perr: [ 0.27452122 ?0.09784324]
>
>
>> minuit
>> popt: [-4.851674617611934e-14, 1.0451612903225629]
>> perr: [ 0.33828315? 0.12647671]

statsmodels.api.WLS
popt: [ -4.90926744e-16   1.04516129e+00]
perr: [ 0.33828314  0.12647671]
statsmodels NonlinearLS
popt: [ -3.92166386e-08   1.04516130e+00]
perr: [ 0.33828314  0.12647671]


finally, I got some bugs out of the weights handling, but still not fully tested

def run_nonlinearls():
    from scikits.statsmodels.miscmodels.nonlinls import NonlinearLS

    class Myfunc(NonlinearLS):

        def _predict(self, params):
            x = self.exog
            a, b = params
            return a + b*x

    mod = Myfunc(y, x, sigma=sigma**2)
    res = mod.fit(start_value=(0.042, 0.42))
    print ('statsmodels NonlinearLS')
    print('popt: {0}'.format(res.params))
    print('perr: {0}'.format(res.bse))

The basics is the same as curve_fit using leastsq, but it uses complex
derivatives which are usually numerically very good.

So it looks like the problems with curve_fit in your example are only
in the numerically derivatives that leastsq is using for the Jacobian.

If leastsq is using only forward differences, then it might be better
to calculate the final Jacobian with centered differences. just a
guess.


>
> statsmodels and minuit agree pretty well
>
>> ? ??"""Example from
>> http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html"""
>> ? ? x = np.arange(5)
>> ? ? y = np.array([1, -2, 1, -2, 1])
>> ? ? sigma = 10 * np.array([1,? 2, 1,? 2, 1])
>> statsmodels.api.WLS
>> popt: [? 5.71428571e-01 ? 7.63278329e-17]
>> perr: [ 0.98609784? 0.38892223]
>> scipy.optimize.curve_fit
>> popt: [? 5.71428662e-01? -8.73679511e-08]
>> perr: [ 0.97804034? 0.3818681 ]
>> minuit
>> popt: [0.5714285714294132, 2.1449508835758024e-13]
>> perr: [ 0.98609784? 0.38892223]

statsmodels.api.WLS
popt: [  5.71428571e-01   1.94289029e-16]
perr: [ 0.98609784  0.38892223]
statsmodels NonlinearLS
popt: [  5.71428387e-01   8.45750929e-08]
perr: [ 0.98609784  0.38892223]

Josef

>
> statsmodels and minuit agree,
>
> my guess is that the jacobian calculation of leastsq (curve_fit) is
> not very good in these examples. Maybe trying Dfun or the other
> options, epsfcn, will help.
>
> I was trying to see whether I get better results calculation the
> numerical derivatives in a different way, but had to spend the time
> fixing bugs.
> (NonlinearLS didn't work correctly with weights.)
>
> Josef
>
>>
>>
>>
>>
>> Looking at the source code I see that scipy.optimize.curve_fit multiplies
>>
>> the pcov obtained from scipy.optimize.leastsq by a factor s_sq:
>>
>> https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L438
>>
>> ????if (len(ydata) > len(p0)) and pcov is not None:
>>
>> ????????s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0))
>>
>> ????????pcov = pcov * s_sq
>>
>> If so is it possible to add an explanation to
>>
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
>>
>> that pcov is multiplied with this s_sq factor and why that will give correct
>>
>> errors?
>>
>> After I noticed this issue I saw that this s_sq factor is mentioned in the
>>
>> cov_x return parameter description of leastsq,
>>
>> but I think it should be explained in curve_fit where it is applied, maybe
>>
>> leaving a reference in the cov_x leastsq description.
>>
>> Also it would be nice to mention the full_output option in the curve_fit
>>
>> docu, I only realized after looking at the source code that this was
>>
>> possible.
>>
>> Christoph
>>
>> _______________________________________________
>>
>> SciPy-User mailing list
>>
>> SciPy-User at scipy.org
>>
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>