From bjorn.burr.nyberg at gmail.com  Thu Mar  1 05:14:41 2012
From: bjorn.burr.nyberg at gmail.com (Bjorn Burr Nyberg)
Date: Thu, 1 Mar 2012 11:14:41 +0100
Subject: [SciPy-User] Loadtxt vs genfromtxt
In-Reply-To: <CAHAreOquJbixvD0yVrTcQNojvj-HOG3LVYvF6J1R-MrVuT3S=Q@mail.gmail.com>
References: <CAJPUwMAaj1mufhZVsqXFYRwRTbrfzjaWN=NOTQAzBdv2FX_SXQ@mail.gmail.com>
	<CAHAreOquJbixvD0yVrTcQNojvj-HOG3LVYvF6J1R-MrVuT3S=Q@mail.gmail.com>
Message-ID: <967E7468-A62E-47BD-8307-D99B8CC281B1@gmail.com>


Hi,
I have a general question about loading data into numpy as I want to compare numpy and r by loading the juraset.dat ASCII file from the gstat package. Reading the support documents I have decided that it is better to use the loadtxt function as I do not have any missing data as useful by the genfromtxt function. However I receive this error when running loadtxt:

File ..... Numpy\lib\npyio.py, line 796, in loadtxt
Items = [conv(Val) for (conv,val) in zip(converts,Vals)]
ValueError: invalid literal for float()

Using the same parameters but with genfromtxt works, although the first entry of the array is Nan(not a numeric - expected a header like in a data frame of r). I suppose I was wondering if there was any way to save header data of an array whereby one could simply call that header for the data?Do I just have to remember the data associated with each column and call using data[]? Even loading the data as x,y,z = loadtxt is problematic when there are several columns associated with the data that I do not necessarily remember offhand.

Thanks for any advice and with your patience as I'm rather new to Numpy.
Nyberg

From warren.weckesser at enthought.com  Thu Mar  1 08:12:11 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Thu, 1 Mar 2012 07:12:11 -0600
Subject: [SciPy-User] Loadtxt vs genfromtxt
In-Reply-To: <967E7468-A62E-47BD-8307-D99B8CC281B1@gmail.com>
References: <CAJPUwMAaj1mufhZVsqXFYRwRTbrfzjaWN=NOTQAzBdv2FX_SXQ@mail.gmail.com>
	<CAHAreOquJbixvD0yVrTcQNojvj-HOG3LVYvF6J1R-MrVuT3S=Q@mail.gmail.com>
	<967E7468-A62E-47BD-8307-D99B8CC281B1@gmail.com>
Message-ID: <CAM-+wY-nNBchLzJE0tO7OkCjCabx9nM7sJpHQ3hsZRh_oraJVA@mail.gmail.com>

On Thu, Mar 1, 2012 at 4:14 AM, Bjorn Burr Nyberg <
bjorn.burr.nyberg at gmail.com> wrote:

>
> Hi,
> I have a general question about loading data into numpy as I want to
> compare numpy and r by loading the juraset.dat ASCII file from the gstat
> package. Reading the support documents I have decided that it is better to
> use the loadtxt function as I do not have any missing data as useful by the
> genfromtxt function. However I receive this error when running loadtxt:
>
> File ..... Numpy\lib\npyio.py, line 796, in loadtxt
> Items = [conv(Val) for (conv,val) in zip(converts,Vals)]
> ValueError: invalid literal for float()
>
> Using the same parameters but with genfromtxt works, although the first
> entry of the array is Nan(not a numeric - expected a header like in a data
> frame of r). I suppose I was wondering if there was any way to save header
> data of an array whereby one could simply call that header for the data?Do
> I just have to remember the data associated with each column and call using
> data[]? Even loading the data as x,y,z = loadtxt is problematic when there
> are several columns associated with the data that I do not necessarily
> remember offhand.
>
> Thanks for any advice and with your patience as I'm rather new to Numpy.
> Nyberg
>


Bjorn,

I don't see the text file 'juraset.dat' in the gstat package
(gstat_1.0-10), but google finds this:
 http://www.ualberta.ca/~jbb/files/juraset.dat

If that is your file, you can read it with genfromtxt like this:


In [1]: data = genfromtxt('juraset.dat', skiprows=26, names=True)

In [2]: data[0]
Out[2]: (2.386, 3.077, 3.0, 3.0, 1.74, 25.72, 77.36, 9.32, 38.32, 21.32,
92.56)

In [3]: data['Zn'][:3]
Out[3]: array([ 92.56,  73.56,  64.8 ])

In [4]: data.dtype.names
Out[4]: ('X', 'Y', 'Rock', 'Land', 'Cd', 'Cu', 'Pb', 'Co', 'Cr', 'Ni', 'Zn')


The option 'names=True' tells genfromtxt to create a structured array,
using the fields in first line (after skiprows) as the field names for the
array.

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120301/8963b266/attachment.html>

From barthpi at gmail.com  Thu Mar  1 10:54:48 2012
From: barthpi at gmail.com (Pierre Barthelemy)
Date: Thu, 1 Mar 2012 16:54:48 +0100
Subject: [SciPy-User] Scipy fitting
Message-ID: <CAB2iWq2gxmADr-DYxT+gnDMbvdumiMHjnUedkWMAXYP=QAodUw@mail.gmail.com>

Dear all,

i am writing a program for data analysis. One of the functions of this
program gives the possibility to fit the functions. I therefore use the
recipe described in :
http://www.scipy.org/Cookbook/FittingData<http://www.scipy.org/Cookbook/FittingData>
under
the section "Simplifying the syntax". This recipe make use of the
function: scipy.optimize.leastsq.


One thing that i would like to know is how can i get the error on the
parameters ? From what i understood from the "Cookbook" page, and from the
scipy manual (
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq),
the second argument returned by the leastsq function gives access to these
errors.
std_error=std(y-function(x))
param_error=sqrt(diagonal(out[1])*std_error)

The param_errors that i get in this case are extremely small. Much smaller
than what i expected, and much smaller than what i can get fitting the
function with matlab. So i guess i made an error here.

Can someone tell me how i should do to retrieve the parameter errors ?

Bests,

Pierre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120301/eff09392/attachment.html>

From sturla at molden.no  Thu Mar  1 10:54:53 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 01 Mar 2012 16:54:53 +0100
Subject: [SciPy-User] Running NumPy or SciPy scripts in the background on
	Windows
Message-ID: <4F4F9BCD.1070807@molden.no>


Often when running long computations with NumPy or SciPy we want to "run 
the task in the background". This is particularly important on Windows, 
where a process that is greedy on CPU or RAM can almost make the system 
unresponsive (e.g. the desktop seems to hang). On Unix there is the 
"nice" command, but it is not available on Windows. We can set a process 
priority with the Windows task manager, but that is of no help if the 
system is unresponsive -- i.e. you cannot get to the task manager.

This is very simple to do with the Windows API (or pywin32). Here is how 
a Python script (e.g. running NumPy or SciPy) can put itself in the 
background using pywin32:


     from win32process import (GetCurrentProcess,
        IDLE_PRIORITY_CLASS, SetPriorityClass)
     SetPriorityClass(GetCurrentProcess(), IDLE_PRIORITY_CLASS)


These are the available flags for SetPriorityClass:


REALTIME_PRIORITY_CLASS       ## absolute highest priority
                               ## NB! Windows is not a RT OS,
                               ## except Windows CE
ABOVE_NORMAL_PRIORITY_CLASS
NORMAL_PRIORITY_CLASS         ## the normal priority
BELOW_NORMAL_PRIORITY_CLASS
IDLE_PRIORITY_CLASS           ## only execute when system is idle


Surprisingly many users of NumPy on Windows does not know this. I 
thought we might put a "receipe" for doing this in the cookbook?


Sometimes we want to do this just for a thread, e.g. to keep an UI 
responsive. We cannot control thread priorities with the Python stdlib. 
Using pywin32, a Python thread that does this will put itself in the 
background "relative to the priority class" for the process -- but not 
relative to the rest of the system:

     from win32api import GetCurrentThread
     from win32process import THREAD_PRIORITY_IDLE, SetThreadPriority
     SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_IDLE)

Python interpreter does not attempt to schedule access to the GIL. That 
is, a high-priority thread that releases the GIL might win it back, and 
so get more time in the Python interpreter (at least on Python 2.x). 
Similary a low-priority thread might more often miss the GIL battle. But 
if the GIL is locked while an extension library (e.g. NumPy) is doing a 
long computation, thread priority can be of no relevance. So this is 
generally less useful than SetPriorityClass.

Here are the flags we can use for SetThreadPriority. Note that these are 
relative to the "priority class" for the process (and complicated by the 
GIL), not relative to the other processes on the system:

THREAD_PRIORITY_TIME_CRITICAL ## higher than 'highest'
THREAD_PRIORITY_HIGHEST
THREAD_PRIORITY_ABOVE_NORMAL
THREAD_PRIORITY_NORMAL
THREAD_PRIORITY_BELOW_NORMAL
THREAD_PRIORITY_LOWEST
THREAD_PRIORITY_IDLE


Sturla


From josef.pktd at gmail.com  Thu Mar  1 11:55:54 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 1 Mar 2012 11:55:54 -0500
Subject: [SciPy-User] Scipy fitting
In-Reply-To: <CAB2iWq2gxmADr-DYxT+gnDMbvdumiMHjnUedkWMAXYP=QAodUw@mail.gmail.com>
References: <CAB2iWq2gxmADr-DYxT+gnDMbvdumiMHjnUedkWMAXYP=QAodUw@mail.gmail.com>
Message-ID: <CAMMTP+AU-Z4C9iTYMYcYadbcU0V1-sXB8zUSeAfwdAEKbDuKKQ@mail.gmail.com>

On Thu, Mar 1, 2012 at 10:54 AM, Pierre Barthelemy <barthpi at gmail.com> wrote:
> Dear all,
>
> i am writing a program for data analysis. One of the functions of this
> program gives the possibility to fit the functions. I therefore use the
> recipe described in : http://www.scipy.org/Cookbook/FittingData?under the
> section "Simplifying the syntax". This recipe make use of the
> function:?scipy.optimize.leastsq.
>
>
> One thing that i would like to know is how can i get the error on the
> parameters ? From what i understood from the "Cookbook" page, and from the
> scipy manual
> (http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq),
> the second argument returned by the leastsq function gives access to these
> errors.
> std_error=std(y-function(x))
> param_error=sqrt(diagonal(out[1])*std_error)

you are taking the sqrt twice numpy.std takes the sqrt of the variance
and then you take it again.

once is enough if I read the snippet correctly, you might also add a
ddof correction to std
(y - function(x)) should have also mean zero if a constant is included

?

Josef

>
> The param_errors that i get in this case are extremely small. Much smaller
> than what i expected, and much smaller than what i can get fitting the
> function with matlab. So i guess i made an error here.
>
> Can someone tell me how i should do to retrieve the parameter errors ?
>
> Bests,
>
> Pierre
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From bjorn.burr.nyberg at gmail.com  Thu Mar  1 11:58:22 2012
From: bjorn.burr.nyberg at gmail.com (Bjorn Burr Nyberg)
Date: Thu, 1 Mar 2012 17:58:22 +0100
Subject: [SciPy-User] Loadtxt vs genfromtxt
In-Reply-To: <CAM-+wY-nNBchLzJE0tO7OkCjCabx9nM7sJpHQ3hsZRh_oraJVA@mail.gmail.com>
References: <CAJPUwMAaj1mufhZVsqXFYRwRTbrfzjaWN=NOTQAzBdv2FX_SXQ@mail.gmail.com>
	<CAHAreOquJbixvD0yVrTcQNojvj-HOG3LVYvF6J1R-MrVuT3S=Q@mail.gmail.com>
	<967E7468-A62E-47BD-8307-D99B8CC281B1@gmail.com>
	<CAM-+wY-nNBchLzJE0tO7OkCjCabx9nM7sJpHQ3hsZRh_oraJVA@mail.gmail.com>
Message-ID: <76E4351E-C5E2-4B28-98DB-12AEEC037DA3@gmail.com>

Thanks that's exactly what I was looking for. Juraset.dat was saved by using require(gstat) data(jura) ...

Nyberg

Sent from my iPad

On 1. mars 2012, at 14:12, Warren Weckesser <warren.weckesser at enthought.com> wrote:

> 
> 
> On Thu, Mar 1, 2012 at 4:14 AM, Bjorn Burr Nyberg <bjorn.burr.nyberg at gmail.com> wrote:
> 
> Hi,
> I have a general question about loading data into numpy as I want to compare numpy and r by loading the juraset.dat ASCII file from the gstat package. Reading the support documents I have decided that it is better to use the loadtxt function as I do not have any missing data as useful by the genfromtxt function. However I receive this error when running loadtxt:
> 
> File ..... Numpy\lib\npyio.py, line 796, in loadtxt
> Items = [conv(Val) for (conv,val) in zip(converts,Vals)]
> ValueError: invalid literal for float()
> 
> Using the same parameters but with genfromtxt works, although the first entry of the array is Nan(not a numeric - expected a header like in a data frame of r). I suppose I was wondering if there was any way to save header data of an array whereby one could simply call that header for the data?Do I just have to remember the data associated with each column and call using data[]? Even loading the data as x,y,z = loadtxt is problematic when there are several columns associated with the data that I do not necessarily remember offhand.
> 
> Thanks for any advice and with your patience as I'm rather new to Numpy.
> Nyberg
> 
> 
> Bjorn,
> 
> I don't see the text file 'juraset.dat' in the gstat package (gstat_1.0-10), but google finds this:
>  http://www.ualberta.ca/~jbb/files/juraset.dat 
> 
> If that is your file, you can read it with genfromtxt like this:
> 
> 
> In [1]: data = genfromtxt('juraset.dat', skiprows=26, names=True)
> 
> In [2]: data[0]
> Out[2]: (2.386, 3.077, 3.0, 3.0, 1.74, 25.72, 77.36, 9.32, 38.32, 21.32, 92.56)
> 
> In [3]: data['Zn'][:3]
> Out[3]: array([ 92.56,  73.56,  64.8 ])
> 
> In [4]: data.dtype.names
> Out[4]: ('X', 'Y', 'Rock', 'Land', 'Cd', 'Cu', 'Pb', 'Co', 'Cr', 'Ni', 'Zn')
> 
> 
> The option 'names=True' tells genfromtxt to create a structured array, using the fields in first line (after skiprows) as the field names for the array.
> 
> Warren
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120301/e29b42d1/attachment.html>

From travis at continuum.io  Thu Mar  1 11:59:53 2012
From: travis at continuum.io (Travis Oliphant)
Date: Thu, 1 Mar 2012 08:59:53 -0800
Subject: [SciPy-User] Scipy fitting
In-Reply-To: <CAB2iWq2gxmADr-DYxT+gnDMbvdumiMHjnUedkWMAXYP=QAodUw@mail.gmail.com>
References: <CAB2iWq2gxmADr-DYxT+gnDMbvdumiMHjnUedkWMAXYP=QAodUw@mail.gmail.com>
Message-ID: <F57A1C87-36A0-4AC0-993A-156A087AE8E7@continuum.io>

You could look at the curve_fit function in scipy optimize as well.   It returns the error in the fitted parameters.  The source code of that function should be useful for you.

Travis 

--
Travis Oliphant
(on a mobile)
512-826-7480


On Mar 1, 2012, at 7:54 AM, Pierre Barthelemy <barthpi at gmail.com> wrote:

> Dear all,
> 
> i am writing a program for data analysis. One of the functions of this program gives the possibility to fit the functions. I therefore use the recipe described in : http://www.scipy.org/Cookbook/FittingData under the section "Simplifying the syntax". This recipe make use of the function: scipy.optimize.leastsq. 
> 
> 
> One thing that i would like to know is how can i get the error on the parameters ? From what i understood from the "Cookbook" page, and from the scipy manual (http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq), the second argument returned by the leastsq function gives access to these errors. 
> std_error=std(y-function(x))
> param_error=sqrt(diagonal(out[1])*std_error)
> 
> The param_errors that i get in this case are extremely small. Much smaller than what i expected, and much smaller than what i can get fitting the function with matlab. So i guess i made an error here.
> 
> Can someone tell me how i should do to retrieve the parameter errors ? 
> 
> Bests,
> 
> Pierre
> 
> 
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120301/0a3b4ce3/attachment.html>

From draft2008 at bk.ru  Fri Mar  2 01:02:43 2012
From: draft2008 at bk.ru (=?UTF-8?B?0JLQu9Cw0LTQuNC80LjRgA==?=)
Date: Fri, 02 Mar 2012 10:02:43 +0400
Subject: [SciPy-User] =?utf-8?q?Orthogonal_distance_regression_in_3D?=
Message-ID: <E1S3LZj-0007Du-6w.draft2008-bk-ru@f238.mail.ru>

Hello!
I'm working with orthogonal distance regression (scipy.odr).
I try to fit the curve to a point cloud (3d), but it doesn work properly, it returns wrong results

For example I want to fit the simple curve y = a*x + b*z + c to some point cloud (y_data, x_data, z_data) ?
?
? ? def func(p, input):
? ? x,z = input
? ? x = np.array(x)
? ? z = np.array(z)
? ? return (p[0]*x + p[1]*z + p[2])

? ? initialGuess = [1,1,1]

? ? myModel = Model(func)
? ? myData = Data([x_data, z_daya], y_data)
? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
? ? myOdr.set_job(fit_type=0)
? ? out = myOdr.run()
? ? print out.beta?

It works perfectly in 2d dimension (2 axes), but in 3d dimension the results are not even close to real, moreover it is very sensitive to initial Guess, so it returns different result even if i change InitiaGuess from?[1,1,1] to?[0.99,1,1]

What do I do wrong?
?
Im not very strong in mathematics, but may be I should specify some additional parameters such as Jacobian matrix or weight matrix or something else?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120302/bd70d7cb/attachment.html>

From robert.kern at gmail.com  Fri Mar  2 06:48:40 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 2 Mar 2012 11:48:40 +0000
Subject: [SciPy-User] Orthogonal distance regression in 3D
In-Reply-To: <E1S3LZj-0007Du-6w.draft2008-bk-ru@f238.mail.ru>
References: <E1S3LZj-0007Du-6w.draft2008-bk-ru@f238.mail.ru>
Message-ID: <CAF6FJitF=xbtFNO7ipnCuWJNrv0qqQWHEP9AeRUXCZWzSQ8WFQ@mail.gmail.com>

On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
> Hello!
> I'm working with orthogonal distance regression (scipy.odr).
> I try to fit the curve to a point cloud (3d), but it doesn work properly, it
> returns wrong results
>
> For example I want to fit the simple curve y = a*x + b*z + c to some point
> cloud (y_data, x_data, z_data)
>
>
> ? ? def func(p, input):
>
> ? ? x,z = input
>
> ? ? x = np.array(x)
>
> ? ? z = np.array(z)
>
> ? ? return (p[0]*x + p[1]*z + p[2])
>
>
> ? ? initialGuess = [1,1,1]
>
> ? ? myModel = Model(func)
>
> ? ? myData = Data([x_data, z_daya], y_data)
>
> ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
>
> ? ? myOdr.set_job(fit_type=0)
>
> ? ? out = myOdr.run()
>
> ? ? print out.beta
>
> It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
> are not even close to real, moreover it is very sensitive to initial Guess,
> so it returns different result even if i change InitiaGuess from?[1,1,1]
> to?[0.99,1,1]
>
> What do I do wrong?

Can you provide a complete runnable example including some data? Note
that if you do not specify any errors on your data, they are assumed
to correspond to a standard deviation of 1 for all dimensions. If that
is wildly different from the actual variance around the "true"
surface, then it might lead the optimizer astray.

-- 
Robert Kern


From ralf.gommers at googlemail.com  Sat Mar  3 09:12:59 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 3 Mar 2012 15:12:59 +0100
Subject: [SciPy-User] Facing a problem with integrations
In-Reply-To: <CAOosPk+mzG71EaBWygHTVe9jyYMDQQSDDezYL1QGqAhiCYg96g@mail.gmail.com>
References: <CAOosPk+mzG71EaBWygHTVe9jyYMDQQSDDezYL1QGqAhiCYg96g@mail.gmail.com>
Message-ID: <CABL7CQi9Kdx51s_=K7u34rtao5B0nUGX3GaOFiYSexwGkLATPA@mail.gmail.com>

On Wed, Feb 29, 2012 at 3:47 AM, Marcel Caraciolo <caraciol at gmail.com>wrote:

> Hi all,
>
> My name is Marcel and I am lecturing scientific computing with Python here
> at Brazil. One of my students came to me with a problem that he is
> currently solving it with matlab but he decided to change his code to
> Python (thanks to the course!)
>
> The problem is calculate numerically the coefficients aim that are defined
> by the following integral [1].
>
> It must be calculated using integrals. In the example showed above he
> wants to use the trapezoid rule adapted for 2-D arrays or if there is any
> another solutions easily with scipy it would be match perfectly also.
>
> Here is the matrix input (U) and the corresponding coefficients
> (solution). The goal is to calculate the corresponding coefficients
> by the formula (integral) shown at [1].
>
> Could anyone give some a solution using scipy.integrate ? I tried several
> proposals but it didn't worked.
>

It's a double integral, so your first try should be integrate.dblquad. I
suggest that you try that, then if you get stuck show us where exactly and
we'll try to help you.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120303/c8ad11fb/attachment.html>

From josef.pktd at gmail.com  Sat Mar  3 11:41:32 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 3 Mar 2012 11:41:32 -0500
Subject: [SciPy-User] OT: Data Analysis in Python
Message-ID: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>

just some thoughts, mostly personal

http://jpktd.blogspot.com/2012/03/data-in-python.html

Josef
(trying to blog to keep the mailing list on its trend to be less noisy)


From travis at continuum.io  Sat Mar  3 14:16:12 2012
From: travis at continuum.io (Travis Oliphant)
Date: Sat, 3 Mar 2012 13:16:12 -0600
Subject: [SciPy-User] OT: Data Analysis in Python
In-Reply-To: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
References: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
Message-ID: <2401158E-2B49-46EF-A961-CB493F20DEC0@continuum.io>

Thanks for posting this.    The trend towards blogging is not new.  It feels like a pretty good venue for this sort of thing ---  and it let's the mailing lists stay more technical. 

"Big-Data" does have a lot of "Big-Hype" but it is my opinion that there are user-stories that the SciPy community would be wise to address.  The PyData workshop was basically an attempt to make sure that the noise around Strata at least has some "Python".   It was also a chance to be in the same room as many people active in the larger SciPy community as a pre-cursor to PyCon.

With R getting a lot of attention from Venture Capitalists, there are a lot of people who are "new" to so-called "Data-Science" who are being pushed to "R" instead of Python because of the larger market messaging.   Their use-cases are not informing this community as much as it could be.  Wes McKinney, fortunately, is doing a lot of work to change that. 

-Travis


On Mar 3, 2012, at 10:41 AM, josef.pktd at gmail.com wrote:

> just some thoughts, mostly personal
> 
> http://jpktd.blogspot.com/2012/03/data-in-python.html
> 
> Josef
> (trying to blog to keep the mailing list on its trend to be less noisy)
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From pav at iki.fi  Sat Mar  3 14:24:57 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 03 Mar 2012 20:24:57 +0100
Subject: [SciPy-User] OT: Data Analysis in Python
In-Reply-To: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
References: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
Message-ID: <jitr6b$jvm$1@dough.gmane.org>

Hi,

03.03.2012 17:41, josef.pktd at gmail.com kirjoitti:
> just some thoughts, mostly personal
> 
> http://jpktd.blogspot.com/2012/03/data-in-python.html

OT on OT: should your blog be added the planet http://planet.scipy.org/

-- 
Pauli Virtanen


From josef.pktd at gmail.com  Sat Mar  3 14:53:27 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 3 Mar 2012 14:53:27 -0500
Subject: [SciPy-User] OT: Data Analysis in Python
In-Reply-To: <jitr6b$jvm$1@dough.gmane.org>
References: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
	<jitr6b$jvm$1@dough.gmane.org>
Message-ID: <CAMMTP+C+0oStzf_jXn2_WjJ7AFbToPoJtUYFW_NRE4OZjWPU=g@mail.gmail.com>

On Sat, Mar 3, 2012 at 2:24 PM, Pauli Virtanen <pav at iki.fi> wrote:
> Hi,
>
> 03.03.2012 17:41, josef.pktd at gmail.com kirjoitti:
>> just some thoughts, mostly personal
>>
>> http://jpktd.blogspot.com/2012/03/data-in-python.html
>
> OT on OT: should your blog be added the planet http://planet.scipy.org/

Yes, thank you.
I wasn't sure I will keep it up, but now it's one command to go from
rst file to blogger.
The next posts should be more technical.

Josef

>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From emanuele at relativita.com  Sat Mar  3 16:05:14 2012
From: emanuele at relativita.com (Emanuele Olivetti)
Date: Sat, 03 Mar 2012 22:05:14 +0100
Subject: [SciPy-User] issue estimating the multivariate Polya distribution:
	why?
Message-ID: <4F52878A.3040808@relativita.com>

Dear All,

I am playing with the multivariate Polya distribution - also
known as the Dirichlet compound multinomial distribution. A
brief description from wikipedia:
   http://en.wikipedia.org/wiki/Multivariate_P%C3%B3lya_distribution

I made a straightforward implementation of the probability
density function in the log-scale here:
   https://gist.github.com/1968113
together with a straightforward montecarlo estimation (by
sampling first from a Dirichlet and then computing the log-likelihood
of the multinomial) in the log-scale as well. The log-scale was
chosen in order to improve numerical stability.

If you run the code liked above you should get these two examples:
----
X: [ 0 50 50]
alpha: [ 1 10 10]
analytic: -5.22892710577
montecarlo -5.23470053651

X: [100   0   0]
alpha: [ 1 10 10]
analytic: -51.737395965
montecarlo -93.5266543113
----

As you can see in the first case, i.e. X=[0,50,50], there is excellent
agreement between the two implementations while in the second case, i.e.
x=[100,0,0], there is a dramatic disagreement. Note that the montecarlo
estimate is quite stable and if you change the seed of the random
number generator you get numbers not too far from -90.

So my question is: where does this issue come from? I cannot see
mistakes in the implementation (the code is very simple) and
I cannot see the source of numerical instability.

Any hint?

Best,

Emanuele


From josef.pktd at gmail.com  Sat Mar  3 18:24:32 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 3 Mar 2012 18:24:32 -0500
Subject: [SciPy-User] issue estimating the multivariate Polya
 distribution: why?
In-Reply-To: <4F52878A.3040808@relativita.com>
References: <4F52878A.3040808@relativita.com>
Message-ID: <CAMMTP+B3o69Dmj46dP_pwZ3jY7=R=kJYTgQgwooUo0Z0gkd-1w@mail.gmail.com>

On Sat, Mar 3, 2012 at 4:05 PM, Emanuele Olivetti
<emanuele at relativita.com> wrote:
> Dear All,
>
> I am playing with the multivariate Polya distribution - also
> known as the Dirichlet compound multinomial distribution. A
> brief description from wikipedia:
> ? http://en.wikipedia.org/wiki/Multivariate_P%C3%B3lya_distribution
>
> I made a straightforward implementation of the probability
> density function in the log-scale here:
> ? https://gist.github.com/1968113
> together with a straightforward montecarlo estimation (by
> sampling first from a Dirichlet and then computing the log-likelihood
> of the multinomial) in the log-scale as well. The log-scale was
> chosen in order to improve numerical stability.
>
> If you run the code liked above you should get these two examples:
> ----
> X: [ 0 50 50]
> alpha: [ 1 10 10]
> analytic: -5.22892710577
> montecarlo -5.23470053651
>
> X: [100 ? 0 ? 0]
> alpha: [ 1 10 10]
> analytic: -51.737395965
> montecarlo -93.5266543113
> ----
>
> As you can see in the first case, i.e. X=[0,50,50], there is excellent
> agreement between the two implementations while in the second case, i.e.
> x=[100,0,0], there is a dramatic disagreement. Note that the montecarlo
> estimate is quite stable and if you change the seed of the random
> number generator you get numbers not too far from -90.
>
> So my question is: where does this issue come from? I cannot see
> mistakes in the implementation (the code is very simple) and
> I cannot see the source of numerical instability.

I don't see anything. I was trying out several different parameters,
and my only guess is that
the logaddexp is not precise enough in this case. My results (numpy
1.5.1) are even worse.

The probability that you want to calculate is very low

>>> np.exp(-51.737395965)
3.3941765165211696e-23

For larger values it seems to work fine, but it deteriorates fast when
the loglikelihood drops below -15 or so (with the versions I have
installed).

Do you need almost zero probability events?

In a similar problem with poisson mixtures but without monte carlo, I
was trying out various ways of rescaling, but didn't come up with
anything useful.

Josef

>
> Any hint?
>
> Best,
>
> Emanuele
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From josef.pktd at gmail.com  Sat Mar  3 18:51:38 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 3 Mar 2012 18:51:38 -0500
Subject: [SciPy-User] issue estimating the multivariate Polya
 distribution: why?
In-Reply-To: <CAMMTP+B3o69Dmj46dP_pwZ3jY7=R=kJYTgQgwooUo0Z0gkd-1w@mail.gmail.com>
References: <4F52878A.3040808@relativita.com>
	<CAMMTP+B3o69Dmj46dP_pwZ3jY7=R=kJYTgQgwooUo0Z0gkd-1w@mail.gmail.com>
Message-ID: <CAMMTP+DJmV4+wtuXMdAUdpZ_ojCBXJT3=EocHhm2_crgEOnDZQ@mail.gmail.com>

On Sat, Mar 3, 2012 at 6:24 PM,  <josef.pktd at gmail.com> wrote:
> On Sat, Mar 3, 2012 at 4:05 PM, Emanuele Olivetti
> <emanuele at relativita.com> wrote:
>> Dear All,
>>
>> I am playing with the multivariate Polya distribution - also
>> known as the Dirichlet compound multinomial distribution. A
>> brief description from wikipedia:
>> ? http://en.wikipedia.org/wiki/Multivariate_P%C3%B3lya_distribution
>>
>> I made a straightforward implementation of the probability
>> density function in the log-scale here:
>> ? https://gist.github.com/1968113
>> together with a straightforward montecarlo estimation (by
>> sampling first from a Dirichlet and then computing the log-likelihood
>> of the multinomial) in the log-scale as well. The log-scale was
>> chosen in order to improve numerical stability.
>>
>> If you run the code liked above you should get these two examples:
>> ----
>> X: [ 0 50 50]
>> alpha: [ 1 10 10]
>> analytic: -5.22892710577
>> montecarlo -5.23470053651
>>
>> X: [100 ? 0 ? 0]
>> alpha: [ 1 10 10]
>> analytic: -51.737395965
>> montecarlo -93.5266543113
>> ----
>>
>> As you can see in the first case, i.e. X=[0,50,50], there is excellent
>> agreement between the two implementations while in the second case, i.e.
>> x=[100,0,0], there is a dramatic disagreement. Note that the montecarlo
>> estimate is quite stable and if you change the seed of the random
>> number generator you get numbers not too far from -90.
>>
>> So my question is: where does this issue come from? I cannot see
>> mistakes in the implementation (the code is very simple) and
>> I cannot see the source of numerical instability.
>
> I don't see anything. I was trying out several different parameters,
> and my only guess is that
> the logaddexp is not precise enough in this case. My results (numpy
> 1.5.1) are even worse.
>
> The probability that you want to calculate is very low
>
>>>> np.exp(-51.737395965)
> 3.3941765165211696e-23
>
> For larger values it seems to work fine, but it deteriorates fast when
> the loglikelihood drops below -15 or so (with the versions I have
> installed).

just an observation

with iterations=1e7 I get much better numbers, which are still way
off. But I don't see why this should matter much, since you are
simulating alpha and not low probability events. (unless lot's of tiny
errors add up in different ways)

Josef

>
> Do you need almost zero probability events?
>
> In a similar problem with poisson mixtures but without monte carlo, I
> was trying out various ways of rescaling, but didn't come up with
> anything useful.
>
> Josef
>
>>
>> Any hint?
>>
>> Best,
>>
>> Emanuele
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user


From aronne.merrelli at gmail.com  Sun Mar  4 00:54:47 2012
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Sat, 3 Mar 2012 23:54:47 -0600
Subject: [SciPy-User] issue estimating the multivariate Polya
 distribution: why?
In-Reply-To: <CAMMTP+DJmV4+wtuXMdAUdpZ_ojCBXJT3=EocHhm2_crgEOnDZQ@mail.gmail.com>
References: <4F52878A.3040808@relativita.com>
	<CAMMTP+B3o69Dmj46dP_pwZ3jY7=R=kJYTgQgwooUo0Z0gkd-1w@mail.gmail.com>
	<CAMMTP+DJmV4+wtuXMdAUdpZ_ojCBXJT3=EocHhm2_crgEOnDZQ@mail.gmail.com>
Message-ID: <CAHNdQ4+mhqPhzWtacGRNFE0jpNHRkJOtjLSXKr3ya1mYEezyrg@mail.gmail.com>

On Sat, Mar 3, 2012 at 5:51 PM,  <josef.pktd at gmail.com> wrote:
> On Sat, Mar 3, 2012 at 6:24 PM, ?<josef.pktd at gmail.com> wrote:
>> On Sat, Mar 3, 2012 at 4:05 PM, Emanuele Olivetti
>> The probability that you want to calculate is very low
>>
>>>>> np.exp(-51.737395965)
>> 3.3941765165211696e-23
>>
>> For larger values it seems to work fine, but it deteriorates fast when
>> the loglikelihood drops below -15 or so (with the versions I have
>> installed).
>
> just an observation
>
> with iterations=1e7 I get much better numbers, which are still way
> off. But I don't see why this should matter much, since you are
> simulating alpha and not low probability events. (unless lot's of tiny
> errors add up in different ways)
>
> Josef

I'm a little out of my field here, so take this with a grain of salt.

I think Josef's observation is the key; the problem is the number of
samples in the MC is too low. This distribution seems very, very
skewed; if you plot the actual values in the second case
(specifically, exp(logp_Hs)) - some of it underflows, obviously, but
if you plot it in linear scale, it appears to be dominated by 1 or 2
large "outlier" values. The final mean value is largely dependent on
only those outliers. The MC just would require *a lot* more samples to
get a few realizations that would pull up the mean to more accurately
match the analytic prediction.

Try this test: compute the MC mean as the number of samples increase;
for example (this will take a few minutes to compute - the spacing in
the iteration number is overdone but when you plot it, you get some
rough approximation to the scatter and expectation value for the MC
estimate)

X = array([100,0,0])
alpha = array([1,10,10])
n_iter = 10**linspace(3,6,161)
test_logmeans = np.zeros(n_iter.shape)
for n in range(n_iter.shape[0]):
    test_logmeans[n] = log_multivariate_polya_montecarlo(X, alpha,
int(n_iter[n]))


If you plot test_logmeans, it clearly shows a negative bias (relative
to the analytic prediction) that decreases as the sample size
increases.

Cheers,
Aronne


From gael.varoquaux at normalesup.org  Sun Mar  4 07:58:23 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sun, 4 Mar 2012 13:58:23 +0100
Subject: [SciPy-User] OT: Data Analysis in Python
In-Reply-To: <CAMMTP+C+0oStzf_jXn2_WjJ7AFbToPoJtUYFW_NRE4OZjWPU=g@mail.gmail.com>
References: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
	<jitr6b$jvm$1@dough.gmane.org>
	<CAMMTP+C+0oStzf_jXn2_WjJ7AFbToPoJtUYFW_NRE4OZjWPU=g@mail.gmail.com>
Message-ID: <20120304125823.GA705@phare.normalesup.org>

On Sat, Mar 03, 2012 at 02:53:27PM -0500, josef.pktd at gmail.com wrote:
> On Sat, Mar 3, 2012 at 2:24 PM, Pauli Virtanen <pav at iki.fi> wrote:
> > Hi,

> > 03.03.2012 17:41, josef.pktd at gmail.com kirjoitti:
> >> just some thoughts, mostly personal

> >> http://jpktd.blogspot.com/2012/03/data-in-python.html

> > OT on OT: should your blog be added the planet http://planet.scipy.org/

> Yes, thank you.

Indeed. I just added it. It will be in the update.

Gael


From lee.j.joon at gmail.com  Sun Mar  4 02:59:12 2012
From: lee.j.joon at gmail.com (Jae-Joon Lee)
Date: Sun, 4 Mar 2012 16:59:12 +0900
Subject: [SciPy-User] [Matplotlib-users] matplotlib: Simple legend code
 no longer works after upgrade to Ubuntu 11.10
In-Reply-To: <CAM-+wY91GtFEjz8=fY9LU37OXXT2LARJeY5XwjLDBJmtTt5ySQ@mail.gmail.com>
References: <f3c510ff-41ec-40ff-961e-562fa2ea7b70@d17g2000yql.googlegroups.com>
	<CAM-+wY91GtFEjz8=fY9LU37OXXT2LARJeY5XwjLDBJmtTt5ySQ@mail.gmail.com>
Message-ID: <CAG=uJ+m+wFCUOBD-Hq4oXH6_-49n1Qw7cS=dpfCARtPMnxoUFw@mail.gmail.com>

Although this is quite an old post, one need to set the location again.

e.g.,

lh._loc = 2

Regards,

-JJ


On Wed, Dec 14, 2011 at 12:09 AM, Warren Weckesser
<warren.weckesser at enthought.com> wrote:
>
>
> On Mon, Dec 12, 2011 at 7:05 PM, C Barrington-Leigh <cpblpublic at gmail.com>
> wrote:
>>
>> Oops; I just posted this to comp.lang.python, but I wonder whether
>> matplotlib questions are supposed to go to scipy-user?
>
>
>
> How about matplotlib-users at lists.sourceforge.net?? I've cc'ed to that list.
>
> Warren
>
>
>>
>> Here it is:
>> """
>> Before I upgraded to 2.7.2+ / 4 OCt 2011, the following code added a
>> comment line to an axis legend using matplotlib / pylab.
>> Now, the same code makes the legend appear "off-screen", ie way
>> outside the axes limits.
>>
>> Can anyone help? And/or is there a new way to add a title and footer
>> to the legend?
>>
>> Thanks!
>> """
>>
>> from pylab import *
>> plot([0,0],[1,1],label='Ubuntu 11.10')
>> lh=legend(fancybox=True,shadow=False)
>> lh.get_frame().set_alpha(0.5)
>>
>> from matplotlib.offsetbox import TextArea, VPacker
>> fontsize=lh.get_texts()[0].get_fontsize()
>> legendcomment=TextArea('extra comments here',
>> textprops=dict(size=fontsize))
>> show()
>> # Looks fine here
>> lh._legend_box = VPacker(pad=5,
>> ? ? ? ? ? ? ? ? ? ? ? ? sep=0,
>> ? ? ? ? ? ? ? ? ? ? ? ? children=[lh._legend_box,legendcomment],
>> ? ? ? ? ? ? ? ? ? ? ? ? align="left")
>> lh._legend_box.set_figure(gcf())
>> draw()
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> ------------------------------------------------------------------------------
> Systems Optimization Self Assessment
> Improve efficiency and utilization of IT resources. Drive out cost and
> improve service delivery. Take 5 minutes to use this Systems Optimization
> Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/
> _______________________________________________
> Matplotlib-users mailing list
> Matplotlib-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>


From timmichelsen at gmx-topmail.de  Sun Mar  4 13:58:11 2012
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Sun, 04 Mar 2012 19:58:11 +0100
Subject: [SciPy-User] OT: Data Analysis in Python
In-Reply-To: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
References: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
Message-ID: <jj0e03$cl6$1@dough.gmane.org>

hello,

> http://jpktd.blogspot.com/2012/03/data-in-python.html
Is there any information if the slides an materials will be published 
for those who couldn't attend?


I'd be interested also if the materials of the following meeting will be 
published:
Data analysis in Python with pandas
https://us.pycon.org/2012/schedule/presentation/427/

It's just difficult to fly round the globe for just one day...

Regards,
Timmie


From wesmckinn at gmail.com  Sun Mar  4 14:30:00 2012
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sun, 4 Mar 2012 14:30:00 -0500
Subject: [SciPy-User] OT: Data Analysis in Python
In-Reply-To: <jj0e03$cl6$1@dough.gmane.org>
References: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
	<jj0e03$cl6$1@dough.gmane.org>
Message-ID: <CAJPUwMBKYXgT=1z9_9X5-uUMDOwLTD0S=5jtMwhWV8LGxYvNHg@mail.gmail.com>

On Sun, Mar 4, 2012 at 1:58 PM, Tim Michelsen
<timmichelsen at gmx-topmail.de> wrote:
> hello,
>
>> http://jpktd.blogspot.com/2012/03/data-in-python.html
> Is there any information if the slides an materials will be published
> for those who couldn't attend?
>
>
> I'd be interested also if the materials of the following meeting will be
> published:
> Data analysis in Python with pandas
> https://us.pycon.org/2012/schedule/presentation/427/
>
> It's just difficult to fly round the globe for just one day...
>
> Regards,
> Timmie
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

hi Timmie,

I will definitely post materials from my pandas tutorial (which are at
the moment a bit scattered, becoming less so over the next couple
days) and my talk at PyCon. I'm also due to cut some screencasts of
recent demos.

Some videos from the PyData workshop will also be published since we
had a recording crew.

cheers,
Wes


From emanuele at relativita.com  Sun Mar  4 17:41:58 2012
From: emanuele at relativita.com (Emanuele Olivetti)
Date: Sun, 04 Mar 2012 23:41:58 +0100
Subject: [SciPy-User] issue estimating the multivariate Polya
 distribution: why?
In-Reply-To: <CAHNdQ4+mhqPhzWtacGRNFE0jpNHRkJOtjLSXKr3ya1mYEezyrg@mail.gmail.com>
References: <4F52878A.3040808@relativita.com>	<CAMMTP+B3o69Dmj46dP_pwZ3jY7=R=kJYTgQgwooUo0Z0gkd-1w@mail.gmail.com>	<CAMMTP+DJmV4+wtuXMdAUdpZ_ojCBXJT3=EocHhm2_crgEOnDZQ@mail.gmail.com>
	<CAHNdQ4+mhqPhzWtacGRNFE0jpNHRkJOtjLSXKr3ya1mYEezyrg@mail.gmail.com>
Message-ID: <4F53EFB6.60100@relativita.com>

On 03/04/2012 06:54 AM, Aronne Merrelli wrote:
> On Sat, Mar 3, 2012 at 5:51 PM,<josef.pktd at gmail.com>  wrote:
>> On Sat, Mar 3, 2012 at 6:24 PM,<josef.pktd at gmail.com>  wrote:
>>> 3.3941765165211696e-23
>>>
>>> For larger values it seems to work fine, but it deteriorates fast when
>>> the loglikelihood drops below -15 or so (with the versions I have
>>> installed).
>> just an observation
>>
>> with iterations=1e7 I get much better numbers, which are still way
>> off. But I don't see why this should matter much, since you are
>> simulating alpha and not low probability events. (unless lot's of tiny
>> errors add up in different ways)
>>
>> Josef
> I'm a little out of my field here, so take this with a grain of salt.
>
> I think Josef's observation is the key; the problem is the number of
> samples in the MC is too low. This distribution seems very, very
> skewed;
[...]

> If you plot test_logmeans, it clearly shows a negative bias (relative
> to the analytic prediction) that decreases as the sample size
> increases.

Thanks Josef and Aronne, your tests and comments are very
useful. Indeed the distribution of interest is very skewed and
the specific nature of the skewness could be the explanation
of the unexpected behavior of the montecarlo estimate.

I am trying to set up a minimal and straightforward example
where same surprising effect will (hopefully) appear. I guess
that crafting a very skewed distribution - even a very simple
one - should show such an anomalous behavior and give insights.

Unfortunately until now I was not successful. But I'll let you
know as soon as I make some progress. Of course any help
in digging more on this issue is warmly welcome!

Best,

Emanuele


From ramercer at gmail.com  Sun Mar  4 19:05:25 2012
From: ramercer at gmail.com (Adam Mercer)
Date: Sun, 4 Mar 2012 18:05:25 -0600
Subject: [SciPy-User] Test failures with SciPy-0.10.1 and Mac OS X Lion
Message-ID: <CA+mfgz3Bnhyur366DYQLzzdE2Q+kTXWSWx58ocqWBw-i8bAqqQ@mail.gmail.com>

Hi

I've been updating the scipy in MacPorts and when updating to 0.10.1
the test suite now has the following failures:

$ python
Python 2.7.2 (default, Mar  1 2012, 20:21:11)
[GCC 4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.45)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> scipy.test()
Running unit tests for scipy
NumPy version 1.6.1
NumPy is installed in
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy
SciPy version 0.10.1
SciPy is installed in
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy
Python version 2.7.2 (default, Mar  1 2012, 20:21:11) [GCC 4.2.1
Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.45)]
nose version 1.1.2
<snip>
======================================================================
FAIL: test_asum (test_blas.TestFBLAS1Simple)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/tests/test_blas.py",
line 58, in test_asum
    assert_almost_equal(f([3,-4,5]),12)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 468, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.0
 DESIRED: 12

======================================================================
FAIL: test_dot (test_blas.TestFBLAS1Simple)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/tests/test_blas.py",
line 67, in test_dot
    assert_almost_equal(f([3,-4,5],[2,5,1]),-9)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 468, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.0
 DESIRED: -9

======================================================================
FAIL: test_nrm2 (test_blas.TestFBLAS1Simple)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/tests/test_blas.py",
line 78, in test_nrm2
    assert_almost_equal(f([3,-4,5]),math.sqrt(50))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 468, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.0
 DESIRED: 7.0710678118654755

======================================================================
FAIL: test_basic.TestNorm.test_overflow
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/tests/test_basic.py",
line 581, in test_overflow
    assert_almost_equal(norm(a), a)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 452, in assert_almost_equal
    return assert_array_almost_equal(actual, desired, decimal, err_msg)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals

(mismatch 100.0%)
 x: array(-0.0)
 y: array([  1.00000002e+20], dtype=float32)

======================================================================
FAIL: test_basic.TestNorm.test_stable
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/tests/test_basic.py",
line 586, in test_stable
    assert_almost_equal(norm(a) - 1e4, 0.5)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 468, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: -10000.0
 DESIRED: 0.5

======================================================================
FAIL: test_basic.TestNorm.test_types
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/tests/test_basic.py",
line 568, in test_types
    assert_allclose(norm(x), np.sqrt(14), rtol=tol)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 1168, in assert_allclose
    verbose=verbose, header=header)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=2.38419e-06, atol=0

(mismatch 100.0%)
 x: array(1.0842021724855044e-19)
 y: array(3.7416573867739413)

======================================================================
FAIL: test_asum (test_blas.TestFBLAS1Simple)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/tests/test_blas.py",
line 99, in test_asum
    assert_almost_equal(f([3,-4,5]),12)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 468, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.0
 DESIRED: 12

======================================================================
FAIL: test_dot (test_blas.TestFBLAS1Simple)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/tests/test_blas.py",
line 109, in test_dot
    assert_almost_equal(f([3,-4,5],[2,5,1]),-9)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 468, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.0
 DESIRED: -9

======================================================================
FAIL: test_nrm2 (test_blas.TestFBLAS1Simple)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/tests/test_blas.py",
line 127, in test_nrm2
    assert_almost_equal(f([3,-4,5]),math.sqrt(50))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py",
line 468, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.0
 DESIRED: 7.0710678118654755

----------------------------------------------------------------------
Ran 5101 tests in 86.528s

FAILED (KNOWNFAIL=12, SKIP=31, failures=9)
<nose.result.TextTestResult run=5101 errors=0 failures=9>

I can't recall seeing these failures before, this has been compiled
using the compilers from Xcode-4.3?

Are these anything to worry about?

Cheers

Adam


From draft2008 at bk.ru  Mon Mar  5 05:26:39 2012
From: draft2008 at bk.ru (=?UTF-8?B?0JLQu9Cw0LTQuNC80LjRgA==?=)
Date: Mon, 05 Mar 2012 14:26:39 +0400
Subject: [SciPy-User] =?utf-8?q?Orthogonal_distance_regression_in_3D?=
Message-ID: <E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>

02 ????? 2012, 15:49 ?? Robert Kern <robert.kern at gmail.com>:
> On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
> > Hello!
> > I'm working with orthogonal distance regression (scipy.odr).
> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
> > returns wrong results
> >
> > For example I want to fit the simple curve y = a*x + b*z + c to some point
> > cloud (y_data, x_data, z_data)
> >
> >
> > ? ? def func(p, input):
> >
> > ? ? x,z = input
> >
> > ? ? x = np.array(x)
> >
> > ? ? z = np.array(z)
> >
> > ? ? return (p[0]*x + p[1]*z + p[2])
> >
> >
> > ? ? initialGuess = [1,1,1]
> >
> > ? ? myModel = Model(func)
> >
> > ? ? myData = Data([x_data, z_daya], y_data)
> >
> > ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
> >
> > ? ? myOdr.set_job(fit_type=0)
> >
> > ? ? out = myOdr.run()
> >
> > ? ? print out.beta
> >
> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
> > are not even close to real, moreover it is very sensitive to initial Guess,
> > so it returns different result even if i change InitiaGuess from?[1,1,1]
> > to?[0.99,1,1]
> >
> > What do I do wrong?
>
> Can you provide a complete runnable example including some data? Note
> that if you do not specify any errors on your data, they are assumed
> to correspond to a standard deviation of 1 for all dimensions. If that
> is wildly different from the actual variance around the "true"
> surface, then it might lead the optimizer astray.
>
> --
> Robert Kern
>

I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.

Here is an example (Sorry for the huge array of data, but its important to show what happens on it)

import numpy as np
from scipy.odr import *
from math import *

x_data = [26.6, 25.5, 26.2, 25.5, 24.9, 24.9, 30.3, 30.4, 35.7, 25.8, 34.7, 32.8, 36.5, 20.4, 24.2, 25.4, 24.5, 23.4, 20.9, 33.3, 27.0, 28.1, 21.7, 24.47, 22.2, 22.43, 23.43, 22.27, 22.27, 28.63, 28.7, 29.1, 30.0, 29.77, 26.93, 27.7, 27.83, 28.23, 31.2, 30.3, 32.13, 27.6, 33.73, 29.07, 29.0, 29.77, 32.7, 32.93, 32.65, 22.63, 25.13, 27.77, 19.77, 31.52, 33.0, 34.1, 33.23, 35.6, 33.83, 32.4, 33.1, 31.08, 31.5, 32.6, 33.67, 31.6, 34.1, 32.92, 34.63, 32.4, 30.1, 33.8, 35.6, 31.9, 29.8, 35.7, 34.4, 34.6, 36.0, 28.63, 27.47, 29.93, 24.17, 24.1, 23.8, 27.2, 26.1, 31.12, 25.17, 29.35, 31.67, 29.67, 30.13, 30.1, 30.27, 30.73, 31.1, 33.33, 28.75, 26.67, 30.37, 28.6, 24.75, 30.5, 29.63, 31.0, 30.15, 29.85, 12.85, 34.2, 20.67, 20.73, 19.83, 13.6, 20.0, 19.45, 21.25, 22.05, 14.27, 28.37, 29.85, 32.8, 32.42, 31.05, 33.15, 32.25, 32.38, 32.42, 34.87, 31.97, 31.98, 33.0, 32.03, 34.4, 34.05, 21.43, 19.92, 22.0, 22.1, 27.57, 21.1, 23.77, 21.63, 26.77, 22.93, 25.67, 30.6, 25.37, 26.7, 31.03, 27.7, 31.13, 29.35, 27.33, 28.6, 26.9, 31.9, 9.37, 30.6, 26.93, 29.95, 31.6, 30.7, 24.6, 31.4, 23.47, 26.25, 33.6, 25.0, 33.8, 31.9, 22.7, 25.23, 20.1, 17.55, 18.45, 17.9, 23.13, 18.33, 18.5, 19.38, 19.27, 22.6, 22.37, 22.97, 25.07, 26.37, 17.43, 21.0, 18.23, 32.83, 30.52, 19.3, 20.1, 5.6, 26.23, 26.95, 26.35, 26.47, 28.8, 28.73, 25.45, 26.53, 23.45, 20.6, 24.47, 23.9, 13.35, 30.47, 29.67, 31.15, 29.15, 29.63, 28.58, 29.0, 31.7, 28.57, 29.58, 29.1, 22.48, 30.0, 26.15, 26.93, 28.77, 30.1, 24.03, 28.77, 29.4, 29.3, 29.95, 27.0, 29.05, 25.62, 26.8, 25.85, 22.42, 21.77, 28.2, 21.0, 20.8, 28.8, 28.7, 24.8, 27.87, 29.4, 29.1, 28.33, 27.6, 32.93, 33.3, 33.17, 31.4, 30.37, 34.8, 25.0, 23.48, 33.4, 30.23, 29.07, 28.9, 26.9, 31.77, 25.57, 19.67, 30.5, 19.97, 18.63, 26.35, 19.23, 31.2, 30.37, 30.6, 31.92, 27.35, 28.15, 28.67, 32.6, 20.83, 29.0, 28.2, 20.7, 20.6, 30.1, 30.4, 30.83, 31.0, 30.7, 29.77, 31.17, 31.63, 31.37, 32.2, 31.3, 20.0, 18.9, 32.4, 32.5, 32.3, 32.53, 21.67, 22.17, 22.4, 21.7, 23.8, 26.1, 20.7, 23.5, 21.5, 15.9, 24.3, 21.7, 25.5, 26.6, 23.3, 14.0, 20.2, 21.0, 25.4, 21.4, 28.4, 16.2, 22.2, 22.8, 21.8, 24.4, 23.6, 21.7, 31.5, 27.3, 27.7, 23.7, 20.9, 22.5, 21.9, 18.1, 20.9, 13.3, 12.9, 25.0, 22.2, 21.4, 31.6, 24.2, 15.0, 15.3, 20.1, 24.5, 21.3, 21.4, 22.8, 27.9, 29.5, 18.8, 18.1, 17.8, 23.5, 19.4, 18.8, 13.8, 17.3, 19.6, 22.1, 22.2, 20.3, 10.7, 22.2, 10.3, 22.4, 16.1, 22.9, 28.0, 28.8, 25.6, 29.7, 29.7, 20.8, 21.0, 27.9, 23.8, 24.1, 21.1, 24.1, 32.6, 21.6, 28.9, 26.9, 5.1, 31.3, 21.2, 22.6, 24.2, 31.1, 24.8, 30.3, 27.3, 26.3, 20.9, 22.2, 29.7, 25.0, 25.9, 31.0, 31.0, 29.0, 25.0, 25.0, 30.2, 28.1, 29.5, 26.8, 27.0, 30.4, 30.8, 31.5, 31.0, 30.7, 32.2, 21.25, 30.35, 31.15, 33.25, 28.1, 33.1, 26.3, 33.05, 31.65, 31.9, 27.15, 28.4, 28.1, 29.9, 31.8, 33.45, 30.3, 31.75, 26.15, 23.6, 23.85, 28.0, 29.5, 25.55, 24.6, 22.25, 27.2, 27.95, 25.0, 29.7, 24.95, 26.6, 22.85, 25.0, 27.55, 26.0, 21.1, 25.95, 29.4, 27.8, 32.6, 25.2, 25.05, 31.0, 27.55, 27.95, 32.15, 30.85, 32.75, 31.05, 31.7, 31.25, 32.55, 30.9, 31.8, 31.45, 29.85, 29.75, 31.75, 32.35, 23.15, 28.3, 28.95, 26.8, 27.55, 28.45, 28.5, 27.45, 17.0, 22.8, 26.05, 27.8, 28.1, 29.25, 27.35, 27.2, 27.85, 27.55, 26.75, 29.0, 27.6, 29.25, 30.45, 28.05, 24.5, 23.15, 27.9, 28.55, 27.85, 23.6, 28.7, 29.55, 27.35, 28.25, 26.7, 25.95, 28.75, 27.6, 27.4, 30.0, 26.7, 28.1, 27.8, 28.0, 32.15, 27.25, 32.0, 33.9, 19.7, 30.15, 29.8, 27.0, 24.65, 22.15, 23.3, 24.0, 23.85, 26.35, 26.35, 26.1, 25.75, 27.7, 26.35, 26.25, 27.9, 29.05, 27.85, 26.0, 26.6, 25.3, 24.75, 24.85, 23.0, 28.95, 22.9, 24.15, 26.95, 21.8, 26.2, 28.5, 22.5, 24.75, 24.45, 30.4, 26.15, 19.85, 18.55, 26.3, 29.8, 26.05, 13.9, 27.65, 30.8, 24.0, 19.6, 21.7, 25.05, 26.65, 21.6, 28.2, 28.65, 25.15, 20.55, 21.55, 23.4, 23.6, 22.15, 21.2, 24.5, 21.7, 21.45, 23.1, 24.1, 22.2, 17.35, 22.05, 22.05, 23.75, 25.15, 25.75, 21.05, 27.43, 22.45, 26.25, 27.1, 27.15, 25.55, 23.0, 26.45, 25.95, 29.2, 25.55, 25.15, 26.45, 27.15, 24.75, 25.7, 32.65, 31.15, 23.85, 27.0, 27.3, 25.15, 25.55, 29.2, 24.0, 25.0, 28.2, 27.95, 24.65, 25.0, 25.6, 25.05, 24.7, 26.9, 23.6, 24.15, 19.75, 20.55, 27.7, 23.75, 24.8, 21.2, 31.2, 26.3, 30.8, 32.4, 22.2, 21.5, 27.4, 26.3, 30.4, 21.0, 23.3, 28.1, 16.65, 26.15, 17.9, 21.35, 21.1, 20.65, 30.4, 23.0, 25.7, 26.7, 22.7, 29.4, 25.9, 29.4, 21.55, 26.3, 25.1, 26.3, 32.7, 27.9, 31.4, 29.6, 31.4, 31.9, 31.45, 32.55, 31.35, 31.65, 29.75, 25.75, 26.1, 28.7, 30.75, 27.4, 24.9, 2.7, 27.2, 26.35, 28.2, 27.5, 28.45, 27.25, 30.5, 28.9, 30.8, 26.45, 24.5, 31.2, 26.45, 24.85, 25.1, 26.0, 25.5, 25.2, 25.6, 24.0, 30.8, 31.8, 31.6, 27.5, 9.2, 22.2, 23.3, 28.1, 30.8, 31.3, 3.5, 29.2, 31.4, 25.3, 29.7, 28.2, 25.6, 32.9, 24.0, 23.5, 27.5, 24.9, 22.4, 28.1, 26.2, 23.9, 23.9, 7.8]

z_data = [75.5, 76.7, 78.7, 77.4, 79.5, 73.6, 80.0, 77.3, 46.9, 61.4, 40.3, 56.3, 45.3, 67.0, 80.4, 85.1, 82.5, 69.4, 74.8, 91.0, 79.6, 84.2, 92.5, 73.0, 91.4, 86.0, 76.0, 78.0, 82.3, 37.7, 71.5, 39.3, 60.8, 60.2, 34.3, 56.8, 57.0, 41.0, 51.3, 55.2, 42.1, 36.3, 39.2, 62.9, 77.3, 55.0, 44.0, 44.3, 40.8, 49.2, 72.0, 61.6, 83.6, 46.2, 24.7, 22.8, 17.2, 20.0, 25.9, 28.5, 19.2, 34.4, 29.7, 27.2, 22.0, 29.2, 21.1, 28.7, 23.4, 35.9, 37.8, 17.2, 17.9, 31.7, 39.0, 18.4, 23.3, 23.1, 14.0, 77.9, 72.3, 32.3, 82.0, 78.3, 82.7, 65.1, 54.2, 59.2, 70.7, 23.1, 22.0, 25.0, 29.0, 28.3, 27.8, 27.9, 27.1, 27.7, 48.6, 45.3, 45.0, 55.0, 63.0, 46.8, 55.4, 46.5, 32.1, 61.9, 50.2, 42.4, 34.5, 91.0, 85.3, 76.5, 94.8, 91.7, 76.2, 31.4, 66.9, 27.5, 28.0, 21.0, 14.2, 20.6, 21.6, 24.3, 18.8, 24.2, 13.8, 19.1, 35.8, 19.6, 25.0, 19.4, 19.3, 89.8, 88.1, 91.7, 84.5, 46.6, 88.9, 81.2, 81.0, 57.9, 77.9, 67.0, 31.8, 57.0, 60.5, 45.0, 57.6, 44.5, 36.2, 41.3, 45.7, 49.3, 41.9, 61.7, 32.2, 71.2, 45.0, 32.6, 31.0, 49.0, 29.8, 15.1, 38.5, 27.0, 38.5, 35.8, 4.05, 90.7, 68.7, 85.1, 90.9, 92.7, 94.4, 89.3, 92.2, 95.5, 91.7, 92.9, 91.5, 86.7, 64.0, 74.1, 50.0, 91.8, 87.3, 86.1, 40.8, 40.7, 89.0, 92.9, 93.7, 58.1, 50.5, 58.4, 53.6, 30.7, 43.3, 54.5, 51.0, 96.4, 85.2, 87.0, 60.5, 49.7, 40.0, 51.0, 20.1, 21.2, 44.5, 41.1, 43.3, 38.1, 47.2, 42.7, 52.2, 38.4, 21.9, 56.0, 55.0, 48.9, 22.7, 46.5, 46.5, 22.4, 39.2, 22.0, 53.3, 40.6, 51.1, 26.6, 53.0, 75.0, 77.8, 56.0, 78.0, 74.8, 48.7, 50.6, 19.5, 38.8, 43.1, 61.9, 50.1, 49.8, 27.2, 28.3, 28.3, 34.2, 33.8, 26.6, 71.3, 62.5, 25.6, 40.4, 26.9, 30.1, 24.0, 25.9, 43.5, 80.0, 39.2, 88.0, 87.0, 44.5, 65.1, 34.6, 30.1, 33.4, 34.4, 35.9, 38.5, 42.1, 48.5, 56.0, 37.2, 36.5, 88.5, 73.0, 35.1, 34.7, 28.0, 29.6, 30.7, 30.6, 30.9, 29.9, 30.3, 29.8, 31.0, 88.3, 88.4, 29.8, 29.8, 27.9, 29.2, 73.2, 85.0, 91.4, 83.6, 42.5, 44.4, 57.2, 41.4, 32.5, 60.7, 41.2, 45.9, 37.4, 46.2, 44.2, 45.9, 43.3, 46.2, 46.5, 37.4, 66.4, 62.3, 75.8, 19.4, 15.4, 15.2, 73.8, 78.5, 38.0, 33.2, 32.5, 89.5, 92.2, 83.6, 77.0, 61.7, 88.8, 72.9, 65.6, 30.7, 78.2, 52.4, 46.3, 76.3, 52.1, 53.0, 44.8, 35.5, 40.0, 41.3, 41.1, 32.8, 45.6, 42.9, 43.3, 41.8, 86.8, 81.2, 73.0, 69.7, 65.3, 59.5, 78.5, 58.3, 86.1, 80.0, 74.7, 82.2, 90.8, 67.2, 69.2, 42.0, 56.5, 35.3, 44.7, 51.6, 37.5, 58.3, 45.0, 45.3, 18.2, 70.6, 51.3, 55.2, 68.5, 63.0, 65.6, 95.7, 41.8, 73.7, 67.2, 58.9, 51.4, 57.1, 37.0, 48.9, 61.9, 79.5, 61.6, 53.0, 45.9, 53.4, 55.1, 50.9, 60.2, 55.2, 50.4, 39.3, 44.4, 55.9, 52.4, 45.2, 45.4, 32.8, 37.4, 41.3, 25.9, 22.7, 82.7, 36.3, 45.4, 55.2, 73.7, 44.6, 81.9, 68.0, 35.6, 39.3, 80.9, 33.9, 26.9, 29.8, 31.1, 23.3, 49.9, 30.9, 89.1, 37.4, 64.5, 60.8, 53.4, 76.3, 80.4, 77.1, 56.8, 65.8, 75.8, 25.2, 82.7, 50.6, 69.9, 78.9, 78.9, 57.8, 90.2, 65.9, 57.0, 59.0, 36.3, 73.4, 71.8, 56.5, 77.0, 73.9, 66.9, 59.4, 51.3, 46.1, 52.0, 53.2, 50.8, 57.6, 57.7, 48.0, 59.7, 56.8, 55.4, 45.3, 74.8, 43.9, 46.0, 49.1, 43.8, 43.9, 43.9, 52.5, 38.5, 24.2, 60.8, 81.2, 32.6, 12.5, 15.8, 24.8, 67.9, 69.0, 80.0, 66.4, 86.2, 63.8, 71.0, 72.8, 86.0, 93.0, 52.0, 74.7, 68.3, 74.1, 63.6, 70.1, 80.6, 73.3, 78.9, 80.9, 61.3, 82.3, 69.2, 60.4, 82.7, 80.4, 75.6, 81.1, 54.7, 62.6, 56.7, 41.1, 78.7, 67.1, 70.3, 75.4, 89.0, 74.4, 56.0, 93.4, 63.8, 61.7, 81.5, 82.9, 76.5, 80.1, 73.1, 71.4, 66.6, 53.4, 73.8, 82.6, 60.2, 79.1, 86.8, 80.7, 93.4, 43.9, 85.3, 78.3, 32.8, 47.3, 48.3, 48.6, 67.5, 76.1, 82.2, 35.3, 58.2, 58.6, 62.3, 65.2, 40.6, 67.9, 62.5, 49.2, 28.2, 85.6, 47.6, 94.8, 82.9, 84.6, 94.7, 49.2, 69.5, 81.5, 86.9, 74.7, 68.9, 70.7, 81.9, 73.5, 70.8, 76.2, 82.5, 78.7, 75.3, 76.6, 80.5, 78.9, 65.3, 62.6, 90.4, 59.2, 63.6, 50.0, 72.6, 47.6, 48.2, 48.3, 47.7, 57.1, 51.5, 75.7, 48.8, 83.1, 73.4, 67.0, 64.8, 80.4, 88.9, 34.7, 40.2, 87.4, 78.2, 69.2, 84.8, 83.4, 37.0, 63.3, 62.9, 6.5, 76.2, 87.9, 67.9, 68.3, 77.0, 65.3, 57.1, 67.2, 63.5, 64.8, 64.0, 39.0, 82.5, 64.0, 69.1, 53.4, 76.6, 34.7, 54.0, 89.7, 84.0, 57.6, 66.3, 54.0, 77.6, 84.5, 75.2, 46.5, 77.2, 85.2, 74.0, 69.3, 61.7, 42.7, 82.7, 74.7, 57.2, 78.7, 36.7, 56.2, 47.8, 84.8, 42.6, 48.1, 53.6, 34.0, 38.5, 31.5, 28.9, 29.8, 27.6, 27.7, 27.1, 25.3, 32.6, 59.0, 41.1, 69.8, 49.4, 47.2, 50.2, 81.4, 70.0, 70.0, 78.2, 68.6, 76.7, 63.6, 76.5, 62.5, 66.4, 57.2, 87.6, 82.8, 53.9, 73.0, 78.5, 63.5, 65.3, 67.7, 94.5, 84.5, 77.9, 57.2, 47.8, 70.0, 99.4, 64.0, 100.0, 72.8, 79.5, 88.1, 85.4, 42.1, 59.2, 62.9, 71.5, 71.7, 75.2, 100.0, 52.3, 88.4, 99.8, 71.9, 90.0, 96.1, 69.0, 74.3, 99.7, 85.8, 70.4]

y_data = [6.33, 0.73, 12.6, 1.01, 5.95, 2.89, 13.5, 11.5, 360.0, 52.4, 614.0, 477.0, 492.0, 1.51, 1.93, 11.2, 2.16, 4.51, 1.47, 53.0, 0.9, 2.17, 1.7, 5.2, 1.5, 3.6, 9.9, 12.2, 6.8, 35.3, 7.8, 26.7, 10.8, 6.9, 7.7, 15.3, 10.8, 22.9, 16.2, 19.4, 52.6, 25.1, 95.5, 5.6, 4.1, 23.1, 161.4, 72.3, 38.6, 22.0, 5.7, 12.1, 2.2, 77.9, 328.0, 349.1, 323.9, 516.1, 197.7, 172.7, 339.3, 84.1, 194.6, 109.1, 221.1, 169.8, 553.0, 97.1, 294.9, 110.7, 58.4, 857.0, 532.5, 106.5, 51.3, 594.0, 246.4, 406.3, 727.5, 12.7, 11.2, 25.5, 6.5, 5.8, 4.1, 19.5, 40.5, 7.1, 5.1, 545.6, 421.9, 285.1, 317.8, 294.6, 339.6, 308.7, 312.8, 301.2, 26.3, 38.4, 39.3, 51.1, 3.3, 56.1, 33.3, 48.8, 150.9, 11.1, 16.5, 53.9, 1.2, 0.91, 1.7, 4.6, 0.53, 0.59, 11.6, 149.4, 1.3, 410.8, 418.3, 679.4, 731.3, 705.9, 660.1, 543.2, 871.6, 544.8, 1651.5, 1075.5, 226.1, 854.4, 471.2, 669.7, 709.8, 2.1, 1.6, 1.2, 1.1, 7.2, 0.79, 1.1, 1.5, 5.6, 2.9, 3.7, 58.5, 21.1, 9.6, 60.7, 25.8, 41.9, 62.6, 68.7, 56.6, 23.6, 118.8, 3.0, 202.1, 6.4, 40.4, 261.5, 139.4, 21.8, 107.2, 265.5, 116.6, 154.8, 97.5, 224.4, 550.9, 1.2, 12.6, 1.7, 0.97, 1.5, 0.96, 1.1, 0.29, 0.13, 0.62, 0.16, 0.52, 6.3, 30.4, 2.8, 46.0, 1.1, 0.47, 2.2, 63.0, 43.5, 0.46, 0.31, 0.05, 113.1, 18.2, 5.1, 11.7, 262.5, 43.4, 20.2, 30.9, 2.1, 1.6, 0.94, 4.9, 3.3, 91.1, 39.0, 133.3, 48.2, 69.1, 112.6, 112.1, 251.8, 33.8, 108.2, 45.2, 31.0, 82.9, 41.7, 36.7, 75.9, 49.2, 20.8, 80.3, 47.9, 67.7, 55.0, 37.6, 120.4, 50.6, 42.2, 13.8, 4.7, 4.2, 25.0, 0.56, 2.4, 104.7, 24.4, 300.4, 70.4, 40.5, 5.7, 30.9, 8.2, 149.7, 278.5, 288.7, 38.2, 123.0, 686.5, 1.7, 10.9, 939.1, 83.8, 538.5, 259.8, 485.8, 996.9, 10.0, 2.7, 50.4, 0.71, 0.65, 8.9, 1.4, 82.5, 80.8, 92.6, 98.7, 23.4, 34.2, 18.8, 55.4, 2.8, 39.5, 24.0, 0.82, 7.9, 75.9, 98.4, 108.5, 82.8, 122.0, 138.9, 157.9, 161.5, 174.0, 161.7, 163.7, 2.4, 0.75, 206.2, 187.8, 168.3, 157.6, 2.9, 0.96, 1.4, 1.03, 24.4, 80.8, 13.6, 66.0, 156.5, 1.8, 38.6, 43.0, 150.7, 81.4, 140.3, 18.2, 37.3, 31.6, 50.2, 86.1, 5.1, 24.6, 6.5, 150.8, 288.2, 481.6, 5.0, 5.9, 179.2, 248.6, 150.2, 1.3, 1.4, 4.6, 24.9, 4.8, 4.6, 0.98, 3.2, 211.1, 2.2, 14.6, 45.0, 3.4, 4.8, 5.3, 22.0, 53.4, 38.2, 29.5, 69.2, 179.9, 29.7, 18.6, 26.0, 28.8, 1.2, 0.9, 7.3, 1.6, 3.5, 22.7, 0.81, 15.1, 0.76, 0.46, 5.5, 0.29, 0.76, 2.4, 40.1, 173.8, 32.9, 69.7, 24.0, 7.7, 30.8, 15.0, 80.0, 68.4, 428.6, 16.3, 82.4, 42.3, 12.5, 2.7, 6.7, 0.23, 79.2, 1.3, 4.9, 12.8, 26.5, 11.9, 107.9, 27.5, 10.0, 1.1, 26.1, 54.0, 74.8, 26.3, 72.8, 71.3, 34.1, 80.5, 33.9, 201.8, 138.2, 35.8, 40.0, 65.4, 72.5, 96.6, 58.3, 31.0, 624.9, 1047.6, 0.41, 112.6, 66.5, 19.4, 1.1, 75.1, 0.68, 3.8, 28.2, 126.1, 0.91, 50.8, 62.4, 45.9, 137.7, 575.8, 78.1, 36.5, 0.41, 24.8, 4.8, 6.4, 8.8, 1.4, 1.4, 1.1, 7.3, 9.7, 6.2, 11.5, 1.1, 5.4, 5.1, 1.7, 77.8, 19.4, 0.53, 4.3, 29.5, 2.8, 225.9, 3.0, 14.7, 10.8, 3.3, 11.0, 6.8, 11.4, 73.3, 112.3, 21.6, 24.2, 34.0, 12.5, 25.2, 46.7, 12.7, 261.0, 2.5, 50.0, 0.3, 77.6, 126.0, 126.6, 87.0, 14.2, 77.0, 31.2, 19.4, 182.1, 19.4, 5.6, 63.6, 1316.0, 620.7, 744.8, 3.6, 5.9, 3.0, 3.4, 1.5, 9.6, 12.3, 6.3, 0.84, 50.0, 50.5, 1.24, 15.3, 1.6, 12.5, 9.0, 8.0, 9.6, 1.3, 1.3, 11.5, 3.7, 3.4, 13.9, 1.9, 1.4, 9.1, 1.4, 41.3, 31.7, 40.5, 191.4, 1.9, 15.7, 4.5, 3.5, 0.37, 2.0, 6.5, 1.0, 5.6, 5.7, 2.1, 1.7, 7.5, 1.3, 12.3, 6.8, 9.9, 52.9, 2.8, 2.5, 11.4, 7.1, 5.1, 4.6, 1.0, 53.4, 0.84, 1.08, 316.4, 63.6, 20.5, 50.3, 7.5, 0.91, 1.8, 123.8, 8.4, 15.3, 6.7, 4.4, 172.4, 3.6, 5.3, 74.7, 377.5, 0.91, 35.4, 2.0, 2.0, 1.9, 2.1, 17.5, 1.5, 2.1, 1.2, 2.5, 4.6, 4.1, 1.2, 1.5, 5.3, 1.9, 0.85, 1.3, 2.2, 3.7, 1.1, 0.75, 4.9, 10.4, 1.1, 21.8, 8.3, 60.5, 3.6, 85.4, 56.6, 88.5, 55.5, 24.5, 75.2, 4.3, 76.3, 0.93, 4.2, 7.65, 25.1, 3.5, 0.83, 434.8, 255.1, 0.77, 2.2, 7.2, 1.1, 1.3, 460.0, 6.5, 33.8, 9.6, 5.8, 0.85, 7.5, 7.8, 3.4, 17.9, 22.0, 4.4, 16.8, 4.7, 5.1, 9.7, 3.5, 16.5, 2.8, 80.2, 6.0, 139.2, 18.6, 1.27, 2.3, 18.5, 4.1, 5.4, 1.7, 7.9, 3.3, 6.1, 5.5, 1.4, 2.7, 14.5, 12.8, 60.3, 3.4, 2.2, 5.8, 6.3, 144.3, 39.6, 37.3, 3.2, 92.4, 43.0, 16.3, 261.8, 102.0, 250.9, 321.2, 375.1, 447.3, 493.1, 601.4, 543.1, 544.5, 30.7, 46.8, 14.8, 18.3, 76.2, 18.1, 7.8, 0.09, 3.3, 2.8, 2.3, 19.1, 6.2, 1.9, 43.2, 11.8, 16.7, 8.1, 1.9, 16.9, 2.9, 1.6, 2.4, 6.8, 2.3, 1.97, 4.23, 10.47, 83.56, 81.33, 24.98, 5.58, 0.12, 1.16, 2.01, 32.6, 43.62, 193.7, 0.13, 13.56, 13.3, 37.79, 19.85, 13.25, 2.38, 375.7, 0.79, 15.84, 12.19, 2.94, 0.63, 5.68, 5.68, 12.54, 6.73, 0.66]

def funcReturner(p, input):
	input = np.array(input) 	
	x = input[0]
	z = input[1]
	return 10**(p[0]*x + p[1]*z +p[2])
	
myModel = Model(funcReturner)
myData = Data([x_data,z_data], y_data)
myOdr = ODR(myData, myModel, beta0=[0.04, -0.02,  1.75])
myOdr.set_job(fit_type=0)
out = myOdr.run()
result = out.beta	

print "Optimal coefficients: ", result

I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209


From robert.kern at gmail.com  Mon Mar  5 05:58:46 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 5 Mar 2012 10:58:46 +0000
Subject: [SciPy-User] Orthogonal distance regression in 3D
In-Reply-To: <E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>
References: <E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>
Message-ID: <CAF6FJiu+AP9KH3=j-1P4vXE1s6i+0UJ=1=wOkMMDXWQQ-mHp4w@mail.gmail.com>

On Mon, Mar 5, 2012 at 10:26, ???????? <draft2008 at bk.ru> wrote:
> 02 ????? 2012, 15:49 ?? Robert Kern <robert.kern at gmail.com>:
>> On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
>> > Hello!
>> > I'm working with orthogonal distance regression (scipy.odr).
>> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
>> > returns wrong results
>> >
>> > For example I want to fit the simple curve y = a*x + b*z + c to some point
>> > cloud (y_data, x_data, z_data)
>> >
>> >
>> > ? ? def func(p, input):
>> >
>> > ? ? x,z = input
>> >
>> > ? ? x = np.array(x)
>> >
>> > ? ? z = np.array(z)
>> >
>> > ? ? return (p[0]*x + p[1]*z + p[2])
>> >
>> >
>> > ? ? initialGuess = [1,1,1]
>> >
>> > ? ? myModel = Model(func)
>> >
>> > ? ? myData = Data([x_data, z_daya], y_data)
>> >
>> > ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
>> >
>> > ? ? myOdr.set_job(fit_type=0)
>> >
>> > ? ? out = myOdr.run()
>> >
>> > ? ? print out.beta
>> >
>> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
>> > are not even close to real, moreover it is very sensitive to initial Guess,
>> > so it returns different result even if i change InitiaGuess from?[1,1,1]
>> > to?[0.99,1,1]
>> >
>> > What do I do wrong?
>>
>> Can you provide a complete runnable example including some data? Note
>> that if you do not specify any errors on your data, they are assumed
>> to correspond to a standard deviation of 1 for all dimensions. If that
>> is wildly different from the actual variance around the "true"
>> surface, then it might lead the optimizer astray.
>>
>> --
>> Robert Kern
>>
>
> I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
>
> Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
>
> import numpy as np
> from scipy.odr import *
> from math import *

[snip]

> def funcReturner(p, input):
> ? ? ? ?input = np.array(input)
> ? ? ? ?x = input[0]
> ? ? ? ?z = input[1]
> ? ? ? ?return 10**(p[0]*x + p[1]*z +p[2])

Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
problem you initially asked about. Setting the uncertainties
accurately on all axes of your data is essential. Do you really know
what they are? It's possible that you want to try fitting a plane to
np.log10(y_data) instead.

> myModel = Model(funcReturner)
> myData = Data([x_data,z_data], y_data)
> myOdr = ODR(myData, myModel, beta0=[0.04, -0.02, ?1.75])
> myOdr.set_job(fit_type=0)
> out = myOdr.run()
> result = out.beta
>
> print "Optimal coefficients: ", result
>
> I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
> If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209

For such a nonlinear problem, finding reasonable initial guesses is
useful. There is also a maximum iteration limit defaulting to a fairly
low 50. Check out.stopreason to see if it actually converged or just
ran into the iteration limit. You can keep calling myOdr.restart()
until it converges. If I start with beta0=[1,1,1], it converges
somewhere between 300 and 400 iterations.

-- 
Robert Kern


From pav at iki.fi  Mon Mar  5 07:35:16 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 05 Mar 2012 13:35:16 +0100
Subject: [SciPy-User] Test failures with SciPy-0.10.1 and Mac OS X Lion
In-Reply-To: <CA+mfgz3Bnhyur366DYQLzzdE2Q+kTXWSWx58ocqWBw-i8bAqqQ@mail.gmail.com>
References: <CA+mfgz3Bnhyur366DYQLzzdE2Q+kTXWSWx58ocqWBw-i8bAqqQ@mail.gmail.com>
Message-ID: <jj2bu5$3sh$1@dough.gmane.org>

05.03.2012 01:05, Adam Mercer kirjoitti:
[clip]
> I can't recall seeing these failures before, this has been compiled
> using the compilers from Xcode-4.3?
> 
> Are these anything to worry about?

The failures seem to indicate that (single-precision) linear algebra
routines are not functioning correctly. These are probably a sign of
some Veclib / Fortran ABI mismatch related to Fortran functions --
similar problems have been seen e.g. when mixing g77 and gfortran
compilers (also on other platforms than OSX), and on OSX adding Veclib
to the mix seems make the situation even more finicky.

Could you file a ticket:
http://projects.scipy.org/scipy/

Currently, we use the Veclib's C interface for complex-valued functions,
but possibly we should do this for all Fortran functions to work around
these problems.

-- 
Pauli Virtanen


From ramercer at gmail.com  Mon Mar  5 10:30:22 2012
From: ramercer at gmail.com (Adam Mercer)
Date: Mon, 5 Mar 2012 09:30:22 -0600
Subject: [SciPy-User] Test failures with SciPy-0.10.1 and Mac OS X Lion
In-Reply-To: <jj2bu5$3sh$1@dough.gmane.org>
References: <CA+mfgz3Bnhyur366DYQLzzdE2Q+kTXWSWx58ocqWBw-i8bAqqQ@mail.gmail.com>
	<jj2bu5$3sh$1@dough.gmane.org>
Message-ID: <CA+mfgz1AQOyiy7CJOPVePLH7w+2cTokvv2j2qCUx51xJeQeebA@mail.gmail.com>

On Mon, Mar 5, 2012 at 06:35, Pauli Virtanen <pav at iki.fi> wrote:

> The failures seem to indicate that (single-precision) linear algebra
> routines are not functioning correctly. These are probably a sign of
> some Veclib / Fortran ABI mismatch related to Fortran functions --
> similar problems have been seen e.g. when mixing g77 and gfortran
> compilers (also on other platforms than OSX), and on OSX adding Veclib
> to the mix seems make the situation even more finicky.
>
> Could you file a ticket:
> http://projects.scipy.org/scipy/
>
> Currently, we use the Veclib's C interface for complex-valued functions,
> but possibly we should do this for all Fortran functions to work around
> these problems.

Filed:

<http://projects.scipy.org/scipy/ticket/1618>

In my original email I said that the compilers from Xcode-4.3 had been
used, this is not the case. Everything should have been build using
the compilers from gcc-4.4.6. Xcode-4.3 is installed on the machine
but it's compilers aren't used for building scipy itself.

Cheers

Adam


From timmichelsen at gmx-topmail.de  Mon Mar  5 10:40:34 2012
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Mon, 05 Mar 2012 16:40:34 +0100
Subject: [SciPy-User] OT: Data Analysis in Python
In-Reply-To: <CAJPUwMBKYXgT=1z9_9X5-uUMDOwLTD0S=5jtMwhWV8LGxYvNHg@mail.gmail.com>
References: <CAMMTP+AVt4_3hHNBuevnan8-p+yrwuupNomHWjj7iNF0HnhdQA@mail.gmail.com>
	<jj0e03$cl6$1@dough.gmane.org>
	<CAJPUwMBKYXgT=1z9_9X5-uUMDOwLTD0S=5jtMwhWV8LGxYvNHg@mail.gmail.com>
Message-ID: <jj2mpi$top$1@dough.gmane.org>

> I will definitely post materials from my pandas tutorial (which are at
> the moment a bit scattered, becoming less so over the next couple
> days) and my talk at PyCon. I'm also due to cut some screencasts of
> recent demos.
>
> Some videos from the PyData workshop will also be published since we
> had a recording crew.
Good news.
Your effort is really appreciated. Thanks.


From dcday137 at gmail.com  Mon Mar  5 13:27:45 2012
From: dcday137 at gmail.com (Collin Day)
Date: Mon, 5 Mar 2012 11:27:45 -0700
Subject: [SciPy-User] How to feed np.mgrid a variable number of 'arguments'
Message-ID: <CAB=m8w+VVtRgEp_UXGzp+5ai0Pg2yiCi_XYZp6MBq3MHZu_q6A@mail.gmail.com>

Hi all,

I am guessing there is an easy way to do this, but I am just not seeing
it.  I have a function where I can have a variable number of input
dimensions.  In the function, I need to use np.mgrid to generate the data I
need.  How would I create a line of code that would feed np.mgrid a
variable number of inputs?  For example:

3d, with 17 nodes

a = np.mgrid[0:17,0:17,0:17]

4d

a = np.mgrid[0:17,0:17,0:17,0:17]

Is there a way I can do

nodes=17
inDims = a_number

a = np.mgrid[0:17,0:17...a_number of times]

easily?

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120305/40dcf245/attachment.html>

From cweisiger at msg.ucsf.edu  Mon Mar  5 13:36:25 2012
From: cweisiger at msg.ucsf.edu (Chris Weisiger)
Date: Mon, 5 Mar 2012 10:36:25 -0800
Subject: [SciPy-User] How to feed np.mgrid a variable number of
	'arguments'
In-Reply-To: <CAB=m8w+VVtRgEp_UXGzp+5ai0Pg2yiCi_XYZp6MBq3MHZu_q6A@mail.gmail.com>
References: <CAB=m8w+VVtRgEp_UXGzp+5ai0Pg2yiCi_XYZp6MBq3MHZu_q6A@mail.gmail.com>
Message-ID: <CABHB1jJuAZ1Bnz4u4EXYuArTk_JdWXaHahB7=WnBRBqHYHyObg@mail.gmail.com>

On Mon, Mar 5, 2012 at 10:27 AM, Collin Day <dcday137 at gmail.com> wrote:
> Is there a way I can do
>
> a = np.mgrid[0:17,0:17...a_number of times]

You want the slice() function. For example:

slices = []
for i in xrange(num_times):
    slices.append(slice(0, 17))
a = np.mgrid[slices]

-Chris


From robert.kern at gmail.com  Mon Mar  5 13:36:53 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 5 Mar 2012 18:36:53 +0000
Subject: [SciPy-User] How to feed np.mgrid a variable number of
	'arguments'
In-Reply-To: <CAB=m8w+VVtRgEp_UXGzp+5ai0Pg2yiCi_XYZp6MBq3MHZu_q6A@mail.gmail.com>
References: <CAB=m8w+VVtRgEp_UXGzp+5ai0Pg2yiCi_XYZp6MBq3MHZu_q6A@mail.gmail.com>
Message-ID: <CAF6FJitMsapM7zKg4FbZQJwu-NqcRZP=e_uSGeSPmr6Nad0BrQ@mail.gmail.com>

On Mon, Mar 5, 2012 at 18:27, Collin Day <dcday137 at gmail.com> wrote:
> Hi all,
>
> I am guessing there is an easy way to do this, but I am just not seeing it.
> I have a function where I can have a variable number of input dimensions.
> In the function, I need to use np.mgrid to generate the data I need.? How
> would I create a line of code that would feed np.mgrid a variable number of
> inputs?? For example:
>
> 3d, with 17 nodes
>
> a = np.mgrid[0:17,0:17,0:17]
>
> 4d
>
> a = np.mgrid[0:17,0:17,0:17,0:17]
>
> Is there a way I can do
>
> nodes=17
> inDims = a_number
>
> a = np.mgrid[0:17,0:17...a_number of times]
>
> easily?

ix = (slice(0, nodes),) * inDims
a = np.mgrid[idx]

-- 
Robert Kern


From draft2008 at bk.ru  Tue Mar  6 03:22:15 2012
From: draft2008 at bk.ru (=?UTF-8?B?0JLQu9Cw0LTQuNC80LjRgA==?=)
Date: Tue, 06 Mar 2012 12:22:15 +0400
Subject: [SciPy-User] =?utf-8?q?Orthogonal_distance_regression_in_3D?=
In-Reply-To: <CAF6FJiuAP9KH3=j-1P4vXE1s6i+0UJ=1=wOkMMDXWQQ-mHp4w@mail.gmail.com>
References: <E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>
	<CAF6FJiuAP9KH3=j-1P4vXE1s6i+0UJ=1=wOkMMDXWQQ-mHp4w@mail.gmail.com>
Message-ID: <E1S4pex-0008VK-BW.draft2008-bk-ru@f55.mail.ru>


05 ????? 2012, 14:59 ?? Robert Kern <robert.kern at gmail.com>:
> On Mon, Mar 5, 2012 at 10:26, ???????? <draft2008 at bk.ru> wrote:
> > 02 ????? 2012, 15:49 ?? Robert Kern <robert.kern at gmail.com>:
> >> On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
> >> > Hello!
> >> > I'm working with orthogonal distance regression (scipy.odr).
> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
> >> > returns wrong results
> >> >
> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
> >> > cloud (y_data, x_data, z_data)
> >> >
> >> >
> >> > ? ? def func(p, input):
> >> >
> >> > ? ? x,z = input
> >> >
> >> > ? ? x = np.array(x)
> >> >
> >> > ? ? z = np.array(z)
> >> >
> >> > ? ? return (p[0]*x + p[1]*z + p[2])
> >> >
> >> >
> >> > ? ? initialGuess = [1,1,1]
> >> >
> >> > ? ? myModel = Model(func)
> >> >
> >> > ? ? myData = Data([x_data, z_daya], y_data)
> >> >
> >> > ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
> >> >
> >> > ? ? myOdr.set_job(fit_type=0)
> >> >
> >> > ? ? out = myOdr.run()
> >> >
> >> > ? ? print out.beta
> >> >
> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
> >> > are not even close to real, moreover it is very sensitive to initial Guess,
> >> > so it returns different result even if i change InitiaGuess from?[1,1,1]
> >> > to?[0.99,1,1]
> >> >
> >> > What do I do wrong?
> >>
> >> Can you provide a complete runnable example including some data? Note
> >> that if you do not specify any errors on your data, they are assumed
> >> to correspond to a standard deviation of 1 for all dimensions. If that
> >> is wildly different from the actual variance around the "true"
> >> surface, then it might lead the optimizer astray.
> >>
> >> --
> >> Robert Kern
> >>
> >
> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
> >
> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
> >
> > import numpy as np
> > from scipy.odr import *
> > from math import *
> 
> [snip]
> 
> > def funcReturner(p, input):
> > ? ? ? ?input = np.array(input)
> > ? ? ? ?x = input[0]
> > ? ? ? ?z = input[1]
> > ? ? ? ?return 10**(p[0]*x + p[1]*z +p[2])
> 
> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
> problem you initially asked about. Setting the uncertainties
> accurately on all axes of your data is essential. Do you really know
> what they are? It's possible that you want to try fitting a plane to
> np.log10(y_data) instead.
> 
> > myModel = Model(funcReturner)
> > myData = Data([x_data,z_data], y_data)
> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02, ?1.75])
> > myOdr.set_job(fit_type=0)
> > out = myOdr.run()
> > result = out.beta
> >
> > print "Optimal coefficients: ", result
> >
> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
> 
> For such a nonlinear problem, finding reasonable initial guesses is
> useful. There is also a maximum iteration limit defaulting to a fairly
> low 50. Check out.stopreason to see if it actually converged or just
> ran into the iteration limit. You can keep calling myOdr.restart()
> until it converges. If I start with beta0=[1,1,1], it converges
> somewhere between 300 and 400 iterations.
> 
> --
> Robert Kern
> 

Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?

I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.

From evanmason at gmail.com  Tue Mar  6 04:24:51 2012
From: evanmason at gmail.com (Evan Mason)
Date: Tue, 6 Mar 2012 09:24:51 +0000 (UTC)
Subject: [SciPy-User] scipy.spatial, dsearchn?
Message-ID: <loom.20120306T101924-609@post.gmane.org>

Hi, I am wondering if there is any way in scipy.spatial to get the information
given by matlab's dsearchn, i.e.:

http://www.mathworks.es/help/techdoc/ref/dsearchn.html

k = dsearchn(X,T,XI) returns the indices k of the closest points in X for each
point in XI. X is an m-by-n matrix representing m points in n-dimensional space.
XI is a p-by-n matrix, representing p points in n-dimensional space. T is a
numt-by-n+1 matrix, a triangulation of the data X generated by delaunayn. The
output k is a column vector of length p.

Many thanks, Evan


From macdonald at maths.ox.ac.uk  Tue Mar  6 04:33:23 2012
From: macdonald at maths.ox.ac.uk (Colin Macdonald)
Date: Tue, 06 Mar 2012 09:33:23 +0000
Subject: [SciPy-User] [Job] Undergrad summer research project at Oxford
Message-ID: <4F55D9E3.40007@maths.ox.ac.uk>

Hi,

I'm looking for a talented undergrad student to work on a 10-week 
summer research project at Oxford.  We're developing Scipy/Numpy code 
for solving differential equations on curved surfaces (to be released 
as free/open source software).

A background in both Python and numerical analysis (finite differences 
mostly) is recommended.

More info at:
http://www.maths.ox.ac.uk/groups/occam/vacancies/oss2012
(also lists other projects).

Candidates would need to be in their final two years or between ugrad 
and MSc).

Deadline is March 21st, 2012.

thanks,
Colin

-- 
Colin Macdonald
University Lecturer in Numerical Methodologies
Tutorial Fellow at Oriel College
University of Oxford


From robert.kern at gmail.com  Tue Mar  6 06:14:48 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 6 Mar 2012 11:14:48 +0000
Subject: [SciPy-User] Orthogonal distance regression in 3D
In-Reply-To: <E1S4pex-0008VK-BW.draft2008-bk-ru@f55.mail.ru>
References: <E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>
	<CAF6FJiuAP9KH3=j-1P4vXE1s6i+0UJ=1=wOkMMDXWQQ-mHp4w@mail.gmail.com>
	<E1S4pex-0008VK-BW.draft2008-bk-ru@f55.mail.ru>
Message-ID: <CAF6FJisdQ6i2zB0GvAJMMMO9+D36cSocFUHDb5eSxDDt=qr8Rg@mail.gmail.com>

On Tue, Mar 6, 2012 at 08:22, ???????? <draft2008 at bk.ru> wrote:
>
> 05 ????? 2012, 14:59 ?? Robert Kern <robert.kern at gmail.com>:
>> On Mon, Mar 5, 2012 at 10:26, ???????? <draft2008 at bk.ru> wrote:
>> > 02 ????? 2012, 15:49 ?? Robert Kern <robert.kern at gmail.com>:
>> >> On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
>> >> > Hello!
>> >> > I'm working with orthogonal distance regression (scipy.odr).
>> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
>> >> > returns wrong results
>> >> >
>> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
>> >> > cloud (y_data, x_data, z_data)
>> >> >
>> >> >
>> >> > ? ? def func(p, input):
>> >> >
>> >> > ? ? x,z = input
>> >> >
>> >> > ? ? x = np.array(x)
>> >> >
>> >> > ? ? z = np.array(z)
>> >> >
>> >> > ? ? return (p[0]*x + p[1]*z + p[2])
>> >> >
>> >> >
>> >> > ? ? initialGuess = [1,1,1]
>> >> >
>> >> > ? ? myModel = Model(func)
>> >> >
>> >> > ? ? myData = Data([x_data, z_daya], y_data)
>> >> >
>> >> > ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
>> >> >
>> >> > ? ? myOdr.set_job(fit_type=0)
>> >> >
>> >> > ? ? out = myOdr.run()
>> >> >
>> >> > ? ? print out.beta
>> >> >
>> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
>> >> > are not even close to real, moreover it is very sensitive to initial Guess,
>> >> > so it returns different result even if i change InitiaGuess from?[1,1,1]
>> >> > to?[0.99,1,1]
>> >> >
>> >> > What do I do wrong?
>> >>
>> >> Can you provide a complete runnable example including some data? Note
>> >> that if you do not specify any errors on your data, they are assumed
>> >> to correspond to a standard deviation of 1 for all dimensions. If that
>> >> is wildly different from the actual variance around the "true"
>> >> surface, then it might lead the optimizer astray.
>> >>
>> >> --
>> >> Robert Kern
>> >>
>> >
>> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
>> >
>> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
>> >
>> > import numpy as np
>> > from scipy.odr import *
>> > from math import *
>>
>> [snip]
>>
>> > def funcReturner(p, input):
>> > ? ? ? ?input = np.array(input)
>> > ? ? ? ?x = input[0]
>> > ? ? ? ?z = input[1]
>> > ? ? ? ?return 10**(p[0]*x + p[1]*z +p[2])
>>
>> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
>> problem you initially asked about. Setting the uncertainties
>> accurately on all axes of your data is essential. Do you really know
>> what they are? It's possible that you want to try fitting a plane to
>> np.log10(y_data) instead.
>>
>> > myModel = Model(funcReturner)
>> > myData = Data([x_data,z_data], y_data)
>> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02, ?1.75])
>> > myOdr.set_job(fit_type=0)
>> > out = myOdr.run()
>> > result = out.beta
>> >
>> > print "Optimal coefficients: ", result
>> >
>> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
>> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
>>
>> For such a nonlinear problem, finding reasonable initial guesses is
>> useful. There is also a maximum iteration limit defaulting to a fairly
>> low 50. Check out.stopreason to see if it actually converged or just
>> ran into the iteration limit. You can keep calling myOdr.restart()
>> until it converges. If I start with beta0=[1,1,1], it converges
>> somewhere between 300 and 400 iterations.
>>
>> --
>> Robert Kern
>>
>
> Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?

As I mentioned before, in a nonlinear case, you really need to have
good estimates of the uncertainties on each point. Since your Y
variable varies over several orders of magnitude, I really doubt that
the uncertainties are the same for each point. It's more likely that
you want to assign a relative 10% (or whatever) uncertainty to each
point rather than the same absolute uncertainty to each. I don't think
that you have really measured both 1651.5+-1.0 and 0.05+-1.0, but
that's what you are implicitly saying when you don't provide explicit
estimates of the uncertainties.

One problem that you are going to run into is that least-squares isn't
especially appropriate for your model. Your Y output is strictly
positive, but it goes very close to 0.0. The error model that
least-squares fits is that each measurement follows a Gaussian
distribution about the true point, and the Gaussian has infinite
support (specifically, it crosses that 0 line, and you know a priori
that you will never observe a negative value). For the observations
~1000.0, that doesn't matter much, but it severely distorts the
problem at 0.05. Your true error distribution is probably something
like log-normal; the errors below the curve are small but the errors
above can be large. Transforming strictly-positive data with a
logarithm is a standard technique. In a sense, the "log-transformed"
model is the "true" model to be using, at least if you want to use
least-squares. Looking at the residuals of both original and the
log10-transformed problem (try plot(x_data, out.eps, 'k.'),
plot(x_data, out.delta[0], 'k.'), etc.), it looks like the
log10-transformed data does fit fairly well; the residuals mostly
follow a normal distribution of the same size across the dataset.
That's good! But it also means that if you transform these residuals
back to the original space, they don't follow a normal distribution
anymore, and using least-squares to fit the problem isn't appropriate
anymore.

> I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.

Well, you will have to go through those functions (and their implicit
error models) and determine if least-squares is truly appropriate for
them. Least-squares is not appropriate for all models. However,
log-transforming the strictly-positive variables in a model quite
frequently is all you need to do to turn a least-squares-inappropriate
model into a least-squares-appropriate one. You can write your
functions in that log-transformed form and write a little adapter to
transform the data (which is given to you in the original form).

-- 
Robert Kern


From draft2008 at bk.ru  Wed Mar  7 01:57:46 2012
From: draft2008 at bk.ru (=?UTF-8?B?0JLQu9Cw0LTQuNC80LjRgA==?=)
Date: Wed, 07 Mar 2012 10:57:46 +0400
Subject: [SciPy-User] =?utf-8?q?Orthogonal_distance_regression_in_3D?=
In-Reply-To: <CAF6FJisdQ6i2zB0GvAJMMMO9D36cSocFUHDb5eSxDDt=qr8Rg@mail.gmail.com>
References: <E1S4pex-0008VK-BW.draft2008-bk-ru@f55.mail.ru>
	<E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>
	<CAF6FJisdQ6i2zB0GvAJMMMO9D36cSocFUHDb5eSxDDt=qr8Rg@mail.gmail.com>
Message-ID: <E1S5Aok-0002cD-8E.draft2008-bk-ru@f150.mail.ru>


06 ????? 2012, 15:15 ?? Robert Kern <robert.kern at gmail.com>:
> On Tue, Mar 6, 2012 at 08:22, ???????? <draft2008 at bk.ru> wrote:
> >
> > 05 ????? 2012, 14:59 ?? Robert Kern <robert.kern at gmail.com>:
> >> On Mon, Mar 5, 2012 at 10:26, ???????? <draft2008 at bk.ru> wrote:
> >> > 02 ????? 2012, 15:49 ?? Robert Kern <robert.kern at gmail.com>:
> >> >> On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
> >> >> > Hello!
> >> >> > I'm working with orthogonal distance regression (scipy.odr).
> >> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
> >> >> > returns wrong results
> >> >> >
> >> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
> >> >> > cloud (y_data, x_data, z_data)
> >> >> >
> >> >> >
> >> >> > ? ? def func(p, input):
> >> >> >
> >> >> > ? ? x,z = input
> >> >> >
> >> >> > ? ? x = np.array(x)
> >> >> >
> >> >> > ? ? z = np.array(z)
> >> >> >
> >> >> > ? ? return (p[0]*x + p[1]*z + p[2])
> >> >> >
> >> >> >
> >> >> > ? ? initialGuess = [1,1,1]
> >> >> >
> >> >> > ? ? myModel = Model(func)
> >> >> >
> >> >> > ? ? myData = Data([x_data, z_daya], y_data)
> >> >> >
> >> >> > ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
> >> >> >
> >> >> > ? ? myOdr.set_job(fit_type=0)
> >> >> >
> >> >> > ? ? out = myOdr.run()
> >> >> >
> >> >> > ? ? print out.beta
> >> >> >
> >> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
> >> >> > are not even close to real, moreover it is very sensitive to initial Guess,
> >> >> > so it returns different result even if i change InitiaGuess from?[1,1,1]
> >> >> > to?[0.99,1,1]
> >> >> >
> >> >> > What do I do wrong?
> >> >>
> >> >> Can you provide a complete runnable example including some data? Note
> >> >> that if you do not specify any errors on your data, they are assumed
> >> >> to correspond to a standard deviation of 1 for all dimensions. If that
> >> >> is wildly different from the actual variance around the "true"
> >> >> surface, then it might lead the optimizer astray.
> >> >>
> >> >> --
> >> >> Robert Kern
> >> >>
> >> >
> >> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
> >> >
> >> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
> >> >
> >> > import numpy as np
> >> > from scipy.odr import *
> >> > from math import *
> >>
> >> [snip]
> >>
> >> > def funcReturner(p, input):
> >> > ? ? ? ?input = np.array(input)
> >> > ? ? ? ?x = input[0]
> >> > ? ? ? ?z = input[1]
> >> > ? ? ? ?return 10**(p[0]*x + p[1]*z +p[2])
> >>
> >> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
> >> problem you initially asked about. Setting the uncertainties
> >> accurately on all axes of your data is essential. Do you really know
> >> what they are? It's possible that you want to try fitting a plane to
> >> np.log10(y_data) instead.
> >>
> >> > myModel = Model(funcReturner)
> >> > myData = Data([x_data,z_data], y_data)
> >> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02, ?1.75])
> >> > myOdr.set_job(fit_type=0)
> >> > out = myOdr.run()
> >> > result = out.beta
> >> >
> >> > print "Optimal coefficients: ", result
> >> >
> >> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
> >> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
> >>
> >> For such a nonlinear problem, finding reasonable initial guesses is
> >> useful. There is also a maximum iteration limit defaulting to a fairly
> >> low 50. Check out.stopreason to see if it actually converged or just
> >> ran into the iteration limit. You can keep calling myOdr.restart()
> >> until it converges. If I start with beta0=[1,1,1], it converges
> >> somewhere between 300 and 400 iterations.
> >>
> >> --
> >> Robert Kern
> >>
> >
> > Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?
> 
> As I mentioned before, in a nonlinear case, you really need to have
> good estimates of the uncertainties on each point. Since your Y
> variable varies over several orders of magnitude, I really doubt that
> the uncertainties are the same for each point. It's more likely that
> you want to assign a relative 10% (or whatever) uncertainty to each
> point rather than the same absolute uncertainty to each. I don't think
> that you have really measured both 1651.5+-1.0 and 0.05+-1.0, but
> that's what you are implicitly saying when you don't provide explicit
> estimates of the uncertainties.
> 
> One problem that you are going to run into is that least-squares isn't
> especially appropriate for your model. Your Y output is strictly
> positive, but it goes very close to 0.0. The error model that
> least-squares fits is that each measurement follows a Gaussian
> distribution about the true point, and the Gaussian has infinite
> support (specifically, it crosses that 0 line, and you know a priori
> that you will never observe a negative value). For the observations
> ~1000.0, that doesn't matter much, but it severely distorts the
> problem at 0.05. Your true error distribution is probably something
> like log-normal; the errors below the curve are small but the errors
> above can be large. Transforming strictly-positive data with a
> logarithm is a standard technique. In a sense, the "log-transformed"
> model is the "true" model to be using, at least if you want to use
> least-squares. Looking at the residuals of both original and the
> log10-transformed problem (try plot(x_data, out.eps, 'k.'),
> plot(x_data, out.delta[0], 'k.'), etc.), it looks like the
> log10-transformed data does fit fairly well; the residuals mostly
> follow a normal distribution of the same size across the dataset.
> That's good! But it also means that if you transform these residuals
> back to the original space, they don't follow a normal distribution
> anymore, and using least-squares to fit the problem isn't appropriate
> anymore.
> 
> > I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.
> 
> Well, you will have to go through those functions (and their implicit
> error models) and determine if least-squares is truly appropriate for
> them. Least-squares is not appropriate for all models. However,
> log-transforming the strictly-positive variables in a model quite
> frequently is all you need to do to turn a least-squares-inappropriate
> model into a least-squares-appropriate one. You can write your
> functions in that log-transformed form and write a little adapter to
> transform the data (which is given to you in the original form).
> 
> --
> Robert Kern
> 

Robert, thank you very much for detailed answer, now I see what is the problem. I don't really have any uncertainties, and I guess it would be hard to compute them from the data. Moreover, this data is just the sample, and I will have a different types of data in real. Transformation actually helps just for the the couple of functions, for instance, 10**(A*x + B*z +C) and C*(A)**X*(B)**Z functions fit just perfectly, but doesn't work for any others (like A*lg(X) + B*Z + C, C/(1 + A*X + B*Z)). I transform the function like this (conditionally):
y_data = np.log10(y_data)
function = np.log10(function)
, is that correct? And what do you mean by little adapter to transform the data?

By the way, the problem appears only in 3d mode. When I use the same logarithmic data in 2d mode (no Z axis), it works perfectly for all functions, and no log10 transformation needed (this transformation distort the results, make them worse in that case).

Do you know any other fitting methods, available in python?

From sturla at molden.no  Wed Mar  7 05:28:40 2012
From: sturla at molden.no (Sturla Molden)
Date: Wed, 07 Mar 2012 11:28:40 +0100
Subject: [SciPy-User] scipy.spatial, dsearchn?
In-Reply-To: <loom.20120306T101924-609@post.gmane.org>
References: <loom.20120306T101924-609@post.gmane.org>
Message-ID: <4F573858.2030100@molden.no>

On 06.03.2012 10:24, Evan Mason wrote:

> k = dsearchn(X,T,XI) returns the indices k of the closest points in X for each
> point in XI.

I have never used Matlab's dsearchn, but scipy.spatial.cKDTree can 
search rather quickly for nearest-neighbour points (time complexity 
O(log n) for each search).

Sturla


From evanmason at gmail.com  Wed Mar  7 07:21:59 2012
From: evanmason at gmail.com (Evan Mason)
Date: Wed, 7 Mar 2012 12:21:59 +0000 (UTC)
Subject: [SciPy-User] scipy.spatial, dsearchn?
References: <loom.20120306T101924-609@post.gmane.org>
	<4F573858.2030100@molden.no>
Message-ID: <loom.20120307T131859-308@post.gmane.org>

> I have never used Matlab's dsearchn, but scipy.spatial.cKDTree can
> search rather quickly for nearest-neighbour points (time complexity
> O(log n) for each search).

Thanks, yes, and here I found pretty much what I needed using
scipy.spatial.cKDTree:

http://permalink.gmane.org/gmane.comp.python.scientific.user/19610


Evan


From robert.kern at gmail.com  Wed Mar  7 07:30:48 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Mar 2012 12:30:48 +0000
Subject: [SciPy-User] Orthogonal distance regression in 3D
In-Reply-To: <E1S5Aok-0002cD-8E.draft2008-bk-ru@f150.mail.ru>
References: <E1S4pex-0008VK-BW.draft2008-bk-ru@f55.mail.ru>
	<E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>
	<CAF6FJisdQ6i2zB0GvAJMMMO9D36cSocFUHDb5eSxDDt=qr8Rg@mail.gmail.com>
	<E1S5Aok-0002cD-8E.draft2008-bk-ru@f150.mail.ru>
Message-ID: <CAF6FJiv4k=SK0JmL9zzuQupn3SxyK0GzBFcFn-C=iYxLMtzsmg@mail.gmail.com>

On Wed, Mar 7, 2012 at 06:57, ???????? <draft2008 at bk.ru> wrote:
>
>
>
> 06 ????? 2012, 15:15 ?? Robert Kern <robert.kern at gmail.com>:
>> On Tue, Mar 6, 2012 at 08:22, ???????? <draft2008 at bk.ru> wrote:
>> >
>> > 05 ????? 2012, 14:59 ?? Robert Kern <robert.kern at gmail.com>:
>> >> On Mon, Mar 5, 2012 at 10:26, ???????? <draft2008 at bk.ru> wrote:
>> >> > 02 ????? 2012, 15:49 ?? Robert Kern <robert.kern at gmail.com>:
>> >> >> On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
>> >> >> > Hello!
>> >> >> > I'm working with orthogonal distance regression (scipy.odr).
>> >> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
>> >> >> > returns wrong results
>> >> >> >
>> >> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
>> >> >> > cloud (y_data, x_data, z_data)
>> >> >> >
>> >> >> >
>> >> >> > ? ? def func(p, input):
>> >> >> >
>> >> >> > ? ? x,z = input
>> >> >> >
>> >> >> > ? ? x = np.array(x)
>> >> >> >
>> >> >> > ? ? z = np.array(z)
>> >> >> >
>> >> >> > ? ? return (p[0]*x + p[1]*z + p[2])
>> >> >> >
>> >> >> >
>> >> >> > ? ? initialGuess = [1,1,1]
>> >> >> >
>> >> >> > ? ? myModel = Model(func)
>> >> >> >
>> >> >> > ? ? myData = Data([x_data, z_daya], y_data)
>> >> >> >
>> >> >> > ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
>> >> >> >
>> >> >> > ? ? myOdr.set_job(fit_type=0)
>> >> >> >
>> >> >> > ? ? out = myOdr.run()
>> >> >> >
>> >> >> > ? ? print out.beta
>> >> >> >
>> >> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
>> >> >> > are not even close to real, moreover it is very sensitive to initial Guess,
>> >> >> > so it returns different result even if i change InitiaGuess from?[1,1,1]
>> >> >> > to?[0.99,1,1]
>> >> >> >
>> >> >> > What do I do wrong?
>> >> >>
>> >> >> Can you provide a complete runnable example including some data? Note
>> >> >> that if you do not specify any errors on your data, they are assumed
>> >> >> to correspond to a standard deviation of 1 for all dimensions. If that
>> >> >> is wildly different from the actual variance around the "true"
>> >> >> surface, then it might lead the optimizer astray.
>> >> >>
>> >> >> --
>> >> >> Robert Kern
>> >> >>
>> >> >
>> >> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
>> >> >
>> >> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
>> >> >
>> >> > import numpy as np
>> >> > from scipy.odr import *
>> >> > from math import *
>> >>
>> >> [snip]
>> >>
>> >> > def funcReturner(p, input):
>> >> > ? ? ? ?input = np.array(input)
>> >> > ? ? ? ?x = input[0]
>> >> > ? ? ? ?z = input[1]
>> >> > ? ? ? ?return 10**(p[0]*x + p[1]*z +p[2])
>> >>
>> >> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
>> >> problem you initially asked about. Setting the uncertainties
>> >> accurately on all axes of your data is essential. Do you really know
>> >> what they are? It's possible that you want to try fitting a plane to
>> >> np.log10(y_data) instead.
>> >>
>> >> > myModel = Model(funcReturner)
>> >> > myData = Data([x_data,z_data], y_data)
>> >> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02, ?1.75])
>> >> > myOdr.set_job(fit_type=0)
>> >> > out = myOdr.run()
>> >> > result = out.beta
>> >> >
>> >> > print "Optimal coefficients: ", result
>> >> >
>> >> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
>> >> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
>> >>
>> >> For such a nonlinear problem, finding reasonable initial guesses is
>> >> useful. There is also a maximum iteration limit defaulting to a fairly
>> >> low 50. Check out.stopreason to see if it actually converged or just
>> >> ran into the iteration limit. You can keep calling myOdr.restart()
>> >> until it converges. If I start with beta0=[1,1,1], it converges
>> >> somewhere between 300 and 400 iterations.
>> >>
>> >> --
>> >> Robert Kern
>> >>
>> >
>> > Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?
>>
>> As I mentioned before, in a nonlinear case, you really need to have
>> good estimates of the uncertainties on each point. Since your Y
>> variable varies over several orders of magnitude, I really doubt that
>> the uncertainties are the same for each point. It's more likely that
>> you want to assign a relative 10% (or whatever) uncertainty to each
>> point rather than the same absolute uncertainty to each. I don't think
>> that you have really measured both 1651.5+-1.0 and 0.05+-1.0, but
>> that's what you are implicitly saying when you don't provide explicit
>> estimates of the uncertainties.
>>
>> One problem that you are going to run into is that least-squares isn't
>> especially appropriate for your model. Your Y output is strictly
>> positive, but it goes very close to 0.0. The error model that
>> least-squares fits is that each measurement follows a Gaussian
>> distribution about the true point, and the Gaussian has infinite
>> support (specifically, it crosses that 0 line, and you know a priori
>> that you will never observe a negative value). For the observations
>> ~1000.0, that doesn't matter much, but it severely distorts the
>> problem at 0.05. Your true error distribution is probably something
>> like log-normal; the errors below the curve are small but the errors
>> above can be large. Transforming strictly-positive data with a
>> logarithm is a standard technique. In a sense, the "log-transformed"
>> model is the "true" model to be using, at least if you want to use
>> least-squares. Looking at the residuals of both original and the
>> log10-transformed problem (try plot(x_data, out.eps, 'k.'),
>> plot(x_data, out.delta[0], 'k.'), etc.), it looks like the
>> log10-transformed data does fit fairly well; the residuals mostly
>> follow a normal distribution of the same size across the dataset.
>> That's good! But it also means that if you transform these residuals
>> back to the original space, they don't follow a normal distribution
>> anymore, and using least-squares to fit the problem isn't appropriate
>> anymore.
>>
>> > I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.
>>
>> Well, you will have to go through those functions (and their implicit
>> error models) and determine if least-squares is truly appropriate for
>> them. Least-squares is not appropriate for all models. However,
>> log-transforming the strictly-positive variables in a model quite
>> frequently is all you need to do to turn a least-squares-inappropriate
>> model into a least-squares-appropriate one. You can write your
>> functions in that log-transformed form and write a little adapter to
>> transform the data (which is given to you in the original form).
>>
>> --
>> Robert Kern
>>
>
> Robert, thank you very much for detailed answer, now I see what is the problem. I don't really have any uncertainties, and I guess it would be hard to compute them from the data. Moreover, this data is just the sample, and I will have a different types of data in real. Transformation actually helps just for the the couple of functions, for instance, 10**(A*x + B*z +C) and C*(A)**X*(B)**Z functions fit just perfectly, but doesn't work for any others (like A*lg(X) + B*Z + C, C/(1 + A*X + B*Z)).

Right. That's why I said that you have to go through all of the
functions to see if it's applicable or not.

> I transform the function like this (conditionally):
> y_data = np.log10(y_data)
> function = np.log10(function)
> , is that correct?

Yes.

> And what do you mean by little adapter to transform the data?

I just meant the "y_data = np.log10(y_data)" part.

> By the way, the problem appears only in 3d mode. When I use the same logarithmic data in 2d mode (no Z axis), it works perfectly for all functions, and no log10 transformation needed (this transformation distort the results, make them worse in that case).

Without knowing how you are getting the data to make that
determination, I don't have much to say about it. The problem of
inaccurate uncertainties will probably get larger as you increase the
dimension of the inputs. Since you don't know them, it's probably not
a good idea to keep trying to do ODR instead of ordinary least
squares. Use myOdr.set_job(fit_type=2) to use OLS instead. You still
have a problem with the unknown uncertainties on the Y output, but
that narrows some of the problems down.

> Do you know any other fitting methods, available in python?

None that free you from this kind of thoughtful analysis. Fitting
functions isn't a black box, I'm afraid. You need to consider your
models and your data before fitting, and you need to look at the
residuals afterwards to check that the results make sense. Then you
improve your model. Least-squares is not suitable for all models. You
may need to roll your own error model and use the generic minimization
routines in scipy.optimize. If you can formulate your models as
generalized linear models (which you can for a couple of them, but not
all), you could look at the functionality for this in the statsmodels
package.

-- 
Robert Kern


From josef.pktd at gmail.com  Wed Mar  7 10:20:06 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 7 Mar 2012 10:20:06 -0500
Subject: [SciPy-User] Orthogonal distance regression in 3D
In-Reply-To: <CAF6FJiv4k=SK0JmL9zzuQupn3SxyK0GzBFcFn-C=iYxLMtzsmg@mail.gmail.com>
References: <E1S4pex-0008VK-BW.draft2008-bk-ru@f55.mail.ru>
	<E1S4V7n-0000MY-Jb.draft2008-bk-ru@f14.mail.ru>
	<CAF6FJisdQ6i2zB0GvAJMMMO9D36cSocFUHDb5eSxDDt=qr8Rg@mail.gmail.com>
	<E1S5Aok-0002cD-8E.draft2008-bk-ru@f150.mail.ru>
	<CAF6FJiv4k=SK0JmL9zzuQupn3SxyK0GzBFcFn-C=iYxLMtzsmg@mail.gmail.com>
Message-ID: <CAMMTP+BDZ61=Rbt3rx2JJnox9q+JE+-v7zeYuv-bjqgkkp1FBQ@mail.gmail.com>

On Wed, Mar 7, 2012 at 7:30 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Mar 7, 2012 at 06:57, ???????? <draft2008 at bk.ru> wrote:
>>
>>
>>
>> 06 ????? 2012, 15:15 ?? Robert Kern <robert.kern at gmail.com>:
>>> On Tue, Mar 6, 2012 at 08:22, ???????? <draft2008 at bk.ru> wrote:
>>> >
>>> > 05 ????? 2012, 14:59 ?? Robert Kern <robert.kern at gmail.com>:
>>> >> On Mon, Mar 5, 2012 at 10:26, ???????? <draft2008 at bk.ru> wrote:
>>> >> > 02 ????? 2012, 15:49 ?? Robert Kern <robert.kern at gmail.com>:
>>> >> >> On Fri, Mar 2, 2012 at 06:02, ???????? <draft2008 at bk.ru> wrote:
>>> >> >> > Hello!
>>> >> >> > I'm working with orthogonal distance regression (scipy.odr).
>>> >> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
>>> >> >> > returns wrong results
>>> >> >> >
>>> >> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
>>> >> >> > cloud (y_data, x_data, z_data)
>>> >> >> >
>>> >> >> >
>>> >> >> > ? ? def func(p, input):
>>> >> >> >
>>> >> >> > ? ? x,z = input
>>> >> >> >
>>> >> >> > ? ? x = np.array(x)
>>> >> >> >
>>> >> >> > ? ? z = np.array(z)
>>> >> >> >
>>> >> >> > ? ? return (p[0]*x + p[1]*z + p[2])
>>> >> >> >
>>> >> >> >
>>> >> >> > ? ? initialGuess = [1,1,1]
>>> >> >> >
>>> >> >> > ? ? myModel = Model(func)
>>> >> >> >
>>> >> >> > ? ? myData = Data([x_data, z_daya], y_data)
>>> >> >> >
>>> >> >> > ? ? myOdr = ODR(myData, myModel, beta0 = initialGuess)
>>> >> >> >
>>> >> >> > ? ? myOdr.set_job(fit_type=0)
>>> >> >> >
>>> >> >> > ? ? out = myOdr.run()
>>> >> >> >
>>> >> >> > ? ? print out.beta
>>> >> >> >
>>> >> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
>>> >> >> > are not even close to real, moreover it is very sensitive to initial Guess,
>>> >> >> > so it returns different result even if i change InitiaGuess from?[1,1,1]
>>> >> >> > to?[0.99,1,1]
>>> >> >> >
>>> >> >> > What do I do wrong?
>>> >> >>
>>> >> >> Can you provide a complete runnable example including some data? Note
>>> >> >> that if you do not specify any errors on your data, they are assumed
>>> >> >> to correspond to a standard deviation of 1 for all dimensions. If that
>>> >> >> is wildly different from the actual variance around the "true"
>>> >> >> surface, then it might lead the optimizer astray.
>>> >> >>
>>> >> >> --
>>> >> >> Robert Kern
>>> >> >>
>>> >> >
>>> >> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
>>> >> >
>>> >> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
>>> >> >
>>> >> > import numpy as np
>>> >> > from scipy.odr import *
>>> >> > from math import *
>>> >>
>>> >> [snip]
>>> >>
>>> >> > def funcReturner(p, input):
>>> >> > ? ? ? ?input = np.array(input)
>>> >> > ? ? ? ?x = input[0]
>>> >> > ? ? ? ?z = input[1]
>>> >> > ? ? ? ?return 10**(p[0]*x + p[1]*z +p[2])
>>> >>
>>> >> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
>>> >> problem you initially asked about. Setting the uncertainties
>>> >> accurately on all axes of your data is essential. Do you really know
>>> >> what they are? It's possible that you want to try fitting a plane to
>>> >> np.log10(y_data) instead.
>>> >>
>>> >> > myModel = Model(funcReturner)
>>> >> > myData = Data([x_data,z_data], y_data)
>>> >> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02, ?1.75])
>>> >> > myOdr.set_job(fit_type=0)
>>> >> > out = myOdr.run()
>>> >> > result = out.beta
>>> >> >
>>> >> > print "Optimal coefficients: ", result
>>> >> >
>>> >> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
>>> >> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
>>> >>
>>> >> For such a nonlinear problem, finding reasonable initial guesses is
>>> >> useful. There is also a maximum iteration limit defaulting to a fairly
>>> >> low 50. Check out.stopreason to see if it actually converged or just
>>> >> ran into the iteration limit. You can keep calling myOdr.restart()
>>> >> until it converges. If I start with beta0=[1,1,1], it converges
>>> >> somewhere between 300 and 400 iterations.
>>> >>
>>> >> --
>>> >> Robert Kern
>>> >>
>>> >
>>> > Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?
>>>
>>> As I mentioned before, in a nonlinear case, you really need to have
>>> good estimates of the uncertainties on each point. Since your Y
>>> variable varies over several orders of magnitude, I really doubt that
>>> the uncertainties are the same for each point. It's more likely that
>>> you want to assign a relative 10% (or whatever) uncertainty to each
>>> point rather than the same absolute uncertainty to each. I don't think
>>> that you have really measured both 1651.5+-1.0 and 0.05+-1.0, but
>>> that's what you are implicitly saying when you don't provide explicit
>>> estimates of the uncertainties.
>>>
>>> One problem that you are going to run into is that least-squares isn't
>>> especially appropriate for your model. Your Y output is strictly
>>> positive, but it goes very close to 0.0. The error model that
>>> least-squares fits is that each measurement follows a Gaussian
>>> distribution about the true point, and the Gaussian has infinite
>>> support (specifically, it crosses that 0 line, and you know a priori
>>> that you will never observe a negative value). For the observations
>>> ~1000.0, that doesn't matter much, but it severely distorts the
>>> problem at 0.05. Your true error distribution is probably something
>>> like log-normal; the errors below the curve are small but the errors
>>> above can be large. Transforming strictly-positive data with a
>>> logarithm is a standard technique. In a sense, the "log-transformed"
>>> model is the "true" model to be using, at least if you want to use
>>> least-squares. Looking at the residuals of both original and the
>>> log10-transformed problem (try plot(x_data, out.eps, 'k.'),
>>> plot(x_data, out.delta[0], 'k.'), etc.), it looks like the
>>> log10-transformed data does fit fairly well; the residuals mostly
>>> follow a normal distribution of the same size across the dataset.
>>> That's good! But it also means that if you transform these residuals
>>> back to the original space, they don't follow a normal distribution
>>> anymore, and using least-squares to fit the problem isn't appropriate
>>> anymore.
>>>
>>> > I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.
>>>
>>> Well, you will have to go through those functions (and their implicit
>>> error models) and determine if least-squares is truly appropriate for
>>> them. Least-squares is not appropriate for all models. However,
>>> log-transforming the strictly-positive variables in a model quite
>>> frequently is all you need to do to turn a least-squares-inappropriate
>>> model into a least-squares-appropriate one. You can write your
>>> functions in that log-transformed form and write a little adapter to
>>> transform the data (which is given to you in the original form).
>>>
>>> --
>>> Robert Kern
>>>
>>
>> Robert, thank you very much for detailed answer, now I see what is the problem. I don't really have any uncertainties, and I guess it would be hard to compute them from the data. Moreover, this data is just the sample, and I will have a different types of data in real. Transformation actually helps just for the the couple of functions, for instance, 10**(A*x + B*z +C) and C*(A)**X*(B)**Z functions fit just perfectly, but doesn't work for any others (like A*lg(X) + B*Z + C, C/(1 + A*X + B*Z)).
>
> Right. That's why I said that you have to go through all of the
> functions to see if it's applicable or not.
>
>> I transform the function like this (conditionally):
>> y_data = np.log10(y_data)
>> function = np.log10(function)
>> , is that correct?
>
> Yes.
>
>> And what do you mean by little adapter to transform the data?
>
> I just meant the "y_data = np.log10(y_data)" part.
>
>> By the way, the problem appears only in 3d mode. When I use the same logarithmic data in 2d mode (no Z axis), it works perfectly for all functions, and no log10 transformation needed (this transformation distort the results, make them worse in that case).
>
> Without knowing how you are getting the data to make that
> determination, I don't have much to say about it. The problem of
> inaccurate uncertainties will probably get larger as you increase the
> dimension of the inputs. Since you don't know them, it's probably not
> a good idea to keep trying to do ODR instead of ordinary least
> squares. Use myOdr.set_job(fit_type=2) to use OLS instead. You still
> have a problem with the unknown uncertainties on the Y output, but
> that narrows some of the problems down.
>
>> Do you know any other fitting methods, available in python?
>
> None that free you from this kind of thoughtful analysis. Fitting
> functions isn't a black box, I'm afraid. You need to consider your
> models and your data before fitting, and you need to look at the
> residuals afterwards to check that the results make sense. Then you
> improve your model. Least-squares is not suitable for all models. You
> may need to roll your own error model and use the generic minimization
> routines in scipy.optimize. If you can formulate your models as
> generalized linear models (which you can for a couple of them, but not
> all), you could look at the functionality for this in the statsmodels
> package.

some additional comments

If the nonlinear function are common or standard (in a field), then it
can be possible to find predefined starting values, or figure out
function specific ways to (semi-)automate the estimation.

I didn't look very closely at those, but these packages have a large
collection of nonlinear functions (the last time I looked)

http://code.google.com/p/pyeq2/     (also with global optimizer)
http://packages.python.org/PyModelFit/    maybe 1d only

R has a few "self-starting" non-linear functions, but AFAIK 1d only,
where it's possible to figure out good starting values from the data.

As Robert explained one problem with least squares is that the
variance of the error might not be the same for all observations
(heteroscedasticity in econometrics)

Besides guessing the right transformation, as the log in this example,
there are ways to test for the heteroscedasticity and to estimate the
transformation within a class of nonlinear transformation.

statsmodels has a few statistical tests, but currently only intended
and checked for the linear case (it should extend to nonlinear
functions)

scipy.stats has an estimator for the Box-Cox transformation, but I
looked at it only for the case of identically distributed observation,
not for function fitting. (statsmodels doesn't have any parameterized
transformations yet, only predefined transformation like the log. )
I don't know of anything in python that could currently be directly
used for this.

Josef

> --
> Robert Kern
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From dcday137 at gmail.com  Wed Mar  7 12:51:54 2012
From: dcday137 at gmail.com (Collin Day)
Date: Wed, 7 Mar 2012 17:51:54 +0000 (UTC)
Subject: [SciPy-User] How to feed np.mgrid a variable number of
	'arguments'
References: <CAB=m8w+VVtRgEp_UXGzp+5ai0Pg2yiCi_XYZp6MBq3MHZu_q6A@mail.gmail.com>
Message-ID: <loom.20120307T185046-111@post.gmane.org>

Collin Day <dcday137 <at> gmail.com> writes:


> 
> Hi all,I am guessing there is an easy way to do this, but I am just not seeing
it.? I have a function where I can have a variable number of input dimensions.?
In the function, I need to use np.mgrid to generate the data I need.? How would
I create a line of code that would feed np.mgrid a variable number of inputs??
For example:3d, with 17 nodesa = np.mgrid[0:17,0:17,0:17]4da =
np.mgrid[0:17,0:17,0:17,0:17]Is there a way I can donodes=17inDims = a_numbera =
np.mgrid[0:17,0:17...a_number of times]easily?Thanks!
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
> 

Thanks both of you who answered - this was exactly what I was looking for, but
it would have been forever before I found it!


From peter.cimermancic at gmail.com  Wed Mar  7 21:39:17 2012
From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=)
Date: Wed, 7 Mar 2012 18:39:17 -0800
Subject: [SciPy-User] Generalized least square on large dataset
Message-ID: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>

Hi,

I'd like to linearly fit the data that were NOT sampled independently. I
came across generalized least square method:

b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y

X and Y are coordinates of the data points, and V is a "variance matrix".

The equation is Matlab format - I've tried solving problem there too, bit
it didn't work - but eventually I'd like to be able to solve problems like
that in python. The problem is that due to its size (1000 rows and
columns), the V matrix becomes singular, thus un-invertable. Any
suggestions for how to get around this problem? Maybe using a way of
solving generalized linear regression problem other than GLS?

Regards,

Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120307/a296c7a8/attachment.html>

From josef.pktd at gmail.com  Wed Mar  7 22:09:14 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 7 Mar 2012 22:09:14 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
Message-ID: <CAMMTP+AayebpUPaQp_HBW0NeejgX=O=KGF6_6SB0CWKBG9=dXQ@mail.gmail.com>

On Wed, Mar 7, 2012 at 9:39 PM, Peter Cimerman?i?
<peter.cimermancic at gmail.com> wrote:
> Hi,
>
> I'd like to linearly fit the data that were NOT sampled independently. I
> came across generalized least square method:
>
> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>
>
> X and Y are coordinates of the data points, and V is a "variance matrix".
>
> The equation is Matlab format - I've tried solving problem there too, bit it
> didn't work - but eventually I'd like to be able to solve problems like that
> in python. The problem is that due to its size (1000 rows and columns), the
> V matrix becomes singular, thus un-invertable. Any suggestions for how to
> get around this problem? Maybe using a way of solving generalized linear
> regression problem other than GLS?

V is (nobs,nobs) has nobs*(nobs-1)/2 parameter (or something like
this) (nobs number of observations rows)

I don't think there is a general solution without imposing a lot of
structure on V that reduces the effective number of parameters.
(singular matrix in itself is not necessarily a problem using pinv or
a small Ridge penalty)

Most of the solutions I'm looking for for GLS is for cases where V is
a kronecker product (or block matrix) or where
V^{-0.5} (the matrix to transform the X) has a nice solution. I only
looked at a few special cases so far.

There are some papers that claim that they have an efficient cholesky
of the V inverse for mixed effects models for example but I never
looked at the details.

General rule: if you manage to get it to work in matlab, then it
should be possible to get it to work in numpython (unless ... which
shouldn't apply in this case)

I think the main question is: What kind of V matrix do you have? What
kind of violation if independent sampling?

I'm also still struggling with this question, and with which linear
algebra to use, so I'm also interested in any solutions.

Josef

>
> Regards,
>
> Peter
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From charlesr.harris at gmail.com  Wed Mar  7 22:46:53 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 20:46:53 -0700
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
Message-ID: <CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>

On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimerman?i? <
peter.cimermancic at gmail.com> wrote:

> Hi,
>
> I'd like to linearly fit the data that were NOT sampled independently. I
> came across generalized least square method:
>
> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>
> X and Y are coordinates of the data points, and V is a "variance matrix".
>
> The equation is Matlab format - I've tried solving problem there too, bit
> it didn't work - but eventually I'd like to be able to solve problems like
> that in python. The problem is that due to its size (1000 rows and
> columns), the V matrix becomes singular, thus un-invertable. Any
> suggestions for how to get around this problem? Maybe using a way of
> solving generalized linear regression problem other than GLS?
>
>
Plain old least squares will probably do a decent job for the fit, where
you will run into trouble is if you want to estimate the covariance. The
idea of using the variance matrix is to transform the data set into
independent observations of equal variance, but except in extreme cases
that shouldn't really be necessary if you have sufficient data points.
Weighting the data is a simple case of this that merely equalizes the
variance, and it often doesn't make that much difference.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120307/46b80d93/attachment.html>

From josef.pktd at gmail.com  Wed Mar  7 22:58:10 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 7 Mar 2012 22:58:10 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
Message-ID: <CAMMTP+BwCvWAWsuc2sJi63f_vp7O+87e5X5X60VUs=SEobFx5w@mail.gmail.com>

On Wed, Mar 7, 2012 at 10:46 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimerman?i?
> <peter.cimermancic at gmail.com> wrote:
>>
>> Hi,
>>
>> I'd like to linearly fit the data that were NOT sampled independently. I
>> came across generalized least square method:
>>
>> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>>
>>
>> X and Y are coordinates of the data points, and V is a "variance matrix".
>>
>> The equation is Matlab format - I've tried solving problem there too, bit
>> it didn't work - but eventually I'd like to be able to solve problems like
>> that in python. The problem is that due to its size (1000 rows and columns),
>> the V matrix becomes singular, thus un-invertable. Any suggestions for how
>> to get around this problem? Maybe using a way of solving generalized linear
>> regression problem other than GLS?
>>
>
> Plain old least squares will probably do a decent job for the fit, where you
> will run into trouble is if you want to estimate the covariance.

side question:
Are heteroscedasticity and (auto)correlation robust standard errors
popular in any field outside of economics/econometrics, so called
sandwich estimators of covariance matrix?
(estimate with OLS ignoring non-independent and non-identical noise,
but correct the covariance matrix)

I recently expanded this in statsmodels, and would like to start soon
some advertising in favor of sandwiches.

Josef


> The idea of
> using the variance matrix is to transform the data set into independent
> observations of equal variance, but except in extreme cases that shouldn't
> really be necessary if you have sufficient data points. Weighting the data
> is a simple case of this that merely equalizes the variance, and it often
> doesn't make that much difference.
>
> Chuck
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From charlesr.harris at gmail.com  Wed Mar  7 23:00:24 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 21:00:24 -0700
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
Message-ID: <CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>

On Wed, Mar 7, 2012 at 8:46 PM, Charles R Harris
<charlesr.harris at gmail.com>wrote:

>
>
> On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimerman?i? <
> peter.cimermancic at gmail.com> wrote:
>
>> Hi,
>>
>> I'd like to linearly fit the data that were NOT sampled independently. I
>> came across generalized least square method:
>>
>> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>>
>> X and Y are coordinates of the data points, and V is a "variance matrix".
>>
>> The equation is Matlab format - I've tried solving problem there too, bit
>> it didn't work - but eventually I'd like to be able to solve problems like
>> that in python. The problem is that due to its size (1000 rows and
>> columns), the V matrix becomes singular, thus un-invertable. Any
>> suggestions for how to get around this problem? Maybe using a way of
>> solving generalized linear regression problem other than GLS?
>>
>>
> Plain old least squares will probably do a decent job for the fit, where
> you will run into trouble is if you want to estimate the covariance. The
> idea of using the variance matrix is to transform the data set into
> independent observations of equal variance, but except in extreme cases
> that shouldn't really be necessary if you have sufficient data points.
> Weighting the data is a simple case of this that merely equalizes the
> variance, and it often doesn't make that much difference.
>
>
To expand a bit, if it is simply the case that the measurement errors
aren't independent and you know their covariance, then you want to minimize
(y - Ax)^T * cov^-1 * (y - ax) and if you factor cov^-1 into U^T * U, then
you can solve the ordinary least squares problem U*A*x = U*y. I can't
really tell what your data/problem is like without more details.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120307/d5c292e3/attachment.html>

From charlesr.harris at gmail.com  Wed Mar  7 23:04:04 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 21:04:04 -0700
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+BwCvWAWsuc2sJi63f_vp7O+87e5X5X60VUs=SEobFx5w@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAMMTP+BwCvWAWsuc2sJi63f_vp7O+87e5X5X60VUs=SEobFx5w@mail.gmail.com>
Message-ID: <CAB6mnxJ5zt0=1zKJ0GxwqbidFuM+CGoeA6425oE0erCxyLcO_Q@mail.gmail.com>

On Wed, Mar 7, 2012 at 8:58 PM, <josef.pktd at gmail.com> wrote:

> On Wed, Mar 7, 2012 at 10:46 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimerman?i?
> > <peter.cimermancic at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I'd like to linearly fit the data that were NOT sampled independently. I
> >> came across generalized least square method:
> >>
> >> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
> >>
> >>
> >> X and Y are coordinates of the data points, and V is a "variance
> matrix".
> >>
> >> The equation is Matlab format - I've tried solving problem there too,
> bit
> >> it didn't work - but eventually I'd like to be able to solve problems
> like
> >> that in python. The problem is that due to its size (1000 rows and
> columns),
> >> the V matrix becomes singular, thus un-invertable. Any suggestions for
> how
> >> to get around this problem? Maybe using a way of solving generalized
> linear
> >> regression problem other than GLS?
> >>
> >
> > Plain old least squares will probably do a decent job for the fit, where
> you
> > will run into trouble is if you want to estimate the covariance.
>
> side question:
> Are heteroscedasticity and (auto)correlation robust standard errors
> popular in any field outside of economics/econometrics, so called
> sandwich estimators of covariance matrix?
> (estimate with OLS ignoring non-independent and non-identical noise,
> but correct the covariance matrix)
>
> I recently expanded this in statsmodels, and would like to start soon
> some advertising in favor of sandwiches.
>
>
I'm not familiar with them, but I can't speak for many. Indeed, there seems
to be the most rudimentary understanding of statistics in many fields,
basically reducible to root sum of squares for the more sophisticated ;)

But I think I was contemplating something similar to what you mention.
Sounds interesting.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120307/fb526510/attachment.html>

From peter.cimermancic at gmail.com  Wed Mar  7 23:25:36 2012
From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=)
Date: Wed, 7 Mar 2012 20:25:36 -0800
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
Message-ID: <CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>

To describe my problem into more details, I have a list of ~1000 bacterial
genome lengths and number of certain genes for each one of them. I'd like
to see if there is any correlation between genome lengths and number of the
genes. It may look like an easy linear regression problem; however, one has
to be a bit more careful as the measurements aren't sampled independently.
Bacteria, whose genomes are similar, tend to also contain similar number of
the genes. Bacterial similarity is what is described with matrix V - it
contains similarity values for each pair of bacteria, ranging from 0 to 1.

Anybody encountered similar problem already?


On Wed, Mar 7, 2012 at 8:00 PM, Charles R Harris
<charlesr.harris at gmail.com>wrote:

>
>
> On Wed, Mar 7, 2012 at 8:46 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimerman?i? <
>> peter.cimermancic at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'd like to linearly fit the data that were NOT sampled independently. I
>>> came across generalized least square method:
>>>
>>> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>>>
>>> X and Y are coordinates of the data points, and V is a "variance matrix".
>>>
>>> The equation is Matlab format - I've tried solving problem there too,
>>> bit it didn't work - but eventually I'd like to be able to solve problems
>>> like that in python. The problem is that due to its size (1000 rows and
>>> columns), the V matrix becomes singular, thus un-invertable. Any
>>> suggestions for how to get around this problem? Maybe using a way of
>>> solving generalized linear regression problem other than GLS?
>>>
>>>
>> Plain old least squares will probably do a decent job for the fit, where
>> you will run into trouble is if you want to estimate the covariance. The
>> idea of using the variance matrix is to transform the data set into
>> independent observations of equal variance, but except in extreme cases
>> that shouldn't really be necessary if you have sufficient data points.
>> Weighting the data is a simple case of this that merely equalizes the
>> variance, and it often doesn't make that much difference.
>>
>>
> To expand a bit, if it is simply the case that the measurement errors
> aren't independent and you know their covariance, then you want to minimize
> (y - Ax)^T * cov^-1 * (y - ax) and if you factor cov^-1 into U^T * U, then
> you can solve the ordinary least squares problem U*A*x = U*y. I can't
> really tell what your data/problem is like without more details.
>
> Chuck
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120307/10c8b9a8/attachment.html>

From josef.pktd at gmail.com  Wed Mar  7 23:26:07 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 7 Mar 2012 23:26:07 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAB6mnxJ5zt0=1zKJ0GxwqbidFuM+CGoeA6425oE0erCxyLcO_Q@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAMMTP+BwCvWAWsuc2sJi63f_vp7O+87e5X5X60VUs=SEobFx5w@mail.gmail.com>
	<CAB6mnxJ5zt0=1zKJ0GxwqbidFuM+CGoeA6425oE0erCxyLcO_Q@mail.gmail.com>
Message-ID: <CAMMTP+Dks-eOxHZg=5zcj1DzcADRDra5+ZxiEcoQCXPU29j4-Q@mail.gmail.com>

On Wed, Mar 7, 2012 at 11:04 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Mar 7, 2012 at 8:58 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Wed, Mar 7, 2012 at 10:46 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimerman?i?
>> > <peter.cimermancic at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I'd like to linearly fit the data that were NOT sampled independently.
>> >> I
>> >> came across generalized least square method:
>> >>
>> >> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>> >>
>> >>
>> >> X and Y are coordinates of the data points, and V is a "variance
>> >> matrix".
>> >>
>> >> The equation is Matlab format - I've tried solving problem there too,
>> >> bit
>> >> it didn't work - but eventually I'd like to be able to solve problems
>> >> like
>> >> that in python. The problem is that due to its size (1000 rows and
>> >> columns),
>> >> the V matrix becomes singular, thus un-invertable. Any suggestions for
>> >> how
>> >> to get around this problem? Maybe using a way of solving generalized
>> >> linear
>> >> regression problem other than GLS?
>> >>
>> >
>> > Plain old least squares will probably do a decent job for the fit, where
>> > you
>> > will run into trouble is if you want to estimate the covariance.
>>
>> side question:
>> Are heteroscedasticity and (auto)correlation robust standard errors
>> popular in any field outside of economics/econometrics, so called
>> sandwich estimators of covariance matrix?
>> (estimate with OLS ignoring non-independent and non-identical noise,
>> but correct the covariance matrix)
>>
>> I recently expanded this in statsmodels, and would like to start soon
>> some advertising in favor of sandwiches.
>>
>
> I'm not familiar with them, but I can't speak for many. Indeed, there seems
> to be the most rudimentary understanding of statistics in many fields,
> basically reducible to root sum of squares for the more sophisticated ;)
>
> But I think I was contemplating something similar to what you mention.
> Sounds interesting.

Basic idea in an example:
Suppose you have a large sample where the noise is very highly autocorrelated.
OLS assumes you have a lot of independent observation and the standard
errors will be small. The real standard errors are much larger because
observations close to each other are almost the same.
Robust standard errors correct for this without assuming much about
the actual correlations.

A (a bit cryptic) example
https://groups.google.com/forum/#!topic/pystatsmodels/GKaQrfyyN7c/discussion
that turned into a discussion about how to write a blog ;)

Josef

>
> <snip>
>
> Chuck
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Thu Mar  8 00:36:35 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Mar 2012 00:36:35 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
Message-ID: <CAMMTP+Df8YUHVr-HSP8y0F1Tx0TnpEpOeNPD-cKQWSU=fmqtVA@mail.gmail.com>

On Wed, Mar 7, 2012 at 11:25 PM, Peter Cimerman?i?
<peter.cimermancic at gmail.com> wrote:
> To describe my problem into more details, I have a list of ~1000 bacterial
> genome lengths and number of certain genes for each one of them. I'd like to
> see if there is any correlation between genome lengths and number of the
> genes. It may look like an easy linear regression problem; however, one has
> to be a bit more careful as the measurements aren't sampled independently.
> Bacteria, whose genomes are similar, tend to also contain similar number of
> the genes. Bacterial similarity is what is described with matrix V - it
> contains similarity values for each pair of bacteria, ranging from 0 to 1.
>
> Anybody encountered similar problem already?

The closest I can think of is spatial econometrics where V is the
spatial similarity
http://pysal.geodacenter.org/1.3/library/spreg/ols.html
But I never looked at the details of the spatial specification, and I
don't know pysal covers this case.

(The V is also similar to the correlation in Gaussian Processes, but
they do only local estimation as far as I have looked at it.)

(I have only vague ideas about this case, If you have the
distances/similarity, then you need to estimate the correlation as a
function of the similarity. If the correlation matrix is not
invertible, then it should be possible to just use the generalized
inverse, pinv, of V. 1000x1000 doesn't sound too big to pinv or to use
an svd. But I don't see any reason that the covariance matrix should
be singular.)

But as Chuck said, OLS would still be a consistent estimator, but for
standard errors a correction will be necessary. (If it's not in pysal,
then it might not be trivial to work out the correction of the
standard error.)

interesting problem

Josef
(...) == maybe on topic

>
>
>
> On Wed, Mar 7, 2012 at 8:00 PM, Charles R Harris <charlesr.harris at gmail.com>
> wrote:
>>
>>
>>
>> On Wed, Mar 7, 2012 at 8:46 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>>
>>>
>>>
>>> On Wed, Mar 7, 2012 at 7:39 PM, Peter Cimerman?i?
>>> <peter.cimermancic at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'd like to linearly fit the data that were NOT sampled independently. I
>>>> came across generalized least square method:
>>>>
>>>> b=(X'*V^(-1)*X)^(-1)*X'*V^(-1)*Y
>>>>
>>>>
>>>> X and Y are coordinates of the data points, and V is a "variance
>>>> matrix".
>>>>
>>>> The equation is Matlab format - I've tried solving problem there too,
>>>> bit it didn't work - but eventually I'd like to be able to solve problems
>>>> like that in python. The problem is that due to its size (1000 rows and
>>>> columns), the V matrix becomes singular, thus un-invertable. Any suggestions
>>>> for how to get around this problem? Maybe using a way of solving generalized
>>>> linear regression problem other than GLS?
>>>>
>>>
>>> Plain old least squares will probably do a decent job for the fit, where
>>> you will run into trouble is if you want to estimate the covariance. The
>>> idea of using the variance matrix is to transform the data set into
>>> independent observations of equal variance, but except in extreme cases that
>>> shouldn't really be necessary if you have sufficient data points. Weighting
>>> the data is a simple case of this that merely equalizes the variance, and it
>>> often doesn't make that much difference.
>>>
>>
>> To expand a bit, if it is simply the case that the measurement errors
>> aren't independent and you know their covariance, then you want to minimize
>> (y - Ax)^T * cov^-1 * (y - ax) and if you factor cov^-1 into U^T * U, then
>> you can solve the ordinary least squares problem U*A*x = U*y. I can't really
>> tell what your data/problem is like without more details.
>>
>> Chuck
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From charlesr.harris at gmail.com  Thu Mar  8 08:35:23 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 8 Mar 2012 06:35:23 -0700
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
Message-ID: <CAB6mnxKO=AXhcE7XHjA_Tp6EOAw9rtSWLUktTq1_2an3bYG11w@mail.gmail.com>

On Wed, Mar 7, 2012 at 9:25 PM, Peter Cimerman?i? <
peter.cimermancic at gmail.com> wrote:

> To describe my problem into more details, I have a list of ~1000 bacterial
> genome lengths and number of certain genes for each one of them. I'd like
> to see if there is any correlation between genome lengths and number of the
> genes. It may look like an easy linear regression problem; however, one has
> to be a bit more careful as the measurements aren't sampled independently.
> Bacteria, whose genomes are similar, tend to also contain similar number of
> the genes. Bacterial similarity is what is described with matrix V - it
> contains similarity values for each pair of bacteria, ranging from 0 to 1.
>
> Anybody encountered similar problem already?
>
>
Ah, that sounds like a fairly common sort of thing to deal with, separating
the effect of two variables, but it is out of the area of my experience.
The statisticians around here should be able to say something useful about
it.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/c9f83f0a/attachment.html>

From njs at pobox.com  Thu Mar  8 10:04:05 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2012 15:04:05 +0000
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
Message-ID: <CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>

On Thu, Mar 8, 2012 at 4:25 AM, Peter Cimerman?i?
<peter.cimermancic at gmail.com> wrote:
> To describe my problem into more details, I have a list of ~1000 bacterial
> genome lengths and number of certain genes for each one of them. I'd like to
> see if there is any correlation between genome lengths and number of the
> genes. It may look like an easy linear regression problem; however, one has
> to be a bit more careful as the measurements aren't sampled independently.
> Bacteria, whose genomes are similar, tend to also contain similar number of
> the genes. Bacterial similarity is what is described with matrix V - it
> contains similarity values for each pair of bacteria, ranging from 0 to 1.
>
> Anybody encountered similar problem already?

I agree with Josef, the first thing that comes to mind is controlling
for spatial effects (which happens in various fields; ecology folks
worry about this a lot too).

In this case, though, I think you may need to think more carefully
about whether your similarity measure is really appropriate. If your
matrix is uninvertible, then IIUC that means you think that you
effectively have less than 1000 distinct genomes -- some of your
genomes are "so similar" to other ones that they can be predicted
*exactly*.

In terms of the underlying probabilistic model: you have some
population of bacteria genomes, and you picked 1000 of them to study.
Each genome you picked has some length, and it also has a number of
genes. The number of genes is determined probabilistically by taking
some linear function of the length, and then adding some Gaussian
noise. Your goal is to figure out what that linear function looks
like.

In OLS, we assume that each of those Gaussian noise terms is IID. In
GLS, we assume that they're correlated. The way to think about this is
that we take 1000 uncorrelated IID gaussian samples, let's call this
vector "g", and then we mix them together by multiplying by a matrix
chol(V), chol(V)*g. (cholV) is the cholesky decomposition; it's
triangular, and chol(V)*chol(V)' = V.) So the noise added to each
measurement is a mixture of these underlying IID gaussian terms, and
bacteria that are more similar have noise terms that overlap more.

If V is singular, then this means that the last k rows of chol(V) are
all-zero, which means that when you compute chol(V)*g, there are some
elements of g that get ignored entirely -- they don't mixed in at all,
and don't effect any bacteria. So, your choice of V is encoding an
assumption that there are really only, like, 900 noise terms and 1000
bacteria, and 'g' effectively only has 900 entries. So if you make the
measurements for the first 900 bacteria, you should be able to
reconstruct the full vector 'g', and then you can use it to compute
*exactly* what measurements you will see for the last 100 bacteria.
And also you can compute the linear relationship exactly. No need to
do any hypothesis tests on the result (and in fact you can't do any
hypothesis tests on the result, the math won't work), because you know
The Truth!

Of course none of these assumptions are actually true. Your bacteria
are less similar to each other -- and your measurements more noisy --
than your V matrix claims.

So you need a better way to compute V. The nice thing about the above
derivation -- and the reason I bothered to go through it -- is that it
tells you what entries in V mean, numerically. Ideally you should
figure out how to re-calibrate your similarity score so that bacteria
which are 0.5 similar have a covariance of 0.5 in their noise --
perhaps by calculating empirical covariances on other measures or
something, figuring out the best way to do this calibration will take
domain knowledge. The ecology folks might have some practical ideas on
how to calibrate such things.

Or, you could just replace V with V+\lambda*I. That'd solve the
numerical problem, but you should be very suspicious of any p-values
you get out, since they are based on lies ;-).

-- Nathaniel


From peter.cimermancic at gmail.com  Thu Mar  8 11:09:04 2012
From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=)
Date: Thu, 8 Mar 2012 08:09:04 -0800
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
Message-ID: <CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>

>
>
>
> I agree with Josef, the first thing that comes to mind is controlling
> for spatial effects (which happens in various fields; ecology folks
> worry about this a lot too).
>
> In this case, though, I think you may need to think more carefully
> about whether your similarity measure is really appropriate. If your
> matrix is uninvertible, then IIUC that means you think that you
> effectively have less than 1000 distinct genomes -- some of your
> genomes are "so similar" to other ones that they can be predicted
> *exactly*.
>


That's exactly true - some of the bacteria are almost identical (I'll try
filtering those and see if it changes anything).


>
> In terms of the underlying probabilistic model: you have some
> population of bacteria genomes, and you picked 1000 of them to study.
> Each genome you picked has some length, and it also has a number of
> genes. The number of genes is determined probabilistically by taking
> some linear function of the length, and then adding some Gaussian
> noise. Your goal is to figure out what that linear function looks
> like.
>
> In OLS, we assume that each of those Gaussian noise terms is IID. In
> GLS, we assume that they're correlated. The way to think about this is
> that we take 1000 uncorrelated IID gaussian samples, let's call this
> vector "g", and then we mix them together by multiplying by a matrix
> chol(V), chol(V)*g. (cholV) is the cholesky decomposition; it's
> triangular, and chol(V)*chol(V)' = V.) So the noise added to each
> measurement is a mixture of these underlying IID gaussian terms, and
> bacteria that are more similar have noise terms that overlap more.
>
>
I'm also unable to calculate chol of my V matrix, because it doesn't appear
to be a positive definite. Any suggestion here?


> If V is singular, then this means that the last k rows of chol(V) are
> all-zero, which means that when you compute chol(V)*g, there are some
> elements of g that get ignored entirely -- they don't mixed in at all,
> and don't effect any bacteria. So, your choice of V is encoding an
> assumption that there are really only, like, 900 noise terms and 1000
> bacteria, and 'g' effectively only has 900 entries. So if you make the
> measurements for the first 900 bacteria, you should be able to
> reconstruct the full vector 'g', and then you can use it to compute
> *exactly* what measurements you will see for the last 100 bacteria.
> And also you can compute the linear relationship exactly. No need to
> do any hypothesis tests on the result (and in fact you can't do any
> hypothesis tests on the result, the math won't work), because you know
> The Truth!
>
> Of course none of these assumptions are actually true. Your bacteria
> are less similar to each other -- and your measurements more noisy --
> than your V matrix claims.
>
> So you need a better way to compute V. The nice thing about the above
> derivation -- and the reason I bothered to go through it -- is that it
> tells you what entries in V mean, numerically. Ideally you should
> figure out how to re-calibrate your similarity score so that bacteria
> which are 0.5 similar have a covariance of 0.5 in their noise --
> perhaps by calculating empirical covariances on other measures or
> something, figuring out the best way to do this calibration will take
> domain knowledge. The ecology folks might have some practical ideas on
> how to calibrate such things.
>
> Or, you could just replace V with V+\lambda*I. That'd solve the
> numerical problem, but you should be very suspicious of any p-values
> you get out, since they are based on lies ;-).
>

Yes, p-values are something I'd eventually like to come close to.

Thank you for your answer!

 Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/5721df9b/attachment.html>

From njs at pobox.com  Thu Mar  8 11:31:52 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2012 16:31:52 +0000
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
	<CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
Message-ID: <CAPJVwBkdKY81eyKmz_mb5PwdRnxsch_vUMA=ugn1M3D7DkkWXg@mail.gmail.com>

On Thu, Mar 8, 2012 at 4:09 PM, Peter Cimerman?i?
<peter.cimermancic at gmail.com> wrote:
>>
>>
>> I agree with Josef, the first thing that comes to mind is controlling
>> for spatial effects (which happens in various fields; ecology folks
>> worry about this a lot too).
>>
>> In this case, though, I think you may need to think more carefully
>> about whether your similarity measure is really appropriate. If your
>> matrix is uninvertible, then IIUC that means you think that you
>> effectively have less than 1000 distinct genomes -- some of your
>> genomes are "so similar" to other ones that they can be predicted
>> *exactly*.
>
> That's exactly true - some of the bacteria are almost identical (I'll try
> filtering those and see if it changes anything).

You aren't just telling the computer that they're almost identical --
that would be fine, the model would just mostly-but-not-entirely
ignore the near-duplicates. You're telling the computer that they are
exactly identical and you had no reason to even collect the data
because you knew ahead of time exactly what it would be. This is the
sort of thing that really confuses statistical programs :-).

>> In terms of the underlying probabilistic model: you have some
>> population of bacteria genomes, and you picked 1000 of them to study.
>> Each genome you picked has some length, and it also has a number of
>> genes. The number of genes is determined probabilistically by taking
>> some linear function of the length, and then adding some Gaussian
>> noise. Your goal is to figure out what that linear function looks
>> like.
>>
>> In OLS, we assume that each of those Gaussian noise terms is IID. In
>> GLS, we assume that they're correlated. The way to think about this is
>> that we take 1000 uncorrelated IID gaussian samples, let's call this
>> vector "g", and then we mix them together by multiplying by a matrix
>> chol(V), chol(V)*g. (cholV) is the cholesky decomposition; it's
>> triangular, and chol(V)*chol(V)' = V.) So the noise added to each
>> measurement is a mixture of these underlying IID gaussian terms, and
>> bacteria that are more similar have noise terms that overlap more.
>>
>
> I'm also unable to calculate chol of my V matrix, because it doesn't appear
> to be a positive definite. Any suggestion here?

Singular matrices can't be positive definite, by definition. They can
be positive semi-definite. (The analogy is numbers -- a number that is
zero cannot be greater than zero, by definition. But it can be >=
zero.) Any well-defined covariance matrix is necessarily positive
semi-definite. If your covariance matrix isn't positive semi-definite,
then that's like claiming that you have three random variables where A
and B have a correlation of 0.99, and B and C have a correlation of
0.99, but A and C are uncorrelated. That's impossible. ([[1, 0.99, 0],
[0.99, 1, 0.99], [0, 0.99, 1]] is not a positive-definite matrix.)

Singular, positive semi-definite matrices do *have* Cholesky
decompositions, but your average off-the-shelf Cholesky routine can't
compute them. (Again by analogy -- in theory you can compute the
square root of zero, but in practice you can't reliably with floating
point, because your "zero" may turn out to actually be represented as
"-2.2e-16" or something, and an off-the-shelf square root routine will
blow up on this because it looks negative.) You can look around for a
"rank-revealing Cholesky", perhaps.

Anyway, the question is whether your matrix is positive semi-definite.
If it is, then this is all expected, and your problem is just that you
need to fix your covariances to be more realistic, as discussed. If it
isn't, then you don't even have a covariance matrix, and again you
need to figure out how to get one :-). You can check for positive
(semi-)definiteness by looking at the eigenvalues -- they should be
all >= 0 for semi-definite, > 0 for definite.

The easiest way to manufacture a positive-definite matrix on command
is to take a non-singular matrix A and compute A'A.

HTH,
- N


From charlesr.harris at gmail.com  Thu Mar  8 11:32:42 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 8 Mar 2012 09:32:42 -0700
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
	<CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
Message-ID: <CAB6mnxLaqSofrEMYbpwiCbgarQh=zwyctaL1rDNajHgFbWaLeg@mail.gmail.com>

On Thu, Mar 8, 2012 at 9:09 AM, Peter Cimerman?i? <
peter.cimermancic at gmail.com> wrote:

>
>>
>> I agree with Josef, the first thing that comes to mind is controlling
>> for spatial effects (which happens in various fields; ecology folks
>> worry about this a lot too).
>>
>> In this case, though, I think you may need to think more carefully
>> about whether your similarity measure is really appropriate. If your
>> matrix is uninvertible, then IIUC that means you think that you
>> effectively have less than 1000 distinct genomes -- some of your
>> genomes are "so similar" to other ones that they can be predicted
>> *exactly*.
>>
>
>
> That's exactly true - some of the bacteria are almost identical (I'll try
> filtering those and see if it changes anything).
>
>
>
>
>>
>> In terms of the underlying probabilistic model: you have some
>> population of bacteria genomes, and you picked 1000 of them to study.
>> Each genome you picked has some length, and it also has a number of
>> genes. The number of genes is determined probabilistically by taking
>> some linear function of the length, and then adding some Gaussian
>> noise. Your goal is to figure out what that linear function looks
>> like.
>>
>> In OLS, we assume that each of those Gaussian noise terms is IID. In
>> GLS, we assume that they're correlated. The way to think about this is
>> that we take 1000 uncorrelated IID gaussian samples, let's call this
>> vector "g", and then we mix them together by multiplying by a matrix
>> chol(V), chol(V)*g. (cholV) is the cholesky decomposition; it's
>> triangular, and chol(V)*chol(V)' = V.) So the noise added to each
>> measurement is a mixture of these underlying IID gaussian terms, and
>> bacteria that are more similar have noise terms that overlap more.
>>
>>
> I'm also unable to calculate chol of my V matrix, because it doesn't
> appear to be a positive definite. Any suggestion here?
>
>
>> If V is singular, then this means that the last k rows of chol(V) are
>> all-zero, which means that when you compute chol(V)*g, there are some
>> elements of g that get ignored entirely -- they don't mixed in at all,
>> and don't effect any bacteria. So, your choice of V is encoding an
>> assumption that there are really only, like, 900 noise terms and 1000
>> bacteria, and 'g' effectively only has 900 entries. So if you make the
>> measurements for the first 900 bacteria, you should be able to
>> reconstruct the full vector 'g', and then you can use it to compute
>> *exactly* what measurements you will see for the last 100 bacteria.
>> And also you can compute the linear relationship exactly. No need to
>> do any hypothesis tests on the result (and in fact you can't do any
>> hypothesis tests on the result, the math won't work), because you know
>> The Truth!
>>
>> Of course none of these assumptions are actually true. Your bacteria
>> are less similar to each other -- and your measurements more noisy --
>> than your V matrix claims.
>>
>> So you need a better way to compute V. The nice thing about the above
>> derivation -- and the reason I bothered to go through it -- is that it
>> tells you what entries in V mean, numerically. Ideally you should
>> figure out how to re-calibrate your similarity score so that bacteria
>> which are 0.5 similar have a covariance of 0.5 in their noise --
>> perhaps by calculating empirical covariances on other measures or
>> something, figuring out the best way to do this calibration will take
>> domain knowledge. The ecology folks might have some practical ideas on
>> how to calibrate such things.
>>
>> Or, you could just replace V with V+\lambda*I. That'd solve the
>> numerical problem, but you should be very suspicious of any p-values
>> you get out, since they are based on lies ;-).
>>
>
> Yes, p-values are something I'd eventually like to come close to.
>
>
I think Josef's suggestion would be a good place to start. The problem
seems to be that the similarity matrix isn't a correlation matrix. What
Josef suggests would give you some way of analysing what the actual
correlations are.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/9e70ff33/attachment.html>

From josef.pktd at gmail.com  Thu Mar  8 11:58:42 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Mar 2012 11:58:42 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAPJVwBkdKY81eyKmz_mb5PwdRnxsch_vUMA=ugn1M3D7DkkWXg@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
	<CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
	<CAPJVwBkdKY81eyKmz_mb5PwdRnxsch_vUMA=ugn1M3D7DkkWXg@mail.gmail.com>
Message-ID: <CAMMTP+B_NAbyKPYs49PRqpdZa8GRvc-ORnem=B05LEf3ar3RWA@mail.gmail.com>

On Thu, Mar 8, 2012 at 11:31 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Mar 8, 2012 at 4:09 PM, Peter Cimerman?i?
> <peter.cimermancic at gmail.com> wrote:
>>>
>>>
>>> I agree with Josef, the first thing that comes to mind is controlling
>>> for spatial effects (which happens in various fields; ecology folks
>>> worry about this a lot too).
>>>
>>> In this case, though, I think you may need to think more carefully
>>> about whether your similarity measure is really appropriate. If your
>>> matrix is uninvertible, then IIUC that means you think that you
>>> effectively have less than 1000 distinct genomes -- some of your
>>> genomes are "so similar" to other ones that they can be predicted
>>> *exactly*.
>>
>> That's exactly true - some of the bacteria are almost identical (I'll try
>> filtering those and see if it changes anything).
>
> You aren't just telling the computer that they're almost identical --
> that would be fine, the model would just mostly-but-not-entirely
> ignore the near-duplicates. You're telling the computer that they are
> exactly identical and you had no reason to even collect the data
> because you knew ahead of time exactly what it would be. This is the
> sort of thing that really confuses statistical programs :-).
>
>>> In terms of the underlying probabilistic model: you have some
>>> population of bacteria genomes, and you picked 1000 of them to study.
>>> Each genome you picked has some length, and it also has a number of
>>> genes. The number of genes is determined probabilistically by taking
>>> some linear function of the length, and then adding some Gaussian
>>> noise. Your goal is to figure out what that linear function looks
>>> like.
>>>
>>> In OLS, we assume that each of those Gaussian noise terms is IID. In
>>> GLS, we assume that they're correlated. The way to think about this is
>>> that we take 1000 uncorrelated IID gaussian samples, let's call this
>>> vector "g", and then we mix them together by multiplying by a matrix
>>> chol(V), chol(V)*g. (cholV) is the cholesky decomposition; it's
>>> triangular, and chol(V)*chol(V)' = V.) So the noise added to each
>>> measurement is a mixture of these underlying IID gaussian terms, and
>>> bacteria that are more similar have noise terms that overlap more.
>>>
>>
>> I'm also unable to calculate chol of my V matrix, because it doesn't appear
>> to be a positive definite. Any suggestion here?
>
> Singular matrices can't be positive definite, by definition. They can
> be positive semi-definite. (The analogy is numbers -- a number that is
> zero cannot be greater than zero, by definition. But it can be >=
> zero.) Any well-defined covariance matrix is necessarily positive
> semi-definite. If your covariance matrix isn't positive semi-definite,
> then that's like claiming that you have three random variables where A
> and B have a correlation of 0.99, and B and C have a correlation of
> 0.99, but A and C are uncorrelated. That's impossible. ([[1, 0.99, 0],
> [0.99, 1, 0.99], [0, 0.99, 1]] is not a positive-definite matrix.)
>
> Singular, positive semi-definite matrices do *have* Cholesky
> decompositions, but your average off-the-shelf Cholesky routine can't
> compute them. (Again by analogy -- in theory you can compute the
> square root of zero, but in practice you can't reliably with floating
> point, because your "zero" may turn out to actually be represented as
> "-2.2e-16" or something, and an off-the-shelf square root routine will
> blow up on this because it looks negative.) You can look around for a
> "rank-revealing Cholesky", perhaps.

I would use SVD or eigenvalue decomposition to get the transformation
matrix. With reduced rank and dropping zero eigenvalues, I think, the
transformation will just drop some observations that are redundant.

Or for normal equations, use X pinv(V) X beta = X pinv(V) y    which
uses SVD inside and requires less work writing the code.

I'm reasonably sure that I have seen the pinv used this way before.

That still leaves going from similarity matrix to covariance matrix.

Josef

>
> Anyway, the question is whether your matrix is positive semi-definite.
> If it is, then this is all expected, and your problem is just that you
> need to fix your covariances to be more realistic, as discussed. If it
> isn't, then you don't even have a covariance matrix, and again you
> need to figure out how to get one :-). You can check for positive
> (semi-)definiteness by looking at the eigenvalues -- they should be
> all >= 0 for semi-definite, > 0 for definite.
>
> The easiest way to manufacture a positive-definite matrix on command
> is to take a non-singular matrix A and compute A'A.
>
> HTH,
> - N
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From peter.cimermancic at gmail.com  Thu Mar  8 12:32:45 2012
From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=)
Date: Thu, 8 Mar 2012 09:32:45 -0800
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+B_NAbyKPYs49PRqpdZa8GRvc-ORnem=B05LEf3ar3RWA@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
	<CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
	<CAPJVwBkdKY81eyKmz_mb5PwdRnxsch_vUMA=ugn1M3D7DkkWXg@mail.gmail.com>
	<CAMMTP+B_NAbyKPYs49PRqpdZa8GRvc-ORnem=B05LEf3ar3RWA@mail.gmail.com>
Message-ID: <CAMYwcco-LhcqgnCCa1fLvs9bUftj3s_2LM358DsnKgWOnPfxow@mail.gmail.com>

>
>
>
> I would use SVD or eigenvalue decomposition to get the transformation
> matrix. With reduced rank and dropping zero eigenvalues, I think, the
> transformation will just drop some observations that are redundant.
>
> Or for normal equations, use X pinv(V) X beta = X pinv(V) y    which
> uses SVD inside and requires less work writing the code.
>
> I'm reasonably sure that I have seen the pinv used this way before.
>
> That still leaves going from similarity matrix to covariance matrix.
>

Yes, pinv() solved the compute problem (no errors anymore). I've also found
some papers describing how to get from a similarity matrix to correlation.
Do you maybe know, are p-values (from MSE calculation) fairly accurate this
way?


Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/3a50a8be/attachment.html>

From josef.pktd at gmail.com  Thu Mar  8 13:07:17 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Mar 2012 13:07:17 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwcco-LhcqgnCCa1fLvs9bUftj3s_2LM358DsnKgWOnPfxow@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
	<CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
	<CAPJVwBkdKY81eyKmz_mb5PwdRnxsch_vUMA=ugn1M3D7DkkWXg@mail.gmail.com>
	<CAMMTP+B_NAbyKPYs49PRqpdZa8GRvc-ORnem=B05LEf3ar3RWA@mail.gmail.com>
	<CAMYwcco-LhcqgnCCa1fLvs9bUftj3s_2LM358DsnKgWOnPfxow@mail.gmail.com>
Message-ID: <CAMMTP+CxJKndGb6CPZ1o_GM3LcFWM-vG8+ttmVxJ6xdfwuDhiQ@mail.gmail.com>

On Thu, Mar 8, 2012 at 12:32 PM, Peter Cimerman?i?
<peter.cimermancic at gmail.com> wrote:
>>
>>
>> I would use SVD or eigenvalue decomposition to get the transformation
>> matrix. With reduced rank and dropping zero eigenvalues, I think, the
>> transformation will just drop some observations that are redundant.
>>
>> Or for normal equations, use X pinv(V) X beta = X pinv(V) y ? ?which
>> uses SVD inside and requires less work writing the code.
>>
>> I'm reasonably sure that I have seen the pinv used this way before.
>>
>> That still leaves going from similarity matrix to covariance matrix.
>
>
> Yes, pinv() solved the compute problem (no errors anymore). I've also found
> some papers describing how to get from a similarity matrix to correlation.
> Do you maybe know, are p-values (from MSE calculation) fairly accurate this
> way?

While parameter estimates are pretty robust, the standard errors and
the pvalues depend a lot on additional assumptions.
If the assumptions are not satsified with a given datasets, then the
pvalues can be pretty far off. For example if your error covariance
matrix (from the V) is misspecified, then it could be the case that
the pvalues are not very accurate.
In a small sample assuming normal distribution might be a problem, but
I would expect that for 1000 observations (or close to it) asymptotic
normality will be accurate enough.

With only one regressor (plus constant) multicollinearity cannot have
a negative impact, so I wouldn't expect any other numerical problems.

If your pvalue is 0.04 or 0.11 then I would do some additional
specification checks. If the pvalue is 0.6 or 1.e-4, then I wouldn't
worry about pvalue accuracy.

Comparing the GLS standard errors with the (in this case incorrect)
standard errors from OLS might give some idea about how much p-values
can change with your data.

I would be interested in hearing how you get from a similarity matrix
to correlation matrix in your case. I would like to see if it is very
difficult to include something like this in statsmodels.

Josef

>
>
> Peter
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From charlesr.harris at gmail.com  Thu Mar  8 13:35:08 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 8 Mar 2012 11:35:08 -0700
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+CxJKndGb6CPZ1o_GM3LcFWM-vG8+ttmVxJ6xdfwuDhiQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
	<CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
	<CAPJVwBkdKY81eyKmz_mb5PwdRnxsch_vUMA=ugn1M3D7DkkWXg@mail.gmail.com>
	<CAMMTP+B_NAbyKPYs49PRqpdZa8GRvc-ORnem=B05LEf3ar3RWA@mail.gmail.com>
	<CAMYwcco-LhcqgnCCa1fLvs9bUftj3s_2LM358DsnKgWOnPfxow@mail.gmail.com>
	<CAMMTP+CxJKndGb6CPZ1o_GM3LcFWM-vG8+ttmVxJ6xdfwuDhiQ@mail.gmail.com>
Message-ID: <CAB6mnxKqZjc3i8Qp6vcM3bbQr0wNBDpfrnpmyeKhdvSTmceUTQ@mail.gmail.com>

On Thu, Mar 8, 2012 at 11:07 AM, <josef.pktd at gmail.com> wrote:

> On Thu, Mar 8, 2012 at 12:32 PM, Peter Cimerman?i?
> <peter.cimermancic at gmail.com> wrote:
> >>
> >>
> >> I would use SVD or eigenvalue decomposition to get the transformation
> >> matrix. With reduced rank and dropping zero eigenvalues, I think, the
> >> transformation will just drop some observations that are redundant.
> >>
> >> Or for normal equations, use X pinv(V) X beta = X pinv(V) y    which
> >> uses SVD inside and requires less work writing the code.
> >>
> >> I'm reasonably sure that I have seen the pinv used this way before.
> >>
> >> That still leaves going from similarity matrix to covariance matrix.
> >
> >
> > Yes, pinv() solved the compute problem (no errors anymore). I've also
> found
> > some papers describing how to get from a similarity matrix to
> correlation.
> > Do you maybe know, are p-values (from MSE calculation) fairly accurate
> this
> > way?
>
> While parameter estimates are pretty robust, the standard errors and
> the pvalues depend a lot on additional assumptions.
> If the assumptions are not satsified with a given datasets, then the
> pvalues can be pretty far off. For example if your error covariance
> matrix (from the V) is misspecified, then it could be the case that
> the pvalues are not very accurate.
> In a small sample assuming normal distribution might be a problem, but
> I would expect that for 1000 observations (or close to it) asymptotic
> normality will be accurate enough.
>
> With only one regressor (plus constant) multicollinearity cannot have
> a negative impact, so I wouldn't expect any other numerical problems.
>
> If your pvalue is 0.04 or 0.11 then I would do some additional
> specification checks. If the pvalue is 0.6 or 1.e-4, then I wouldn't
> worry about pvalue accuracy.
>
> Comparing the GLS standard errors with the (in this case incorrect)
> standard errors from OLS might give some idea about how much p-values
> can change with your data.
>
> I would be interested in hearing how you get from a similarity matrix
> to correlation matrix in your case. I would like to see if it is very
> difficult to include something like this in statsmodels.
>
>
With a model this simple there are likely to be significant systematic
errors, which would make it even more difficult to interpret significance.
OTOH, this may be a case where the residuals are as interesting as the
parameter values.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/1ec856a8/attachment.html>

From njs at pobox.com  Thu Mar  8 14:04:31 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2012 19:04:31 +0000
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+CxJKndGb6CPZ1o_GM3LcFWM-vG8+ttmVxJ6xdfwuDhiQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<CAPJVwBn_1_Pni3wSPyQn71tdNwD1MUEbV3kWT7SPag9rCWJpWw@mail.gmail.com>
	<CAMYwccoDz25jLBScfgSOLJG+E03Afb7_v3+jV8RK6-27R-bwsQ@mail.gmail.com>
	<CAPJVwBkdKY81eyKmz_mb5PwdRnxsch_vUMA=ugn1M3D7DkkWXg@mail.gmail.com>
	<CAMMTP+B_NAbyKPYs49PRqpdZa8GRvc-ORnem=B05LEf3ar3RWA@mail.gmail.com>
	<CAMYwcco-LhcqgnCCa1fLvs9bUftj3s_2LM358DsnKgWOnPfxow@mail.gmail.com>
	<CAMMTP+CxJKndGb6CPZ1o_GM3LcFWM-vG8+ttmVxJ6xdfwuDhiQ@mail.gmail.com>
Message-ID: <CAPJVwBkTANHRwYVwu7hC=sMgTUrr8bWsGe6M7c4PXwD_FMyFDg@mail.gmail.com>

On Thu, Mar 8, 2012 at 6:07 PM,  <josef.pktd at gmail.com> wrote:
> While parameter estimates are pretty robust, the standard errors and
> the pvalues depend a lot on additional assumptions.
> If the assumptions are not satsified with a given datasets, then the
> pvalues can be pretty far off. For example if your error covariance
> matrix (from the V) is misspecified, then it could be the case that
> the pvalues are not very accurate.
> In a small sample assuming normal distribution might be a problem, but
> I would expect that for 1000 observations (or close to it) asymptotic
> normality will be accurate enough.
>
> With only one regressor (plus constant) multicollinearity cannot have
> a negative impact, so I wouldn't expect any other numerical problems.
>
> If your pvalue is 0.04 or 0.11 then I would do some additional
> specification checks. If the pvalue is 0.6 or 1.e-4, then I wouldn't
> worry about pvalue accuracy.
>
> Comparing the GLS standard errors with the (in this case incorrect)
> standard errors from OLS might give some idea about how much p-values
> can change with your data.

These kinds of GLS models are one of the places where having the wrong
model can give you arbitrarily spurious p values. To get an intuition,
consider the case where your errors are all very highly correlated, so
while you made N measurements, you really only effectively have 1.
Without proper correction, as N increases, your p value will get
arbitrarily small... even though you still only have 1 real data
point. Most cases aren't so extreme, of course, but that's the kind of
thing you have to be careful of -- underestimating your correlations =
overestimating your significance.

A good thing to do is check whether the resulting residuals "look
uncorrelated" -- if you have corrected for similarity in the analysis,
then bacteria that are similar to each other should not have similar
residuals, overall. A coarse check of this would be to come up with
some method for visualizing similarity spatially (like clustering your
bacteria into a dendrogram, or using factor analysis to plot coarse
similarity in 1 or 2 dimensions), and then using this to arrange your
residuals. Then you'd want to check that you don't see any overall
patterns, like one part of the plot has residuals that are
systematically larger than another part.

- N


From eemselle at eso.org  Thu Mar  8 15:29:22 2012
From: eemselle at eso.org (Eric Emsellem)
Date: Thu, 08 Mar 2012 21:29:22 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy not
	up to the task?
Message-ID: <4F5916A2.2040604@eso.org>

Dear all,

I know the title looks a little provocative, but this was obviously done 
on purpose. I am very impressed by the capabilities of scipy (et al., 
numpy etc) and have been a fan since years! But one thing (in my 
opinion) seems to be missing  (see below). If it exists, then great (and 
apologies)!

What I didn't find in Scipy (or numpy or..) is *an efficient 
least-squares fitting routine which can include bounded, or fixed 
parameters*. This seems like something many people must be needing! I am 
right now using mpfit.py (from minpack then Craig B. Markwardt for idl 
and Mark Rivers for python), which I did integrate in the package I am 
developing. It is much faster than many other routines in scipy although 
Adam Ginsburg did mention some test-bench he conducted some time ago, 
showing that leastsq was quite efficient. It can include bounds, fixed 
parameters etc. And it works great! But this is probably not the best 
way to have such a stand-alone routine... and it is far from being 
optimised for the modern python.

So:

is there ANY plan for having such a module in Scipy?? I think 
(personally) that this is a MUST DO. This is typically the type of 
routines that I hear people use in e.g., idl etc. If this could be an 
optimised, fast (and easy to use) routine, all the better.

Any input is welcome!
Thanks.

Eric


From adam.ginsburg at colorado.edu  Thu Mar  8 15:40:31 2012
From: adam.ginsburg at colorado.edu (Adam Ginsburg)
Date: Thu, 8 Mar 2012 13:40:31 -0700
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
Message-ID: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>

Hi, I've recently (surprisingly) gotten scipy to compile by following
these http://blog.hyperjeff.net/?p=160 instructions.  However, if I
try to import scipy.interpolate, it fails.  I'm trying to install
scipy into a virtualenv environment, though I don't think that's the
issue because I have another install in a Framework that sees the same
error.

I'm using numpy 1.6.1, scipy 0.10.1, mac OS X 10.6.8.

Can anyone help me understand the following error?

$ ~/virtual-python/bin/python -c "import scipy, scipy.interpolate"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/adam/virtual-python/lib/python2.7/site-packages/scipy/interpolate/__init__.py",
line 156, in <module>
    from ndgriddata import *
  File "/Users/adam/virtual-python/lib/python2.7/site-packages/scipy/interpolate/ndgriddata.py",
line 9, in <module>
    from interpnd import LinearNDInterpolator, NDInterpolatorBase, \
  File "numpy.pxd", line 174, in init interpnd
(scipy/interpolate/interpnd.c:7771)
ValueError: numpy.ndarray has the wrong size, try recompiling

Thanks,
-- 
Adam Ginsburg
Graduate Student
Center for Astrophysics and Space Astronomy
University of Colorado at Boulder
http://casa.colorado.edu/~ginsbura/


From gael.varoquaux at normalesup.org  Thu Mar  8 16:07:22 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Thu, 8 Mar 2012 22:07:22 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <4F5916A2.2040604@eso.org>
References: <4F5916A2.2040604@eso.org>
Message-ID: <20120308210722.GC12436@phare.normalesup.org>

I am sorry I am going to react to the provocation.

As some one who spends a fair amount of time working on open source
software I hear such remarks quite often: 'why is feature foo not
implemented in package bar?. I am finding it harder and harder not to
react negatively to these emails. Now I cannot consider myself as a
contributor to scipy, and thus I can claim that I am not taking your
comment personally.

Why isn't scipy not up to the task? Will, the answer is quite simple:
because it's developed by volunteers that do it on their spare time, late
at night too often, or companies that put some of their benefits in open
source rather in locking down a market. 90% of the time the reason the
feature isn't as good as you would want it is because of lack of time.

I personally find that suggesting that somebody else should put more of
the time and money they are already giving away in improving a feature
that you need is almost insulting.

I am aware that people do not realize how small the group of people that
develop and maintain their toys is. Borrowing from Fernando Perez's talk
at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide 80),
the number of people that do 90% of the grunt work to get the core
scientific Python ecosystem going is around two handfuls.

I'd like to think that it's a problem of skill set: users that have the
ability to contribute are just too rare. This is not entirely true, there
are scores of skilled people on the mailing lists. You yourself mention
that you are developing a package.

Sorry for the rant, but if you want things to improve, you will have more
successes sending in pull request than messages on mailing list that
sound condescending to my ears.

I hope that I haven't overreacted too badly.

Ga?l


From gael.varoquaux at normalesup.org  Thu Mar  8 16:09:35 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Thu, 8 Mar 2012 22:09:35 +0100
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
Message-ID: <20120308210935.GD12436@phare.normalesup.org>

On Thu, Mar 08, 2012 at 01:40:31PM -0700, Adam Ginsburg wrote:
>     from interpnd import LinearNDInterpolator, NDInterpolatorBase, \
>   File "numpy.pxd", line 174, in init interpnd
> (scipy/interpolate/interpnd.c:7771)
> ValueError: numpy.ndarray has the wrong size, try recompiling

My guess is that the numpy headers that were used during the compilation
of scipy do no correspond to the numpy that is being imported when you
are trying to import scipy. In other words, your compilation environment
doesn't match well your run time environment.

HTH,

Gael


From keflavich at gmail.com  Thu Mar  8 16:59:44 2012
From: keflavich at gmail.com (Keflavich)
Date: Thu, 8 Mar 2012 13:59:44 -0800 (PST)
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <20120308210935.GD12436@phare.normalesup.org>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
	<20120308210935.GD12436@phare.normalesup.org>
Message-ID: <fdacaa84-24c9-411a-9bc2-c62817c83e74@2g2000yqk.googlegroups.com>

That's plausible.  How do I specify which numpy is used when compiling
scipy?

On Mar 8, 2:09?pm, Gael Varoquaux <gael.varoqu... at normalesup.org>
wrote:
> On Thu, Mar 08, 2012 at 01:40:31PM -0700, Adam Ginsburg wrote:
> > ? ? from interpnd import LinearNDInterpolator, NDInterpolatorBase, \
> > ? File "numpy.pxd", line 174, in init interpnd
> > (scipy/interpolate/interpnd.c:7771)
> > ValueError: numpy.ndarray has the wrong size, try recompiling
>
> My guess is that the numpy headers that were used during the compilation
> of scipy do no correspond to the numpy that is being imported when you
> are trying to import scipy. In other words, your compilation environment
> doesn't match well your run time environment.
>
> HTH,
>
> Gael
> _______________________________________________
> SciPy-User mailing list
> SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user


From gael.varoquaux at normalesup.org  Thu Mar  8 17:01:40 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Thu, 8 Mar 2012 23:01:40 +0100
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <fdacaa84-24c9-411a-9bc2-c62817c83e74@2g2000yqk.googlegroups.com>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
	<20120308210935.GD12436@phare.normalesup.org>
	<fdacaa84-24c9-411a-9bc2-c62817c83e74@2g2000yqk.googlegroups.com>
Message-ID: <20120308220140.GA19681@phare.normalesup.org>

On Thu, Mar 08, 2012 at 01:59:44PM -0800, Keflavich wrote:
> That's plausible.  How do I specify which numpy is used when compiling
> scipy?

It should be the one that is imported by Python when you type 'import
numpy'. Basically, in scipy's 'setup.py', the header are found using the
'numpy.get_include_folder()' function.

Ga?l


From eemselle at eso.org  Thu Mar  8 18:12:17 2012
From: eemselle at eso.org (Eric Emsellem)
Date: Fri, 09 Mar 2012 00:12:17 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <20120308210722.GC12436@phare.normalesup.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
Message-ID: <4F593CD1.6010207@eso.org>

Dear Gael

thanks for the feedback.

Well yes, I thought of the fact that an email (with all the drawbacks of 
such a medium) may not be the right way, and that my message may be 
misinterpreted. I obviously didn't mean to be offensive here. I am one 
of the biggest fan of python et al., open source in general etc, and I 
try to contribute whenever I can with my very limited expertise (I 
recently opened a github account and am still fighting to finalised my 
badly-written package to release it in case it could be useful to anyone).

I believe it is fine to react as you did, because it does at least show 
that people care (seeing the bright side).  I apologize: if I offended 
you, I probably offended others. This was definitely not my initial 
intent. And I was, for sure, not condescending, by very very far, as 
again I am impressed by what has been achieved (I did use python at a 
time when most modules were very buggy or hard to handle, and you had to 
tweak your system to make it work at all, not talking about early 
version of e.g. linux distribs).

I am  putting energy (at my own, very modest, level) to have more users 
of e.g., python. This means organising tutorials and providing advice 
when relevant. I try to do that as much as I can.

The question about a leastsq I relayed is often what I hear from 
non-experts when looking at what is available. I always first try to 
look for an easy solution, to convince them that something like that 
exists, or that you can solve the problem anyway. In the specific case I 
describe, I didn't find a way out (besides using the very useful 
implementation of mpfit!). I do believe that, considering the amazing 
amount of extensive development in scipy and related packages, such a 
routine should be available and directly linked with the main, big, 
packages. This was basically my two cents (probably badly toned) to 
trigger a reaction. I did for sure get one, but not the one I expected. 
So once more: apologies to all I may have offended.

Let's see where this goes now. I see that there is a lmfit package 
(thanks Matt for the input!). I'll have a look at this asap and try to 
test it on my own minimisation problem. I do think that having a 
well-tested module integrated within scipy would be a big plus. Apart 
from testing these on my specific problems, I cannot offer much 
considering my limited expertise (in programming and maths).

cheers
Eric

On 03/08/2012 10:07 PM, Gael Varoquaux wrote:
> I am sorry I am going to react to the provocation.
>
> As some one who spends a fair amount of time working on open source
> software I hear such remarks quite often: 'why is feature foo not
> implemented in package bar?. I am finding it harder and harder not to
> react negatively to these emails. Now I cannot consider myself as a
> contributor to scipy, and thus I can claim that I am not taking your
> comment personally.
>


From eemselle at eso.org  Thu Mar  8 18:17:19 2012
From: eemselle at eso.org (Eric Emsellem)
Date: Fri, 09 Mar 2012 00:17:19 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
Message-ID: <4F593DFF.90101@eso.org>


> Yes, see https://github.com/newville/lmfit-py,  which does everything 
> you ask for, and a bit more, with the possible exception of "being 
> included in scipy".   For what its worth, I work with Mark Rivers 
> (who's no longer actively developing Python), and our group is full of 
> IDL users who are very familiar with Markwardt's implementation.
>
> The lmfit-py version uses scipy.optimize.leastsq(), which uses MINPACK 
> directly, so has the advantage of not being implemented in pure IDL or 
> Python. It is definitely faster than mpfit.py.
>
> With lmfit-py, one writes a python function-to-minimize that takes a 
> list of Parameters instead of the array of floating point variables 
> that scipy.optimize.leastsq() uses. Each Parameter can be freely 
> varied of fixed, have upper and/or lower bounds placed on them, or be 
> written as algebraic expressions of other Parameters.   Uncertainties 
> in varied Parameters and correlations between Parameters are estimated 
> using the same "scaled covariance" method as used in 
> scipy.optimize.curve_fit().   There is limited support for 
> optimization methods other than scipy.optimize.leastsq(), but I don't 
> find these methods to be very useful for the kind of fitting  problems 
> I normally see, so support for them may not be perfect.
>
> Whether this gets included into scipy is up to the scipy developers. 
> I'd be happy to support this module within scipy or outside scipy.
> I have no doubt that improvements could be made to lmfit.py.   If you 
> have suggestion, I'd be happy to hear them.

looks great! I'll have a go at this, as mentioned in my previous post. I 
believe that leastsq is probably the fastest anyway (according to the 
test Adam mentioned to me today) so this could be it. I'll make a test 
and compare it with mpfit (for the specific case I am thinking of, I am 
optimising over ~10^5-6 points with ~90 parameters...).

thanks again for this, and I'll try to report on this (if relevant) asap.

Eric


From keflavich at gmail.com  Thu Mar  8 18:27:48 2012
From: keflavich at gmail.com (Keflavich)
Date: Thu, 8 Mar 2012 15:27:48 -0800 (PST)
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <20120308220140.GA19681@phare.normalesup.org>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
	<20120308210935.GD12436@phare.normalesup.org>
	<fdacaa84-24c9-411a-9bc2-c62817c83e74@2g2000yqk.googlegroups.com>
	<20120308220140.GA19681@phare.normalesup.org>
Message-ID: <1dfb3f19-81a1-42b7-bb6e-16ec2084e964@q18g2000yqh.googlegroups.com>

Well, it took another half-dozen clean rebuilds, but I got it
working.  Thanks!

(clarification: it's numpy.get_include(), not
numpy.get_include_folder(), I think)

On Mar 8, 3:01?pm, Gael Varoquaux <gael.varoqu... at normalesup.org>
wrote:
> On Thu, Mar 08, 2012 at 01:59:44PM -0800, Keflavich wrote:
> > That's plausible. ?How do I specify which numpy is used when compiling
> > scipy?
>
> It should be the one that is imported by Python when you type 'import
> numpy'. Basically, in scipy's 'setup.py', the header are found using the
> 'numpy.get_include_folder()' function.
>
> Ga?l
> _______________________________________________
> SciPy-User mailing list
> SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user


From pav at iki.fi  Thu Mar  8 19:00:12 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 09 Mar 2012 01:00:12 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <4F5916A2.2040604@eso.org>
References: <4F5916A2.2040604@eso.org>
Message-ID: <jjbh6d$etj$1@dough.gmane.org>

08.03.2012 21:29, Eric Emsellem kirjoitti:
[clip]
> What I didn't find in Scipy (or numpy or..) is *an efficient 
> least-squares fitting routine which can include bounded, or fixed 
> parameters*. This seems like something many people must be needing! I am 
> right now using mpfit.py (from minpack then Craig B. Markwardt for idl 
> and Mark Rivers for python), which I did integrate in the package I am 
> developing.

mpfit is a Fortran-to-Python translation of a MINPACK routine. Scipy's
leastsq uses the original MINPACK Fortran code, so it's probably more
efficient than mpfit.py. However, the bounded parameters seems to be a
more recent addition that are not in the original.

The good news is that mpfit license seems at first sight compatible with
Scipy's. There's also an existing pull request for reimplementation of
Levenberg-Marquardt which might also work as a base for further work,
although IIRC it didn't implement bound limits. The only thing missing
is someone who needs this stuff and is not averse for a little bit of
dirty work, combining the existing pieces and making sure that the API
makes sense.

-- 
Pauli Virtanen


From david_baddeley at yahoo.com.au  Thu Mar  8 21:37:48 2012
From: david_baddeley at yahoo.com.au (David Baddeley)
Date: Thu, 8 Mar 2012 18:37:48 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
	not up to the task?
In-Reply-To: <jjbh6d$etj$1@dough.gmane.org>
References: <4F5916A2.2040604@eso.org> <jjbh6d$etj$1@dough.gmane.org>
Message-ID: <1331260668.7917.YahooMailNeo@web113405.mail.gq1.yahoo.com>

You guys beat me too it - I just wanted to add that support for fixed parameters is already available in leastsq (albeit with an interface which means you have to decide which parameters are fixed when you're writing your objective function), and it's not too hard to kludge bounds by performing variable substitution (using,?for example, the square of the variable if you want a one-ended constraint, or a sigmoidal function of the variable such as erf or the logistic function for an interval). This may in fact be preferable to the approach taken by mpfit in which parameters are "pegged" at the boundary as soon as they touch it.

cheers,
David?


________________________________
 From: Pauli Virtanen <pav at iki.fi>
To: scipy-user at scipy.org 
Sent: Friday, 9 March 2012 1:00 PM
Subject: Re: [SciPy-User] Least-squares fittings with bounds: why is scipy not up to the task?
 
08.03.2012 21:29, Eric Emsellem kirjoitti:
[clip]
> What I didn't find in Scipy (or numpy or..) is *an efficient 
> least-squares fitting routine which can include bounded, or fixed 
> parameters*. This seems like something many people must be needing! I am 
> right now using mpfit.py (from minpack then Craig B. Markwardt for idl 
> and Mark Rivers for python), which I did integrate in the package I am 
> developing.

mpfit is a Fortran-to-Python translation of a MINPACK routine. Scipy's
leastsq uses the original MINPACK Fortran code, so it's probably more
efficient than mpfit.py. However, the bounded parameters seems to be a
more recent addition that are not in the original.

The good news is that mpfit license seems at first sight compatible with
Scipy's. There's also an existing pull request for reimplementation of
Levenberg-Marquardt which might also work as a base for further work,
although IIRC it didn't implement bound limits. The only thing missing
is someone who needs this stuff and is not averse for a little bit of
dirty work, combining the existing pieces and making sure that the API
makes sense.

-- 
Pauli Virtanen

_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/30003e63/attachment.html>

From david_baddeley at yahoo.com.au  Thu Mar  8 22:14:23 2012
From: david_baddeley at yahoo.com.au (David Baddeley)
Date: Thu, 8 Mar 2012 19:14:23 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
	not up to the task?
In-Reply-To: <4F593DFF.90101@eso.org>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
Message-ID: <1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>

>From a pure performance perspective, you're probably going to be best setting your bounds by variable substitution (particularly if they're only single-ended - x**2 is cheap) - you really don't want to have the for loops, dictionary lookups and conditionals that lmfit introduces for it's bounds checking inside your objective function.

I think a high level wrapper that permitted bounds, an unadulterated goal function, and setting which parameters to fit, but also retained much of the raw speed of leastsq could be accomplished with some clever on the fly code generation (maybe also using Sympy to automatically derive the Jacobian). Would make an interesting project ...

David


________________________________
 From: Eric Emsellem <eemselle at eso.org>
To: Matthew Newville <matt.newville at gmail.com> 
Cc: scipy-user at scipy.org; scipy-user at googlegroups.com 
Sent: Friday, 9 March 2012 12:17 PM
Subject: Re: [SciPy-User] Least-squares fittings with bounds: why is scipy not up to the task?
 

> Yes, see https://github.com/newville/lmfit-py,? which does everything 
> you ask for, and a bit more, with the possible exception of "being 
> included in scipy".?  For what its worth, I work with Mark Rivers 
> (who's no longer actively developing Python), and our group is full of 
> IDL users who are very familiar with Markwardt's implementation.
>
> The lmfit-py version uses scipy.optimize.leastsq(), which uses MINPACK 
> directly, so has the advantage of not being implemented in pure IDL or 
> Python. It is definitely faster than mpfit.py.
>
> With lmfit-py, one writes a python function-to-minimize that takes a 
> list of Parameters instead of the array of floating point variables 
> that scipy.optimize.leastsq() uses. Each Parameter can be freely 
> varied of fixed, have upper and/or lower bounds placed on them, or be 
> written as algebraic expressions of other Parameters.?  Uncertainties 
> in varied Parameters and correlations between Parameters are estimated 
> using the same "scaled covariance" method as used in 
> scipy.optimize.curve_fit().?  There is limited support for 
> optimization methods other than scipy.optimize.leastsq(), but I don't 
> find these methods to be very useful for the kind of fitting? problems 
> I normally see, so support for them may not be perfect.
>
> Whether this gets included into scipy is up to the scipy developers. 
> I'd be happy to support this module within scipy or outside scipy.
> I have no doubt that improvements could be made to lmfit.py.?  If you 
> have suggestion, I'd be happy to hear them.

looks great! I'll have a go at this, as mentioned in my previous post. I 
believe that leastsq is probably the fastest anyway (according to the 
test Adam mentioned to me today) so this could be it. I'll make a test 
and compare it with mpfit (for the specific case I am thinking of, I am 
optimising over ~10^5-6 points with ~90 parameters...).

thanks again for this, and I'll try to report on this (if relevant) asap.

Eric
_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/d0db5c7c/attachment.html>

From mdekauwe at gmail.com  Thu Mar  8 22:31:10 2012
From: mdekauwe at gmail.com (mdekauwe)
Date: Thu, 8 Mar 2012 19:31:10 -0800 (PST)
Subject: [SciPy-User] [SciPy-user] Is there a better way to read a CSV file
 and store for processing?
Message-ID: <33469432.post@talk.nabble.com>


Hi,

So I was wondering if there might be a "better" way to go about reading a
CSV file and storing it for later post-processing. What I have written does
the job fine, but I think there might be a better way as I seem to be
duplicating some steps to get around things I don't know. For example I
guess ideally I would like to read the CSV file into a numpy array one could
access by variable names but I couldn't work that out. Any thoughts welcome.

Thanks...

CSV file looks a bit like this

Year,Day of the year,NPP, etc...
--,--,some units, etc...
YEAR,DOY,NPP, etc...
1996.0,1.0,10.09, etc...
etc
etc

Code...

#!/usr/bin/env python

"""
Example of reading CSV file and some simple processing...

    1. Read CSV file into a python dictionary/list
    2. Save the data to a pickle object, to speed up reading back in 
    3. Read the object back in to test everything is fine
    4. Get the timeseries of one of the variables, print it and plot it...
"""
__author__ = "Martin De Kauwe"
__version__ = "1.0 (09.03.2012)"
__email__ = "mdekauwe at gmail.com"

import numpy as np
import sys
import glob
import csv
import cPickle as pickle

def main():
    for fname in glob.glob("*.csv"): 
        data = read_csv_file(fname, head_length=3, delim=",")
        
        # save the data to the hard disk for quick access later
        pkl_fname = "test_model_data.pkl"
        save_dictionary(data, pkl_fname)
        
        # read the data back in to check it worked...
        f = open(pkl_fname, 'rb')
        data = pickle.load(f)
        
        npp = get_var(data, "NPP")
        for i in xrange(len(npp)):
            print npp[i]
            
        import matplotlib.pyplot as plt
        plt.plot(npp, "ro-")
        plt.show()


def read_csv_file(fname, head_length=None, delim=None):
    """ read the csv file into a dictionary """
    f = open(fname, "rb")
    
    # read the correct header keys...
    f = find_header_keys(f, line_with_keys=2)
    
    # read the data into a nice big dictionary...and return as a list
    reader = csv.DictReader(f, delimiter=',')
    data = [row for row in reader]
    
    return data
    
def find_header_keys(fp, line_with_keys=None):
    """ Incase the csv file doesn't have the header keys on the first line,
    advanced the pointer until the line we desire """
    dialect = csv.Sniffer().sniff(fp.read(1024))
    fp.seek(0)
    for i in xrange(line_with_keys):
        next(fp)
    return fp
    
def save_dictionary(data, outfname):
    """ save dictionary to disk, i.e. pickle it """
    out_dict = open(outfname, 'wb')
    pickle.dump(data, out_dict, pickle.HIGHEST_PROTOCOL)
    out_dict.close()    
    
def get_var(data, var):
    """ return the entire time series for a given variable """
    return np.asarray([data[i][var] for i in xrange(len(data))])
    

if __name__ == "__main__":
    
    main()


-- 
View this message in context: http://old.nabble.com/Is-there-a-better-way-to-read-a-CSV-file-and-store-for-processing--tp33469432p33469432.html
Sent from the Scipy-User mailing list archive at Nabble.com.


From william.ratcliff at gmail.com  Fri Mar  9 01:20:37 2012
From: william.ratcliff at gmail.com (william ratcliff)
Date: Fri, 9 Mar 2012 01:20:37 -0500
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
Message-ID: <CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>

A response to Gael:

I don't think the problem is just a question of motivation/effort being put
in by people who are asking for new features.  Perhaps I'm overly
optimistic, but I think that most people are aware of the effort put in by
very busy people to put together scipy and are rather grateful that the
tool exists.  Many of us would like to see python adopted more in our
organizations and find that some tools that our users are used to are not
available and would like to see them made available.   However, I think
that the barrier to inclusion in scipy seems high.

A bit of history:

We also used the IDL version of  mpfit--in moving to python, I looked for
an analogue and found mpfit.py that at the time, relied on Numeric.   I
made a port to numpy and found Sergei Koposov had made a similar port (
http://code.google.com/p/astrolibpy/wiki/AstrolibpyContents) in addition to
fixing some bugs in the original and adding extensions.   I talked with all
of the stakeholders to receive licensing permission for inclusion into
scipy.   However, there were questions about the coding style, and it never
made it in.  (I'm happy to see the lmfit project on github)

Also, related to bounds, in the past, the scipy implementation of simulated
annealing would wander outside of bounds (which is not my expected
behavior--or that of many people), so I made a patch, which if a guess
would place the next point out of bounds, would stay at the same spot and
guess again (it may not be optimal, but it preserves ergodicity).   I was
told that if someone actually wanted the function to stay in bounds, they
could add a penalty function.

The end result:  I have my own version of anneal.py and mpfit.py.   I would
like to contribute.  I have students that work on things that might be of
general use.  However, the process needs to be more streamlined if the
community wants more participation.   Sympy is a good example of this--if
you have something that seems useful, they are very happy to take it--and
clean it up later (or help you to).  I've only watched sckit-learn from a
far, but http://scikit-learn.org/stable/developers/index.html seems to
provide rather clear instructions for contributing...

Take the question of bounds for example--is it better to have no easy way
of implementing bounds, or to have the cleanest/most efficient piece of
code?  What is the actual process of contributing these days?   For
example, for making a patch, now that the codebase is on github.  Do we
make a fork, patch, and point to the fork?  Submit a patch?  If so, where?


What exactly are scikits?  What determines if something belongs in
scipy.optimize as compared to a scikit?  What is the process for creating a
scikit?  The webpage is a bit vague.   Do scikits share more than a
namespace?

Sorry that this is a bit disorganized, but the TL;DR is that I think scipy
could do more to make it easier for people to contribute...I understand the
need to have maintainable code in a large project, but in many cases,
having a less than perfect implementation (with tests) would be better than
having no implementation...Also, what may be easy for us, may not be easy
to many users of scipy, so having convenience methods is worthwhile...

Best,
William


On Thu, Mar 8, 2012 at 10:14 PM, David Baddeley <david_baddeley at yahoo.com.au
> wrote:

> From a pure performance perspective, you're probably going to be best
> setting your bounds by variable substitution (particularly if they're only
> single-ended - x**2 is cheap) - you really don't want to have the for
> loops, dictionary lookups and conditionals that lmfit introduces for it's
> bounds checking inside your objective function.
>
> I think a high level wrapper that permitted bounds, an unadulterated goal
> function, and setting which parameters to fit, but also retained much of
> the raw speed of leastsq could be accomplished with some clever on the fly
> code generation (maybe also using Sympy to automatically derive the
> Jacobian). Would make an interesting project ...
>
> David
>
>   ------------------------------
> *From:* Eric Emsellem <eemselle at eso.org>
> *To:* Matthew Newville <matt.newville at gmail.com>
> *Cc:* scipy-user at scipy.org; scipy-user at googlegroups.com
> *Sent:* Friday, 9 March 2012 12:17 PM
>
> *Subject:* Re: [SciPy-User] Least-squares fittings with bounds: why is
> scipy not up to the task?
>
>
>
> > Yes, see https://github.com/newville/lmfit-py,  which does everything
> > you ask for, and a bit more, with the possible exception of "being
> > included in scipy".  For what its worth, I work with Mark Rivers
> > (who's no longer actively developing Python), and our group is full of
> > IDL users who are very familiar with Markwardt's implementation.
> >
> > The lmfit-py version uses scipy.optimize.leastsq(), which uses MINPACK
> > directly, so has the advantage of not being implemented in pure IDL or
> > Python. It is definitely faster than mpfit.py.
> >
> > With lmfit-py, one writes a python function-to-minimize that takes a
> > list of Parameters instead of the array of floating point variables
> > that scipy.optimize.leastsq() uses. Each Parameter can be freely
> > varied of fixed, have upper and/or lower bounds placed on them, or be
> > written as algebraic expressions of other Parameters.  Uncertainties
> > in varied Parameters and correlations between Parameters are estimated
> > using the same "scaled covariance" method as used in
> > scipy.optimize.curve_fit().  There is limited support for
> > optimization methods other than scipy.optimize.leastsq(), but I don't
> > find these methods to be very useful for the kind of fitting  problems
> > I normally see, so support for them may not be perfect.
> >
> > Whether this gets included into scipy is up to the scipy developers.
> > I'd be happy to support this module within scipy or outside scipy.
> > I have no doubt that improvements could be made to lmfit.py.  If you
> > have suggestion, I'd be happy to hear them.
>
> looks great! I'll have a go at this, as mentioned in my previous post. I
> believe that leastsq is probably the fastest anyway (according to the
> test Adam mentioned to me today) so this could be it. I'll make a test
> and compare it with mpfit (for the specific case I am thinking of, I am
> optimising over ~10^5-6 points with ~90 parameters...).
>
> thanks again for this, and I'll try to report on this (if relevant) asap.
>
> Eric
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/dba5914c/attachment.html>

From gael.varoquaux at normalesup.org  Fri Mar  9 01:31:20 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Fri, 9 Mar 2012 07:31:20 +0100
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <1dfb3f19-81a1-42b7-bb6e-16ec2084e964@q18g2000yqh.googlegroups.com>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
	<20120308210935.GD12436@phare.normalesup.org>
	<fdacaa84-24c9-411a-9bc2-c62817c83e74@2g2000yqk.googlegroups.com>
	<20120308220140.GA19681@phare.normalesup.org>
	<1dfb3f19-81a1-42b7-bb6e-16ec2084e964@q18g2000yqh.googlegroups.com>
Message-ID: <20120309063120.GC2046@phare.normalesup.org>

On Thu, Mar 08, 2012 at 03:27:48PM -0800, Keflavich wrote:
> Well, it took another half-dozen clean rebuilds, but I got it
> working.  Thanks!

Well, good job on persisting: half a dozen rebuild is too much :). Do you
have an idea what finally made the difference?

> (clarification: it's numpy.get_include(), not
> numpy.get_include_folder(), I think)

Good point, I was writing this email in a rush, without checking my
facts.

Gael

> On Mar 8, 3:01?pm, Gael Varoquaux <gael.varoqu... at normalesup.org>
> wrote:
> > On Thu, Mar 08, 2012 at 01:59:44PM -0800, Keflavich wrote:
> > > That's plausible. ?How do I specify which numpy is used when compiling
> > > scipy?

> > It should be the one that is imported by Python when you type 'import
> > numpy'. Basically, in scipy's 'setup.py', the header are found using the
> > 'numpy.get_include_folder()' function.

> > Ga?l
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    Laboratoire de Neuro-Imagerie Assistee par Ordinateur
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info


From gael.varoquaux at normalesup.org  Fri Mar  9 01:57:10 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Fri, 9 Mar 2012 07:57:10 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
	<CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
Message-ID: <20120309065710.GD2046@phare.normalesup.org>

David,

You raise good points. On the one hand, contributing to scipy may be a
bit more technical than it should be for someone wanting to add simple
things: it requires building the beast! On the other hand, a lot of the
points you raise simply boil down to the fact that putting independent
piece of code together in a larger, somewhat consistent project, is
actually much more work than writing the individual pieces of code. There
is a large overhead of big project. It is also much more value.

With regards to the scikits, that you mention, I am a huge fan of the
scikits approach, because it enables to break down a bit that friction of
large project, to the cost of some fragmentation. Let us not fool us, the
major scikits can easily grow fat and end up in the same situation,
although if I believe the N^2 complexity increase (
http://www.computer.org/csdl/trans/ts/1979/02/01702600-abs.html ) means
that the union of two projects will tend to be significantly harder to
handle that each project separately.

To answer your question about scikits: they are nothing much more than a
brand name, and maybe a bit of a community trend. Both are good, because
they help create dynamism, but they are not enough per se.

With regards to how disorganized things are, you are definitely right.
Organizing a community, writing contribution guidelines, keeping web
pages up to date, takes a lot of time. Such work, also from volunteers,
is also needed to keep a project alive, and even more a 'meta project',
like scipy.

Ga?l

PS: As a side note, I am recruiting on a regular basis (say one young
engineer everything year) people to work part time on the scipy ecosystem
(mainly on scikit-learn, but side projects are encouraged). The salary
doesn't compete with what the industry has to offer, and I've had a hard
time finding good people.


From eemselle at eso.org  Fri Mar  9 03:46:37 2012
From: eemselle at eso.org (Eric Emsellem)
Date: Fri, 09 Mar 2012 09:46:37 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
Message-ID: <4F59C36D.3020607@eso.org>

Thanks David for this.

The main issue is not in fact to solve the pb for myself (with some variable 
substitution or..) as I can also think of e.g. interfacing C/fortran efficient 
codes with python via standard wrapping (I had used this with e.g. the amazing 
NAG library with the help of expert programmers).

There are 3 issues here (which are closely related to each others):

- to have such a module integrated in scipy means that new python users would 
find the module by default and do not need to install more and more modules. 
This is one of the problem many people encounter. In the early days of scipy (or 
python) things had to be installed, tuned, re-installed, etc. This was fun but 
does not allow a large community to join. There are efforts to coordinate, 
homogeneise, optimise all this. Scipy is one of these (and an impressive 
success). Astropy is another path specific to astronomy (my field). But for such 
complex routines, we need (I believe) things which are "simple" to use and 
already integrated. I acknowledge this is a huge effort, both to develop the 
module, and integrate it and I am not blaming anyone here (on the contrary, as 
mentioned, I am very impressed by what has been achieved!). I am just saying: I 
believe this is a "must have". People who will need such a module for their own 
goals could then use it transparently.

- if the specifics of the bounds/fixed parameters are in the user-defined 
function itself, then we loose it I think. To me it is then nearly equivalent 
(although slightly better), for a new python user, as having to download and 
install several additional packages. You need to spend some time tuning your 
function, and cannot change it on the fly. On the long run, I would be surprised 
if the "non-advanced" users would really go for this. They would turn to e.g., 
idl or whatever is convenient for them.

- When contributing to an effort like astropy (via e.g., github) and when you do 
post a new package, you would like to avoid requiring the installation of 2-3 
more packages on top of the one you are proposing (even if their installation is 
automatised). At the moment, my package includes mpfit.py as a sub-module. This 
is bad practice (as various packages will have various versions of mpfit maybe, 
and mpfit is not optimised) but this guarantees that the person who downloads 
the package can just rely on that. In astropy, the guideline is that APART from 
matplotlib, scipy/numpy, you shouldn't have to download more if you wish to have 
a specific piece of software work on your computer. This ensures that the 
community reacts positively to this coordinating effort (which is very 
significant) and that it will attract more and more people around these 
beautiful developments, namely numpy, scipy et al.

Of course, this is just a biased opinion from a non-expert python user! :-)

cheers
Eric

On 03/09/2012 04:14 AM, David Baddeley wrote:
>  From a pure performance perspective, you're probably going to be best setting
> your bounds by variable substitution (particularly if they're only single-ended
> - x**2 is cheap) - you really don't want to have the for loops, dictionary
> lookups and conditionals that lmfit introduces for it's bounds checking inside
> your objective function.
>
> I think a high level wrapper that permitted bounds, an unadulterated goal
> function, and setting which parameters to fit, but also retained much of the raw
> speed of leastsq could be accomplished with some clever on the fly code
> generation (maybe also using Sympy to automatically derive the Jacobian). Would
> make an interesting project ...


From adnothing at gmail.com  Fri Mar  9 03:47:24 2012
From: adnothing at gmail.com (Adrien Gaidon)
Date: Fri, 9 Mar 2012 09:47:24 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
	<CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
Message-ID: <CANuBgHL-zUiDH_+0TVpuziW=bLrwPzBbWX=mR4=iZrKb93qQCQ@mail.gmail.com>

2012/3/9 william ratcliff <william.ratcliff at gmail.com>

>
> Take the question of bounds for example--is it better to have no easy way
> of implementing bounds, or to have the cleanest/most efficient piece of
> code?  What is the actual process of contributing these days?   For
> example, for making a patch, now that the codebase is on github.  Do we
> make a fork, patch, and point to the fork?  Submit a patch?  If so, where?
>
>
> What exactly are scikits?  What determines if something belongs in
> scipy.optimize as compared to a scikit?  What is the process for creating a
> scikit?  The webpage is a bit vague.   Do scikits share more than a
> namespace?
>
> Sorry that this is a bit disorganized, but the TL;DR is that I think scipy
> could do more to make it easier for people to contribute...I understand the
> need to have maintainable code in a large project, but in many cases,
> having a less than perfect implementation (with tests) would be better than
> having no implementation...
>

I 100% agree with William here and think *how* to contribute is at the
heart of the problem.

I think many users in the scipy multiverse have their own `utils.py` or
other home-made modules, which may contain code useful to a wide audience
and absent of the numpy/scipy or related well-known libraries (e.g. the
excellent scikit-learn).
As all pythonistas have a good heart, I'm sure they would like to share it,
but, as William said, the road along that path is unclear, bumpy and
sometimes not super friendly.

For instance, I wrote a simple multi-dimensional digitize function and
posted a gist (https://gist.github.com/1509853) to this list (or numpy or
some other relevant mailing list...). Before doing that, I really pondered:
"is it useful enough? is it not trivial? where and how should I contribute
it?" etc.
All these metaphysical questions are a barrier to the wannabe-contributor,
that, IMHO, filters out a lot of useful code. Especially for such small
contributions, the hassle becomes superior to the expected gain for the
community and the code ends up self-censored or forgotten (we're all always
very busy).
That's problematic, because I believe in the "emergence" property of open
source projects: a sum of small contributions can make a powerful library.

Furthermore, it seems that large projects tend to have API zealots that
don't even want to see code unless it can be directly merged in master
(caricature). I totally understand that, and think it's in the nature of
open source projects in order to not grow anarchistically.
However, this also prevents small "diamonds in the rough" to be discovered,
or useful temporary hole-filling solutions to be proposed until a proper
one is available. To me, this is a false problem due to the fact that the
only advertised way to contribute is by forking + pull request. But not
everybody is a scipy source code guru!

Therefore, I think it's necessary for the community to discuss this issue,
get a consensus on the desired ways to contribute with respect to the
contribution type (very important), write a small tutorial or document
explaining this, and, most importantly, publicly advertise it on the
website.

So far, the ways to contribute that I know of are:

- fork + pull request: high barrier of entry
- mailing list + gist: quick & easy, but one must be willing to "spam"
everyone
- scipy central

If I have forgotten any, feel free to add them!

To conclude this long rant, I think that http://scipy-central.org/ is a
great idea with lots of potential, especially for sharing small snippets.
To me, it can be some kind of "social network for code snippets", and the
comments / voting / popularity system can allow "true scipy contributors"
to peek at the best contributions and clean / test / integrate their
selection. That way, users can just "dump" their code, without worrying
about the difficult engineering issues of integration into scipy!
But it needs more advertising to gain visibility to every scipy user. For
now, there is not even a link on http://www.scipy.org/  and Google results
are not looking either...

Cheers,

Adrien
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/b57ef4ae/attachment.html>

From pav at iki.fi  Fri Mar  9 06:04:00 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 09 Mar 2012 12:04:00 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
	<CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
Message-ID: <jjco30$sik$1@dough.gmane.org>

Hi,

09.03.2012 07:20, william ratcliff kirjoitti:
[clip]
> We also used the IDL version of  mpfit--in moving to python, I looked
> for an analogue and found mpfit.py that at the time, relied on Numeric.
>   I made a port to numpy and found Sergei Koposov had made a similar
> port (http://code.google.com/p/astrolibpy/wiki/AstrolibpyContents) in
> addition to fixing some bugs in the original and adding extensions.   I
> talked with all of the stakeholders to receive licensing permission for
> inclusion into scipy.   However, there were questions about the coding
> style, and it never made it in.  (I'm happy to see the lmfit project on
> github)  

Sorry, I completely forgot you had already done work for inclusion of
mpfit, and I guess it was my reply that halted it:

    http://mail.scipy.org/pipermail/scipy-dev/2009-May/011947.html

My intent here was honestly not to be a total API zealot and say that
*everything* needs to be fixed before checkin --- just that errors
should raise exceptions, and some minor stylistic cleanup should be made
--- the rest could be cleaned up later. Though, I can understand that a
long laundry list of things to correct is not the nicest first response
to code contributions.

There's also a second issue here, which is more organizational --- since
there was no procedure for the contributions, I lost track of where this
work was progressing, and eventually forgot about it. This is where
Github's pull requests improve the situation by a large amount. In
principle Trac could serve the same role, but in practice it turns out
to work somewhat less well.

[clip]
> Sorry that this is a bit disorganized, but the TL;DR is that I think
> scipy could do more to make it easier for people to contribute... I
> understand the need to have maintainable code in a large project, but in
> many cases, having a less than perfect implementation (with tests) would
> be better than having no implementation... Also, what may be easy for us,
> may not be easy to many users of scipy, so having convenience methods is
> worthwhile...

The fine line to walk here is that there must be some quality control
for code contributions, to avoid ending up with a set of routines that
are awkward to use or don't work as promised (except around the research
problem for which it was first written). The point in doing as much as
possible before accepting contributions is that if this is left for
later, the contributor may be MIA and there's nobody around who
understands the piece of code well, and you're committed to a clunky API
which you cannot easily change anymore if there has been a release in
between. The flip side is of course that the barrier to contributions is
higher, and it should not be made too high.

The scipy-central.org is a good solution for just sharing research code
with minimum hassle, and definitely could use more advertisement.

	Pauli


From josef.pktd at gmail.com  Fri Mar  9 07:12:09 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 9 Mar 2012 07:12:09 -0500
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <jjco30$sik$1@dough.gmane.org>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
	<CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
	<jjco30$sik$1@dough.gmane.org>
Message-ID: <CAMMTP+AnBwaCBSOhyGMNyZ1jqsL3fsc9y-nUTkuN-y-2ygZppg@mail.gmail.com>

On Fri, Mar 9, 2012 at 6:04 AM, Pauli Virtanen <pav at iki.fi> wrote:
> Hi,
>
> 09.03.2012 07:20, william ratcliff kirjoitti:
> [clip]
>> We also used the IDL version of ?mpfit--in moving to python, I looked
>> for an analogue and found mpfit.py that at the time, relied on Numeric.
>> ? I made a port to numpy and found Sergei Koposov had made a similar
>> port (http://code.google.com/p/astrolibpy/wiki/AstrolibpyContents) in
>> addition to fixing some bugs in the original and adding extensions. ? I
>> talked with all of the stakeholders to receive licensing permission for
>> inclusion into scipy. ? However, there were questions about the coding
>> style, and it never made it in. ?(I'm happy to see the lmfit project on
>> github)
>
> Sorry, I completely forgot you had already done work for inclusion of
> mpfit, and I guess it was my reply that halted it:
>
> ? ?http://mail.scipy.org/pipermail/scipy-dev/2009-May/011947.html
>
> My intent here was honestly not to be a total API zealot and say that
> *everything* needs to be fixed before checkin --- just that errors
> should raise exceptions, and some minor stylistic cleanup should be made
> --- the rest could be cleaned up later. Though, I can understand that a
> long laundry list of things to correct is not the nicest first response
> to code contributions.
>
> There's also a second issue here, which is more organizational --- since
> there was no procedure for the contributions, I lost track of where this
> work was progressing, and eventually forgot about it. This is where
> Github's pull requests improve the situation by a large amount. In
> principle Trac could serve the same role, but in practice it turns out
> to work somewhat less well.
>
> [clip]
>> Sorry that this is a bit disorganized, but the TL;DR is that I think
>> scipy could do more to make it easier for people to contribute... I
>> understand the need to have maintainable code in a large project, but in
>> many cases, having a less than perfect implementation (with tests) would
>> be better than having no implementation... Also, what may be easy for us,
>> may not be easy to many users of scipy, so having convenience methods is
>> worthwhile...
>
> The fine line to walk here is that there must be some quality control
> for code contributions, to avoid ending up with a set of routines that
> are awkward to use or don't work as promised (except around the research
> problem for which it was first written). The point in doing as much as
> possible before accepting contributions is that if this is left for
> later, the contributor may be MIA and there's nobody around who
> understands the piece of code well, and you're committed to a clunky API
> which you cannot easily change anymore if there has been a release in
> between. The flip side is of course that the barrier to contributions is
> higher, and it should not be made too high.

taking anneal as an example

None of the scipy "maintainers" knew the background for simulated annealing
The function in scipy has several problems, and it might take some
time for someone that understands this to make it work better.

The proposed patch for bounds, *replaced* the bounds on the updating
step with bounds on the parameters instead of adding an additional
functionality.

http://projects.scipy.org/scipy/ticket/1126
http://projects.scipy.org/scipy/ticket/875
http://article.gmane.org/gmane.comp.python.scientific.devel/10398

I didn't and don't think just changing the behavior was an appropriate patch.

Essentially, maintenance for global optimizers in scipy is MIA because
of the lack of someone (?) who can evaluate it and convert a patch
into a tested improvement of the code.

The advantage of a developers' own tools or utilities functions is
that the developer can do as much testing and quality control as (s)he
feels like or as much as is necessary for a specific use case.
For inclusion in scipy the quality control should be higher, in my opinion.

One advantage of scipy-central and a commenting system would be that
code will get user testing and feedback, so it's easier to evaluate
for inclusion in scipy whether the code works as promised, and is
useful to a wider audience.


scipy is missing linear programming, and quadratic programming in the
scipy.optimize, and there is no work-around. Both are "must-haves", I
think.

Josef

>
> The scipy-central.org is a good solution for just sharing research code
> with minimum hassle, and definitely could use more advertisement.
>
> ? ? ? ?Pauli
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From guyer at nist.gov  Fri Mar  9 08:50:34 2012
From: guyer at nist.gov (Jonathan Guyer)
Date: Fri, 9 Mar 2012 08:50:34 -0500
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
Message-ID: <9B08C4C4-7639-4996-AB7E-82D96ADCCF79@nist.gov>

My instructions to FiPy users are at: http://www.matforge.org/fipy/wiki/InstallFiPy/MacOSX/SnowLeopard

They're largely based on hyperjeff's, although his

sudo mv /System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/numpy \
/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/numpyX

is evil and unnecessary if you pass --no-site-packages to mkvirtualenv.


I'm still on SciPy 0.9.0, but I don't have any trouble importing scipy.interpolate.


On Mar 8, 2012, at 3:40 PM, Adam Ginsburg wrote:

> Hi, I've recently (surprisingly) gotten scipy to compile by following
> these http://blog.hyperjeff.net/?p=160 instructions.  However, if I
> try to import scipy.interpolate, it fails.  I'm trying to install
> scipy into a virtualenv environment, though I don't think that's the
> issue because I have another install in a Framework that sees the same
> error.
> 
> I'm using numpy 1.6.1, scipy 0.10.1, mac OS X 10.6.8.
> 
> Can anyone help me understand the following error?
> 
> $ ~/virtual-python/bin/python -c "import scipy, scipy.interpolate"
> Traceback (most recent call last):
>  File "<string>", line 1, in <module>
>  File "/Users/adam/virtual-python/lib/python2.7/site-packages/scipy/interpolate/__init__.py",
> line 156, in <module>
>    from ndgriddata import *
>  File "/Users/adam/virtual-python/lib/python2.7/site-packages/scipy/interpolate/ndgriddata.py",
> line 9, in <module>
>    from interpnd import LinearNDInterpolator, NDInterpolatorBase, \
>  File "numpy.pxd", line 174, in init interpnd
> (scipy/interpolate/interpnd.c:7771)
> ValueError: numpy.ndarray has the wrong size, try recompiling
> 
> Thanks,
> -- 
> Adam Ginsburg
> Graduate Student
> Center for Astrophysics and Space Astronomy
> University of Colorado at Boulder
> http://casa.colorado.edu/~ginsbura/
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From sturla at molden.no  Fri Mar  9 10:47:03 2012
From: sturla at molden.no (Sturla Molden)
Date: Fri, 09 Mar 2012 16:47:03 +0100
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
Message-ID: <4F5A25F7.3020408@molden.no>

On 08.03.2012 05:25, Peter Cimerman?i? wrote:

> To describe my problem into more details, I have a list of ~1000
> bacterial genome lengths and number of certain genes for each one of
> them. I'd like to see if there is any correlation between genome lengths
> and number of the genes.

> It may look like an easy linear regression
> problem;

No, it does not. If you are working with counts, the appropriate model 
would usually be Poisson regression. I.e. Generalized linear model with 
log-link function and Possion probability family. I have seen many 
examples of microbiologists using linear regression when they should 
actually use Poisson regression (e.g. counting genes) or logistic 
regression (e.g. dose-response and titration curves).

This will do it for you:

MATLAB: glmfit from the statistics toolbox
R: glm
SAS: PROC GLIM
Python: statmodels scikit

Another example of inappropriate use of linear regression in 
microbiology is the Lineweaver-Burk plot as substitute for non-linear 
least-squares (usually Levenberg-Marquardt) to fit a Michelis-Menten 
curve. Some microbiologists are bevare of this, but they seem to prefer 
all sorts of ad hoc trickeries like linearizations and 
variance-stabilizing transforms instead of "just doing it right".

As for samples that are not independent, that will affect the final 
likelihood. If you want to optimize the log-likelhood yourself, to 
control for this, getting ML estimates by maximizing the log-likelhood 
is easy with fmin_powell or fmin_bgfs from scipy.optimize. (Powell's 
method does not even need the gradient.) And if you need the "p-value", 
you can either use the likelihood ratio or Monte Carlo (e.g. permutation 
test).


Sturla


P.S. I think biostatistics courses that biologists are tought do not 
cover the tools that are most commonly needed. Ronald Fisher (famous for 
multiple regression and ANOVA) worked with quantitative genetics (e.g. 
animal and plant breeding). But today most biologists work in molecular 
biology labs, and other methods for data analysis are often needed. That 
includes generalized linear models, non-linear regression, image 
processing (for microscopy), and general signal processing (e.g. 
electrophysiology).


From matt.newville at gmail.com  Thu Mar  8 17:09:49 2012
From: matt.newville at gmail.com (Matthew Newville)
Date: Thu, 8 Mar 2012 14:09:49 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <4F5916A2.2040604@eso.org>
References: <4F5916A2.2040604@eso.org>
Message-ID: <18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>

Dear Eric,

On Thursday, March 8, 2012 2:29:22 PM UTC-6, Eric Emsellem wrote:
>     Dear all,
>
>     I know the title looks a little provocative, but this was obviously 
done
>     on purpose. I am very impressed by the capabilities of scipy (et al.,
>     numpy etc) and have been a fan since years! But one thing (in my
>     opinion) seems to be missing  (see below). If it exists, then great 
(and
>     apologies)!
>
>     What I didn't find in Scipy (or numpy or..) is *an efficient
>     least-squares fitting routine which can include bounded, or fixed
>     parameters*. This seems like something many people must be needing! I 
am
>     right now using mpfit.py (from minpack then Craig B. Markwardt for idl
>     and Mark Rivers for python), which I did integrate in the package I am
>     developing. It is much faster than many other routines in scipy 
although
>     Adam Ginsburg did mention some test-bench he conducted some time ago,
>     showing that leastsq was quite efficient. It can include bounds, fixed
>     parameters etc. And it works great! But this is probably not the best
>     way to have such a stand-alone routine... and it is far from being
>     optimised for the modern python.
>
>     So:
>
>     is there ANY plan for having such a module in Scipy?? I think
>     (personally) that this is a MUST DO. This is typically the type of
>     routines that I hear people use in e.g., idl etc. If this could be an
>     optimised, fast (and easy to use) routine, all the better.

Yes, see https://github.com/newville/lmfit-py,  which does everything you 
ask for, and a bit more, with the possible exception of "being included in 
scipy".   For what its worth, I work with Mark Rivers (who's no longer 
actively developing Python), and our group is full of IDL users who are 
very familiar with Markwardt's implementation.

The lmfit-py version uses scipy.optimize.leastsq(), which uses MINPACK 
directly, so has the advantage of not being implemented in pure IDL or 
Python. It is definitely faster than mpfit.py.

With lmfit-py, one writes a python function-to-minimize that takes a list 
of Parameters instead of the array of floating point variables that 
scipy.optimize.leastsq() uses. Each Parameter can be freely varied of 
fixed, have upper and/or lower bounds placed on them, or be written as 
algebraic expressions of other Parameters.   Uncertainties in varied 
Parameters and correlations between Parameters are estimated using the same 
"scaled covariance" method as used in scipy.optimize.curve_fit().   There 
is limited support for optimization methods other than 
scipy.optimize.leastsq(), but I don't find these methods to be very useful 
for the kind of fitting  problems I normally see, so support for them may 
not be perfect.

Whether this gets included into scipy is up to the scipy developers. I'd be 
happy to support this module within scipy or outside scipy.
I have no doubt that improvements could be made to lmfit.py.   If you have 
suggestion, I'd be happy to hear them.

Cheers,

--Matt Newville

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/606c1aa1/attachment.html>

From matt.newville at gmail.com  Thu Mar  8 19:55:49 2012
From: matt.newville at gmail.com (Matthew Newville)
Date: Thu, 8 Mar 2012 16:55:49 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <20120308210722.GC12436@phare.normalesup.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
Message-ID: <6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>

Gael,

On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
>
>   I am sorry I am going to react to the provocation.

And I am sorry that I am going to react to your message.  I think your 
reaction is unfair.

>   As some one who spends a fair amount of time working on open source
>   software I hear such remarks quite often: 'why is feature foo not
>   implemented in package bar?. I am finding it harder and harder not to
>   react negatively to these emails. Now I cannot consider myself as a
>   contributor to scipy, and thus I can claim that I am not taking your
>   comment personally.

Where I work (a large scientific user facility), there are lots of 
scientists in what I'll presume is Eric's position -- able and willing to 
work well with scientific programming tools, but unable to devote the extra 
time needed to develop core functionality or maintain much work outside of 
their own area of interest.  There are a great many scientists interested 
in learning and using python.  Several people there *are* writing 
scientific libraries with python.  Similarly in the fields I work in, 
python is widely accepted as an important ecosystem.

>   Why isn't scipy not up to the task? Will, the answer is quite simple:
>   because it's developed by volunteers that do it on their spare time, 
late
>   at night too often, or companies that put some of their benefits in open
>   source rather in locking down a market. 90% of the time the reason the
>   feature isn't as good as you would want it is because of lack of time.
>
>   I personally find that suggesting that somebody else should put more of
>   the time and money they are already giving away in improving a feature
>   that you need is almost insulting.

Well, in some sense, Eric's message is an expression of interest.... 
Perhaps you would prefer that nobody outside the core group of developers 
or mailing list subscribers asked for any new features or clarification of 
existing features.

>   I am aware that people do not realize how small the group of people that
>   develop and maintain their toys is. Borrowing from Fernando Perez's talk
>   at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide 80),
>   the number of people that do 90% of the grunt work to get the core
>   scientific Python ecosystem going is around two handfuls.

Well, Fernando's slides indicate there is a small group that dominates 
commits to the projects, then explains, at least partially, why that it 
is.  It is *NOT* because scientists expect this work to be done for them by 
volunteers who should just work harder.

There are very good reasons for people to not be involved.  The work is 
rarely funded, is generally a distraction from funded work, and hardly ever 
"counts" as scientific work.  That's all on top of being a scientist, not a 
programmer.  Now, if you'll allow me, I myself am one of the "lucky" 
scientific software developers, well-recognized in my own small community 
for open source analysis software, and also in a scientific position and in 
a group where building tools for better data collection and analysis can 
easily be interpreted as part of the job.  In fact, I spend a very 
significant amount of my time writing open source software, and work nearly 
exclusively in python.

So, just as as an example of what happens when someone might "contribute",  
I wrote some code (lmfit-py) that could go into scipy and posted it to this 
list several months ago.  Many people have expressed interest in this 
module, and it has been discussed on this list a few times in the past few 
months.  Though lmfit-py is older than Fernando's slides (it was inspired 
after being asked several times "Is there something like IDL's mpfit, only 
faster and in python?"), it actually follows his directions of "get 
involved" quite closely: it is BSD, at github, with decent documentation, 
and does not depend on packages other than scipy and numpy.   Though it's 
been discussed on this list recently, two responses from frequent 
mailing-list responders (you, Paul V) was more along the lines of  "yes, 
that could be done, in principle, if someone were up to doing the work" 
instead of "perhaps package xxx would work for you".  

At no point has anyone from the scipy team expressed an interest in putting 
this into scipy.  OK, perhaps lmfit-py is not high enough quality.  I can 
accept that.  My point is that there *is* a contribution but one that would 
not show up on Fernando's graph as a lengthening of "the tail of 
contributors". There ARE a few developers out there who are interested in 
making contributions, and the scipy team is not doing everything it could 
be doing to either facilitate or even encourage such participation.  In 
fact, especially given your response, it would be possible to conclude that 
contributions are actually discouraged.  It's also possible to be more 
optimistic, and conclude that Fernando's statistics are accurate only for 
each project shown, but wildly underestimate the whole of the community.

>  I'd like to think that it's a problem of skill set: users that have the
>  ability to contribute are just too rare. This is not entirely true, there
>  are scores of skilled people on the mailing lists. You yourself mention
>  that you are developing a package.

There are many kinds of skills.  Sometimes, not insulting your customers, 
colleagues, and potential collaborators is the most important one.

>  Sorry for the rant, but if you want things to improve, you will have more
>  successes sending in pull request than messages on mailing list that
>  sound condescending to my ears.
>
>  I hope that I haven't overreacted too badly.

Sorry, but I think you have.  I'm impressed that Eric was appreciative -- I 
know many who would not be.

For myself, I find it quite discouraging that the scipy team is so insular.
Cheers,

--Matt Newville
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120308/50b85ff2/attachment.html>

From matt.newville at gmail.com  Fri Mar  9 10:04:17 2012
From: matt.newville at gmail.com (Matthew Newville)
Date: Fri, 9 Mar 2012 07:04:17 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
Message-ID: <16397446.506.1331305457726.JavaMail.geo-discussion-forums@ynlw24>

David, 

On Thursday, March 8, 2012 9:14:23 PM UTC-6, David Baddeley wrote:
>
> From a pure performance perspective, you're probably going to be best 
> setting your bounds by variable substitution (particularly if they're only 
> single-ended - x**2 is cheap) - you really don't want to have the for 
> loops, dictionary lookups and conditionals that lmfit introduces for it's 
> bounds checking inside your objective function.
>

>From a performance perspective in which dictionary lookups and additions in 
the wrapper that lmfit puts on an objective function are considered high, I 
think you would probably not want the objective function written in python 
to begin with, but just use Fortran (or C).   Much of scipy presupposes 
that one must include development time (here, or writing and manipulating 
the objective function) into "performance".     

So, for some trivial cases, one can easily change the parameter from, say  
"x, with a minimum value of 0" to "x**2", but then one also has to change 
the objective function and re-map the estimated uncertainties in the 
parameters every time the bounds might be changed.   These would be changes 
that the end-user would have to do.....  Or they could use lmfit which does 
this automatically, if at a slight performance cost compared to having no 
bounds set.    Is that performance hit important?  I doubt it. 

The original question, and several follow-up messages, point to mpfit.py.  
As I'm sure you're aware, this implements the Levenberg-Marquardt algorithm 
**in python**.   And people use it because it provides a convenient way to 
set bounds.   So, like much of scipy and python, sometimes pure performance 
is not the main requirement.   Now, mpfit.py is slow (and is a translation 
of MINPACK from fortran to IDL to python-with-Numeric),  OTOH, lmfit uses 
scipy.optimize.leastsq(), which calls into the fortran version of MINPACK, 
and so does have improved performance compared to mpfit.py.

I think a high level wrapper that permitted bounds, an unadulterated goal 
> function, and setting which parameters to fit, but also retained much of 
> the raw speed of leastsq could be accomplished with some clever on the fly 
> code generation (maybe also using Sympy to automatically derive the 
> Jacobian). Would make an interesting project ..
>

 Sounds great..  Let me know when it's ready and I'll be happy to give it a 
try.

--Matt

>  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/d61ff5dd/attachment.html>

From josef.pktd at gmail.com  Fri Mar  9 11:40:01 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 9 Mar 2012 11:40:01 -0500
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
Message-ID: <CAMMTP+CHMts7z5FeaQ+fG-4N5Jy1gG+9Ri0fE6b49iW-uU8EnQ@mail.gmail.com>

On Thu, Mar 8, 2012 at 7:55 PM, Matthew Newville
<matt.newville at gmail.com> wrote:
> Gael,
>
>
> On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
>>
>>?? I am sorry I am going to react to the provocation.
>
> And I am sorry that I am going to react to your message.? I think your
> reaction is unfair.
>
>
>>?? As some one who spends a fair amount of time working on open source
>>?? software I hear such remarks quite often: 'why is feature foo not
>>?? implemented in package bar?. I am finding it harder and harder not to
>>?? react negatively to these emails. Now I cannot consider myself as a
>>?? contributor to scipy, and thus I can claim that I am not taking your
>>?? comment personally.
>
> Where I work (a large scientific user facility), there are lots of
> scientists in what I'll presume is Eric's position -- able and willing to
> work well with scientific programming tools, but unable to devote the extra
> time needed to develop core functionality or maintain much work outside of
> their own area of interest.? There are a great many scientists interested in
> learning and using python.? Several people there *are* writing scientific
> libraries with python.? Similarly in the fields I work in, python is widely
> accepted as an important ecosystem.
>
>
>>?? Why isn't scipy not up to the task? Will, the answer is quite simple:
>>?? because it's developed by volunteers that do it on their spare time,
>> late
>>?? at night too often, or companies that put some of their benefits in open
>>?? source rather in locking down a market. 90% of the time the reason the
>>?? feature isn't as good as you would want it is because of lack of time.
>>
>>?? I personally find that suggesting that somebody else should put more of
>>?? the time and money they are already giving away in improving a feature
>>?? that you need is almost insulting.
>
> Well, in some sense, Eric's message is an expression of interest.... Perhaps
> you would prefer that nobody outside the core group of developers or mailing
> list subscribers asked for any new features or clarification of existing
> features.
>
>
>>?? I am aware that people do not realize how small the group of people that
>>?? develop and maintain their toys is. Borrowing from Fernando Perez's talk
>>?? at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide 80),
>>?? the number of people that do 90% of the grunt work to get the core
>>?? scientific Python ecosystem going is around two handfuls.
>
> Well, Fernando's slides indicate there is a small group that dominates
> commits to the projects, then explains, at least partially, why that it is.
> It is *NOT* because scientists expect this work to be done for them by
> volunteers who should just work harder.
>
> There are very good reasons for people to not be involved.? The work is
> rarely funded, is generally a distraction from funded work, and hardly ever
> "counts" as scientific work.? That's all on top of being a scientist, not a
> programmer.? Now, if you'll allow me, I myself am one of the "lucky"
> scientific software developers, well-recognized in my own small community
> for open source analysis software, and also in a scientific position and in
> a group where building tools for better data collection and analysis can
> easily be interpreted as part of the job.? In fact, I spend a very
> significant amount of my time writing open source software, and work nearly
> exclusively in python.
>
> So, just as as an example of what happens when someone might "contribute",
> I wrote some code (lmfit-py) that could go into scipy and posted it to this
> list several months ago.? Many people have expressed interest in this
> module, and it has been discussed on this list a few times in the past few
> months.? Though lmfit-py is older than Fernando's slides (it was inspired
> after being asked several times "Is there something like IDL's mpfit, only
> faster and in python?"), it actually follows his directions of "get
> involved" quite closely: it is BSD, at github, with decent documentation,
> and does not depend on packages other than scipy and numpy.?? Though it's
> been discussed on this list recently, two responses from frequent
> mailing-list responders (you, Paul V) was more along the lines of? "yes,
> that could be done, in principle, if someone were up to doing the work"
> instead of "perhaps package xxx would work for you".
>
> At no point has anyone from the scipy team expressed an interest in putting
> this into scipy.? OK, perhaps lmfit-py is not high enough quality.? I can
> accept that.? My point is that there *is* a contribution but one that would
> not show up on Fernando's graph as a lengthening of "the tail of
> contributors". There ARE a few developers out there who are interested in
> making contributions, and the scipy team is not doing everything it could be
> doing to either facilitate or even encourage such participation.? In fact,
> especially given your response, it would be possible to conclude that
> contributions are actually discouraged.? It's also possible to be more
> optimistic, and conclude that Fernando's statistics are accurate only for
> each project shown, but wildly underestimate the whole of the community.

I think lmfit is a good project, it can be easy installed. You are
able to maintain and develop it.
So I don't think the need to have it in scipy is very urgent.

On the other hand, for anyone not familiar with AST manipulation it
feels to me like a possible maintenance nightmare.
It doesn't mean it is, but as part of a community project it should be
possible to maintain (or come with a maintainer).

But maybe I have just seen to much stranded and broken code in scipy
(that remained neglected for years).

As an example for a contribution: fisher's exact test, a pretty
important function, but didn't quite work for several cases. I spend
several days trying to figure out how to fix it. I was not successfull
since I was not familiar with the algorithm and the numerical problems
it raised. A while later users or the original developer found ways to
fix the corner cases. At that stage it was possible to include it in
scipy. (There were a few additional edge cases afterwards, but that
were minor fixes.)

As a positive example, Denis Laxalde became very active and is
revamping and improving large parts of the scipy.optimize code.

Josef

>
>
>>? I'd like to think that it's a problem of skill set: users that have the
>>? ability to contribute are just too rare. This is not entirely true, there
>>? are scores of skilled people on the mailing lists. You yourself mention
>>? that you are developing a package.
>
> There are many kinds of skills.? Sometimes, not insulting your customers,
> colleagues, and potential collaborators is the most important one.
>
>
>>? Sorry for the rant, but if you want things to improve, you will have more
>>? successes sending in pull request than messages on mailing list that
>>? sound condescending to my ears.
>>
>>? I hope that I haven't overreacted too badly.
>
> Sorry, but I think you have.? I'm impressed that Eric was appreciative -- I
> know many who would not be.
>
> For myself, I find it quite discouraging that the scipy team is so insular.
> Cheers,
>
> --Matt Newville
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From keflavich at gmail.com  Fri Mar  9 11:46:24 2012
From: keflavich at gmail.com (Keflavich)
Date: Fri, 9 Mar 2012 08:46:24 -0800 (PST)
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <9B08C4C4-7639-4996-AB7E-82D96ADCCF79@nist.gov>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
	<9B08C4C4-7639-4996-AB7E-82D96ADCCF79@nist.gov>
Message-ID: <5b25d17c-3a93-4e6c-a5eb-df976557f6c1@j5g2000yqm.googlegroups.com>

re: Gael: I removed numpy and scipy completely from both my Frameworks
and virtualenv installs and removed the build directories from both.
However, I think that *still* didn't work.  One part of the build that
probably caused problems was using this:
PYTHONPATH="/Library/Frameworks/Python.framework/Versions/2.7/lib/
python2.7/site-packages/"
which some site (maybe hyperjeff?  maybe elsewhere?) said was
necessary for building scipy.  I'm pretty sure getting rid of that in
the scipy build was essential.

My eventual successful build looked like this:
numpy:
CFLAGS="-arch i386 -arch x86_64" FFLAGS="-m32 -m64" LDFLAGS="-Wall -
undefined dynamic_lookup -bundle -arch i386 -arch x86_64"
MACOSX_DEPLOYMENT_TARGET=10.6 PYTHONPATH="/Library/Frameworks/
Python.framework/Versions/2.7/lib/python2.7/site-packages/" ~/virtual-
python/bin/python2.7 setup.py build --fcompiler=gnu95
scipy:
CFLAGS="-arch i386 -arch x86_64" FFLAGS="-m32 -m64" LDFLAGS="-Wall -
undefined dynamic_lookup -bundle -arch i386 -arch x86_64"
MACOSX_DEPLOYMENT_TARGET=10.6 PYTHONPATH="/Users/adam/virtual-python/
lib/python2.7/site-packages/" ~/virtual-python/bin/python2.7 setup.py
build --fcompiler=gnu95

Re: Jonathan - good to know.  I don't think that affected me, though,
as I made my virtualenv from a /Library python, not a /System/Library
python.  Why evil, though?


From william.ratcliff at gmail.com  Fri Mar  9 11:51:45 2012
From: william.ratcliff at gmail.com (william ratcliff)
Date: Fri, 9 Mar 2012 11:51:45 -0500
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAMMTP+CHMts7z5FeaQ+fG-4N5Jy1gG+9Ri0fE6b49iW-uU8EnQ@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CAMMTP+CHMts7z5FeaQ+fG-4N5Jy1gG+9Ri0fE6b49iW-uU8EnQ@mail.gmail.com>
Message-ID: <CAFt3ydvAoEx3V=eyPkz8eSEsbUD9fQBKUqtqynKhKTy6v+1jhA@mail.gmail.com>

But this is exactly the problem--I also work at a large user facility and
will play with his package--but scipy is one of the first places new users
will turn to--and fitting a function with bounds is a very common task.

In this particular case, what are the exact steps needed to get it into
scipy?  Can they charge be listed as tickets somewhere so that others of us
can help?  Can we document the process to make it easier the next time?  I
realize everyone is busy, but if the barrier to contribution is lowered it
will make life better in the long run.
On Mar 9, 2012 11:40 AM, <josef.pktd at gmail.com> wrote:

> On Thu, Mar 8, 2012 at 7:55 PM, Matthew Newville
> <matt.newville at gmail.com> wrote:
> > Gael,
> >
> >
> > On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
> >>
> >>   I am sorry I am going to react to the provocation.
> >
> > And I am sorry that I am going to react to your message.  I think your
> > reaction is unfair.
> >
> >
> >>   As some one who spends a fair amount of time working on open source
> >>   software I hear such remarks quite often: 'why is feature foo not
> >>   implemented in package bar?. I am finding it harder and harder not to
> >>   react negatively to these emails. Now I cannot consider myself as a
> >>   contributor to scipy, and thus I can claim that I am not taking your
> >>   comment personally.
> >
> > Where I work (a large scientific user facility), there are lots of
> > scientists in what I'll presume is Eric's position -- able and willing to
> > work well with scientific programming tools, but unable to devote the
> extra
> > time needed to develop core functionality or maintain much work outside
> of
> > their own area of interest.  There are a great many scientists
> interested in
> > learning and using python.  Several people there *are* writing scientific
> > libraries with python.  Similarly in the fields I work in, python is
> widely
> > accepted as an important ecosystem.
> >
> >
> >>   Why isn't scipy not up to the task? Will, the answer is quite simple:
> >>   because it's developed by volunteers that do it on their spare time,
> >> late
> >>   at night too often, or companies that put some of their benefits in
> open
> >>   source rather in locking down a market. 90% of the time the reason the
> >>   feature isn't as good as you would want it is because of lack of time.
> >>
> >>   I personally find that suggesting that somebody else should put more
> of
> >>   the time and money they are already giving away in improving a feature
> >>   that you need is almost insulting.
> >
> > Well, in some sense, Eric's message is an expression of interest....
> Perhaps
> > you would prefer that nobody outside the core group of developers or
> mailing
> > list subscribers asked for any new features or clarification of existing
> > features.
> >
> >
> >>   I am aware that people do not realize how small the group of people
> that
> >>   develop and maintain their toys is. Borrowing from Fernando Perez's
> talk
> >>   at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide
> 80),
> >>   the number of people that do 90% of the grunt work to get the core
> >>   scientific Python ecosystem going is around two handfuls.
> >
> > Well, Fernando's slides indicate there is a small group that dominates
> > commits to the projects, then explains, at least partially, why that it
> is.
> > It is *NOT* because scientists expect this work to be done for them by
> > volunteers who should just work harder.
> >
> > There are very good reasons for people to not be involved.  The work is
> > rarely funded, is generally a distraction from funded work, and hardly
> ever
> > "counts" as scientific work.  That's all on top of being a scientist,
> not a
> > programmer.  Now, if you'll allow me, I myself am one of the "lucky"
> > scientific software developers, well-recognized in my own small community
> > for open source analysis software, and also in a scientific position and
> in
> > a group where building tools for better data collection and analysis can
> > easily be interpreted as part of the job.  In fact, I spend a very
> > significant amount of my time writing open source software, and work
> nearly
> > exclusively in python.
> >
> > So, just as as an example of what happens when someone might
> "contribute",
> > I wrote some code (lmfit-py) that could go into scipy and posted it to
> this
> > list several months ago.  Many people have expressed interest in this
> > module, and it has been discussed on this list a few times in the past
> few
> > months.  Though lmfit-py is older than Fernando's slides (it was inspired
> > after being asked several times "Is there something like IDL's mpfit,
> only
> > faster and in python?"), it actually follows his directions of "get
> > involved" quite closely: it is BSD, at github, with decent documentation,
> > and does not depend on packages other than scipy and numpy.   Though it's
> > been discussed on this list recently, two responses from frequent
> > mailing-list responders (you, Paul V) was more along the lines of  "yes,
> > that could be done, in principle, if someone were up to doing the work"
> > instead of "perhaps package xxx would work for you".
> >
> > At no point has anyone from the scipy team expressed an interest in
> putting
> > this into scipy.  OK, perhaps lmfit-py is not high enough quality.  I can
> > accept that.  My point is that there *is* a contribution but one that
> would
> > not show up on Fernando's graph as a lengthening of "the tail of
> > contributors". There ARE a few developers out there who are interested in
> > making contributions, and the scipy team is not doing everything it
> could be
> > doing to either facilitate or even encourage such participation.  In
> fact,
> > especially given your response, it would be possible to conclude that
> > contributions are actually discouraged.  It's also possible to be more
> > optimistic, and conclude that Fernando's statistics are accurate only for
> > each project shown, but wildly underestimate the whole of the community.
>
> I think lmfit is a good project, it can be easy installed. You are
> able to maintain and develop it.
> So I don't think the need to have it in scipy is very urgent.
>
> On the other hand, for anyone not familiar with AST manipulation it
> feels to me like a possible maintenance nightmare.
> It doesn't mean it is, but as part of a community project it should be
> possible to maintain (or come with a maintainer).
>
> But maybe I have just seen to much stranded and broken code in scipy
> (that remained neglected for years).
>
> As an example for a contribution: fisher's exact test, a pretty
> important function, but didn't quite work for several cases. I spend
> several days trying to figure out how to fix it. I was not successfull
> since I was not familiar with the algorithm and the numerical problems
> it raised. A while later users or the original developer found ways to
> fix the corner cases. At that stage it was possible to include it in
> scipy. (There were a few additional edge cases afterwards, but that
> were minor fixes.)
>
> As a positive example, Denis Laxalde became very active and is
> revamping and improving large parts of the scipy.optimize code.
>
> Josef
>
> >
> >
> >>  I'd like to think that it's a problem of skill set: users that have the
> >>  ability to contribute are just too rare. This is not entirely true,
> there
> >>  are scores of skilled people on the mailing lists. You yourself mention
> >>  that you are developing a package.
> >
> > There are many kinds of skills.  Sometimes, not insulting your customers,
> > colleagues, and potential collaborators is the most important one.
> >
> >
> >>  Sorry for the rant, but if you want things to improve, you will have
> more
> >>  successes sending in pull request than messages on mailing list that
> >>  sound condescending to my ears.
> >>
> >>  I hope that I haven't overreacted too badly.
> >
> > Sorry, but I think you have.  I'm impressed that Eric was appreciative
> -- I
> > know many who would not be.
> >
> > For myself, I find it quite discouraging that the scipy team is so
> insular.
> > Cheers,
> >
> > --Matt Newville
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/ce93f23b/attachment.html>

From gael.varoquaux at normalesup.org  Fri Mar  9 11:54:43 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Fri, 9 Mar 2012 17:54:43 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
Message-ID: <20120309165443.GA26552@phare.normalesup.org>

Hi Matt,

I am not going to answer to the core of your message. I partly agree with
it and partly disagree. I think that it is fair to have different points
of view. In addition, I do share the opinion that the situation of
developers in open source scientific software is not ideal. I've suffered
from it personally.

I just want to react to a couple of minor points

>    At no point has anyone from the scipy team expressed an interest in
>    putting this into scipy.

Who is the scipy team? What is the scipy team? Who could or should
express such an interest? These are people struggling to maintain a
massive package on their free time. It actually takes a lot of time to
monitor the mailing lists and pick up offers like yours to turn them into
something that can be integrated.

Had you submitted a pull request, with code ready to be merged, i.e. with
no extra work in terms of documentation, API or tests, I think that it
would be legitimate to blame the scipy developers for lack of interest.
That said, I can easily understand how such a pull request would fall
between the cracks. It's unfortunate, not excusable, but it does happen.
Indeed, in the projects I maintain, I am kept busy full time with pure
maintenance work (bug fixing, answering emails, improving documentation).
When I review and merge pull requests, a lot of the time they are for
features that I do not need, and I spend full week ends adding tests,
fixing numerical instabilities, completing the docs so that they can be
merged. You have to realize that most contributions to open source
projects actually add up to the workload of the core developers.
Thankfully, not all of them. Teams do build upon people unexpectedly
fixing bugs, contributing flawless code that can be merged in without any
additional work.

I personally have seen my time invested in maintenance of open source
project go up and up for the last few years, until it was to a point
where I was spending a major part of my free time on it. It ended up
giving me a nasty back pain, and I started not answering bug reports,
pull requests and support emails to preserve my health: it is not sane to
spend all onces time in front of a computer.

>    There are many kinds of skills.? Sometimes, not insulting your customers,
>    colleagues, and potential collaborators is the most important one.

Maybe I went over the top. I didn't want to sound insulting. I felt
insulted, as an open source develop (even thought I am not a scipy
developer). I am sorry that I ignited a flame. Getting worked out about
email is never a good thing, and discussion pushing blame certainly don't
help building a community. Maybe I shouldn't have sent this email, or I
should have worded it differently. I apologize for the harsh tone. I
certainly did feel bad when I received the original email, and I wanted
to express it.

>    For myself, I find it quite discouraging that the scipy team is so
>    insular.

Firstly, I would like to stress that I cannot consider myself as part of
the scipy team. I contribute very little code to scipy. As a consequence
I do not feel that I have much legitimacy in making decisions or
comments on the codebase. Thus you shouldn't take my reaction as a
reaction coming from the scipy team, but rather as coming from myself.

Second, can I ask you what makes you think that the scipy team is
insular? Scipy is a big project with a lot of history. As such it is
harder to contribute to it than a small and light project. But I don't
feel any dogmatism or clique attitude from the developers. And, by the
way, if we are going to talk about the scipy developers, I encourage
everybody to find out who they are, i.e. who has been contributing lately
[1]. I don't think that the handful of people that come on top of the
list have an insular behavior. I do think that they are on an island, in
the sens that they are pretty much left alone to do the grunt work. None
of these people reacted badly to any mail on this mailing list about the
state of scipy. I raise my hat to them!

Ga?l


[1]::

$ git clone https://github.com/scipy/scipy.git
$ git shortlog -sn v0.7.0..


From gael.varoquaux at normalesup.org  Fri Mar  9 11:57:36 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Fri, 9 Mar 2012 17:57:36 +0100
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <5b25d17c-3a93-4e6c-a5eb-df976557f6c1@j5g2000yqm.googlegroups.com>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
	<9B08C4C4-7639-4996-AB7E-82D96ADCCF79@nist.gov>
	<5b25d17c-3a93-4e6c-a5eb-df976557f6c1@j5g2000yqm.googlegroups.com>
Message-ID: <20120309165736.GB13064@phare.normalesup.org>

On Fri, Mar 09, 2012 at 08:46:24AM -0800, Keflavich wrote:
> re: Gael: I removed numpy and scipy completely from both my Frameworks
> and virtualenv installs and removed the build directories from both.
> However, I think that *still* didn't work.  One part of the build that
> probably caused problems was using this:
> PYTHONPATH="/Library/Frameworks/Python.framework/Versions/2.7/lib/
> python2.7/site-packages/"
> which some site (maybe hyperjeff?  maybe elsewhere?) said was
> necessary for building scipy.  I'm pretty sure getting rid of that in
> the scipy build was essential.

Indeed, you story confirms my experience: a lot of build problem are
related to having several installs of Python, and having a hard time
controlling which one exactly is used when.

I don't know a good solution to these problems :(.

G


From peter.cimermancic at gmail.com  Fri Mar  9 12:43:45 2012
From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=)
Date: Fri, 9 Mar 2012 09:43:45 -0800
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <4F5A25F7.3020408@molden.no>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
Message-ID: <CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>

>
>
>
> No, it does not. If you are working with counts, the appropriate model
> would usually be Poisson regression. I.e. Generalized linear model with
> log-link function and Possion probability family. I have seen many
> examples of microbiologists using linear regression when they should
> actually use Poisson regression (e.g. counting genes) or logistic
> regression (e.g. dose-response and titration curves).
>
> This will do it for you:
>
> MATLAB: glmfit from the statistics toolbox
> R: glm
> SAS: PROC GLIM
> Python: statmodels scikit
>
> Another example of inappropriate use of linear regression in
> microbiology is the Lineweaver-Burk plot as substitute for non-linear
> least-squares (usually Levenberg-Marquardt) to fit a Michelis-Menten
> curve. Some microbiologists are bevare of this, but they seem to prefer
> all sorts of ad hoc trickeries like linearizations and
> variance-stabilizing transforms instead of "just doing it right".
>
> As for samples that are not independent, that will affect the final
> likelihood. If you want to optimize the log-likelhood yourself, to
> control for this, getting ML estimates by maximizing the log-likelhood
> is easy with fmin_powell or fmin_bgfs from scipy.optimize. (Powell's
> method does not even need the gradient.) And if you need the "p-value",
> you can either use the likelihood ratio or Monte Carlo (e.g. permutation
> test).
>
>
Sturla, could you be more specific here? I don't know much about
(bio)statistics, but that doesn't mean I don't want to do the things right
:). All I want to get out of this analysis is to be able to say whether the
correlation between genome lengths and numbers of particular genes (which
looks neat and obvious from the scatter plot) is statistically significant
given that the data points are heavily phylogenetically biased. That's why
I mentioned "p-values". Of course, I'm open to any better/more accurate way
of getting there than initially planned.


>
> Sturla
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/e417afbb/attachment.html>

From charlesr.harris at gmail.com  Fri Mar  9 12:50:48 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 9 Mar 2012 10:50:48 -0700
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <20120309165443.GA26552@phare.normalesup.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
Message-ID: <CAB6mnx+NzrgOCTe5k=DXq=PXUX7hTFgEfO4jkBeSiPYCLAFbrA@mail.gmail.com>

On Fri, Mar 9, 2012 at 9:54 AM, Gael Varoquaux <
gael.varoquaux at normalesup.org> wrote:

> Hi Matt,
>
> I am not going to answer to the core of your message. I partly agree with
> it and partly disagree. I think that it is fair to have different points
> of view. In addition, I do share the opinion that the situation of
> developers in open source scientific software is not ideal. I've suffered
> from it personally.
>
> I just want to react to a couple of minor points
>
>
>
>    At no point has anyone from the scipy team expressed an interest in
> >    putting this into scipy.
>
> Who is the scipy team? What is the scipy team? Who could or should
> express such an interest? These are people struggling to maintain a
> massive package on their free time. It actually takes a lot of time to
> monitor the mailing lists and pick up offers like yours to turn them into
> something that can be integrated.
>
> Had you submitted a pull request, with code ready to be merged, i.e. with
> no extra work in terms of documentation, API or tests, I think that it
> would be legitimate to blame the scipy developers for lack of interest.
> That said, I can easily understand how such a pull request would fall
> between the cracks. It's unfortunate, not excusable, but it does happen.
> Indeed, in the projects I maintain, I am kept busy full time with pure
> maintenance work (bug fixing, answering emails, improving documentation).
> When I review and merge pull requests, a lot of the time they are for
> features that I do not need, and I spend full week ends adding tests,
> fixing numerical instabilities, completing the docs so that they can be
> merged. You have to realize that most contributions to open source
> projects actually add up to the workload of the core developers.
> Thankfully, not all of them. Teams do build upon people unexpectedly
> fixing bugs, contributing flawless code that can be merged in without any
> additional work.
>
> I personally have seen my time invested in maintenance of open source
> project go up and up for the last few years, until it was to a point
> where I was spending a major part of my free time on it. It ended up
> giving me a nasty back pain, and I started not answering bug reports,
> pull requests and support emails to preserve my health: it is not sane to
> spend all onces time in front of a computer.
>
> >    There are many kinds of skills.  Sometimes, not insulting your
> customers,
> >    colleagues, and potential collaborators is the most important one.
>
> Maybe I went over the top. I didn't want to sound insulting. I felt
> insulted, as an open source develop (even thought I am not a scipy
> developer). I am sorry that I ignited a flame. Getting worked out about
> email is never a good thing, and discussion pushing blame certainly don't
> help building a community. Maybe I shouldn't have sent this email, or I
> should have worded it differently. I apologize for the harsh tone. I
> certainly did feel bad when I received the original email, and I wanted
> to express it.
>
> >    For myself, I find it quite discouraging that the scipy team is so
> >    insular.
>
> Firstly, I would like to stress that I cannot consider myself as part of
> the scipy team. I contribute very little code to scipy. As a consequence
> I do not feel that I have much legitimacy in making decisions or
> comments on the codebase. Thus you shouldn't take my reaction as a
> reaction coming from the scipy team, but rather as coming from myself.
>
> Second, can I ask you what makes you think that the scipy team is
> insular? Scipy is a big project with a lot of history. As such it is
> harder to contribute to it than a small and light project. But I don't
> feel any dogmatism or clique attitude from the developers. And, by the
> way, if we are going to talk about the scipy developers, I encourage
> everybody to find out who they are, i.e. who has been contributing lately
> [1]. I don't think that the handful of people that come on top of the
> list have an insular behavior. I do think that they are on an island, in
> the sens that they are pretty much left alone to do the grunt work. None
> of these people reacted badly to any mail on this mailing list about the
> state of scipy. I raise my hat to them!
>
>

Carefully stepping past the kerfluffle at the bar, I think this sort of
functionality in scipy would be useful. If nothing else, I wouldn't have to
keep implementing for myself ;) IIRC, Dennis Lexalde was going to do
something similar and I think it would be good if some of the folks with
implementations started a separate thread about getting it into scipy.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/6299010e/attachment.html>

From josef.pktd at gmail.com  Fri Mar  9 12:51:23 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 9 Mar 2012 12:51:23 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
Message-ID: <CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>

On Fri, Mar 9, 2012 at 12:43 PM, Peter Cimerman?i?
<peter.cimermancic at gmail.com> wrote:
>>
>>
>> No, it does not. If you are working with counts, the appropriate model
>> would usually be Poisson regression. I.e. Generalized linear model with
>> log-link function and Possion probability family. I have seen many
>> examples of microbiologists using linear regression when they should
>> actually use Poisson regression (e.g. counting genes) or logistic
>> regression (e.g. dose-response and titration curves).
>>
>> This will do it for you:
>>
>> MATLAB: glmfit from the statistics toolbox
>> R: glm
>> SAS: PROC GLIM
>> Python: statmodels scikit
>>
>> Another example of inappropriate use of linear regression in
>> microbiology is the Lineweaver-Burk plot as substitute for non-linear
>> least-squares (usually Levenberg-Marquardt) to fit a Michelis-Menten
>> curve. Some microbiologists are bevare of this, but they seem to prefer
>> all sorts of ad hoc trickeries like linearizations and
>> variance-stabilizing transforms instead of "just doing it right".
>>
>> As for samples that are not independent, that will affect the final
>> likelihood. If you want to optimize the log-likelhood yourself, to
>> control for this, getting ML estimates by maximizing the log-likelhood
>> is easy with fmin_powell or fmin_bgfs from scipy.optimize. (Powell's
>> method does not even need the gradient.) And if you need the "p-value",
>> you can either use the likelihood ratio or Monte Carlo (e.g. permutation
>> test).
>>
>
> Sturla,?could you be more specific here? I don't know much about
> (bio)statistics, but that doesn't mean I don't want to do the things right
> :). All I want to get out of this analysis is to be able to say whether the
> correlation between genome lengths and numbers of particular genes (which
> looks neat and obvious from the scatter plot) is statistically significant
> given that the data points are?heavily?phylogenetically biased. That's why I
> mentioned "p-values". Of course, I'm open to any better/more accurate way of
> getting there than initially planned.

Peter, Could you post a scatter plot of your data (with axis ticks and
labels) so we get an idea what your data looks like?

I have no idea at all about the bio topic.

Josef

>
>
>
>
>>
>>
>> Sturla
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From guyer at nist.gov  Fri Mar  9 12:55:41 2012
From: guyer at nist.gov (Jonathan Guyer)
Date: Fri, 9 Mar 2012 12:55:41 -0500
Subject: [SciPy-User] scipy compiles, but importing interpolate fails
In-Reply-To: <5b25d17c-3a93-4e6c-a5eb-df976557f6c1@j5g2000yqm.googlegroups.com>
References: <CAEBNSwZA5XJMsq-RBAh2rdmEOdWd+c6hiqB2c02=F_u+ZwmWHQ@mail.gmail.com>
	<9B08C4C4-7639-4996-AB7E-82D96ADCCF79@nist.gov>
	<5b25d17c-3a93-4e6c-a5eb-df976557f6c1@j5g2000yqm.googlegroups.com>
Message-ID: <79EFF91E-EDC5-40F2-8CC5-0E60F21B7588@nist.gov>


On Mar 9, 2012, at 11:46 AM, Keflavich wrote:

> Re: Jonathan - good to know.  I don't think that affected me, though,
> as I made my virtualenv from a /Library python, not a /System/Library
> python.  Why evil, though?

Because /System belongs to Apple and should not be tampered with. Using mkvirtualenv --no-site-packages means you don't have to. It doesn't matter how stale Apple's packages are; they get to have what they expect and you get to use what you want/need.


From pav at iki.fi  Fri Mar  9 13:31:22 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 09 Mar 2012 19:31:22 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAB6mnx+NzrgOCTe5k=DXq=PXUX7hTFgEfO4jkBeSiPYCLAFbrA@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
	<CAB6mnx+NzrgOCTe5k=DXq=PXUX7hTFgEfO4jkBeSiPYCLAFbrA@mail.gmail.com>
Message-ID: <jjdi9q$hqq$1@dough.gmane.org>

Hi,

09.03.2012 18:50, Charles R Harris kirjoitti:
[clip]
> Carefully stepping past the kerfluffle at the bar, I think this sort of
> functionality in scipy would be useful. If nothing else, I wouldn't have
> to keep implementing for myself ;) IIRC, Dennis Lexalde was going to do
> something similar and I think it would be good if some of the folks with
> implementations started a separate thread about getting it into scipy.

Dennis actually not only intended, but also implemented something
similar. I wasn't too deeply involved in that, but it's already merged
in Scipy's trunk.

Now, based on a *very* quick look to lmfit (I did not look at it before
now as I did not remember it existed), it seems to be quite similar in
purpose. Hashing out if lmfit has something extra, or if the current
implementation is missing something could be useful, however.

	Pauli


From pav at iki.fi  Fri Mar  9 13:47:20 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 09 Mar 2012 19:47:20 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAFt3ydvAoEx3V=eyPkz8eSEsbUD9fQBKUqtqynKhKTy6v+1jhA@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CAMMTP+CHMts7z5FeaQ+fG-4N5Jy1gG+9Ri0fE6b49iW-uU8EnQ@mail.gmail.com>
	<CAFt3ydvAoEx3V=eyPkz8eSEsbUD9fQBKUqtqynKhKTy6v+1jhA@mail.gmail.com>
Message-ID: <jjdj7p$pue$1@dough.gmane.org>

Hi,

09.03.2012 17:51, william ratcliff kirjoitti:
[clip]
> In this particular case, what are the exact steps needed to get it into
> scipy?  Can they charge be listed as tickets somewhere so that others of
> us can help?  Can we document the process to make it easier the next
> time?  I realize everyone is busy, but if the barrier to contribution is
> lowered it will make life better in the long run.

In general, basically two ways for contributions:

1. A pull request via Github. We have a writeup here with various tips:

   http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html

   Just replace "Numpy" by "Scipy" everywhere.

2. File a ticket in the Trac: http://projects.scipy.org/scipy/

   Attach whetever you have (a patch, separate files) to the ticket,
   and tag it as "enhancement" and "needs_review".

That's about it.

    ***

However, to make it easier for someone to look at the work and verify it
works properly:

- Ensure your code is accompanied by tests that demonstrate it actually
  works as intended. You can look for examples how to write them in the
  Scipy source tree, in files named test_XXX.py

- Ensure the behavior of the public functions is documented in the
  docstrings.

- Prefer the Github way. Granted, there *is* a learning curve, but it
  saves work in the long run, and it is far less clunky to use.

- The more finished the contribution is, the less work it is to merge,
  and gets in faster.

If you get no response, shout on the scipy-devel mailing list. If
there's still no response, shout louder and start accusing people ;)

If the contribution is "controversial" --- duplicates existing
functionality, breaks backwards compatibility, is very specialized for a
particular research problem, relies on magic, etc. --- it's good to give
an argument why the stuff should be included, as otherwise the
motivation may be missed.

Specific to mpfit: this can be regarded as a "just another optimization
routine", and that doesn't seem too controversial to me. It would be
nicer to subsume the functionality to leastsq, though, but I don't see
anyone wanting to modify the MINPACK fortran code. Instead, perhaps this
could be addressed on the level of the unified optimization interface.

Cheers,
Pauli


From peter.cimermancic at gmail.com  Fri Mar  9 14:30:35 2012
From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=)
Date: Fri, 9 Mar 2012 11:30:35 -0800
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
Message-ID: <CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>

Sure, please see attached. Bacteria.jpg is the plot we're talking about. As
you can see there is a nice correlation in the graph, but I'm afraid there
might something like in the second figure (ives.jpg) going on. The second
figure is from Ives and Zhu; Statistics for correlated data: phylogenies,
space and time (2006).

Peter

On Fri, Mar 9, 2012 at 9:51 AM, <josef.pktd at gmail.com> wrote:

> On Fri, Mar 9, 2012 at 12:43 PM, Peter Cimerman?i?
> <peter.cimermancic at gmail.com> wrote:
> >>
> >>
> >> No, it does not. If you are working with counts, the appropriate model
> >> would usually be Poisson regression. I.e. Generalized linear model with
> >> log-link function and Possion probability family. I have seen many
> >> examples of microbiologists using linear regression when they should
> >> actually use Poisson regression (e.g. counting genes) or logistic
> >> regression (e.g. dose-response and titration curves).
> >>
> >> This will do it for you:
> >>
> >> MATLAB: glmfit from the statistics toolbox
> >> R: glm
> >> SAS: PROC GLIM
> >> Python: statmodels scikit
> >>
> >> Another example of inappropriate use of linear regression in
> >> microbiology is the Lineweaver-Burk plot as substitute for non-linear
> >> least-squares (usually Levenberg-Marquardt) to fit a Michelis-Menten
> >> curve. Some microbiologists are bevare of this, but they seem to prefer
> >> all sorts of ad hoc trickeries like linearizations and
> >> variance-stabilizing transforms instead of "just doing it right".
> >>
> >> As for samples that are not independent, that will affect the final
> >> likelihood. If you want to optimize the log-likelhood yourself, to
> >> control for this, getting ML estimates by maximizing the log-likelhood
> >> is easy with fmin_powell or fmin_bgfs from scipy.optimize. (Powell's
> >> method does not even need the gradient.) And if you need the "p-value",
> >> you can either use the likelihood ratio or Monte Carlo (e.g. permutation
> >> test).
> >>
> >
> > Sturla, could you be more specific here? I don't know much about
> > (bio)statistics, but that doesn't mean I don't want to do the things
> right
> > :). All I want to get out of this analysis is to be able to say whether
> the
> > correlation between genome lengths and numbers of particular genes (which
> > looks neat and obvious from the scatter plot) is statistically
> significant
> > given that the data points are heavily phylogenetically biased. That's
> why I
> > mentioned "p-values". Of course, I'm open to any better/more accurate
> way of
> > getting there than initially planned.
>
> Peter, Could you post a scatter plot of your data (with axis ticks and
> labels) so we get an idea what your data looks like?
>
> I have no idea at all about the bio topic.
>
> Josef
>
> >
> >
> >
> >
> >>
> >>
> >> Sturla
> >>
> >>
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/9e156449/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ives.tiff
Type: image/tiff
Size: 55288 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/9e156449/attachment.tiff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bacteria.jpg
Type: image/jpeg
Size: 37352 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/9e156449/attachment.jpg>

From njs at pobox.com  Fri Mar  9 14:46:07 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2012 19:46:07 +0000
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
	<CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
Message-ID: <CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>

On Fri, Mar 9, 2012 at 7:30 PM, Peter Cimerman?i?
<peter.cimermancic at gmail.com> wrote:
> Sure, please see attached. Bacteria.jpg is the plot we're talking about. As
> you can see there is a nice correlation in the graph, but I'm afraid there
> might something like in the second figure (ives.jpg) going on. The second
> figure is from Ives and Zhu; Statistics for correlated data: phylogenies,
> space and time (2006).

So in the figure from Ives and Zhu, the two variables do seem to be
well-correlated across groups, but then within individual groups they
aren't well-correlated. Is that what you're worried about -- that gene
count and genome length might be correlated overall, but not within
individual groups?

Because GLS doesn't actually address that question. It lets you
correct your p-values for the fact that similarity between bacteria
means that you effectively have somewhat less data than it would
otherwise appear, and thus your p-values should be larger than they
would be in a naive analysis. But it'd still be a p-value on whether
the two variables are correlated overall. (Which they obviously
are...)

-- Nathaniel


From josef.pktd at gmail.com  Fri Mar  9 15:13:37 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 9 Mar 2012 15:13:37 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
	<CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
	<CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>
Message-ID: <CAMMTP+CQNugwFCEG+1c3gerTUVN0zo1EJUnpRa5GvYRS-DYXiw@mail.gmail.com>

On Fri, Mar 9, 2012 at 2:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Fri, Mar 9, 2012 at 7:30 PM, Peter Cimerman?i?
> <peter.cimermancic at gmail.com> wrote:
>> Sure, please see attached. Bacteria.jpg is the plot we're talking about. As
>> you can see there is a nice correlation in the graph, but I'm afraid there
>> might something like in the second figure (ives.jpg) going on. The second
>> figure is from Ives and Zhu; Statistics for correlated data: phylogenies,
>> space and time (2006).
>
> So in the figure from Ives and Zhu, the two variables do seem to be
> well-correlated across groups, but then within individual groups they
> aren't well-correlated. Is that what you're worried about -- that gene
> count and genome length might be correlated overall, but not within
> individual groups?
>
> Because GLS doesn't actually address that question. It lets you
> correct your p-values for the fact that similarity between bacteria
> means that you effectively have somewhat less data than it would
> otherwise appear, and thus your p-values should be larger than they
> would be in a naive analysis. But it'd still be a p-value on whether
> the two variables are correlated overall. (Which they obviously
> are...)

I don't think there would be any problem with p-values for the overall
positive relationship. I would be surprised when any statistical
methods wouldn't produce a large p-value for the slope.

Although there is a bit of bunching of points I don't see any big
clusters that would indicate that the linear relationship is
different. In terms of size of the slope I would guess a robust
estimator (statsmodels.RLM) would downweight the observations on the
high part of the graph, large count/length ratio, outliers of
shorties?

I think Sturla has a point in that both count and length are positive.
It doesn't look like it's relevant for length, but in the counts there
is a bunching just above zero, this creates either a non-linearity or
requires another distribution log-normal (?) or Poisson (without
zeros, or loc=1)?

Josef


>
> -- Nathaniel
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From ralf.gommers at googlemail.com  Fri Mar  9 16:36:47 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 9 Mar 2012 22:36:47 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
Message-ID: <CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>

On Fri, Mar 9, 2012 at 1:55 AM, Matthew Newville <matt.newville at gmail.com>wrote:

> Gael,
>
>
> On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
> >
> >   I am sorry I am going to react to the provocation.
>
> And I am sorry that I am going to react to your message.  I think your
> reaction is unfair.
>
>
> >   As some one who spends a fair amount of time working on open source
> >   software I hear such remarks quite often: 'why is feature foo not
> >   implemented in package bar?. I am finding it harder and harder not to
> >   react negatively to these emails. Now I cannot consider myself as a
> >   contributor to scipy, and thus I can claim that I am not taking your
> >   comment personally.
>
> Where I work (a large scientific user facility), there are lots of
> scientists in what I'll presume is Eric's position -- able and willing to
> work well with scientific programming tools, but unable to devote the extra
> time needed to develop core functionality or maintain much work outside of
> their own area of interest.  There are a great many scientists interested
> in learning and using python.  Several people there *are* writing
> scientific libraries with python.  Similarly in the fields I work in,
> python is widely accepted as an important ecosystem.
>
>
> >   Why isn't scipy not up to the task? Will, the answer is quite simple:
> >   because it's developed by volunteers that do it on their spare time,
> late
> >   at night too often, or companies that put some of their benefits in
> open
> >   source rather in locking down a market. 90% of the time the reason the
> >   feature isn't as good as you would want it is because of lack of time.
> >
> >   I personally find that suggesting that somebody else should put more of
> >   the time and money they are already giving away in improving a feature
> >   that you need is almost insulting.
>
> Well, in some sense, Eric's message is an expression of interest....
> Perhaps you would prefer that nobody outside the core group of developers
> or mailing list subscribers asked for any new features or clarification of
> existing features.
>
>
> >   I am aware that people do not realize how small the group of people
> that
> >   develop and maintain their toys is. Borrowing from Fernando Perez's
> talk
> >   at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide
> 80),
> >   the number of people that do 90% of the grunt work to get the core
> >   scientific Python ecosystem going is around two handfuls.
>
> Well, Fernando's slides indicate there is a small group that dominates
> commits to the projects, then explains, at least partially, why that it
> is.  It is *NOT* because scientists expect this work to be done for them by
> volunteers who should just work harder.
>
> There are very good reasons for people to not be involved.  The work is
> rarely funded, is generally a distraction from funded work, and hardly ever
> "counts" as scientific work.  That's all on top of being a scientist, not a
> programmer.  Now, if you'll allow me, I myself am one of the "lucky"
> scientific software developers, well-recognized in my own small community
> for open source analysis software, and also in a scientific position and in
> a group where building tools for better data collection and analysis can
> easily be interpreted as part of the job.  In fact, I spend a very
> significant amount of my time writing open source software, and work nearly
> exclusively in python.
>
> So, just as as an example of what happens when someone might
> "contribute",  I wrote some code (lmfit-py) that could go into scipy and
> posted it to this list several months ago.  Many people have expressed
> interest in this module, and it has been discussed on this list a few times
> in the past few months.  Though lmfit-py is older than Fernando's slides
> (it was inspired after being asked several times "Is there something like
> IDL's mpfit, only faster and in python?"), it actually follows his
> directions of "get involved" quite closely: it is BSD, at github, with
> decent documentation, and does not depend on packages other than scipy and
> numpy.   Though it's been discussed on this list recently, two responses
> from frequent mailing-list responders (you, Paul V) was more along the
> lines of  "yes, that could be done, in principle, if someone were up to
> doing the work" instead of "perhaps package xxx would work for you".
>
> At no point has anyone from the scipy team expressed an interest in
> putting this into scipy.  OK, perhaps lmfit-py is not high enough quality.
> I can accept that.


I don't think anyone has doubts about the quality of lmfit. On the
contrary, I've asked you to list it on
http://scipy.org/Topical_Software(which you did) because I thought it
looked interesting, and have directed
some users towards your package. The documentation is excellent, certainly
better than that of many parts of scipy. The worry with your code is that
the maintenance burden may be relatively high, simply because very few
developers are familiar with AST. The same for merging it in scipy - one of
the core developers will have to invest a significant amount of time
wrapping his head around your work.

The ideal scenario from my point of view would be this:
- lmfit keeps being maintained by you as a separate package for a while
(say six months to a year)
- it gains more users, who can discover potential flaws and provide
feedback. The API can still be changed if necessary.
- once it's stabilized a bit more, you propose it again (and more
explicitly) for inclusion in scipy
- one of the developers does a thorough review and merges it into
scipy.optimize
- you get commit rights and maintain the code within scipy
- bonus points: if you would be interested in improving and reviewing PRs
for related code in optimize.


Scipy is a very good place to add functionality that's of use in many
different fields of science and engineering, but it needs many more active
developers. I think this thread is another reminder of that. Some of the
criticism in this thread about how hard it is to contribute is certainly
justified. I've had the plan for a while (since Fernando's EuroScipy talk
actually) to write a more accessible "how to contribute" document than the
one Pauli linked to. Besides the mechanics (git, Trac, etc.) it should at
least provide some guidance on what belongs in scipy vs. in a scikit, how
to get help, how to move a contribution that doesn't get a response
forward, etc. I'll try to get a first draft ready within the next week or
so.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/9b18e510/attachment.html>

From charlesr.harris at gmail.com  Fri Mar  9 16:46:29 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 9 Mar 2012 14:46:29 -0700
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
Message-ID: <CAB6mnxLX88YTb5DR2oZZziQ0Guc1PinJa+bDeE0_TWYwaZGTgg@mail.gmail.com>

On Fri, Mar 9, 2012 at 2:36 PM, Ralf Gommers <ralf.gommers at googlemail.com>wrote:

>
>
> On Fri, Mar 9, 2012 at 1:55 AM, Matthew Newville <matt.newville at gmail.com>wrote:
>
>> Gael,
>>
>>
>> On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
>> >
>> >   I am sorry I am going to react to the provocation.
>>
>> And I am sorry that I am going to react to your message.  I think your
>> reaction is unfair.
>>
>>
>> >   As some one who spends a fair amount of time working on open source
>> >   software I hear such remarks quite often: 'why is feature foo not
>> >   implemented in package bar?. I am finding it harder and harder not to
>> >   react negatively to these emails. Now I cannot consider myself as a
>> >   contributor to scipy, and thus I can claim that I am not taking your
>> >   comment personally.
>>
>> Where I work (a large scientific user facility), there are lots of
>> scientists in what I'll presume is Eric's position -- able and willing to
>> work well with scientific programming tools, but unable to devote the extra
>> time needed to develop core functionality or maintain much work outside of
>> their own area of interest.  There are a great many scientists interested
>> in learning and using python.  Several people there *are* writing
>> scientific libraries with python.  Similarly in the fields I work in,
>> python is widely accepted as an important ecosystem.
>>
>>
>> >   Why isn't scipy not up to the task? Will, the answer is quite simple:
>> >   because it's developed by volunteers that do it on their spare time,
>> late
>> >   at night too often, or companies that put some of their benefits in
>> open
>> >   source rather in locking down a market. 90% of the time the reason the
>> >   feature isn't as good as you would want it is because of lack of time.
>> >
>> >   I personally find that suggesting that somebody else should put more
>> of
>> >   the time and money they are already giving away in improving a feature
>> >   that you need is almost insulting.
>>
>> Well, in some sense, Eric's message is an expression of interest....
>> Perhaps you would prefer that nobody outside the core group of developers
>> or mailing list subscribers asked for any new features or clarification of
>> existing features.
>>
>>
>> >   I am aware that people do not realize how small the group of people
>> that
>> >   develop and maintain their toys is. Borrowing from Fernando Perez's
>> talk
>> >   at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide
>> 80),
>> >   the number of people that do 90% of the grunt work to get the core
>> >   scientific Python ecosystem going is around two handfuls.
>>
>> Well, Fernando's slides indicate there is a small group that dominates
>> commits to the projects, then explains, at least partially, why that it
>> is.  It is *NOT* because scientists expect this work to be done for them by
>> volunteers who should just work harder.
>>
>> There are very good reasons for people to not be involved.  The work is
>> rarely funded, is generally a distraction from funded work, and hardly ever
>> "counts" as scientific work.  That's all on top of being a scientist, not a
>> programmer.  Now, if you'll allow me, I myself am one of the "lucky"
>> scientific software developers, well-recognized in my own small community
>> for open source analysis software, and also in a scientific position and in
>> a group where building tools for better data collection and analysis can
>> easily be interpreted as part of the job.  In fact, I spend a very
>> significant amount of my time writing open source software, and work nearly
>> exclusively in python.
>>
>> So, just as as an example of what happens when someone might
>> "contribute",  I wrote some code (lmfit-py) that could go into scipy and
>> posted it to this list several months ago.  Many people have expressed
>> interest in this module, and it has been discussed on this list a few times
>> in the past few months.  Though lmfit-py is older than Fernando's slides
>> (it was inspired after being asked several times "Is there something like
>> IDL's mpfit, only faster and in python?"), it actually follows his
>> directions of "get involved" quite closely: it is BSD, at github, with
>> decent documentation, and does not depend on packages other than scipy and
>> numpy.   Though it's been discussed on this list recently, two responses
>> from frequent mailing-list responders (you, Paul V) was more along the
>> lines of  "yes, that could be done, in principle, if someone were up to
>> doing the work" instead of "perhaps package xxx would work for you".
>>
>> At no point has anyone from the scipy team expressed an interest in
>> putting this into scipy.  OK, perhaps lmfit-py is not high enough quality.
>> I can accept that.
>
>
> I don't think anyone has doubts about the quality of lmfit. On the
> contrary, I've asked you to list it on http://scipy.org/Topical_Software(which you did) because I thought it looked interesting, and have directed
> some users towards your package. The documentation is excellent, certainly
> better than that of many parts of scipy. The worry with your code is that
> the maintenance burden may be relatively high, simply because very few
> developers are familiar with AST. The same for merging it in scipy - one of
> the core developers will have to invest a significant amount of time
> wrapping his head around your work.
>
> The ideal scenario from my point of view would be this:
> - lmfit keeps being maintained by you as a separate package for a while
> (say six months to a year)
> - it gains more users, who can discover potential flaws and provide
> feedback. The API can still be changed if necessary.
> - once it's stabilized a bit more, you propose it again (and more
> explicitly) for inclusion in scipy
> - one of the developers does a thorough review and merges it into
> scipy.optimize
> - you get commit rights and maintain the code within scipy
> - bonus points: if you would be interested in improving and reviewing PRs
> for related code in optimize.
>
>
> Scipy is a very good place to add functionality that's of use in many
> different fields of science and engineering, but it needs many more active
> developers. I think this thread is another reminder of that. Some of the
> criticism in this thread about how hard it is to contribute is certainly
> justified. I've had the plan for a while (since Fernando's EuroScipy talk
> actually) to write a more accessible "how to contribute" document than the
> one Pauli linked to. Besides the mechanics (git, Trac, etc.) it should at
> least provide some guidance on what belongs in scipy vs. in a scikit, how
> to get help, how to move a contribution that doesn't get a response
> forward, etc. I'll try to get a first draft ready within the next week or
> so.
>
>
I wonder if it would be useful to put a reference to lmfit in the leastsq
documentation? I know that would need to be temporary and that referencing
something outside scipy is unusual, but it might help increase the number
of users and help it on it's way.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/7c18e49a/attachment.html>

From ralf.gommers at googlemail.com  Fri Mar  9 16:49:29 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 9 Mar 2012 22:49:29 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAB6mnxLX88YTb5DR2oZZziQ0Guc1PinJa+bDeE0_TWYwaZGTgg@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
	<CAB6mnxLX88YTb5DR2oZZziQ0Guc1PinJa+bDeE0_TWYwaZGTgg@mail.gmail.com>
Message-ID: <CABL7CQiiL8Ht1nbghV9e++EcOrDF83qLJCJD7uvB0jYZ5GvVCQ@mail.gmail.com>

On Fri, Mar 9, 2012 at 10:46 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Fri, Mar 9, 2012 at 2:36 PM, Ralf Gommers <ralf.gommers at googlemail.com>wrote:
>
>>
>>
>> On Fri, Mar 9, 2012 at 1:55 AM, Matthew Newville <matt.newville at gmail.com
>> > wrote:
>>
>>> Gael,
>>>
>>>
>>> On Thursday, March 8, 2012 3:07:22 PM UTC-6, Gael Varoquaux wrote:
>>> >
>>> >   I am sorry I am going to react to the provocation.
>>>
>>> And I am sorry that I am going to react to your message.  I think your
>>> reaction is unfair.
>>>
>>>
>>> >   As some one who spends a fair amount of time working on open source
>>> >   software I hear such remarks quite often: 'why is feature foo not
>>> >   implemented in package bar?. I am finding it harder and harder not to
>>> >   react negatively to these emails. Now I cannot consider myself as a
>>> >   contributor to scipy, and thus I can claim that I am not taking your
>>> >   comment personally.
>>>
>>> Where I work (a large scientific user facility), there are lots of
>>> scientists in what I'll presume is Eric's position -- able and willing to
>>> work well with scientific programming tools, but unable to devote the extra
>>> time needed to develop core functionality or maintain much work outside of
>>> their own area of interest.  There are a great many scientists interested
>>> in learning and using python.  Several people there *are* writing
>>> scientific libraries with python.  Similarly in the fields I work in,
>>> python is widely accepted as an important ecosystem.
>>>
>>>
>>> >   Why isn't scipy not up to the task? Will, the answer is quite simple:
>>> >   because it's developed by volunteers that do it on their spare time,
>>> late
>>> >   at night too often, or companies that put some of their benefits in
>>> open
>>> >   source rather in locking down a market. 90% of the time the reason
>>> the
>>> >   feature isn't as good as you would want it is because of lack of
>>> time.
>>> >
>>> >   I personally find that suggesting that somebody else should put more
>>> of
>>> >   the time and money they are already giving away in improving a
>>> feature
>>> >   that you need is almost insulting.
>>>
>>> Well, in some sense, Eric's message is an expression of interest....
>>> Perhaps you would prefer that nobody outside the core group of developers
>>> or mailing list subscribers asked for any new features or clarification of
>>> existing features.
>>>
>>>
>>> >   I am aware that people do not realize how small the group of people
>>> that
>>> >   develop and maintain their toys is. Borrowing from Fernando Perez's
>>> talk
>>> >   at Euroscipy (http://www.euroscipy.org/file/6459?vid=download slide
>>> 80),
>>> >   the number of people that do 90% of the grunt work to get the core
>>> >   scientific Python ecosystem going is around two handfuls.
>>>
>>> Well, Fernando's slides indicate there is a small group that dominates
>>> commits to the projects, then explains, at least partially, why that it
>>> is.  It is *NOT* because scientists expect this work to be done for them by
>>> volunteers who should just work harder.
>>>
>>> There are very good reasons for people to not be involved.  The work is
>>> rarely funded, is generally a distraction from funded work, and hardly ever
>>> "counts" as scientific work.  That's all on top of being a scientist, not a
>>> programmer.  Now, if you'll allow me, I myself am one of the "lucky"
>>> scientific software developers, well-recognized in my own small community
>>> for open source analysis software, and also in a scientific position and in
>>> a group where building tools for better data collection and analysis can
>>> easily be interpreted as part of the job.  In fact, I spend a very
>>> significant amount of my time writing open source software, and work nearly
>>> exclusively in python.
>>>
>>> So, just as as an example of what happens when someone might
>>> "contribute",  I wrote some code (lmfit-py) that could go into scipy and
>>> posted it to this list several months ago.  Many people have expressed
>>> interest in this module, and it has been discussed on this list a few times
>>> in the past few months.  Though lmfit-py is older than Fernando's slides
>>> (it was inspired after being asked several times "Is there something like
>>> IDL's mpfit, only faster and in python?"), it actually follows his
>>> directions of "get involved" quite closely: it is BSD, at github, with
>>> decent documentation, and does not depend on packages other than scipy and
>>> numpy.   Though it's been discussed on this list recently, two responses
>>> from frequent mailing-list responders (you, Paul V) was more along the
>>> lines of  "yes, that could be done, in principle, if someone were up to
>>> doing the work" instead of "perhaps package xxx would work for you".
>>>
>>> At no point has anyone from the scipy team expressed an interest in
>>> putting this into scipy.  OK, perhaps lmfit-py is not high enough quality.
>>> I can accept that.
>>
>>
>> I don't think anyone has doubts about the quality of lmfit. On the
>> contrary, I've asked you to list it on http://scipy.org/Topical_Software(which you did) because I thought it looked interesting, and have directed
>> some users towards your package. The documentation is excellent, certainly
>> better than that of many parts of scipy. The worry with your code is that
>> the maintenance burden may be relatively high, simply because very few
>> developers are familiar with AST. The same for merging it in scipy - one of
>> the core developers will have to invest a significant amount of time
>> wrapping his head around your work.
>>
>> The ideal scenario from my point of view would be this:
>> - lmfit keeps being maintained by you as a separate package for a while
>> (say six months to a year)
>> - it gains more users, who can discover potential flaws and provide
>> feedback. The API can still be changed if necessary.
>> - once it's stabilized a bit more, you propose it again (and more
>> explicitly) for inclusion in scipy
>> - one of the developers does a thorough review and merges it into
>> scipy.optimize
>> - you get commit rights and maintain the code within scipy
>> - bonus points: if you would be interested in improving and reviewing PRs
>> for related code in optimize.
>>
>>
>> Scipy is a very good place to add functionality that's of use in many
>> different fields of science and engineering, but it needs many more active
>> developers. I think this thread is another reminder of that. Some of the
>> criticism in this thread about how hard it is to contribute is certainly
>> justified. I've had the plan for a while (since Fernando's EuroScipy talk
>> actually) to write a more accessible "how to contribute" document than the
>> one Pauli linked to. Besides the mechanics (git, Trac, etc.) it should at
>> least provide some guidance on what belongs in scipy vs. in a scikit, how
>> to get help, how to move a contribution that doesn't get a response
>> forward, etc. I'll try to get a first draft ready within the next week or
>> so.
>>
>>
> I wonder if it would be useful to put a reference to lmfit in the leastsq
> documentation? I know that would need to be temporary and that referencing
> something outside scipy is unusual, but it might help increase the number
> of users and help it on it's way.
>

Fine with me. I actually think we can do this more often, both for packages
that may be included in scipy later and for pacakges like
scikits.image/statsmodels/learn.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/bf7661f3/attachment.html>

From sturla at molden.no  Sat Mar 10 08:45:11 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 14:45:11 +0100
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+CQNugwFCEG+1c3gerTUVN0zo1EJUnpRa5GvYRS-DYXiw@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
	<CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
	<CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>
	<CAMMTP+CQNugwFCEG+1c3gerTUVN0zo1EJUnpRa5GvYRS-DYXiw@mail.gmail.com>
Message-ID: <4F5B5AE7.1020109@molden.no>

Den 09.03.2012 21:13, skrev josef.pktd at gmail.com:
> I think Sturla has a point in that both count and length are positive. 
> It doesn't look like it's relevant for length, but in the counts there 
> is a bunching just above zero, this creates either a non-linearity or 
> requires another distribution log-normal (?) or Poisson (without 
> zeros, or loc=1)? Josef

You can see that the dependent variable is counts with most of them 
below 10. So I maintain that appropriate model is Poisson regression.

That is,

    COX_count ~ Poission(lambda)

with

    log(lambda) = b0 + b1 * genome_length

Or if there are N groups of bacteria,

    log(lambda) = b[0] + b[1] * genome_length
          + np.dot(b[2:N+1], group[0:N-1])

with N-1 dummy indicator variables in the vector "group".

One could of course consider even more complicated models, such as 
interaction terms between bacterial group and genome length. It's just a 
matter of adding in the appropriate predictor variables.

Normally, the p-value of a Poisson regression model can be inferred from 
the likelihood ratio against a reduced model if samples are independent.

But if samples are not independent, one cannot assume that the total 
log-likelihood for the whole data is the sum of log-likelihoods for each 
data point. So Peter would need to derive a correction for this. I 
cannot be more specific because I don't know the specifics about how 
this between-sample dependency is generated. Perhaps Peter could explain it?


Sturla


From josef.pktd at gmail.com  Sat Mar 10 08:57:01 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Mar 2012 08:57:01 -0500
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <4F5B5AE7.1020109@molden.no>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
	<CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
	<CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>
	<CAMMTP+CQNugwFCEG+1c3gerTUVN0zo1EJUnpRa5GvYRS-DYXiw@mail.gmail.com>
	<4F5B5AE7.1020109@molden.no>
Message-ID: <CAMMTP+B-6AO4gTfzpx15sDzJEvgtRNmkvGe38j8tdt07tWBkrQ@mail.gmail.com>

On Sat, Mar 10, 2012 at 8:45 AM, Sturla Molden <sturla at molden.no> wrote:
> Den 09.03.2012 21:13, skrev josef.pktd at gmail.com:
>> I think Sturla has a point in that both count and length are positive.
>> It doesn't look like it's relevant for length, but in the counts there
>> is a bunching just above zero, this creates either a non-linearity or
>> requires another distribution log-normal (?) or Poisson (without
>> zeros, or loc=1)? Josef
>
> You can see that the dependent variable is counts with most of them
> below 10. So I maintain that appropriate model is Poisson regression.
>
> That is,
>
> ? ?COX_count ~ Poission(lambda)
>
> with
>
> ? ?log(lambda) = b0 + b1 * genome_length
>
> Or if there are N groups of bacteria,
>
> ? ?log(lambda) = b[0] + b[1] * genome_length
> ? ? ? ? ?+ np.dot(b[2:N+1], group[0:N-1])
>
> with N-1 dummy indicator variables in the vector "group".
>
> One could of course consider even more complicated models, such as
> interaction terms between bacterial group and genome length. It's just a
> matter of adding in the appropriate predictor variables.
>
> Normally, the p-value of a Poisson regression model can be inferred from
> the likelihood ratio against a reduced model if samples are independent.
>
> But if samples are not independent, one cannot assume that the total
> log-likelihood for the whole data is the sum of log-likelihoods for each
> data point. So Peter would need to derive a correction for this. I
> cannot be more specific because I don't know the specifics about how
> this between-sample dependency is generated. Perhaps Peter could explain it?

He explained the between sample correlation with the similarity (my
analogy autocorrelation in time series, or spatial correlation).

The main problem I see with using Poisson is that I wouldn't know how
to include the correlation.
I never looked at this, and statsmodels doesn't implement it. (I
looked a bit at count processes for time series with serial
dependence, but not much.)
My guess is that log-linear or something like that would be easier

 Is there a multivariate version of Poisson with correlated
observations similar to GLS for the linear model?

Josef

>
>
> Sturla
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From njs at pobox.com  Sat Mar 10 09:01:38 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 10 Mar 2012 14:01:38 +0000
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
Message-ID: <CAPJVwBkosdfrNar4knpwoSb8Z_Uc-2sppwV-r4nCyem5ZJDykw@mail.gmail.com>

On Fri, Mar 9, 2012 at 9:36 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
> I don't think anyone has doubts about the quality of lmfit. On the contrary,
> I've asked you to list it on http://scipy.org/Topical_Software (which you
> did) because I thought it looked interesting, and have directed some users
> towards your package. The documentation is excellent, certainly better than
> that of many parts of scipy. The worry with your code is that the
> maintenance burden may be relatively high, simply because very few
> developers are familiar with AST. The same for merging it in scipy - one of
> the core developers will have to invest a significant amount of time
> wrapping his head around your work.

Out of curiosity (and apropos an earlier thread), would it affect your
reservations if lmfit's ad-hoc AST usage and python interpreter were
replaced by a simple call to 'eval'?

-- N


From sturla at molden.no  Sat Mar 10 09:34:54 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 15:34:54 +0100
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+B-6AO4gTfzpx15sDzJEvgtRNmkvGe38j8tdt07tWBkrQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
	<CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
	<CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>
	<CAMMTP+CQNugwFCEG+1c3gerTUVN0zo1EJUnpRa5GvYRS-DYXiw@mail.gmail.com>
	<4F5B5AE7.1020109@molden.no>
	<CAMMTP+B-6AO4gTfzpx15sDzJEvgtRNmkvGe38j8tdt07tWBkrQ@mail.gmail.com>
Message-ID: <4F5B668E.4000100@molden.no>

Den 10.03.2012 14:57, skrev josef.pktd at gmail.com:
> My guess is that log-linear or something like that would be easier

Log-linear models are to Poisson regression what ANOVA is to linear 
regression.

There is covariate (genome length), so he cannot just use categorical 
predictors.


>   Is there a multivariate version of Poisson with correlated
> observations similar to GLS for the linear model?

Yes there is, but I am not sure it would help.

I have to think about this again.


Sturla


From ralf.gommers at googlemail.com  Sat Mar 10 09:50:14 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 10 Mar 2012 15:50:14 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAPJVwBkosdfrNar4knpwoSb8Z_Uc-2sppwV-r4nCyem5ZJDykw@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
	<CAPJVwBkosdfrNar4knpwoSb8Z_Uc-2sppwV-r4nCyem5ZJDykw@mail.gmail.com>
Message-ID: <CABL7CQiA4-G=z7ho=vSieoKNF4q5pWnPJUgixH3ejrwTi3nttw@mail.gmail.com>

On Sat, Mar 10, 2012 at 3:01 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Mar 9, 2012 at 9:36 PM, Ralf Gommers
> <ralf.gommers at googlemail.com> wrote:
> > I don't think anyone has doubts about the quality of lmfit. On the
> contrary,
> > I've asked you to list it on http://scipy.org/Topical_Software (which
> you
> > did) because I thought it looked interesting, and have directed some
> users
> > towards your package. The documentation is excellent, certainly better
> than
> > that of many parts of scipy. The worry with your code is that the
> > maintenance burden may be relatively high, simply because very few
> > developers are familiar with AST. The same for merging it in scipy - one
> of
> > the core developers will have to invest a significant amount of time
> > wrapping his head around your work.
>
> Out of curiosity (and apropos an earlier thread), would it affect your
> reservations if lmfit's ad-hoc AST usage and python interpreter were
> replaced by a simple call to 'eval'?


I don't think using "eval" will make anyone happy.

Also, my reservation isn't very strong. I personally think it would make
sense for lmfit to remain a stand-alone package for a while, but if others
disagree and there's one developer who invests the required time in
reviewing/merging it, that's all it would take. This doesn't even have to
be a core scipy developer - if you for example would do this work and
indicate that you would be able to do some basic maintenance for it in case
Matt can't / won't anymore, that would be perfectly fine.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120310/7c0c31c9/attachment.html>

From pav at iki.fi  Sat Mar 10 09:54:55 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 10 Mar 2012 15:54:55 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAPJVwBkosdfrNar4knpwoSb8Z_Uc-2sppwV-r4nCyem5ZJDykw@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
	<CAPJVwBkosdfrNar4knpwoSb8Z_Uc-2sppwV-r4nCyem5ZJDykw@mail.gmail.com>
Message-ID: <jjfpvv$gni$1@dough.gmane.org>

10.03.2012 15:01, Nathaniel Smith kirjoitti:
[clip]
> Out of curiosity (and apropos an earlier thread), would it affect your
> reservations if lmfit's ad-hoc AST usage and python interpreter were
> replaced by a simple call to 'eval'?

As far as I see, the AST manipulation is used only to provide a
restricted dialect of Python. The main question then seems to be whether
there is a need to guard against malicious input (IMHO, not really). The
AST pieces seem to be more or less abstracted away, so this shouldn't
matter so much.

The bigger issue is that the interface provided by this package for
specifying constraints etc. is completely different from the unified
optimization interface that is currently in Scipy's Git.

Just dumping lmfit into Scipy would IMHO not be a good idea --- having
one interface for scalar minimizers, and then a completely different one
for least squares does not seem like a good idea. Code and ideas could
be reused, though, picking up the best parts of the two approaches that
are there now, and producing a better one.

-- 
Pauli Virtanen


From sturla at molden.no  Sat Mar 10 10:05:58 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 16:05:58 +0100
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <CAMMTP+B-6AO4gTfzpx15sDzJEvgtRNmkvGe38j8tdt07tWBkrQ@mail.gmail.com>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
	<CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
	<CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>
	<CAMMTP+CQNugwFCEG+1c3gerTUVN0zo1EJUnpRa5GvYRS-DYXiw@mail.gmail.com>
	<4F5B5AE7.1020109@molden.no>
	<CAMMTP+B-6AO4gTfzpx15sDzJEvgtRNmkvGe38j8tdt07tWBkrQ@mail.gmail.com>
Message-ID: <4F5B6DD6.7020500@molden.no>

Den 10.03.2012 14:57, skrev josef.pktd at gmail.com:
>
> He explained the between sample correlation with the similarity (my
> analogy autocorrelation in time series, or spatial correlation).
>
>

Look at his attachment ives.tiff.

If the categories are known in advance (right panel in
ives.tiff), I think what he actually needs is computing
the likelihood ratio between the model

     log(lambda) = b[0] + b[1] * genome_length
           + np.dot(b[2:N+1], group[0:N-1])

and a reduced model

     log(lambda) = b[0] + np.dot(b[1:N], group[0:N-1])

That is, adding genome length as a predictor should not
improve the fit given that bacterial groups are already in
the model.

If he does not have groups, but some sort of dendrogram
(left panel in ives.tiff), perhaps he could preprocess the
data by clustering the bacteria based on his dendrogram?

A full dendrogram (e.g. used as nested log-linear model)
would overfit the data and explain it perfectly. So adding
genome length would always give zero improvement. But if
the dendrogram can be reduced into a few descrete categories,
he could use a likelihood ratio test for the genome length.


Sturla


From sturla at molden.no  Sat Mar 10 10:10:02 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 16:10:02 +0100
Subject: [SciPy-User] Generalized least square on large dataset
In-Reply-To: <4F5B6DD6.7020500@molden.no>
References: <CAMYwccqvLqoYZJGmwQb+TE6J+9-2uagByE-aTt8gjcpo-8DzMA@mail.gmail.com>
	<CAB6mnx+fr+sLsVpc1L04t4RSLg0V+VN3hSYNa5uetS6yq71iAQ@mail.gmail.com>
	<CAB6mnxLhbEL9dGdrnnzbARZxapycDJJZYz_RoKcUXx9KmS7qLQ@mail.gmail.com>
	<CAMYwccrFqi5_VigANYwiBNka2BymVB6QAhoRH14EAznCXrssOQ@mail.gmail.com>
	<4F5A25F7.3020408@molden.no>
	<CAMYwccq-C3OU0fUz9_EL77MtF=CgAEt8wJ-kWjm1YT89rv1F8A@mail.gmail.com>
	<CAMMTP+CFRDHfENiWziWAVrGTs3E=YD1L+ztayPUOZcgCPKFvxA@mail.gmail.com>
	<CAMYwccpvYxfE88F5H2ARFtrJ8fQMX_2sWjtRegnDKhxEpQx--w@mail.gmail.com>
	<CAPJVwBkOni5cDhT4hGnbK-9kFVKKTmZd_gHo0vcHe-6EdmG+Vg@mail.gmail.com>
	<CAMMTP+CQNugwFCEG+1c3gerTUVN0zo1EJUnpRa5GvYRS-DYXiw@mail.gmail.com>
	<4F5B5AE7.1020109@molden.no>
	<CAMMTP+B-6AO4gTfzpx15sDzJEvgtRNmkvGe38j8tdt07tWBkrQ@mail.gmail.com>
	<4F5B6DD6.7020500@molden.no>
Message-ID: <4F5B6ECA.1090307@molden.no>

Den 10.03.2012 16:05, skrev Sturla Molden:
> That is, adding genome length as a predictor should not
> improve the fit given that bacterial groups are already in
> the model...

... and the situation is as in the right panel of ives.tiff.


Sturla


From pav at iki.fi  Sat Mar 10 10:46:34 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 10 Mar 2012 16:46:34 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <jjfpvv$gni$1@dough.gmane.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<CABL7CQi0FyYFBwnOqzBhPRr9vwZuVcvyuKBmVwjg1Yc1Gb4Q9w@mail.gmail.com>
	<CAPJVwBkosdfrNar4knpwoSb8Z_Uc-2sppwV-r4nCyem5ZJDykw@mail.gmail.com>
	<jjfpvv$gni$1@dough.gmane.org>
Message-ID: <jjft0q$4ed$1@dough.gmane.org>

10.03.2012 15:54, Pauli Virtanen kirjoitti:
[clip]
> Just dumping lmfit into Scipy would IMHO not be a good idea --- having
> one interface for scalar minimizers, and then a completely different one
> for least squares does not seem like a good idea. Code and ideas could
> be reused, though, picking up the best parts of the two approaches that
> are there now, and producing a better one.

... although it seems that the interface in Scipy now is a much more
lower-level one --- and does not have the parameter convenience
abstraction, which seems to be the main point in lmfit, and is
orthogonal to what's currently in. Now that would be a rather useful
addition, one just would have to figure out how to make it work nicely
also with the scalar optimizers.

-- 
Pauli Virtanen


From matt.newville at gmail.com  Fri Mar  9 15:50:57 2012
From: matt.newville at gmail.com (Matthew Newville)
Date: Fri, 9 Mar 2012 12:50:57 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <jjdi9q$hqq$1@dough.gmane.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
	<CAB6mnx+NzrgOCTe5k=DXq=PXUX7hTFgEfO4jkBeSiPYCLAFbrA@mail.gmail.com>
	<jjdi9q$hqq$1@dough.gmane.org>
Message-ID: <5272144.709.1331326257133.JavaMail.geo-discussion-forums@ynhs12>

Pauli,

On Friday, March 9, 2012 12:31:22 PM UTC-6, Pauli Virtanen wrote:
>
> Hi,
>
> 09.03.2012 18:50, Charles R Harris kirjoitti:
> [clip]
> > Carefully stepping past the kerfluffle at the bar, I think this sort of
> > functionality in scipy would be useful. If nothing else, I wouldn't have
> > to keep implementing for myself ;) IIRC, Dennis Lexalde was going to do
> > something similar and I think it would be good if some of the folks with
> > implementations started a separate thread about getting it into scipy.
>
> Dennis actually not only intended, but also implemented something
> similar. I wasn't too deeply involved in that, but it's already merged
> in Scipy's trunk.
>
> Now, based on a *very* quick look to lmfit (I did not look at it before
> now as I did not remember it existed), it seems to be quite similar in
> purpose. Hashing out if lmfit has something extra, or if the current
> implementation is missing something could be useful, however.
>
>         Pauli
>
If I understand, you are talking about scipy.optimize.minimize(), which can 
take many minimization methods, but only accepts bounds for the underlying 
methods (l-bfgs-b, coblya, slsqp, and tnc), and constraints only for coblya 
and slsqp.     Thus, I would interpret minimize() to aim to be (and 
documented to be) a unification of the routines to minimize a scalar 
function of one or more variables.   For the discussion here, minimize() 
does not support the Levenberg-Marquardt least-squares algorithm (leastsq) 
at all, as lmfit uses (and as mpfit uses).  

The constraint mechanism is entirely different between lmfit and the other 
constrained optimization methods. 

Cheers, 

--Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/46c6dab4/attachment.html>

From matt.newville at gmail.com  Fri Mar  9 16:13:35 2012
From: matt.newville at gmail.com (Matthew Newville)
Date: Fri, 9 Mar 2012 13:13:35 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <20120309165443.GA26552@phare.normalesup.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
Message-ID: <32380089.1618.1331327615123.JavaMail.geo-discussion-forums@ynjc20>

Hi Gael,

On Friday, March 9, 2012 10:54:43 AM UTC-6, Gael Varoquaux wrote:
>
> Hi Matt,
>
> I am not going to answer to the core of your message. I partly agree with
> it and partly disagree. I think that it is fair to have different points
> of view. In addition, I do share the opinion that the situation of
> developers in open source scientific software is not ideal. I've suffered
> from it personally.
>
> I just want to react to a couple of minor points
>
> >    At no point has anyone from the scipy team expressed an interest in
> >    putting this into scipy.
>
> Who is the scipy team? What is the scipy team? Who could or should
> express such an interest? These are people struggling to maintain a
> massive package on their free time. It actually takes a lot of time to
> monitor the mailing lists and pick up offers like yours to turn them into
> something that can be integrated.
>
OK, that's all fair.  I am certainly not a scipy dev.   But you do seem to 
want it both ways: chastise people for asking for features instead of 
writing those features themselves, and also saying you're far too busy to 
read and act on submissions of code unless it conforms exactly to your 
needs (this year, with github!).   I would guess that this is not likely to 
work very well for you.

For myself, I might be willing to contribute to scipy, but I have my own 
projects and mailing lists to worry about.   Lmfit was something of a fun 
side-project for me -- I'm willing to support it, and can see that it 
*could* go into scipy.   Since you originally brought it up, I did 
essentially everything in Fernando Perez's slides pleaded for how to 
contribute (except submit a pull request, but I did ask whether this was 
viewed as something that should be a scikit, or go into scipy, or retained 
as standalone). 
 

> Had you submitted a pull request, with code ready to be merged, i.e. with
> no extra work in terms of documentation, API or tests, I think that it
> would be legitimate to blame the scipy developers for lack of interest.
> That said, I can easily understand how such a pull request would fall
> between the cracks. It's unfortunate, not excusable, but it does happen.
> Indeed, in the projects I maintain, I am kept busy full time with pure
> maintenance work (bug fixing, answering emails, improving documentation).
> When I review and merge pull requests, a lot of the time they are for
> features that I do not need, and I spend full week ends adding tests,
> fixing numerical instabilities, completing the docs so that they can be
> merged. You have to realize that most contributions to open source
> projects actually add up to the workload of the core developers.
> Thankfully, not all of them. Teams do build upon people unexpectedly
> fixing bugs, contributing flawless code that can be merged in without any
> additional work.
>
Well, I understand you don't necessarily consider yourself to be a scipy 
dev, but I'll ask anyway:  How many pull requests have been made, and how 
many have been accepted?  Perhaps I am not reading github's Pull Request 
link correctly, but that seems to indicate that the numbers are 17 and 0.  
Surely, those cannot be correct.      Unless I am counting wrong, 5 of the 
17 (total? outstanding?) pull requests listed for scipy/scipy
involve optimization. 
  

> I personally have seen my time invested in maintenance of open source
> project go up and up for the last few years, until it was to a point
> where I was spending a major part of my free time on it. It ended up
> giving me a nasty back pain, and I started not answering bug reports,
> pull requests and support emails to preserve my health: it is not sane to
> spend all onces time in front of a computer.
>
> >    There are many kinds of skills.  Sometimes, not insulting your 
> customers,
> >    colleagues, and potential collaborators is the most important one.
>
> Maybe I went over the top. I didn't want to sound insulting. I felt
> insulted, as an open source develop (even thought I am not a scipy
> developer). I am sorry that I ignited a flame. Getting worked out about
> email is never a good thing, and discussion pushing blame certainly don't
> help building a community. Maybe I shouldn't have sent this email, or I
> should have worded it differently. I apologize for the harsh tone. I
> certainly did feel bad when I received the original email, and I wanted
> to express it.
>
It is easy to feel like one's hard work on an open source project is 
under-appreciated.  I understand that feeling, and have felt that way 
myself in the past.   I am definitely quite appreciative of the work done 
on scipy and friends.
 

> >    For myself, I find it quite discouraging that the scipy team is so
> >    insular.
>
> Firstly, I would like to stress that I cannot consider myself as part of
> the scipy team. I contribute very little code to scipy. As a consequence
> I do not feel that I have much legitimacy in making decisions or
> comments on the codebase. Thus you shouldn't take my reaction as a
> reaction coming from the scipy team, but rather as coming from myself.
>
> Second, can I ask you what makes you think that the scipy team is
> insular? Scipy is a big project with a lot of history. As such it is
> harder to contribute to it than a small and light project. But I don't
> feel any dogmatism or clique attitude from the developers. And, by the
> way, if we are going to talk about the scipy developers, 
>

Well, there was a discussion "Alternative to scipy.optimize" in the past 
two weeks on scipy-users that mentioned lmfit, and several over the past 
several months, and apparently requests about mpfit in the more distant 
past.  And yet two scipy contributers (according to github's list, I am 
counting you) responded to the original request for features **exactly like 
lmfit** with something reading an awful lot like "Well, you'll have to 
write one".    Perhaps insular is not a fair characterization -- how would 
you characterize that?

I encourage everybody to find out who they are, i.e. who has been 
> contributing lately
> [1]. I don't think that the handful of people that come on top of the
> list have an insular behavior. I do think that they are on an island, in
> the sens that they are pretty much left alone to do the grunt work. None
> of these people reacted badly to any mail on this mailing list about the
> state of scipy. I raise my hat to them!
>
And so do I.
  
--Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120309/fbd36d8b/attachment.html>

From pav at iki.fi  Sat Mar 10 12:00:44 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 10 Mar 2012 18:00:44 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <32380089.1618.1331327615123.JavaMail.geo-discussion-forums@ynjc20>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
	<32380089.1618.1331327615123.JavaMail.geo-discussion-forums@ynjc20>
Message-ID: <jjg1bs$1v0$1@dough.gmane.org>

09.03.2012 22:13, Matthew Newville kirjoitti:
[clip]
> Well, I understand you don't necessarily consider yourself to be a scipy
> dev, but I'll ask anyway:  How many pull requests have been made, and
> how many have been accepted?  Perhaps I am not reading github's Pull
> Request link correctly, but that seems to indicate that the numbers are
> 17 and 0.  Surely, those cannot be correct.      Unless I am counting
> wrong, 5 of the 17 (total? outstanding?) pull requests listed for
> scipy/scipy involve optimization.

Wrong: 17 open, 99 accepted.

[clip]
> Well, there was a discussion "Alternative to scipy.optimize" in the past
> two weeks on scipy-users that mentioned lmfit, and several over the past
> several months, and apparently requests about mpfit in the more distant
> past.  And yet two scipy contributers (according to github's list, I am
> counting you) responded to the original request for features **exactly
> like lmfit** with something reading an awful lot like "Well, you'll have
> to write one".    Perhaps insular is not a fair characterization -- how
> would you characterize that?

Come on. The correct characterization is just: "busy". I did not
remember that your project existed as it was announced half a year ago
with no proposal that it should be integrated, and did not read the
recent thread on scipy-user.

-- 
Pauli Virtanen


From pav at iki.fi  Sat Mar 10 12:02:13 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 10 Mar 2012 18:02:13 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <jjg1bs$1v0$1@dough.gmane.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
	<32380089.1618.1331327615123.JavaMail.geo-discussion-forums@ynjc20>
	<jjg1bs$1v0$1@dough.gmane.org>
Message-ID: <jjg1el$1v0$2@dough.gmane.org>

10.03.2012 18:00, Pauli Virtanen kirjoitti:
> 09.03.2012 22:13, Matthew Newville kirjoitti:
> [clip]
>> Well, I understand you don't necessarily consider yourself to be a scipy
>> dev, but I'll ask anyway:  How many pull requests have been made, and
>> how many have been accepted?  Perhaps I am not reading github's Pull
>> Request link correctly, but that seems to indicate that the numbers are
>> 17 and 0.  Surely, those cannot be correct.      Unless I am counting
>> wrong, 5 of the 17 (total? outstanding?) pull requests listed for
>> scipy/scipy involve optimization.
> 
> Wrong: 17 open, 99 accepted

Sorry, the actual number is 161 accepted.

-- 
Pauli Virtanen


From josef.pktd at gmail.com  Sat Mar 10 12:12:56 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Mar 2012 12:12:56 -0500
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <jjg1bs$1v0$1@dough.gmane.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
	<32380089.1618.1331327615123.JavaMail.geo-discussion-forums@ynjc20>
	<jjg1bs$1v0$1@dough.gmane.org>
Message-ID: <CAMMTP+DHvD5TkCtayF4Kz4c2ap25FrqZXHi5qOiiZ5gew1ErLw@mail.gmail.com>

On Sat, Mar 10, 2012 at 12:00 PM, Pauli Virtanen <pav at iki.fi> wrote:
> 09.03.2012 22:13, Matthew Newville kirjoitti:
> [clip]
>> Well, I understand you don't necessarily consider yourself to be a scipy
>> dev, but I'll ask anyway: ?How many pull requests have been made, and
>> how many have been accepted? ?Perhaps I am not reading github's Pull
>> Request link correctly, but that seems to indicate that the numbers are
>> 17 and 0. ?Surely, those cannot be correct. ? ? ?Unless I am counting
>> wrong, 5 of the 17 (total? outstanding?) pull requests listed for
>> scipy/scipy involve optimization.
>
> Wrong: 17 open, 99 accepted.
>
> [clip]
>> Well, there was a discussion "Alternative to scipy.optimize" in the past
>> two weeks on scipy-users that mentioned lmfit, and several over the past
>> several months, and apparently requests about mpfit in the more distant
>> past. ?And yet two scipy contributers (according to github's list, I am
>> counting you) responded to the original request for features **exactly
>> like lmfit** with something reading an awful lot like "Well, you'll have
>> to write one". ? ?Perhaps insular is not a fair characterization -- how
>> would you characterize that?
>
> Come on. The correct characterization is just: "busy". I did not
> remember that your project existed as it was announced half a year ago
> with no proposal that it should be integrated, and did not read the
> recent thread on scipy-user.

just as a reminder there is also https://github.com/scipy/scipy/pull/90

openopt started out as a scikits.

It doesn't look easy to me to come up with an interface to optimizers
that satisfy "all" use cases.

(and there is only one Pauli)

Josef


>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From josef.pktd at gmail.com  Sat Mar 10 13:16:09 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Mar 2012 13:16:09 -0500
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares fittings
 with bounds: why is scipy not up to the task?
Message-ID: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>

I would like to extend the discussion a bit.
(I thought of dropping the half finished message and go back to my
corner but I think this might still be useful.)

On Fri, Mar 9, 2012 at 1:47 PM, Pauli Virtanen <pav at iki.fi> wrote:
> Hi,
>
> 09.03.2012 17:51, william ratcliff kirjoitti:
> [clip]
>> In this particular case, what are the exact steps needed to get it into
>> scipy? ?Can they charge be listed as tickets somewhere so that others of
>> us can help? ?Can we document the process to make it easier the next
>> time? ?I realize everyone is busy, but if the barrier to contribution is
>> lowered it will make life better in the long run.
>
> In general, basically two ways for contributions:
>
> 1. A pull request via Github. We have a writeup here with various tips:
>
> ? http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html
>
> ? Just replace "Numpy" by "Scipy" everywhere.
>
> 2. File a ticket in the Trac: http://projects.scipy.org/scipy/
>
> ? Attach whetever you have (a patch, separate files) to the ticket,
> ? and tag it as "enhancement" and "needs_review".
>
> That's about it.
>
> ? ?***
>
> However, to make it easier for someone to look at the work and verify it
> works properly:
>
> - Ensure your code is accompanied by tests that demonstrate it actually
> ?works as intended. You can look for examples how to write them in the
> ?Scipy source tree, in files named test_XXX.py
>
> - Ensure the behavior of the public functions is documented in the
> ?docstrings.
>
> - Prefer the Github way. Granted, there *is* a learning curve, but it
> ?saves work in the long run, and it is far less clunky to use.
>
> - The more finished the contribution is, the less work it is to merge,
> ?and gets in faster.
>
> If you get no response, shout on the scipy-devel mailing list. If
> there's still no response, shout louder and start accusing people ;)
>
> If the contribution is "controversial" --- duplicates existing
> functionality, breaks backwards compatibility, is very specialized for a
> particular research problem, relies on magic, etc. --- it's good to give
> an argument why the stuff should be included, as otherwise the
> motivation may be missed.
>
<snip>

Code that is not ready for Prime Time
--------------------------------------------------

Stata has a large library of user contributed code
http://ideas.repec.org/s/boc/bocode.html
matlab has the file exchange
several other statistical packages have user contributed code
collections in forums

SciPy has mailing lists, pull requests, gists, tickets and
scipy-central and open source and pypi

Pull request on github are very convenient for code that changes,
improves, fixes, existing code, or code that can be included after
some work.
For new functions or modules intended for library code, I find them
easier to try out when they are standalone.

One of the bottlenecks in including code in established packages is
the time it takes to review, refactor and test code that isn't quite
right or doesn't quite fit.


I think it would be useful to have a central location for publishing
code and get earlier feedback from users. (Currently I bookmark, for
example,  gists or modules in a random package, that look interesting
and I might not find again when I need them in future.)

I'm still envious of the matlab fileexchange with a very useful
commenting system where I look more often than at scipy-central, and I
like the commenting system on stackoverflow that gives a much faster
way of evaluating several ways of doing things.

I also think Mathworks and Statacorp are very smart in supporting user
code, since (I assume) that they look at the download statistic to see
what is in high demand and might incorporate code if it's license
compatible.

A community supported scipy-central !?

Cheers,

Josef

>
> Cheers,
> Pauli
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From pav at iki.fi  Sat Mar 10 13:32:47 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 10 Mar 2012 19:32:47 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
Message-ID: <jjg6of$6q0$1@dough.gmane.org>

10.03.2012 19:16, josef.pktd at gmail.com kirjoitti:
[clip]
> I think it would be useful to have a central location for publishing
> code and get earlier feedback from users. (Currently I bookmark, for
> example,  gists or modules in a random package, that look interesting
> and I might not find again when I need them in future.)
> 
> I'm still envious of the matlab fileexchange with a very useful
> commenting system where I look more often than at scipy-central, and I
> like the commenting system on stackoverflow that gives a much faster
> way of evaluating several ways of doing things.
> 
> I also think Mathworks and Statacorp are very smart in supporting user
> code, since (I assume) that they look at the download statistic to see
> what is in high demand and might incorporate code if it's license
> compatible.

I think this is pretty much the position scipy-central tries to fill. As
far as I understand, what you say that is missing is

- more advertisement

- comment system

The former can be fixed. The latter requires someone to sit down and
think how to implement it --- but it shouldn't be too difficult, as the
app [1] is written with Django and seems pretty easy to work on. If
someone familiar with web applications wants to give a hand here, this
would be an opportunity to contribute.

[1] https://github.com/kgdunn/SciPyCentral/

-- 
Pauli Virtanen


From ralf.gommers at googlemail.com  Sat Mar 10 14:48:11 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 10 Mar 2012 20:48:11 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <jjg6of$6q0$1@dough.gmane.org>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<jjg6of$6q0$1@dough.gmane.org>
Message-ID: <CABL7CQgbXuD0G-PSVJGZ1khfccvJ6p7oYJRpn5FpWa0=pAX=SQ@mail.gmail.com>

On Sat, Mar 10, 2012 at 7:32 PM, Pauli Virtanen <pav at iki.fi> wrote:

> 10.03.2012 19:16, josef.pktd at gmail.com kirjoitti:
> [clip]
> > I think it would be useful to have a central location for publishing
> > code and get earlier feedback from users. (Currently I bookmark, for
> > example,  gists or modules in a random package, that look interesting
> > and I might not find again when I need them in future.)
> >
> > I'm still envious of the matlab fileexchange with a very useful
> > commenting system where I look more often than at scipy-central, and I
> > like the commenting system on stackoverflow that gives a much faster
> > way of evaluating several ways of doing things.
> >
> > I also think Mathworks and Statacorp are very smart in supporting user
> > code, since (I assume) that they look at the download statistic to see
> > what is in high demand and might incorporate code if it's license
> > compatible.
>
> I think this is pretty much the position scipy-central tries to fill. As
> far as I understand, what you say that is missing is
>
> - more advertisement
>

As a start, I added a link on the main scipy.org site. Adding it to
http://new.scipy.org/ would also be good - SciPy Central needs a logo
though.

Ralf


>
> - comment system
>
> The former can be fixed. The latter requires someone to sit down and
> think how to implement it --- but it shouldn't be too difficult, as the
> app [1] is written with Django and seems pretty easy to work on. If
> someone familiar with web applications wants to give a hand here, this
> would be an opportunity to contribute.
>
> [1] https://github.com/kgdunn/SciPyCentral/
>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120310/cddc8f9a/attachment.html>

From chris at simplistix.co.uk  Sat Mar 10 17:09:49 2012
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 10 Mar 2012 14:09:49 -0800
Subject: [SciPy-User] Layering a virtualenv over EPD
Message-ID: <4F5BD12D.9090504@simplistix.co.uk>

Hi All,

I hope this is the right list.

So, here's the problem I'm facing: I use EPD as my base python, but I 
have a bunch of projects that all have additional dependencies. For 
example, I may want to use a version of Pandas that's newer than the one 
that ships with my chosen version of EPD, and add some additional 
libraries that don't ship with EPD.

Okay, so I worry that the advice may be "well just install stuff into 
EPD with pip or easy_install". I don't want to do that, just because I 
need a newer version of Pandas for one project, doesn't mean I want to 
have to make sure *all* my projects work with that new version, etc.

So, I tried to wrap a virtualenv around epd:

epd-python virtualenv.py mytestenv

...and then install new pandas and other stuff:

mytestenv/bin/python easy_install -U pandas

...but now, how would I start ipython using that virutalenv?

I tried just running "ipython", but of course, that doesn't include the 
virtualenv.

I tried activating the virtualenv and then running ipython, but it still 
doesn't include anything installed in the virtualenv.

So, how should I do this?

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From robert.kern at gmail.com  Sat Mar 10 17:22:52 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 10 Mar 2012 22:22:52 +0000
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BD12D.9090504@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk>
Message-ID: <CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>

On Sat, Mar 10, 2012 at 22:09, Chris Withers <chris at simplistix.co.uk> wrote:
> Hi All,
>
> I hope this is the right list.
>
> So, here's the problem I'm facing: I use EPD as my base python, but I
> have a bunch of projects that all have additional dependencies.

Questions about EPD should go to epd-users at enthought.com

https://mail.enthought.com/mailman/listinfo/epd-users

However, your question is about virtualenv, not EPD.

> ...but now, how would I start ipython using that virutalenv?
>
> I tried just running "ipython", but of course, that doesn't include the
> virtualenv.

You need to install IPython in your virtualenv.

-- 
Robert Kern


From sturla at molden.no  Sat Mar 10 17:24:46 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 23:24:46 +0100
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BD12D.9090504@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk>
Message-ID: <4F5BD4AE.5020703@molden.no>


You might also want to ask on the EPD users' list: epd-users at the 
provider of EPD (written so to avoid spam).

Sturla


Den 10.03.2012 23:09, skrev Chris Withers:
> Hi All,
>
> I hope this is the right list.
>
> So, here's the problem I'm facing: I use EPD as my base python, but I
> have a bunch of projects that all have additional dependencies. For
> example, I may want to use a version of Pandas that's newer than the one
> that ships with my chosen version of EPD, and add some additional
> libraries that don't ship with EPD.
>
> Okay, so I worry that the advice may be "well just install stuff into
> EPD with pip or easy_install". I don't want to do that, just because I
> need a newer version of Pandas for one project, doesn't mean I want to
> have to make sure *all* my projects work with that new version, etc.
>
> So, I tried to wrap a virtualenv around epd:
>
> epd-python virtualenv.py mytestenv
>
> ...and then install new pandas and other stuff:
>
> mytestenv/bin/python easy_install -U pandas
>
> ...but now, how would I start ipython using that virutalenv?
>
> I tried just running "ipython", but of course, that doesn't include the
> virtualenv.
>
> I tried activating the virtualenv and then running ipython, but it still
> doesn't include anything installed in the virtualenv.
>
> So, how should I do this?
>
> Chris
>


From chris at simplistix.co.uk  Sat Mar 10 17:40:33 2012
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 10 Mar 2012 14:40:33 -0800
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
Message-ID: <4F5BD861.8050205@simplistix.co.uk>

On 10/03/2012 14:22, Robert Kern wrote:
>> So, here's the problem I'm facing: I use EPD as my base python, but I
>> have a bunch of projects that all have additional dependencies.
>
> Questions about EPD should go to epd-users at enthought.com

Well, okay, but this is a more generic question that seems to face a lot 
of SciPy users: "I want to layer some stuff on top of a binary install 
of the scipy stuff, without poluting that base layer", that base layer 
being EPD, OS-installed packages, etc...

>> ...but now, how would I start ipython using that virutalenv?
>>
>> I tried just running "ipython", but of course, that doesn't include the
>> virtualenv.
>
> You need to install IPython in your virtualenv.

Okay, but how do I do that without having to build the whole of ipython 
myself? How do I say "just let me run ipython (or any of the other 
binary tools that are in scipy) with a virtual env wrapped over it?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From sturla at molden.no  Sat Mar 10 17:44:48 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 23:44:48 +0100
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BD861.8050205@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
	<4F5BD861.8050205@simplistix.co.uk>
Message-ID: <4F5BD960.3040902@molden.no>

Den 10.03.2012 23:40, skrev Chris Withers:
>
> Well, okay, but this is a more generic question that seems to face a lot
> of SciPy users: "I want to layer some stuff on top of a binary install
> of the scipy stuff, without poluting that base layer", that base layer
> being EPD, OS-installed packages, etc...
>

Install the package or module to a local folder instead of site_packages 
and make sure it is in PYTHONPATH.

Sturla


From robert.kern at gmail.com  Sat Mar 10 17:45:41 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 10 Mar 2012 22:45:41 +0000
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BD861.8050205@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
	<4F5BD861.8050205@simplistix.co.uk>
Message-ID: <CAF6FJisMih2K3eVgdxQCGbc9CxNjLUy5ZrGKCvERN5LpBZ8Yug@mail.gmail.com>

On Sat, Mar 10, 2012 at 22:40, Chris Withers <chris at simplistix.co.uk> wrote:
> On 10/03/2012 14:22, Robert Kern wrote:
>>>
>>> So, here's the problem I'm facing: I use EPD as my base python, but I
>>> have a bunch of projects that all have additional dependencies.
>>
>>
>> Questions about EPD should go to epd-users at enthought.com
>
>
> Well, okay, but this is a more generic question that seems to face a lot of
> SciPy users: "I want to layer some stuff on top of a binary install of the
> scipy stuff, without poluting that base layer", that base layer being EPD,
> OS-installed packages, etc...
>
>
>>> ...but now, how would I start ipython using that virutalenv?
>>>
>>> I tried just running "ipython", but of course, that doesn't include the
>>> virtualenv.
>>
>>
>> You need to install IPython in your virtualenv.
>
>
> Okay, but how do I do that without having to build the whole of ipython
> myself?

The best way to use IPython in your virtualenv is to install it in
your virtualenv. It's easy.

> How do I say "just let me run ipython (or any of the other binary
> tools that are in scipy)

There are no executable scripts in scipy.

> with a virtual env wrapped over it?

If you must, you can edit the ipython script that you already have
installed. Change the #! line to be

  #!/usr/bin/env python

instead of the full path to your EPD Python executable. This will make
the ipython script use whichever "python" executable is first in your
$PATH. If you have your virtualenv activated, this will be the
virtualenv's "python" executable.

-- 
Robert Kern


From chris at simplistix.co.uk  Sat Mar 10 17:57:47 2012
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 10 Mar 2012 14:57:47 -0800
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BD960.3040902@molden.no>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
	<4F5BD861.8050205@simplistix.co.uk> <4F5BD960.3040902@molden.no>
Message-ID: <4F5BDC6B.6090102@simplistix.co.uk>

On 10/03/2012 14:44, Sturla Molden wrote:
> Den 10.03.2012 23:40, skrev Chris Withers:
>>
>> Well, okay, but this is a more generic question that seems to face a lot
>> of SciPy users: "I want to layer some stuff on top of a binary install
>> of the scipy stuff, without poluting that base layer", that base layer
>> being EPD, OS-installed packages, etc...
>>
>
> Install the package or module to a local folder instead of site_packages
> and make sure it is in PYTHONPATH.

Yeah, virtualenv is kinda the abstraction of that pattern ;-)

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From sturla at molden.no  Sat Mar 10 18:05:39 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 11 Mar 2012 00:05:39 +0100
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BDC6B.6090102@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
	<4F5BD861.8050205@simplistix.co.uk> <4F5BD960.3040902@molden.no>
	<4F5BDC6B.6090102@simplistix.co.uk>
Message-ID: <4F5BDE43.1090401@molden.no>

Den 10.03.2012 23:57, skrev Chris Withers:
>
> Yeah, virtualenv is kinda the abstraction of that pattern ;-)
>


It seems Python has introduced its' own version of DLL hell.

Sorry for stating the obvious.


Sturla


From rex at nosyntax.net  Sat Mar 10 18:44:42 2012
From: rex at nosyntax.net (rex)
Date: Sat, 10 Mar 2012 15:44:42 -0800
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BD861.8050205@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
	<4F5BD861.8050205@simplistix.co.uk>
Message-ID: <20120310234442.GN24301@ninja.nosyntax.net>

Chris Withers <chris at simplistix.co.uk> [2012-03-10 14:40]:
>On 10/03/2012 14:22, Robert Kern wrote:
>>> So, here's the problem I'm facing: I use EPD as my base python, but I
>>> have a bunch of projects that all have additional dependencies.
>>
>> Questions about EPD should go to epd-users at enthought.com
>
>Well, okay, but this is a more generic question that seems to face a lot
>of SciPy users: "I want to layer some stuff on top of a binary install
>of the scipy stuff, without poluting that base layer", that base layer
>being EPD, OS-installed packages, etc...
>
>>> ...but now, how would I start ipython using that virutalenv?
>>>
>>> I tried just running "ipython", but of course, that doesn't include the
>>> virtualenv.
>>
>> You need to install IPython in your virtualenv.
>
>Okay, but how do I do that without having to build the whole of ipython
>myself? How do I say "just let me run ipython (or any of the other
>binary tools that are in scipy) with a virtual env wrapped over it?

Simple answer: use R. I fought with Python+NumPy+SciPy+Matplotlib
problems for years before I discovered R. Night and day change. One
package that just works. Thousands of libraries that just work.

Developers, I'm not denigrating your efforts. I like Python, and I
really tried to make Python+NumPy+SciPy+Matplotlib my main tool for
years, but as a mere user it was simply too difficult to maintain the
parts -- every hour spent screwing with tool problems is an hour lost
to creative work.

Perhaps the NumPy+SciPy+Matplotlib community could learn something by
looking at how the R community works? To this mere user who wants to
get a job done, it's a night and day difference. I still use Python
for GP programming, but there's a snowball's chance I'd ever use
anything but R for my main interest, which is econometrics.

-rex
-- 
"In the real world, this would be a problem.  But in mathematics, we
can just define a place where this problem doesn't exist.  So we'll go
ahead and do that now..."


From wardefar at iro.umontreal.ca  Sat Mar 10 20:17:56 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Sat, 10 Mar 2012 20:17:56 -0500
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <CAF6FJisMih2K3eVgdxQCGbc9CxNjLUy5ZrGKCvERN5LpBZ8Yug@mail.gmail.com>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
	<4F5BD861.8050205@simplistix.co.uk>
	<CAF6FJisMih2K3eVgdxQCGbc9CxNjLUy5ZrGKCvERN5LpBZ8Yug@mail.gmail.com>
Message-ID: <EC0B36F6-E6CD-4A92-A0C8-5B08080F3B9A@iro.umontreal.ca>


On 2012-03-10, at 5:45 PM, Robert Kern wrote:

>> Okay, but how do I do that without having to build the whole of ipython
>> myself?
> 
> The best way to use IPython in your virtualenv is to install it in
> your virtualenv. It's easy.

To add to what Robert said, there's not a lot to "build" in IPython, if anything at all.

EPD provides all the compiled extensions that are dependencies/optional dependencies, like pyzmq. "easy_install ipython" or "python setup.py install" from a decompressed source package should take seconds.

From telukpalu at gmail.com  Sat Mar 10 21:01:45 2012
From: telukpalu at gmail.com (aa)
Date: Sun, 11 Mar 2012 09:01:45 +0700
Subject: [SciPy-User] about ode
Message-ID: <4F5C0789.1090800@gmail.com>

can You  give me alitle explanation using scipy escpecially abolut 
sc.integrate.ode,
i found matlab code like this:
g = @(t,x)  [x(1); -2*x(2)]
[t,x] = ode45(g,[0:1], [1.5,3])
plot(x(:,1),x(:,2))

---
how to get equivalent code in scipy...??


From lists at hilboll.de  Sun Mar 11 04:55:46 2012
From: lists at hilboll.de (Andreas H.)
Date: Sun, 11 Mar 2012 09:55:46 +0100
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5BD12D.9090504@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk>
Message-ID: <4F5C6892.3050606@hilboll.de>

Am 10.03.2012 23:09, schrieb Chris Withers:
> Hi All,
> 
> I hope this is the right list.
> 
> So, here's the problem I'm facing: I use EPD as my base python, but I 
> have a bunch of projects that all have additional dependencies. For 
> example, I may want to use a version of Pandas that's newer than the one 
> that ships with my chosen version of EPD, and add some additional 
> libraries that don't ship with EPD.
> 
> Okay, so I worry that the advice may be "well just install stuff into 
> EPD with pip or easy_install". I don't want to do that, just because I 
> need a newer version of Pandas for one project, doesn't mean I want to 
> have to make sure *all* my projects work with that new version, etc.
> 
> So, I tried to wrap a virtualenv around epd:
> 
> epd-python virtualenv.py mytestenv
> 
> ...and then install new pandas and other stuff:
> 
> mytestenv/bin/python easy_install -U pandas
> 
> ...but now, how would I start ipython using that virutalenv?
> 
> I tried just running "ipython", but of course, that doesn't include the 
> virtualenv.
> 
> I tried activating the virtualenv and then running ipython, but it still 
> doesn't include anything installed in the virtualenv.
> 
> So, how should I do this?
> 
> Chris
> 

Chris,

I just uploaded a quick log of what I did to accomplish exactly this to

   https://gist.github.com/2015652

I do have the problem that within the virtualenv, something with the
console's not working right, as iPythons help doesn't work properly, and
I cannot launch applications which open windows (except for ``ipython
pylab=wx``) ...

I hope it still helps ...

Cheers,
Andreas.


From pav at iki.fi  Sun Mar 11 07:14:30 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 11 Mar 2012 12:14:30 +0100
Subject: [SciPy-User] about ode
In-Reply-To: <4F5C0789.1090800@gmail.com>
References: <4F5C0789.1090800@gmail.com>
Message-ID: <jji1em$mvu$1@dough.gmane.org>

11.03.2012 03:01, aa kirjoitti:
> can You  give me alitle explanation using scipy escpecially abolut 
> sc.integrate.ode,
> i found matlab code like this:
> g = @(t,x)  [x(1); -2*x(2)]
> [t,x] = ode45(g,[0:1], [1.5,3])
> plot(x(:,1),x(:,2))
> 
> ---
> how to get equivalent code in scipy...??

import numpy as np
from scipy.integrate import odeint

def g(x, t):
    return [x[0], -2*x[1]]

t = np.linspace(0, 1, 200)
x = odeint(g, [1.5, 3], t)   # LSODAR, not Runge-Kutta


See also:

http://stackoverflow.com/questions/9466046/how-to-make-odeint-successful

^ In that recipe, replace 'zvode' by 'dopri5' to get R-K

http://www.scipy.org/NumPy_for_Matlab_Users


From seb.haase at gmail.com  Sun Mar 11 07:31:33 2012
From: seb.haase at gmail.com (Sebastian Haase)
Date: Sun, 11 Mar 2012 12:31:33 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CABL7CQgbXuD0G-PSVJGZ1khfccvJ6p7oYJRpn5FpWa0=pAX=SQ@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<jjg6of$6q0$1@dough.gmane.org>
	<CABL7CQgbXuD0G-PSVJGZ1khfccvJ6p7oYJRpn5FpWa0=pAX=SQ@mail.gmail.com>
Message-ID: <CAN06oV9Go+rVnYqDpO8RzhO_4RyV+GQkuUwa91k-U9OFeLdKBg@mail.gmail.com>

On Sat, Mar 10, 2012 at 8:48 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Sat, Mar 10, 2012 at 7:32 PM, Pauli Virtanen <pav at iki.fi> wrote:
>>
>> 10.03.2012 19:16, josef.pktd at gmail.com kirjoitti:
>> [clip]
>> > I think it would be useful to have a central location for publishing
>> > code and get earlier feedback from users. (Currently I bookmark, for
>> > example, ?gists or modules in a random package, that look interesting
>> > and I might not find again when I need them in future.)
>> >
>> > I'm still envious of the matlab fileexchange with a very useful
>> > commenting system where I look more often than at scipy-central, and I
>> > like the commenting system on stackoverflow that gives a much faster
>> > way of evaluating several ways of doing things.
>> >
>> > I also think Mathworks and Statacorp are very smart in supporting user
>> > code, since (I assume) that they look at the download statistic to see
>> > what is in high demand and might incorporate code if it's license
>> > compatible.
>>
>> I think this is pretty much the position scipy-central tries to fill. As
>> far as I understand, what you say that is missing is
>>
>> - more advertisement
>
>
> As a start, I added a link on the main scipy.org site. Adding it to
> http://new.scipy.org/ would also be good - SciPy Central needs a logo
> though.
>
> Ralf
>

FWIW, I didn't know (or must have forgotten) about scipy central - and
a google search also did NOT really help !!! Only the 5th hit was
"User profile: SciPy Central ---
scipy-central.org/user/profile/scipy-central/"  after "cookbook" on
first place,  followed by some mailing list post from Sept-2011 ...
and so on ...

How does SciPy central compare to the cookbook ?  It sound's like I
was kind-of meant to supersede it ... ?

- Sebastian Haase


From pav at iki.fi  Sun Mar 11 08:39:09 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 11 Mar 2012 13:39:09 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CAN06oV9Go+rVnYqDpO8RzhO_4RyV+GQkuUwa91k-U9OFeLdKBg@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<jjg6of$6q0$1@dough.gmane.org>
	<CABL7CQgbXuD0G-PSVJGZ1khfccvJ6p7oYJRpn5FpWa0=pAX=SQ@mail.gmail.com>
	<CAN06oV9Go+rVnYqDpO8RzhO_4RyV+GQkuUwa91k-U9OFeLdKBg@mail.gmail.com>
Message-ID: <jji6dd$mpo$1@dough.gmane.org>

11.03.2012 12:31, Sebastian Haase kirjoitti:
[clip]
> How does SciPy central compare to the cookbook ?  It sound's like I
> was kind-of meant to supersede it ... ?

The aim of the Scipy Central is more or less to supersede the Cookbook
and the Topical Software wiki pages with something more friendly.

I believe constructive suggestions on how to improve it would be welcome.

> FWIW, I didn't know (or must have forgotten) about scipy central - and
> a google search also did NOT really help !!! Only the 5th hit was
> "User profile: SciPy Central ---
> scipy-central.org/user/profile/scipy-central/"  after "cookbook" on
> first place,  followed by some mailing list post from Sept-2011 ...
> and so on ...

Today, that link is the first hit, probably thanks to the Google juice
from the scipy.org front page.

I wonder why the leading hit is not the front page, though. Some
additional Google-fu is maybe required. The user pages probably should
have the noindex meta, as they're not so useful to have in search engines.

-- 
Pauli Virtanen


From pav at iki.fi  Sun Mar 11 15:21:45 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 11 Mar 2012 20:21:45 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CANuBgHL-zUiDH_+0TVpuziW=bLrwPzBbWX=mR4=iZrKb93qQCQ@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
	<CAFt3ydvDS_MbZ3if=29cBwEeyieRz9OjJQ2t8xTg+Krk988R1w@mail.gmail.com>
	<CANuBgHL-zUiDH_+0TVpuziW=bLrwPzBbWX=mR4=iZrKb93qQCQ@mail.gmail.com>
Message-ID: <jjiu08$luh$1@dough.gmane.org>

Hi,

09.03.2012 09:47, Adrien Gaidon kirjoitti:
[clip]
> Furthermore, it seems that large projects tend to have API zealots that
> don't even want to see code unless it can be directly merged in master
> (caricature). I totally understand that, and think it's in the nature of
> open source projects in order to not grow anarchistically.
> However, this also prevents small "diamonds in the rough" to be
> discovered, or useful temporary hole-filling solutions to be proposed
> until a proper one is available. To me, this is a false problem due to
> the fact that the only advertised way to contribute is by forking + pull
> request. But not everybody is a scipy source code guru!

Coming back to this: it is also a false dichotomy.

A contribution is not either accepted or rejected. Rather,

- contribution is proposed
- it gets feedback
- original contributor (or someone else) revises, if needed
- accepted when it's good enough

This is exactly the same process through which all Scipy development is
done. (As it is now, no new feature lands in without review.) The
distinction between the "scipy team" and "contributors" is blurry at
best, and unproductive at worst.

The failure modes are that the original contributor or the other side
goes MIA. This is, however, not a real problem. If the code was listed
in a pull request, or a Trac ticket, it is possible (for the people
originally involved, or someone else) to get back to it later on. Sure,
for low-priority things, the delay may be long in the worst case, but
for things of broad interest, not so often.

Interestingly, in all of the concrete examples mentioned in this thread,
the discussion was only done on the mailing list. On a mailing list,
it's in practice not productive to read through the archives and pick up
pending stuff. (I often tell people to open a ticket, but don't always
remember to.)

Note that a contribution being just rejected (!= needs-work) does not
occur so often. In my experience, with Scipy this only happens if
there's something really wrong, or it is out of scope. I don't remember
many actual cases.

-- 
Pauli Virtanen


From massimodisasha at gmail.com  Sun Mar 11 16:53:47 2012
From: massimodisasha at gmail.com (Massimo Di Stefano)
Date: Sun, 11 Mar 2012 16:53:47 -0400
Subject: [SciPy-User] numpy - scipy test failure on OSX (git version
	10/3/2012)
Message-ID: <4C18CACA-0D26-4EEE-84B8-9CD8F82621C4@gmail.com>


Hi All,

i'm on OSX lion using the system python.

i just finished to build numpy and scipy  using the later gig version and i tried to run the test.
unlucky bumpy failed ands scipy test gives me a segfault.
i hope something is wrong on my system .. have you any hints on how to debug this problem ?

attached a link to the test log [1].

[1] http://www.geofemengineering.it//numpy_scipy_log.txt


From wardefar at iro.umontreal.ca  Sun Mar 11 23:30:19 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Sun, 11 Mar 2012 23:30:19 -0400
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
	fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CABL7CQgbXuD0G-PSVJGZ1khfccvJ6p7oYJRpn5FpWa0=pAX=SQ@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<jjg6of$6q0$1@dough.gmane.org>
	<CABL7CQgbXuD0G-PSVJGZ1khfccvJ6p7oYJRpn5FpWa0=pAX=SQ@mail.gmail.com>
Message-ID: <3A573391-18C3-4732-A3CC-BFEFC33F3321@iro.umontreal.ca>

On 2012-03-10, at 2:48 PM, Ralf Gommers wrote:

> As a start, I added a link on the main scipy.org site. Adding it to http://new.scipy.org/would also be good - SciPy Central needs a logo though.

Not to belittle anyone's efforts, but it needs some aesthetic/design work too besides that.

I think a nontrivial element in StackOverflow's success has been that it's very easy on the eyes, very well designed, and everything is exactly where you expect it to be.

(That's one thing that I think new.scipy.org got relatively right; unfortunately there were workload vs. manpower problems, and problems with access to the hosting (that eventually got solved, albeit too late).)

David

From slasley at space.umd.edu  Mon Mar 12 01:29:38 2012
From: slasley at space.umd.edu (Scott Lasley)
Date: Mon, 12 Mar 2012 01:29:38 -0400
Subject: [SciPy-User] numpy - scipy test failure on OSX (git version
	10/3/2012)
In-Reply-To: <4C18CACA-0D26-4EEE-84B8-9CD8F82621C4@gmail.com>
References: <4C18CACA-0D26-4EEE-84B8-9CD8F82621C4@gmail.com>
Message-ID: <B7631CD7-5B31-4A02-81ED-8F0CE1836A3E@space.umd.edu>


On Mar 11, 2012, at 4:53 PM, Massimo Di Stefano wrote:

> Hi All,
> 
> i'm on OSX lion using the system python.
> 
> i just finished to build numpy and scipy  using the later gig version and i tried to run the test.
> unlucky bumpy failed ands scipy test gives me a segfault.
> i hope something is wrong on my system .. have you any hints on how to debug this problem ?
> 
> attached a link to the test log [1].
> 
> [1] http://www.geofemengineering.it//numpy_scipy_log.txt

Which version of Xcode do you have?  I got a segfault in the scipy tests after building with the llvm-gcc included with Xcode 4.2.  I have not tried building scipy with the latest llvm-gcc included with Xcode 4.3, but building with clang or gcc-4.2 from the cran.r-project worked.  See this message for more information
http://mail.scipy.org/pipermail/scipy-user/2012-February/031460.html

hth,
Scott

From massimodisasha at gmail.com  Mon Mar 12 04:15:54 2012
From: massimodisasha at gmail.com (Massimo Di Stefano)
Date: Mon, 12 Mar 2012 04:15:54 -0400
Subject: [SciPy-User] numpy - scipy test failure on OSX (git version
	10/3/2012)
In-Reply-To: <B7631CD7-5B31-4A02-81ED-8F0CE1836A3E@space.umd.edu>
References: <4C18CACA-0D26-4EEE-84B8-9CD8F82621C4@gmail.com>
	<B7631CD7-5B31-4A02-81ED-8F0CE1836A3E@space.umd.edu>
Message-ID: <840B0189-6F93-400C-86AD-FF5B7F10A408@gmail.com>

I tried to upgrade to Xcode 4.3 .. but seems that 'for free' there is only Xcore 4.2 on the apple dev center.

i installed the gcc from the crane website and i exported the flags .. but numpy build is giving me word error about array with negative size.

here [1] the full log using the gcc from the R website.

[1] http://www.geofemengineering.it/numpy_build_log.txt


i think it i is the cause of all the evil ? i should fix this before to go a had with scipy.


frustrated i also tried to build gcc4.6 from source  .. but no lucky the setup.py ends with :

MacBook-Pro-di-Massimo:numpy epifanio$ export CC=/usr/local/bin/gcc
MacBook-Pro-di-Massimo:numpy epifanio$ export CXX=/usr/local/bin/g++
MacBook-Pro-di-Massimo:numpy epifanio$ rm -rf build/
MacBook-Pro-di-Massimo:numpy epifanio$ python setup.py build_ext --fcompiler=/usr/local/bin/gfortran-4.2
Running from numpy source directory.
non-existing path in 'numpy/distutils': 'site.cfg'
F2PY Version 2
numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch detected, the C API version numbers have to be updated. Current C api version is 6, with checksum eb54c77ff4149bab310324cd7c0cb176, but recorded checksum for C API version 6 in codegen_dir/cversions.txt is e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, you have to update C_API_VERSION  in numpy/core/setup_common.pyc.
  MismatchCAPIWarning)
blas_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers']

lapack_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-msse3']

running build_ext
running build_src
build_src
building py_modules sources
creating build
creating build/src.macosx-10.7-intel-2.7
creating build/src.macosx-10.7-intel-2.7/numpy
creating build/src.macosx-10.7-intel-2.7/numpy/distutils
building library "npymath" sources
customize NAGFCompiler
Could not locate executable f95
customize AbsoftFCompiler
Could not locate executable f90
Could not locate executable f77
customize IBMFCompiler
Could not locate executable xlf90
Could not locate executable xlf
customize IntelFCompiler
Could not locate executable ifort
Could not locate executable ifc
customize GnuFCompiler
Could not locate executable g77
customize Gnu95FCompiler
Found executable /usr/local/bin/gfortran
customize Gnu95FCompiler
customize Gnu95FCompiler using config
C compiler: /usr/local/bin/gcc -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe

compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c'
gcc: _configtest.c
gcc: warning: ?-mfused-madd? is deprecated; use ?-ffp-contract=? instead
gcc: error: i386: No such file or directory
gcc: error: x86_64: No such file or directory
gcc: error: unrecognized option ?-arch?
gcc: error: unrecognized option ?-arch?
gcc: warning: ?-mfused-madd? is deprecated; use ?-ffp-contract=? instead
gcc: error: i386: No such file or directory
gcc: error: x86_64: No such file or directory
gcc: error: unrecognized option ?-arch?
gcc: error: unrecognized option ?-arch?
failure.
removing: _configtest.c _configtest.o
Traceback (most recent call last):
  File "setup.py", line 214, in <module>
    setup_package()
  File "setup.py", line 207, in setup_package
    configuration=configuration )
  File "/Users/epifanio/dev/src/numpy/numpy/distutils/core.py", line 186, in setup
    return old_setup(**new_attr)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 152, in setup
    dist.run_commands()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/Users/epifanio/dev/src/numpy/numpy/distutils/command/build_ext.py", line 57, in run
    self.run_command('build_src')
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/cmd.py", line 326, in run_command
    self.distribution.run_command(command)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/Users/epifanio/dev/src/numpy/numpy/distutils/command/build_src.py", line 152, in run
    self.build_sources()
  File "/Users/epifanio/dev/src/numpy/numpy/distutils/command/build_src.py", line 163, in build_sources
    self.build_library_sources(*libname_info)
  File "/Users/epifanio/dev/src/numpy/numpy/distutils/command/build_src.py", line 298, in build_library_sources
    sources = self.generate_sources(sources, (lib_name, build_info))
  File "/Users/epifanio/dev/src/numpy/numpy/distutils/command/build_src.py", line 385, in generate_sources
    source = func(extension, build_dir)
  File "numpy/core/setup.py", line 648, in get_mathlib_info
    raise RuntimeError("Broken toolchain: cannot link a simple C program")
RuntimeError: Broken toolchain: cannot link a simple C program
MacBook-Pro-di-Massimo:numpy epifanio$ 

maybe i missed something during the gcc configure ???


Il giorno Mar 12, 2012, alle ore 1:29 AM, Scott Lasley ha scritto:

> 
> On Mar 11, 2012, at 4:53 PM, Massimo Di Stefano wrote:
> 
>> Hi All,
>> 
>> i'm on OSX lion using the system python.
>> 
>> i just finished to build numpy and scipy  using the later gig version and i tried to run the test.
>> unlucky bumpy failed ands scipy test gives me a segfault.
>> i hope something is wrong on my system .. have you any hints on how to debug this problem ?
>> 
>> attached a link to the test log [1].
>> 
>> [1] http://www.geofemengineering.it//numpy_scipy_log.txt
> 
> Which version of Xcode do you have?  I got a segfault in the scipy tests after building with the llvm-gcc included with Xcode 4.2.  I have not tried building scipy with the latest llvm-gcc included with Xcode 4.3, but building with clang or gcc-4.2 from the cran.r-project worked.  See this message for more information
> http://mail.scipy.org/pipermail/scipy-user/2012-February/031460.html
> 
> hth,
> Scott
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From robert.kern at gmail.com  Mon Mar 12 05:43:54 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 12 Mar 2012 09:43:54 +0000
Subject: [SciPy-User] numpy - scipy test failure on OSX (git version
	10/3/2012)
In-Reply-To: <840B0189-6F93-400C-86AD-FF5B7F10A408@gmail.com>
References: <4C18CACA-0D26-4EEE-84B8-9CD8F82621C4@gmail.com>
	<B7631CD7-5B31-4A02-81ED-8F0CE1836A3E@space.umd.edu>
	<840B0189-6F93-400C-86AD-FF5B7F10A408@gmail.com>
Message-ID: <CAF6FJiv2hrwPh4KZBZHtnpfXDNki46Br1iWjs0D1tdO67sVtNA@mail.gmail.com>

On Mon, Mar 12, 2012 at 08:15, Massimo Di Stefano
<massimodisasha at gmail.com> wrote:
> I tried to upgrade to Xcode 4.3 .. but seems that 'for free' there is only Xcore 4.2 on the apple dev center.
>
> i installed the gcc from the crane website and i exported the flags .. but numpy build is giving me word error about array with negative size.
>
> here [1] the full log using the gcc from the R website.
>
> [1] http://www.geofemengineering.it/numpy_build_log.txt
>
>
> i think it i is the cause of all the evil ? i should fix this before to go a had with scipy.
>
>
>
> frustrated i also tried to build gcc4.6 from source ?.. but no lucky the setup.py ends with :

> C compiler: /usr/local/bin/gcc -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe
>
> compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c'
> gcc: _configtest.c
> gcc: warning: ?-mfused-madd? is deprecated; use ?-ffp-contract=? instead
> gcc: error: i386: No such file or directory
> gcc: error: x86_64: No such file or directory
> gcc: error: unrecognized option ?-arch?
> gcc: error: unrecognized option ?-arch?
> gcc: warning: ?-mfused-madd? is deprecated; use ?-ffp-contract=? instead
> gcc: error: i386: No such file or directory
> gcc: error: x86_64: No such file or directory
> gcc: error: unrecognized option ?-arch?
> gcc: error: unrecognized option ?-arch?
> failure.

> maybe i missed something during the gcc configure ???

You need to apply Apple's patches to provide the "-arch" compile flag.
I don't believe they have patches to gcc 4.6.

This is probably the easiest way to get command-line installs of the
official gcc:

  http://kennethreitz.com/xcode-gcc-and-homebrew.html

-- 
Robert Kern


From matt.newville at gmail.com  Sat Mar 10 13:24:42 2012
From: matt.newville at gmail.com (Matthew Newville)
Date: Sat, 10 Mar 2012 10:24:42 -0800 (PST)
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <jjg1bs$1v0$1@dough.gmane.org>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
	<32380089.1618.1331327615123.JavaMail.geo-discussion-forums@ynjc20>
	<jjg1bs$1v0$1@dough.gmane.org>
Message-ID: <1111627.2833.1331403882535.JavaMail.geo-discussion-forums@ynnk21>


On Saturday, March 10, 2012 11:00:44 AM UTC-6, Pauli Virtanen wrote:
>
> 09.03.2012 22:13, Matthew Newville kirjoitti:
> [clip]
> > Well, I understand you don't necessarily consider yourself to be a scipy
> > dev, but I'll ask anyway:  How many pull requests have been made, and
> > how many have been accepted?  Perhaps I am not reading github's Pull
> > Request link correctly, but that seems to indicate that the numbers are
> > 17 and 0.  Surely, those cannot be correct.      Unless I am counting
> > wrong, 5 of the 17 (total? outstanding?) pull requests listed for
> > scipy/scipy involve optimization.
>
> Wrong: 17 open, 99 accepted.
>

Yes, I see that (or more) now....  I've been using git for a while now, but 
still learning how to read github pages.  I knew 17/0 couldn't be right.... 
Sorry for that.
 

> [clip]
> > Well, there was a discussion "Alternative to scipy.optimize" in the past
> > two weeks on scipy-users that mentioned lmfit, and several over the past
> > several months, and apparently requests about mpfit in the more distant
> > past.  And yet two scipy contributers (according to github's list, I am
> > counting you) responded to the original request for features **exactly
> > like lmfit** with something reading an awful lot like "Well, you'll have
> > to write one".    Perhaps insular is not a fair characterization -- how
> > would you characterize that?
>
> Come on. The correct characterization is just: "busy". I did not
> remember that your project existed as it was announced half a year ago
> with no proposal that it should be integrated, and did not read the
> recent thread on scipy-user.
>
OK, I can accept that.  We're all busy.  

I still take exception with Gael's response to the original question.  In 
this instance, there was browbeating of outside people for not contributing 
coupled with ignoring contributions from outside people.  You'll probably 
forgive me for thinking that is not so encouraging for outside 
developers....

Scipy is fantastic project, and I've been relying on it for many years.  
But it is also very large and diverse, with lots of great code, and some 
mediocre code, and it's sometimes difficult to tell what is well supported 
and what is less so.   It's also not clear to me what the strategy is for 
deciding what belongs in the core and what belongs in outside projects.  

The problems of how to organize the scientific python projects, and how to 
attract more developers, are challenging.  I do not pretend to know how to 
do that, and I'm sure you've all thought about and discussed this at 
length, but it might be important to give more attention to these issues.  
But I also submit that the (self) perception that there is a small group of 
people doing all the work and a large group of people who do nothing but 
ask for more features is likely to be self-fulfilling.   The way out of the 
cage is probably not by running harder or screaming louder, but by moving 
the boundaries.

Cheers, 

-Matt Newville

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120310/eb2302fc/attachment.html>

From alec.kalinin at gmail.com  Mon Mar 12 09:56:47 2012
From: alec.kalinin at gmail.com (Alexander Kalinin)
Date: Mon, 12 Mar 2012 16:56:47 +0300
Subject: [SciPy-User] What is the best way to take elements from an array
	along an axis?
Message-ID: <CACL-BpGx8OoAnYVpX5uzZbp2tODgZQ06H70m8C+xY7tkOq=F-w@mail.gmail.com>

Hello,

When we use "fancy" indexing to take elements from an array along an axis,
the error could appear. Look at the code

 import numpy as np

a = np.random.rand(10, 3)

b = np.random.rand(10, 3)

c = b[:, 0]

result = a * c


We got an error: "ValueError: shape mismatch: objects cannot be broadcast
to a single shape". We have the shape problem:

>>> a.shape
(10, 3)
>>> c.shape
(10,)
>>

I know two ways to overcome this problem:
1 way:
c = b[:, 0].reshape(-1, 1)


2 way:

c = b[:, 0, np.newaxis]


But what is the best practice for this case? How I should take elements
from an array?


Sincerely,

Alexander
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120312/96a94bd9/attachment.html>

From josef.pktd at gmail.com  Mon Mar 12 10:04:28 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 12 Mar 2012 10:04:28 -0400
Subject: [SciPy-User] What is the best way to take elements from an
 array along an axis?
In-Reply-To: <CACL-BpGx8OoAnYVpX5uzZbp2tODgZQ06H70m8C+xY7tkOq=F-w@mail.gmail.com>
References: <CACL-BpGx8OoAnYVpX5uzZbp2tODgZQ06H70m8C+xY7tkOq=F-w@mail.gmail.com>
Message-ID: <CAMMTP+BwG4aPGP_z5c2tcujRP5Ob1Grb9WXmByQYudYReeN9uQ@mail.gmail.com>

On Mon, Mar 12, 2012 at 9:56 AM, Alexander Kalinin
<alec.kalinin at gmail.com> wrote:
> Hello,
>
> When we use "fancy" indexing to take elements from an array along an axis,
> the error could appear. Look at the code
>
> import numpy as np
>
> a = np.random.rand(10, 3)
>
> b = np.random.rand(10, 3)
>
> c = b[:, 0]
>
> result = a * c
>
>
> We got an error: "ValueError: shape mismatch: objects cannot be broadcast to
> a single shape". We have the shape problem:
>
>
>>>> a.shape
> (10, 3)
>>>> c.shape
> (10,)
>>>
>
> I know two ways to overcome this problem:
> 1 way:
> c = b[:, 0].reshape(-1, 1)
>
>
> 2 way:
>
> c = b[:, 0, np.newaxis]
>
>
> But what is the best practice for this case? How I should take elements from
> an array?

when I know I will need it right away with the extra axis, I usually use slices

c = b[:, :1]
or
c = b[:, k:k+1]

numpy also has a function to add a newaxis that I often use, but it
has such an awful name that I don't find it anymore. :)

Josef


>
>
> Sincerely,
>
> Alexander
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Mon Mar 12 10:07:11 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 12 Mar 2012 10:07:11 -0400
Subject: [SciPy-User] What is the best way to take elements from an
 array along an axis?
In-Reply-To: <CAMMTP+BwG4aPGP_z5c2tcujRP5Ob1Grb9WXmByQYudYReeN9uQ@mail.gmail.com>
References: <CACL-BpGx8OoAnYVpX5uzZbp2tODgZQ06H70m8C+xY7tkOq=F-w@mail.gmail.com>
	<CAMMTP+BwG4aPGP_z5c2tcujRP5Ob1Grb9WXmByQYudYReeN9uQ@mail.gmail.com>
Message-ID: <CAMMTP+BYy+xsrJw_n47O5AHAF6VoSGN2HCkJYPvqizsRPk7BUw@mail.gmail.com>

On Mon, Mar 12, 2012 at 10:04 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, Mar 12, 2012 at 9:56 AM, Alexander Kalinin
> <alec.kalinin at gmail.com> wrote:
>> Hello,
>>
>> When we use "fancy" indexing to take elements from an array along an axis,
>> the error could appear. Look at the code
>>
>> import numpy as np
>>
>> a = np.random.rand(10, 3)
>>
>> b = np.random.rand(10, 3)
>>
>> c = b[:, 0]
>>
>> result = a * c
>>
>>
>> We got an error: "ValueError: shape mismatch: objects cannot be broadcast to
>> a single shape". We have the shape problem:
>>
>>
>>>>> a.shape
>> (10, 3)
>>>>> c.shape
>> (10,)
>>>>
>>
>> I know two ways to overcome this problem:
>> 1 way:
>> c = b[:, 0].reshape(-1, 1)
>>
>>
>> 2 way:
>>
>> c = b[:, 0, np.newaxis]
>>
>>
>> But what is the best practice for this case? How I should take elements from
>> an array?
>
> when I know I will need it right away with the extra axis, I usually use slices
>
> c = b[:, :1]
> or
> c = b[:, k:k+1]
>
> numpy also has a function to add a newaxis that I often use, but it
> has such an awful name that I don't find it anymore. :)

numpy.expand_dims(a, axis)

which is nice if the axis is a parameter and not in a fixed position

Josef

>
> Josef
>
>
>
>>
>>
>> Sincerely,
>>
>> Alexander
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>


From alec.kalinin at gmail.com  Mon Mar 12 10:10:06 2012
From: alec.kalinin at gmail.com (Alexander Kalinin)
Date: Mon, 12 Mar 2012 17:10:06 +0300
Subject: [SciPy-User] What is the best way to take elements from an
 array along an axis?
In-Reply-To: <CAMMTP+BwG4aPGP_z5c2tcujRP5Ob1Grb9WXmByQYudYReeN9uQ@mail.gmail.com>
References: <CACL-BpGx8OoAnYVpX5uzZbp2tODgZQ06H70m8C+xY7tkOq=F-w@mail.gmail.com>
	<CAMMTP+BwG4aPGP_z5c2tcujRP5Ob1Grb9WXmByQYudYReeN9uQ@mail.gmail.com>
Message-ID: <CACL-BpFVbM_yEh7RWaOtaX63uUVbYX4xaqW8nSzL-ghUYc80Gw@mail.gmail.com>

Thank you, Josef!

Yes, I think
c = b[:, :1]
or
c = b[:, k:k+1]
is the most clean way to do what i want.

Sincerely,
Alexander

On Mon, Mar 12, 2012 at 6:04 PM, <josef.pktd at gmail.com> wrote:

> On Mon, Mar 12, 2012 at 9:56 AM, Alexander Kalinin
> <alec.kalinin at gmail.com> wrote:
> > Hello,
> >
> > When we use "fancy" indexing to take elements from an array along an
> axis,
> > the error could appear. Look at the code
> >
> > import numpy as np
> >
> > a = np.random.rand(10, 3)
> >
> > b = np.random.rand(10, 3)
> >
> > c = b[:, 0]
> >
> > result = a * c
> >
> >
> > We got an error: "ValueError: shape mismatch: objects cannot be
> broadcast to
> > a single shape". We have the shape problem:
> >
> >
> >>>> a.shape
> > (10, 3)
> >>>> c.shape
> > (10,)
> >>>
> >
> > I know two ways to overcome this problem:
> > 1 way:
> > c = b[:, 0].reshape(-1, 1)
> >
> >
> > 2 way:
> >
> > c = b[:, 0, np.newaxis]
> >
> >
> > But what is the best practice for this case? How I should take elements
> from
> > an array?
>
> when I know I will need it right away with the extra axis, I usually use
> slices
>
> c = b[:, :1]
> or
> c = b[:, k:k+1]
>
> numpy also has a function to add a newaxis that I often use, but it
> has such an awful name that I don't find it anymore. :)
>
> Josef
>
>
>
> >
> >
> > Sincerely,
> >
> > Alexander
> >
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120312/449decfe/attachment.html>

From massimodisasha at gmail.com  Mon Mar 12 10:36:07 2012
From: massimodisasha at gmail.com (Massimo Di Stefano)
Date: Mon, 12 Mar 2012 10:36:07 -0400
Subject: [SciPy-User] numpy - scipy test failure on OSX (git version
	10/3/2012)
In-Reply-To: <CAF6FJiv2hrwPh4KZBZHtnpfXDNki46Br1iWjs0D1tdO67sVtNA@mail.gmail.com>
References: <4C18CACA-0D26-4EEE-84B8-9CD8F82621C4@gmail.com>
	<B7631CD7-5B31-4A02-81ED-8F0CE1836A3E@space.umd.edu>
	<840B0189-6F93-400C-86AD-FF5B7F10A408@gmail.com>
	<CAF6FJiv2hrwPh4KZBZHtnpfXDNki46Br1iWjs0D1tdO67sVtNA@mail.gmail.com>
Message-ID: <F09B906C-225B-495F-93D1-7756CA4DA0EA@gmail.com>

i removed the Xcode tools and  I've re-installed the latest Xcode tools, 
the gcc compiler is :


MacBook-Pro-di-Massimo:~ epifanio$ gcc --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


it is the same version i had previously ? the build give the same errors about array of  negative size and the test fails as before.


so, i decided to give a chance to home brew. 
i installed it and python,
then using the homebrew easy_install executable i installed nose.

building nupy i noticed the same bad log about error array with negative size.
running the test i got this :

.............................................................
======================================================================
FAIL: test_kind.TestKind.test_all
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.2/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/usr/local/Cellar/python/2.7.2/lib/python2.7/site-packages/numpy/f2py/tests/test_kind.py", line 30, in test_all
    'selectedrealkind(%s): expected %r but got %r' %  (i, selected_real_kind(i), selectedrealkind(i)))
  File "/usr/local/Cellar/python/2.7.2/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_
    raise AssertionError(msg)
AssertionError: selectedrealkind(19): expected -1 but got 16

----------------------------------------------------------------------
Ran 3650 tests in 29.466s

FAILED (KNOWNFAIL=3, SKIP=5, failures=1)
<nose.result.TextTestResult run=3650 errors=0 failures=1>
>>> 


so far, .. no lucky.

i'll try to build the 'stable' old bumpy and see what happens.


Il giorno Mar 12, 2012, alle ore 5:43 AM, Robert Kern ha scritto:

> On Mon, Mar 12, 2012 at 08:15, Massimo Di Stefano
> <massimodisasha at gmail.com> wrote:
>> I tried to upgrade to Xcode 4.3 .. but seems that 'for free' there is only Xcore 4.2 on the apple dev center.
>> 
>> i installed the gcc from the crane website and i exported the flags .. but numpy build is giving me word error about array with negative size.
>> 
>> here [1] the full log using the gcc from the R website.
>> 
>> [1] http://www.geofemengineering.it/numpy_build_log.txt
>> 
>> 
>> i think it i is the cause of all the evil ? i should fix this before to go a had with scipy.
>> 
>> 
>> 
>> frustrated i also tried to build gcc4.6 from source  .. but no lucky the setup.py ends with :
> 
>> C compiler: /usr/local/bin/gcc -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe
>> 
>> compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c'
>> gcc: _configtest.c
>> gcc: warning: ?-mfused-madd? is deprecated; use ?-ffp-contract=? instead
>> gcc: error: i386: No such file or directory
>> gcc: error: x86_64: No such file or directory
>> gcc: error: unrecognized option ?-arch?
>> gcc: error: unrecognized option ?-arch?
>> gcc: warning: ?-mfused-madd? is deprecated; use ?-ffp-contract=? instead
>> gcc: error: i386: No such file or directory
>> gcc: error: x86_64: No such file or directory
>> gcc: error: unrecognized option ?-arch?
>> gcc: error: unrecognized option ?-arch?
>> failure.
> 
>> maybe i missed something during the gcc configure ???
> 
> You need to apply Apple's patches to provide the "-arch" compile flag.
> I don't believe they have patches to gcc 4.6.
> 
> This is probably the easiest way to get command-line installs of the
> official gcc:
> 
>  http://kennethreitz.com/xcode-gcc-and-homebrew.html
> 
> -- 
> Robert Kern
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From efmgdj at yahoo.com  Mon Mar 12 12:48:54 2012
From: efmgdj at yahoo.com (Eric)
Date: Mon, 12 Mar 2012 16:48:54 +0000 (UTC)
Subject: [SciPy-User] scipy sparse limits
Message-ID: <loom.20120312T174612-54@post.gmane.org>

Hi,  I'm trying to use large 10^5x10^5 sparse 
matrices but seem to 
be running up against a scipy limit:

n=10**5, x=sp.rand(n,n,.001) gets 
"ValueError: Trying to generate a random sparse matrix such 
as the product of dimensions is greater than 
2147483647 - this is not supported on this machine"

Does anyone know why limit is there and if I can avoid it? 
(fyi, I'm using a macbook air with
 4gb memory and the enthought distribution)

thanks,
Eric


From pav at iki.fi  Mon Mar 12 13:28:53 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 12 Mar 2012 18:28:53 +0100
Subject: [SciPy-User] scipy sparse limits
In-Reply-To: <loom.20120312T174612-54@post.gmane.org>
References: <loom.20120312T174612-54@post.gmane.org>
Message-ID: <jjlbol$dct$1@dough.gmane.org>

12.03.2012 17:48, Eric kirjoitti:
[clip]
> n=10**5, x=sp.rand(n,n,.001) gets 
> "ValueError: Trying to generate a random sparse matrix such 
> as the product of dimensions is greater than 
> 2147483647 - this is not supported on this machine"
> 
> Does anyone know why limit is there and if I can avoid it? 

The limit seems to be only in the rand() routine --- you can create
larger random matrices otherwise.

The limit probably shouldn't be there --- this is probably a bug. As a
workaround, you can copy and paste the rand() routine to your own code
and remove the size check:

https://github.com/scipy/scipy/blob/master/scipy/sparse/construct.py#L573


From cweisiger at msg.ucsf.edu  Mon Mar 12 13:32:06 2012
From: cweisiger at msg.ucsf.edu (Chris Weisiger)
Date: Mon, 12 Mar 2012 10:32:06 -0700
Subject: [SciPy-User] scipy sparse limits
In-Reply-To: <loom.20120312T174612-54@post.gmane.org>
References: <loom.20120312T174612-54@post.gmane.org>
Message-ID: <CABHB1jJEEVX8a5m4GSX7FJH1Ux4-0QwLdhSNY492-RM8b8hALg@mail.gmail.com>

On Mon, Mar 12, 2012 at 9:48 AM, Eric <efmgdj at yahoo.com> wrote:
> Hi, ?I'm trying to use large 10^5x10^5 sparse
> matrices but seem to
> be running up against a scipy limit:
>
> n=10**5, x=sp.rand(n,n,.001) gets
> "ValueError: Trying to generate a random sparse matrix such
> as the product of dimensions is greater than
> 2147483647?- this is not supported on this machine"
>
> Does anyone know why limit is there and if I can avoid it?

log2(2147483648) = 31, so it sounds like you're running into a 32-bit
limitation somewhere. If you can, using 64-bit Python/Numpy/Scipy
would probably get you around this.

-Chris


From ralf.gommers at googlemail.com  Mon Mar 12 17:41:47 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 12 Mar 2012 22:41:47 +0100
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <1111627.2833.1331403882535.JavaMail.geo-discussion-forums@ynnk21>
References: <4F5916A2.2040604@eso.org>
	<20120308210722.GC12436@phare.normalesup.org>
	<6942812.8.1331254549777.JavaMail.geo-discussion-forums@vbuf40>
	<20120309165443.GA26552@phare.normalesup.org>
	<32380089.1618.1331327615123.JavaMail.geo-discussion-forums@ynjc20>
	<jjg1bs$1v0$1@dough.gmane.org>
	<1111627.2833.1331403882535.JavaMail.geo-discussion-forums@ynnk21>
Message-ID: <CABL7CQih8wz1fMju9_f9VRrTmWJ2jHky6ubJqa0BJx_d0ke7RQ@mail.gmail.com>

On Sat, Mar 10, 2012 at 7:24 PM, Matthew Newville
<matt.newville at gmail.com>wrote:

>
>
> On Saturday, March 10, 2012 11:00:44 AM UTC-6, Pauli Virtanen wrote:
>>
>> 09.03.2012 22:13, Matthew Newville kirjoitti:
>> [clip]
>> > Well, I understand you don't necessarily consider yourself to be a scipy
>> > dev, but I'll ask anyway:  How many pull requests have been made, and
>> > how many have been accepted?  Perhaps I am not reading github's Pull
>> > Request link correctly, but that seems to indicate that the numbers are
>> > 17 and 0.  Surely, those cannot be correct.      Unless I am counting
>> > wrong, 5 of the 17 (total? outstanding?) pull requests listed for
>> > scipy/scipy involve optimization.
>>
>> Wrong: 17 open, 99 accepted.
>>
>
> Yes, I see that (or more) now....  I've been using git for a while now,
> but still learning how to read github pages.  I knew 17/0 couldn't be
> right.... Sorry for that.
>
>
>> [clip]
>> > Well, there was a discussion "Alternative to scipy.optimize" in the past
>> > two weeks on scipy-users that mentioned lmfit, and several over the past
>> > several months, and apparently requests about mpfit in the more distant
>> > past.  And yet two scipy contributers (according to github's list, I am
>> > counting you) responded to the original request for features **exactly
>> > like lmfit** with something reading an awful lot like "Well, you'll have
>> > to write one".    Perhaps insular is not a fair characterization -- how
>> > would you characterize that?
>>
>> Come on. The correct characterization is just: "busy". I did not
>> remember that your project existed as it was announced half a year ago
>> with no proposal that it should be integrated, and did not read the
>> recent thread on scipy-user.
>>
> OK, I can accept that.  We're all busy.
>
> I still take exception with Gael's response to the original question.  In
> this instance, there was browbeating of outside people for not contributing
> coupled with ignoring contributions from outside people.  You'll probably
> forgive me for thinking that is not so encouraging for outside
> developers....
>
> Scipy is fantastic project, and I've been relying on it for many years.
> But it is also very large and diverse, with lots of great code, and some
> mediocre code, and it's sometimes difficult to tell what is well supported
> and what is less so.   It's also not clear to me what the strategy is for
> deciding what belongs in the core and what belongs in outside projects.
>

This strategy was never really clear, we're trying to improve that. The
recent "SciPy Goal" thread helped a lot there. Better descriptions of what
has been discussed and how to make these decisions (in general, there's
always a grey area) should be put up for review and discussion soon.


Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120312/f577ec13/attachment.html>

From eddybarratt1 at yahoo.co.uk  Mon Mar 12 17:46:07 2012
From: eddybarratt1 at yahoo.co.uk (Eddy Barratt)
Date: Mon, 12 Mar 2012 21:46:07 +0000 (GMT)
Subject: [SciPy-User] Building numpy/scipy for python3 on MacOS Lion
Message-ID: <1331588767.69711.YahooMailNeo@web29505.mail.ird.yahoo.com>

I can't get Numpy or Scipy to work with Python3 on Mac OSX Lion.

I have used pip successfully to install numpy, scipy and matplotlib, and they work well with Python2.7, but in Python3 typing 'import numpy' brings up 'No module named numpy'. I've tried downloading the source code directly and then running 'python3 setup.py build', but I get various error warnings, some in red that have to do with fortran (e.g. 'Could not locate executable f95'). The error message that appears to fail in the end is 'RuntimeError: Broken toolchain: cannot link a simple C program', and appears to be related to the previous line 'sh: gcc-4.2: command not found'.

The Scipy website (http://www.scipy.org/Installing_SciPy/Mac_OS_X) suggests that there may be issues with the c compiler, but the same problems didn't arise using pip to install for python2.7. I have followed the instructions on the website regarding changing the compiler but this has not made any difference.


I have also tried installing from a virtual environment:


>>> mkvirtualenv -p python3.2 test1
>>> pip install numpy

But this fails with "Command python setup.py egg_info failed with error code 1 in /Users/Eddy/.virtualenvs/test1/build/numpy"
I've considered making python3 default, and then I thought a pip install might work, but I don't know how to do that. Does anyone have any suggestions for how I might proceed? I'm relatively new to Python but it's something I feel I'm likely to become more involved in so I'd like to start using Python3 before I get too established with 2.7. Thanks for your help.

Eddy


From kgdunn at gmail.com  Mon Mar 12 23:11:22 2012
From: kgdunn at gmail.com (Kevin Dunn)
Date: Mon, 12 Mar 2012 23:11:22 -0400
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <jji6dd$mpo$1@dough.gmane.org>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<jjg6of$6q0$1@dough.gmane.org>
	<CABL7CQgbXuD0G-PSVJGZ1khfccvJ6p7oYJRpn5FpWa0=pAX=SQ@mail.gmail.com>
	<CAN06oV9Go+rVnYqDpO8RzhO_4RyV+GQkuUwa91k-U9OFeLdKBg@mail.gmail.com>
	<jji6dd$mpo$1@dough.gmane.org>
Message-ID: <CAGD7iRaev3wK6bgYv+HQ9qmsv3J_G4Y5p2A5FyXTnZ0CqGq59g@mail.gmail.com>

On Sun, Mar 11, 2012 at 08:39, Pauli Virtanen <pav at iki.fi> wrote:
> 11.03.2012 12:31, Sebastian Haase kirjoitti:
> [clip]
>> How does SciPy central compare to the cookbook ? ?It sound's like I
>> was kind-of meant to supersede it ... ?
>
> The aim of the Scipy Central is more or less to supersede the Cookbook
> and the Topical Software wiki pages with something more friendly.
>
> I believe constructive suggestions on how to improve it would be welcome.
>
>> FWIW, I didn't know (or must have forgotten) about scipy central - and
>> a google search also did NOT really help !!! Only the 5th hit was
>> "User profile: SciPy Central ---
>> scipy-central.org/user/profile/scipy-central/" ?after "cookbook" on
>> first place, ?followed by some mailing list post from Sept-2011 ...
>> and so on ...
>
> Today, that link is the first hit, probably thanks to the Google juice
> from the scipy.org front page.
>
> I wonder why the leading hit is not the front page, though. Some
> additional Google-fu is maybe required. The user pages probably should
> have the noindex meta, as they're not so useful to have in search engines.
>
> --
> Pauli Virtanen

I agree that the SciPy Central site needs more work. I was hoping to
slowly copy/paste the cookbook examples and some of the Topical
Software wiki pages over to the site.

I also agree the site needs to be visually improved.

And, commenting can definitely be added to the site, but it will
require a bit of thought on getting it right, so that it;s useful
(something like StackOverflow, as David mentioned).

However, I've not been able to get around to any of these issues due
to personal time constraints, and will not be able to until May.

So perhaps a good start is if someone would like to get the cookbook
examples copied over. I can share the credentials for the
http://scipy-central.org/user/profile/scipy-central/ user, so that no
one in particular gets credit for these submissions.

Ideas for visual changes to the site can usually be easily tested and
incorporated; while suggestions for a commenting system might best be
handled in a new thread, to keep discussion on track.

Kevin (SciPy Central maintainer)


From fralosal at ei.upv.es  Tue Mar 13 09:02:11 2012
From: fralosal at ei.upv.es (javi)
Date: Tue, 13 Mar 2012 13:02:11 +0000 (UTC)
Subject: [SciPy-User] how to use properly the function fmin () to
	scipy.optimize
Message-ID: <loom.20120313T120412-601@post.gmane.org>

Hello, I have been trying to find the right way to use the function fmin () to
use downhill simplex.

Mainly I have a problem with that is that the algorithm converges to good
effect, ie as a solution with a value next to zero.

To test the performance of the algorithm I used the following example:

def minimize (x):

         min = x [0] + x [1] + x [2] + x [3]
         return min

In which given a vector x would want to obtain the values ??of its elements that
when added give the minimum possible value.

To do this use the following function call:

solution = fmin (minimize, x0 = array ([1, 2, 3, 4]), args = "1", xtol = 0.21, =
0.21 ftol, full_output = 1)

print "value parameters", solution [0], "\ n"

and I get the following results:

       Optimization terminated successfully.
                Current function value: 10.000000
                Iterations: 1
                Function evaluations: 5
      
       value of the parameters: [1. 2. 3. 4.]

As you can see the solution is VERY BAD, and I understand that due to large
values ??of ftol and xtol that I gave it converges very quickly and gives a
small value.

Now, for that is a better result, ie, better than the 10 found understand that I
must decrease and ftol xtol values??, but in doing so I get:


"Warning: Maximum number of function evaluations exceeded Has Been."

Where I understand the algorithm before converging has made excessive calls to
the function "minimize".

Could you tell me what the correct use of the parameters ftol and  xtol to find
a good minimum next to 0?. Sshould generally be used in subsequent cases of ftol
and xtol values???, They differ?.

A greeting and thank you very much.


From denis-bz-gg at t-online.de  Tue Mar 13 10:47:09 2012
From: denis-bz-gg at t-online.de (denis)
Date: Tue, 13 Mar 2012 07:47:09 -0700 (PDT)
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
Message-ID: <878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>

Folks,
  a view from the peanut gallery of 3 somewhat different areas
that Scipy-central, cookbook, docs.scipy, stackoverflow cover well /
not so well:

1) Q+As
2) code recipes, a few pages with some test and """ doc """
3) narrative overviews, slides, talks, FAQs on
    3a) basics: indexing, plotting, linking to cython/c ...
    3b) area overviews: cluster fft integrate interpolate io ...
    Some of these


1) Q+As are I think well covered by stackoverflow, so don't reinvent
that wheel
    (although I liked advice.mechanicalkern.com)
2) there are quite a few sites to put up recipes
    but 100 unsorted recipes do not make a cookbook
    even with a snazzy cover.
    Sure user feedback, comments, weeding, organizing are important
    but weeding and sorting scipy.org/Cookbook is difficult-to-
impossible,
    not happening. (Don't see what copying the lot would gain us.)

3) narrative overviews are not the topic of this thread
    but seem to me a need, an opportunity.
    Are there pages on scipy.org that collect the best
    slides, talks, screencasts, FAQs, Wikis
    with expert comments and critical reviews ?

cheers
  -- denis

On Mar 10, 7:16?pm, josef.p... at gmail.com wrote:
> I would like to extend the discussion a bit.


From warren.weckesser at enthought.com  Tue Mar 13 13:35:25 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Tue, 13 Mar 2012 12:35:25 -0500
Subject: [SciPy-User] how to use properly the function fmin () to
	scipy.optimize
In-Reply-To: <loom.20120313T120412-601@post.gmane.org>
References: <loom.20120313T120412-601@post.gmane.org>
Message-ID: <CAM-+wY_B1B7O8inzZRPF__MrkgujoJZHOftY7QVU-Pv8GT3Xnw@mail.gmail.com>

On Tue, Mar 13, 2012 at 8:02 AM, javi <fralosal at ei.upv.es> wrote:

> Hello, I have been trying to find the right way to use the function fmin
> () to
> use downhill simplex.
>
> Mainly I have a problem with that is that the algorithm converges to good
> effect, ie as a solution with a value next to zero.
>
> To test the performance of the algorithm I used the following example:
>
> def minimize (x):
>
>         min = x [0] + x [1] + x [2] + x [3]
>         return min
>
> In which given a vector x would want to obtain the values of its elements
> that
> when added give the minimum possible value.
>
> To do this use the following function call:
>
> solution = fmin (minimize, x0 = array ([1, 2, 3, 4]), args = "1", xtol =
> 0.21, =
> 0.21 ftol, full_output = 1)
>
> print "value parameters", solution [0], "\ n"
>
> and I get the following results:
>
>       Optimization terminated successfully.
>                Current function value: 10.000000
>                Iterations: 1
>                Function evaluations: 5
>
>       value of the parameters: [1. 2. 3. 4.]
>
> As you can see the solution is VERY BAD, and I understand that due to large
> values of ftol and xtol that I gave it converges very quickly and gives a
> small value.
>
> Now, for that is a better result, ie, better than the 10 found understand
> that I
> must decrease and ftol xtol values??, but in doing so I get:
>
>
> "Warning: Maximum number of function evaluations exceeded Has Been."
>
> Where I understand the algorithm before converging has made excessive
> calls to
> the function "minimize".
>
> Could you tell me what the correct use of the parameters ftol and  xtol to
> find
> a good minimum next to 0?. Sshould generally be used in subsequent cases
> of ftol
> and xtol values???, They differ?.
>
> A greeting and thank you very much.
>
>

It looks like you want to solve a *constrained* minimization problem, in
which all the components of x remain positive.  The function fmin() is for
unconstrained optimization, and your objective function has no
(unconstrained) minimum.

You can try fmin_cobyla or fmin_slsqp.   Here's a short demonstration:

-----
from scipy.optimize import fmin_slsqp, fmin_cobyla


def objective(x):
    """The objective function to be minized."""
    return x.sum()

def all_positive_constr(x):
    """Component constraint function for fmin_slsqp."""
    return x


# The following are the component constraint functions for fmin_cobyla.

def x0_positive(x):
    return x[0]

def x1_positive(x):
    return x[1]

def x2_positive(x):
    return x[2]

def x3_positive(x):
    return x[3]


if __name__ == "__main__":

    print "Using fmin_slsqp"
    result = fmin_slsqp(objective, [1,2,3,4], f_ieqcons=all_positive_constr)
    print result
    print

    print "Using fmin_cobyla"
    result = fmin_cobyla(objective, [1,2,3,4], [x0_positive, x1_positive,
x2_positive, x3_positive])
    print result
    print
-----

Warren

_______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120313/ff9fec44/attachment.html>

From lists at hilboll.de  Tue Mar 13 13:37:27 2012
From: lists at hilboll.de (Andreas H.)
Date: Tue, 13 Mar 2012 18:37:27 +0100
Subject: [SciPy-User] [scipy-central] Site design
Message-ID: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>

Hi all,

I think everyone agrees that the webdesign of scipy-central.org needs some
major enhancements in order to make the site appealing to users so that
they want to stay, browse, and use it.

I think it would make sense to make the site visually similar to the main
SciPy site (new.scipy.org), so that users can already "feel" the
connection. I'm mainly talking about colors and fonts here.

Also, a logo would be good. For a start, maybe we could use the main SciPy
logo, but eventually, scipy-central should have its own, similar logo.

Then, a sidebar would be nice. Possible blocks for the sidebar include
'links to core and related projects', 'what is SciPy', ... ideas welcome.

If you agree, I could start playing around with the templates/css over the
next weeks.

Best,
Andreas.


From warren.weckesser at enthought.com  Tue Mar 13 13:50:52 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Tue, 13 Mar 2012 12:50:52 -0500
Subject: [SciPy-User] how to use properly the function fmin () to
	scipy.optimize
In-Reply-To: <CAM-+wY_B1B7O8inzZRPF__MrkgujoJZHOftY7QVU-Pv8GT3Xnw@mail.gmail.com>
References: <loom.20120313T120412-601@post.gmane.org>
	<CAM-+wY_B1B7O8inzZRPF__MrkgujoJZHOftY7QVU-Pv8GT3Xnw@mail.gmail.com>
Message-ID: <CAM-+wY-mce94gOeqc1sL+rUKyUX5BGr774WE=2dBFEKaMOwMag@mail.gmail.com>

On Tue, Mar 13, 2012 at 12:35 PM, Warren Weckesser <
warren.weckesser at enthought.com> wrote:

>
>
> On Tue, Mar 13, 2012 at 8:02 AM, javi <fralosal at ei.upv.es> wrote:
>
>> Hello, I have been trying to find the right way to use the function fmin
>> () to
>> use downhill simplex.
>>
>> Mainly I have a problem with that is that the algorithm converges to good
>> effect, ie as a solution with a value next to zero.
>>
>> To test the performance of the algorithm I used the following example:
>>
>> def minimize (x):
>>
>>         min = x [0] + x [1] + x [2] + x [3]
>>         return min
>>
>> In which given a vector x would want to obtain the values of its elements
>> that
>> when added give the minimum possible value.
>>
>> To do this use the following function call:
>>
>> solution = fmin (minimize, x0 = array ([1, 2, 3, 4]), args = "1", xtol =
>> 0.21, =
>> 0.21 ftol, full_output = 1)
>>
>> print "value parameters", solution [0], "\ n"
>>
>> and I get the following results:
>>
>>       Optimization terminated successfully.
>>                Current function value: 10.000000
>>                Iterations: 1
>>                Function evaluations: 5
>>
>>       value of the parameters: [1. 2. 3. 4.]
>>
>> As you can see the solution is VERY BAD, and I understand that due to
>> large
>> values of ftol and xtol that I gave it converges very quickly and gives a
>> small value.
>>
>> Now, for that is a better result, ie, better than the 10 found understand
>> that I
>> must decrease and ftol xtol values??, but in doing so I get:
>>
>>
>> "Warning: Maximum number of function evaluations exceeded Has Been."
>>
>> Where I understand the algorithm before converging has made excessive
>> calls to
>> the function "minimize".
>>
>> Could you tell me what the correct use of the parameters ftol and  xtol
>> to find
>> a good minimum next to 0?. Sshould generally be used in subsequent cases
>> of ftol
>> and xtol values???, They differ?.
>>
>> A greeting and thank you very much.
>>
>>
>
> It looks like you want to solve a *constrained* minimization problem, in
> which all the components of x remain positive.  The function fmin() is for
> unconstrained optimization, and your objective function has no
> (unconstrained) minimum.
>
> You can try fmin_cobyla or fmin_slsqp.
>


Or fmin_tnc or fmin_l_bfgs.  See the docstrings of these functions for more
information and examples.

Warren


> Here's a short demonstration:
>
> -----
> from scipy.optimize import fmin_slsqp, fmin_cobyla
>
>
> def objective(x):
>     """The objective function to be minized."""
>     return x.sum()
>
> def all_positive_constr(x):
>     """Component constraint function for fmin_slsqp."""
>     return x
>
>
> # The following are the component constraint functions for fmin_cobyla.
>
> def x0_positive(x):
>     return x[0]
>
> def x1_positive(x):
>     return x[1]
>
> def x2_positive(x):
>     return x[2]
>
> def x3_positive(x):
>     return x[3]
>
>
> if __name__ == "__main__":
>
>     print "Using fmin_slsqp"
>     result = fmin_slsqp(objective, [1,2,3,4],
> f_ieqcons=all_positive_constr)
>     print result
>     print
>
>     print "Using fmin_cobyla"
>     result = fmin_cobyla(objective, [1,2,3,4], [x0_positive, x1_positive,
> x2_positive, x3_positive])
>     print result
>     print
> -----
>
> Warren
>
>  _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120313/0279ffcc/attachment.html>

From fralosal at ei.upv.es  Tue Mar 13 19:00:50 2012
From: fralosal at ei.upv.es (Francisco Javier =?iso-8859-1?b?TPNwZXo=?= Salcedo)
Date: Wed, 14 Mar 2012 00:00:50 +0100
Subject: [SciPy-User] how to use properly the function fmin () to
 scipy.optimize
In-Reply-To: <CAM-+wY_B1B7O8inzZRPF__MrkgujoJZHOftY7QVU-Pv8GT3Xnw@mail.gmail.com>
References: <loom.20120313T120412-601@post.gmane.org>
	<CAM-+wY_B1B7O8inzZRPF__MrkgujoJZHOftY7QVU-Pv8GT3Xnw@mail.gmail.com>
Message-ID: <20120314000050.71713y0pqmkwrxcy@wm.upv.es>

Warren thank you very much for your help and your demonstration. This  
was only invented problem to learn to use fmin (), but perhaps not the  
best problem solving with fmin as you say. I mistakenly thinking about  
that the expected minimum value would be zero, but obviously this is  
not true because the problem has no a priori minimum, I apologize.

I understand that the problem will not converge naturally never, so if  
you do not define convergence by high values ??of xtol and ftol fmin()  
threw the warning that I mentioned having to evaluate the function  
"minimize" excessive times.

Chiefly my question about the algorithm were two things, that defines  
exactly the parameter ftol and xtol?, What is the difference between  
them?, If I wanted to stop the algorithm when the minimum is not  
differentiated from one iteration to the next in a given amount which  
of these two parameters should have occasion to modify its default  
value?.


Again thank you very much and sorry.


Warren Weckesser <warren.weckesser at enthought.com> escribi?:

> On Tue, Mar 13, 2012 at 8:02 AM, javi <fralosal at ei.upv.es> wrote:
>
>> Hello, I have been trying to find the right way to use the function fmin
>> () to
>> use downhill simplex.
>>
>> Mainly I have a problem with that is that the algorithm converges to good
>> effect, ie as a solution with a value next to zero.
>>
>> To test the performance of the algorithm I used the following example:
>>
>> def minimize (x):
>>
>>         min = x [0] + x [1] + x [2] + x [3]
>>         return min
>>
>> In which given a vector x would want to obtain the values of its elements
>> that
>> when added give the minimum possible value.
>>
>> To do this use the following function call:
>>
>> solution = fmin (minimize, x0 = array ([1, 2, 3, 4]), args = "1", xtol =
>> 0.21, =
>> 0.21 ftol, full_output = 1)
>>
>> print "value parameters", solution [0], "\ n"
>>
>> and I get the following results:
>>
>>       Optimization terminated successfully.
>>                Current function value: 10.000000
>>                Iterations: 1
>>                Function evaluations: 5
>>
>>       value of the parameters: [1. 2. 3. 4.]
>>
>> As you can see the solution is VERY BAD, and I understand that due to large
>> values of ftol and xtol that I gave it converges very quickly and gives a
>> small value.
>>
>> Now, for that is a better result, ie, better than the 10 found understand
>> that I
>> must decrease and ftol xtol values??, but in doing so I get:
>>
>>
>> "Warning: Maximum number of function evaluations exceeded Has Been."
>>
>> Where I understand the algorithm before converging has made excessive
>> calls to
>> the function "minimize".
>>
>> Could you tell me what the correct use of the parameters ftol and  xtol to
>> find
>> a good minimum next to 0?. Sshould generally be used in subsequent cases
>> of ftol
>> and xtol values???, They differ?.
>>
>> A greeting and thank you very much.
>>
>>
>
> It looks like you want to solve a *constrained* minimization problem, in
> which all the components of x remain positive.  The function fmin() is for
> unconstrained optimization, and your objective function has no
> (unconstrained) minimum.
>
> You can try fmin_cobyla or fmin_slsqp.   Here's a short demonstration:
>
> -----
> from scipy.optimize import fmin_slsqp, fmin_cobyla
>
>
> def objective(x):
>     """The objective function to be minized."""
>     return x.sum()
>
> def all_positive_constr(x):
>     """Component constraint function for fmin_slsqp."""
>     return x
>
>
> # The following are the component constraint functions for fmin_cobyla.
>
> def x0_positive(x):
>     return x[0]
>
> def x1_positive(x):
>     return x[1]
>
> def x2_positive(x):
>     return x[2]
>
> def x3_positive(x):
>     return x[3]
>
>
> if __name__ == "__main__":
>
>     print "Using fmin_slsqp"
>     result = fmin_slsqp(objective, [1,2,3,4], f_ieqcons=all_positive_constr)
>     print result
>     print
>
>     print "Using fmin_cobyla"
>     result = fmin_cobyla(objective, [1,2,3,4], [x0_positive, x1_positive,
> x2_positive, x3_positive])
>     print result
>     print
> -----
>
> Warren
>
> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>


From tsyu80 at gmail.com  Tue Mar 13 22:22:30 2012
From: tsyu80 at gmail.com (Tony Yu)
Date: Tue, 13 Mar 2012 22:22:30 -0400
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <CAEym_Hq8d1FxeesG3iRAVP6eKiYFKZhSMHBpGg0byM+JCe=VGQ@mail.gmail.com>

On Tue, Mar 13, 2012 at 1:37 PM, Andreas H. <lists at hilboll.de> wrote:

> Hi all,
>
> I think everyone agrees that the webdesign of scipy-central.org needs some
> major enhancements in order to make the site appealing to users so that
> they want to stay, browse, and use it.
>
> I think it would make sense to make the site visually similar to the main
> SciPy site (new.scipy.org), so that users can already "feel" the
> connection. I'm mainly talking about colors and fonts here.
>
> Also, a logo would be good. For a start, maybe we could use the main SciPy
> logo, but eventually, scipy-central should have its own, similar logo.
>
> Then, a sidebar would be nice. Possible blocks for the sidebar include
> 'links to core and related projects', 'what is SciPy', ... ideas welcome.
>
> If you agree, I could start playing around with the templates/css over the
> next weeks.
>
> Best,
> Andreas.
>
>
Here's a logo concept. The concept is a bit literal: SciPy curve, with the
background of a globe and arrows pointing from different locations to a
"central" point.

I posted the code on github, if someone wants to play around with it (it's
not particularly pretty code): https://github.com/tonysyu/SciPy-Central-Logo


Tools used:
    * numpy
    * matplotlib
    * basemap
    * scipy (although this was a bit forced)

-Tony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120313/f379bb42/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scipy_central_logo.png
Type: image/png
Size: 124761 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120313/f379bb42/attachment.png>

From matthew.brett at gmail.com  Tue Mar 13 22:42:28 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Tue, 13 Mar 2012 19:42:28 -0700
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <CAEym_Hq8d1FxeesG3iRAVP6eKiYFKZhSMHBpGg0byM+JCe=VGQ@mail.gmail.com>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CAEym_Hq8d1FxeesG3iRAVP6eKiYFKZhSMHBpGg0byM+JCe=VGQ@mail.gmail.com>
Message-ID: <CAH6Pt5qEZekTkpN7BWYCZVBqbaG-q7mukfHPCQOzADzz38VnhA@mail.gmail.com>

Hi,

On Tue, Mar 13, 2012 at 7:22 PM, Tony Yu <tsyu80 at gmail.com> wrote:
>
>
> On Tue, Mar 13, 2012 at 1:37 PM, Andreas H. <lists at hilboll.de> wrote:
>>
>> Hi all,
>>
>> I think everyone agrees that the webdesign of scipy-central.org needs some
>> major enhancements in order to make the site appealing to users so that
>> they want to stay, browse, and use it.
>>
>> I think it would make sense to make the site visually similar to the main
>> SciPy site (new.scipy.org), so that users can already "feel" the
>> connection. I'm mainly talking about colors and fonts here.
>>
>> Also, a logo would be good. For a start, maybe we could use the main SciPy
>> logo, but eventually, scipy-central should have its own, similar logo.
>>
>> Then, a sidebar would be nice. Possible blocks for the sidebar include
>> 'links to core and related projects', 'what is SciPy', ... ideas welcome.
>>
>> If you agree, I could start playing around with the templates/css over the
>> next weeks.
>>
>> Best,
>> Andreas.
>>
>
> Here's a logo concept. The concept is a bit literal: SciPy curve, with the
> background of a globe and arrows pointing from different locations to a
> "central" point.

Thanks for doing that - it looks good.

But - aren't the arrows pointing dangerously close to the North
Atlantic Garbage Patch?

http://en.wikipedia.org/wiki/North_Atlantic_Garbage_Patch

Best,

Matthew


From tsyu80 at gmail.com  Tue Mar 13 22:46:00 2012
From: tsyu80 at gmail.com (Tony Yu)
Date: Tue, 13 Mar 2012 22:46:00 -0400
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <CAH6Pt5qEZekTkpN7BWYCZVBqbaG-q7mukfHPCQOzADzz38VnhA@mail.gmail.com>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CAEym_Hq8d1FxeesG3iRAVP6eKiYFKZhSMHBpGg0byM+JCe=VGQ@mail.gmail.com>
	<CAH6Pt5qEZekTkpN7BWYCZVBqbaG-q7mukfHPCQOzADzz38VnhA@mail.gmail.com>
Message-ID: <CAEym_HoMe6OjF8iScpU63poew=TmkKdiMaUD-Hhj0kp8Ka=Neg@mail.gmail.com>

On Tue, Mar 13, 2012 at 10:42 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Tue, Mar 13, 2012 at 7:22 PM, Tony Yu <tsyu80 at gmail.com> wrote:
> >
> >
> > On Tue, Mar 13, 2012 at 1:37 PM, Andreas H. <lists at hilboll.de> wrote:
> >>
> >> Hi all,
> >>
> >> I think everyone agrees that the webdesign of scipy-central.org needs
> some
> >> major enhancements in order to make the site appealing to users so that
> >> they want to stay, browse, and use it.
> >>
> >> I think it would make sense to make the site visually similar to the
> main
> >> SciPy site (new.scipy.org), so that users can already "feel" the
> >> connection. I'm mainly talking about colors and fonts here.
> >>
> >> Also, a logo would be good. For a start, maybe we could use the main
> SciPy
> >> logo, but eventually, scipy-central should have its own, similar logo.
> >>
> >> Then, a sidebar would be nice. Possible blocks for the sidebar include
> >> 'links to core and related projects', 'what is SciPy', ... ideas
> welcome.
> >>
> >> If you agree, I could start playing around with the templates/css over
> the
> >> next weeks.
> >>
> >> Best,
> >> Andreas.
> >>
> >
> > Here's a logo concept. The concept is a bit literal: SciPy curve, with
> the
> > background of a globe and arrows pointing from different locations to a
> > "central" point.
>
> Thanks for doing that - it looks good.
>
> But - aren't the arrows pointing dangerously close to the North
> Atlantic Garbage Patch?
>
> http://en.wikipedia.org/wiki/North_Atlantic_Garbage_Patch
>
> Best,
>
> Matthew
>

I had no intention of predicting the code quality of SciPy Central
submissions when designing this. ;)

Best,
-Tony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120313/29ca049a/attachment.html>

From josef.pktd at gmail.com  Tue Mar 13 23:44:13 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 13 Mar 2012 23:44:13 -0400
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <CAEym_HoMe6OjF8iScpU63poew=TmkKdiMaUD-Hhj0kp8Ka=Neg@mail.gmail.com>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CAEym_Hq8d1FxeesG3iRAVP6eKiYFKZhSMHBpGg0byM+JCe=VGQ@mail.gmail.com>
	<CAH6Pt5qEZekTkpN7BWYCZVBqbaG-q7mukfHPCQOzADzz38VnhA@mail.gmail.com>
	<CAEym_HoMe6OjF8iScpU63poew=TmkKdiMaUD-Hhj0kp8Ka=Neg@mail.gmail.com>
Message-ID: <CAMMTP+B=Fwc-703PKuH1-r70=JnpDzqPFWu_S7HkELMux2OgRQ@mail.gmail.com>

On Tue, Mar 13, 2012 at 10:46 PM, Tony Yu <tsyu80 at gmail.com> wrote:
>
>
> On Tue, Mar 13, 2012 at 10:42 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Tue, Mar 13, 2012 at 7:22 PM, Tony Yu <tsyu80 at gmail.com> wrote:
>> >
>> >
>> > On Tue, Mar 13, 2012 at 1:37 PM, Andreas H. <lists at hilboll.de> wrote:
>> >>
>> >> Hi all,
>> >>
>> >> I think everyone agrees that the webdesign of scipy-central.org needs
>> >> some
>> >> major enhancements in order to make the site appealing to users so that
>> >> they want to stay, browse, and use it.
>> >>
>> >> I think it would make sense to make the site visually similar to the
>> >> main
>> >> SciPy site (new.scipy.org), so that users can already "feel" the
>> >> connection. I'm mainly talking about colors and fonts here.
>> >>
>> >> Also, a logo would be good. For a start, maybe we could use the main
>> >> SciPy
>> >> logo, but eventually, scipy-central should have its own, similar logo.
>> >>
>> >> Then, a sidebar would be nice. Possible blocks for the sidebar include
>> >> 'links to core and related projects', 'what is SciPy', ... ideas
>> >> welcome.
>> >>
>> >> If you agree, I could start playing around with the templates/css over
>> >> the
>> >> next weeks.
>> >>
>> >> Best,
>> >> Andreas.
>> >>
>> >
>> > Here's a logo concept. The concept is a bit literal: SciPy curve, with
>> > the
>> > background of a globe and arrows pointing from different locations to a
>> > "central" point.
>>
>> Thanks for doing that - it looks good.
>>
>> But - aren't the arrows pointing dangerously close to the North
>> Atlantic Garbage Patch?
>>
>> http://en.wikipedia.org/wiki/North_Atlantic_Garbage_Patch

Now you made me read about ocean pollution for more than half an hour ;(
200,000 invisible particles per square kilometer? Don't go swimming in
the middle of any ocean.

Would pointing the arrows to the central part of the snaky S work?

Cheers,

Josef

>>
>> Best,
>>
>> Matthew
>
>
> I had no intention of predicting the code quality of SciPy Central
> submissions when designing this. ;)
>
> Best,
> -Tony
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From scott.sinclair.za at gmail.com  Wed Mar 14 02:53:27 2012
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Wed, 14 Mar 2012 08:53:27 +0200
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <CA+nsYDuov1zZE12vaQ0XmOcBH4063SHnVnhqfk5o2QwDYNJE_Q@mail.gmail.com>

On 13 March 2012 19:37, Andreas H. <lists at hilboll.de> wrote:
> I think everyone agrees that the webdesign of scipy-central.org needs some
> major enhancements in order to make the site appealing to users so that
> they want to stay, browse, and use it.
>
> I think it would make sense to make the site visually similar to the main
> SciPy site (new.scipy.org), so that users can already "feel" the
> connection. I'm mainly talking about colors and fonts here.

That sounds like a good idea to me.

> Also, a logo would be good. For a start, maybe we could use the main SciPy
> logo, but eventually, scipy-central should have its own, similar logo.

Tony's logo seems like a great start. I like it, but now that I know
about the garbage patch, I have a very minor aversion to the focal
point of the inwardly pointing arrows.

> Then, a sidebar would be nice. Possible blocks for the sidebar include
> 'links to core and related projects', 'what is SciPy', ... ideas welcome.
>
> If you agree, I could start playing around with the templates/css over the
> next weeks.

Thanks for getting this rolling. I'd really like to see the scipy.org
domain pointed at scipy.github.com, but the cookbook and topical
software pages are the main sticking point since they contain lots of
useful (and sometimes not so useful) content that needs to be
organized in some manner. Scipy Central seems like a good candidate
for this.

Cheers,
Scott


From scott.sinclair.za at gmail.com  Wed Mar 14 03:04:28 2012
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Wed, 14 Mar 2012 09:04:28 +0200
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
Message-ID: <CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>

On 13 March 2012 16:47, denis <denis-bz-gg at t-online.de> wrote:
> 1) Q+As are I think well covered by stackoverflow, so don't reinvent
> that wheel
> ? ?(although I liked advice.mechanicalkern.com)

I've always liked the idea of advice.mechanicalkern.com or
ask.scipy.org as a good way of hosting an FAQ (there is some good
content on both sites), but I guess we could encourage using Stack
Overflow instead.

> 2) there are quite a few sites to put up recipes
> ? ?but 100 unsorted recipes do not make a cookbook
> ? ?even with a snazzy cover.
> ? ?Sure user feedback, comments, weeding, organizing are important
> ? ?but weeding and sorting scipy.org/Cookbook is difficult-to-
> impossible,
> ? ?not happening. (Don't see what copying the lot would gain us.)

Selective copying could be useful, but that's still a lot of work and
it doesn't look like there are (m)any volunteers at this stage.

Cheers,
Scott


From ralf.gommers at googlemail.com  Wed Mar 14 03:21:41 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Wed, 14 Mar 2012 08:21:41 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
Message-ID: <CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>

On Wed, Mar 14, 2012 at 8:04 AM, Scott Sinclair <scott.sinclair.za at gmail.com
> wrote:

> On 13 March 2012 16:47, denis <denis-bz-gg at t-online.de> wrote:
> > 1) Q+As are I think well covered by stackoverflow, so don't reinvent
> > that wheel
> >    (although I liked advice.mechanicalkern.com)
>
> I've always liked the idea of advice.mechanicalkern.com or
> ask.scipy.org as a good way of hosting an FAQ (there is some good
> content on both sites), but I guess we could encourage using Stack
> Overflow instead.
>
> > 2) there are quite a few sites to put up recipes
> >    but 100 unsorted recipes do not make a cookbook
> >    even with a snazzy cover.
> >    Sure user feedback, comments, weeding, organizing are important
> >    but weeding and sorting scipy.org/Cookbook is difficult-to-
> > impossible,
> >    not happening. (Don't see what copying the lot would gain us.)
>
> Selective copying could be useful, but that's still a lot of work and
> it doesn't look like there are (m)any volunteers at this stage.
>

Can we start by removing recipes that aren't useful anymore, links to
external sites (there are ~40 OpenOpt / FuncDesigner links for example) and
the list of all pages? That would cut down the Cookbook page to a more
manageable size immediately.

Then there will be some things left that can land in the numpy/scipy
tutorials, and some things for SciPy Central. Moving that content will
still be a lot of work, but much less than what it looks like now.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120314/ad902257/attachment.html>

From scott.sinclair.za at gmail.com  Wed Mar 14 03:58:34 2012
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Wed, 14 Mar 2012 09:58:34 +0200
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
Message-ID: <CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>

On 14 March 2012 09:21, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
>
>
> On Wed, Mar 14, 2012 at 8:04 AM, Scott Sinclair
> <scott.sinclair.za at gmail.com> wrote:
>>
>> On 13 March 2012 16:47, denis <denis-bz-gg at t-online.de> wrote:
>> > 2) there are quite a few sites to put up recipes
>> > ? ?but 100 unsorted recipes do not make a cookbook
>> > ? ?even with a snazzy cover.
>> > ? ?Sure user feedback, comments, weeding, organizing are important
>> > ? ?but weeding and sorting scipy.org/Cookbook is difficult-to-
>> > impossible,
>> > ? ?not happening. (Don't see what copying the lot would gain us.)
>>
>> Selective copying could be useful, but that's still a lot of work and
>> it doesn't look like there are (m)any volunteers at this stage.
>
>
> Can we start by removing recipes that aren't useful anymore, links to
> external sites (there are ~40 OpenOpt / FuncDesigner links for example) and
> the list of all pages? That would cut down the Cookbook page to a more
> manageable size immediately.

There's also quite a lot related to Matplotlib, MayaVi etc. which
might have a better home with those projects.

Cheers,
Scott


From seb.haase at gmail.com  Wed Mar 14 04:16:45 2012
From: seb.haase at gmail.com (Sebastian Haase)
Date: Wed, 14 Mar 2012 09:16:45 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
Message-ID: <CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>

On Wed, Mar 14, 2012 at 8:58 AM, Scott Sinclair
<scott.sinclair.za at gmail.com> wrote:
>
> On 14 March 2012 09:21, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
> >
> >
> > On Wed, Mar 14, 2012 at 8:04 AM, Scott Sinclair
> > <scott.sinclair.za at gmail.com> wrote:
> >>
> >> On 13 March 2012 16:47, denis <denis-bz-gg at t-online.de> wrote:
> >> > 2) there are quite a few sites to put up recipes
> >> > ? ?but 100 unsorted recipes do not make a cookbook
> >> > ? ?even with a snazzy cover.
> >> > ? ?Sure user feedback, comments, weeding, organizing are important
> >> > ? ?but weeding and sorting scipy.org/Cookbook is difficult-to-
> >> > impossible,
> >> > ? ?not happening. (Don't see what copying the lot would gain us.)
> >>
> >> Selective copying could be useful, but that's still a lot of work and
> >> it doesn't look like there are (m)any volunteers at this stage.
> >
> >
> > Can we start by removing recipes that aren't useful anymore, links to
> > external sites (there are ~40 OpenOpt / FuncDesigner links for example) and
> > the list of all pages? That would cut down the Cookbook page to a more
> > manageable size immediately.
>
> There's also quite a lot related to Matplotlib, MayaVi etc. which
> might have a better home with those projects.
>

I find it quite interesting to see those examples. I'm not involved in
those other projects,
 and this is the only time I would see "what's possible".
So, "SciPy Central/Cookbook" could be understood in
the broader sense of "Science with Python", rather than "only" how to
use the scipy-package....

My 2 cents.
Sebastian Haase


From scott.sinclair.za at gmail.com  Wed Mar 14 06:12:20 2012
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Wed, 14 Mar 2012 12:12:20 +0200
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
Message-ID: <CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>

On 14 March 2012 10:16, Sebastian Haase <seb.haase at gmail.com> wrote:
> On Wed, Mar 14, 2012 at 8:58 AM, Scott Sinclair
> <scott.sinclair.za at gmail.com> wrote:
>>
>> On 14 March 2012 09:21, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
>> >
>> > Can we start by removing recipes that aren't useful anymore, links to
>> > external sites (there are ~40 OpenOpt / FuncDesigner links for example) and
>> > the list of all pages? That would cut down the Cookbook page to a more
>> > manageable size immediately.
>>
>> There's also quite a lot related to Matplotlib, MayaVi etc. which
>> might have a better home with those projects.
>>
>
> I find it quite interesting to see those examples. I'm not involved in
> those other projects,
> ?and this is the only time I would see "what's possible".
> So, "SciPy Central/Cookbook" could be understood in
> the broader sense of "Science with Python", rather than "only" how to
> use the scipy-package....

Sure they're interesting, I'm not proposing to throw anything away
(lot's of people have contributed their time to produce the recipes).

I still think that recipes which tell me how to do task X, with
package Y are better hosted in the documentation/online resources of
package Y. Recipes that solve a specific problem primarily using
Numpy/Scipy, but that might also use Matplotlib/MayaVi/Chaco/? for
plotting or cython/f2py/SWIG to speed up or wrap compiled code feel
like they have a better fit.

Overall, navigating through the Scipy web presence is awfully
convoluted and I'm wondering how we can start solving that.

Cheers,
Scott


From lchaplin13 at gmail.com  Wed Mar 14 06:16:09 2012
From: lchaplin13 at gmail.com (Lee)
Date: Wed, 14 Mar 2012 03:16:09 -0700 (PDT)
Subject: [SciPy-User] delete rows and columns
Message-ID: <46726410-54bc-41cc-a1f9-9064d7a50055@x10g2000pbi.googlegroups.com>

Hi all,

first time here, sorry if I am not posting in the right group.
I am trying to run the below example from numpy docs:

import numpy as np
print np.version.version #1.6.1 (win7-64, py2.6)

a = np.array([0, 10, 20, 30, 40])
np.delete(a, [2,4]) # remove a[2] and a[4]
print a
a = np.arange(16).reshape(4,4)
print a
np.delete(a, np.s_[1:3], axis=0) # remove rows 1 and 2
print a
np.delete(a, np.s_[1:3], axis=1) # remove columns 1 and 2
print a

Basically I am trying to delete some column/rows from an array or a
matrix.
It seems that delete doesn't work I expect (and advertised). Am I
missing something?

Thanks,
Lee


From punchagan at gmail.com  Wed Mar 14 06:25:16 2012
From: punchagan at gmail.com (Puneeth Chaganti)
Date: Wed, 14 Mar 2012 15:55:16 +0530
Subject: [SciPy-User] delete rows and columns
In-Reply-To: <46726410-54bc-41cc-a1f9-9064d7a50055@x10g2000pbi.googlegroups.com>
References: <46726410-54bc-41cc-a1f9-9064d7a50055@x10g2000pbi.googlegroups.com>
Message-ID: <CALnw1fS9Ot6Hhf_9KwA-7np7X-H4HoMMjRrArYk0CzQ9SSZpYg@mail.gmail.com>

On Wed, Mar 14, 2012 at 3:46 PM, Lee <lchaplin13 at gmail.com> wrote:
> Hi all,
>
> first time here, sorry if I am not posting in the right group.
> I am trying to run the below example from numpy docs:
>
> import numpy as np
> print np.version.version #1.6.1 (win7-64, py2.6)
>
> a = np.array([0, 10, 20, 30, 40])
> np.delete(a, [2,4]) # remove a[2] and a[4]
> print a
> a = np.arange(16).reshape(4,4)
> print a
> np.delete(a, np.s_[1:3], axis=0) # remove rows 1 and 2
> print a
> np.delete(a, np.s_[1:3], axis=1) # remove columns 1 and 2
> print a
>
> Basically I am trying to delete some column/rows from an array or a
> matrix.
> It seems that delete doesn't work I expect (and advertised). Am I
> missing something?

np.delete does not change the array in place. It does work as
advertised, which says

"""
Return a new array with sub-arrays along an axis deleted.
"""

>>> arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
>>> arr
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])
>>> np.delete(arr, 1, 0)
array([[ 1,  2,  3,  4],
       [ 9, 10, 11, 12]])

HTH,
Puneeth


From ernesto.adorio at gmail.com  Wed Mar 14 07:51:06 2012
From: ernesto.adorio at gmail.com (Ernesto Adorio)
Date: Wed, 14 Mar 2012 19:51:06 +0800
Subject: [SciPy-User] speed up a python function using scipy constructs
Message-ID: <CAPM0Lnmz7QEbt=kw+nB_xvCytXTAfDEmFJ1mb_Z-zmeiATQN3g@mail.gmail.com>

Hi,

The following function is a pure python implmentation of the multinomial
logistic regression
log likelihood function.

<pre>
def negmloglik(Betas, X, Y,  m, reflevel=0):
    """
    log likelihood for polytomous regression or mlogit.
    Betas - estimated coefficients, as a SINGLE array!
    Y values are coded from 0 to ncategories - 1

    Betas matrix
            b[0][0] + b[0][1]+ b[0][2]+ ... + b[[0][D-1]
            b[1][0] + b[1][1]+ b[1][2]+ ... + b[[1][D-1]
                        ...
            b[ncategories-1][0] + b[ncategories-1][1]+ b[ncategories-1][2]
             .... + ... + b[[ncategories - 1][D-1]

            Stored in one array! The beta   coefficients for each level
            are stored with indices in range(level*D , level *D + D)
    X,Y   data X matrix and integer response Y vector with values
            from 0 to maxlevel=ncategories-1
    m - number of categories in Y vector. each value of  ylevel in Y must
be in the
            interval [0, ncategories) or 0 <= ylevel < m
    reflevel - reference level, default code: 0
    """

    n  = len(X[0]) # number of coefficients per level.
    L  = 0
    for (xrow, ylevel) in zip(X,Y):
        h   = [0.0] * m
        denom = 0.0
        for k in range(m):
                 if k == reflevel:
                    denom += 1
                 else:
                    sa = k * n
                    v = sum([(x * b) for (x,b) in zip(xrow, Betas[sa: sa +
n])])
                    h[k] = v
                    denom += exp(v)
        deltaL = h[ylevel] - log(denom)
        L += deltaL
    return -2 * L
</pre>

I am wondering if there are Scipy/Numpy constructs which can speed up the
above Python implementation?
Rewrite if necessary.

Regards,
Ernesto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120314/ae9a9052/attachment.html>

From denis.laxalde at mcgill.ca  Tue Mar 13 09:26:48 2012
From: denis.laxalde at mcgill.ca (Denis Laxalde)
Date: Tue, 13 Mar 2012 09:26:48 -0400
Subject: [SciPy-User] how to use properly the function fmin () to
 scipy.optimize
In-Reply-To: <loom.20120313T120412-601@post.gmane.org>
References: <loom.20120313T120412-601@post.gmane.org>
Message-ID: <20120313092648.1ab6d564@mcgill.ca>

javi wrote:
> To test the performance of the algorithm I used the following example:
> 
> def minimize (x):
> 
>          min = x [0] + x [1] + x [2] + x [3]
>          return min

This function does not have a minimum.


From josef.pktd at gmail.com  Wed Mar 14 08:20:47 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 14 Mar 2012 08:20:47 -0400
Subject: [SciPy-User] speed up a python function using scipy constructs
In-Reply-To: <CAPM0Lnmz7QEbt=kw+nB_xvCytXTAfDEmFJ1mb_Z-zmeiATQN3g@mail.gmail.com>
References: <CAPM0Lnmz7QEbt=kw+nB_xvCytXTAfDEmFJ1mb_Z-zmeiATQN3g@mail.gmail.com>
Message-ID: <CAMMTP+DB8xMdKULmeGpUXiDACV7KGNJVJSB3oL5bwAiOOG3RRA@mail.gmail.com>

On Wed, Mar 14, 2012 at 7:51 AM, Ernesto Adorio
<ernesto.adorio at gmail.com> wrote:
> Hi,
>
> The following function is a pure python implmentation of the multinomial
> logistic regression
> log likelihood function.
>
> <pre>
> def negmloglik(Betas, X, Y, ?m, reflevel=0):
> ? ? """
> ? ? log likelihood for polytomous regression or mlogit.
> ? ? Betas - estimated coefficients, as a SINGLE array!
> ? ? Y values are coded from 0 to ncategories - 1
>
> ? ? Betas matrix
> ? ? ? ? ? ? b[0][0] + b[0][1]+ b[0][2]+ ... + b[[0][D-1]
> ? ? ? ? ? ? b[1][0] + b[1][1]+ b[1][2]+ ... + b[[1][D-1]
> ? ? ? ? ? ? ? ? ? ? ? ? ...
> ? ? ? ? ? ? b[ncategories-1][0] + b[ncategories-1][1]+ b[ncategories-1][2]
> ? ? ? ? ? ? ?.... + ... + b[[ncategories - 1][D-1]
>
> ? ? ? ? ? ? Stored in one array! The beta ? coefficients for each level
> ? ? ? ? ? ? are stored with indices in range(level*D , level *D + D)
> ? ? X,Y ? data X matrix and integer response Y vector with values
> ? ? ? ? ? ? from 0 to maxlevel=ncategories-1
> ? ? m - number of categories in Y vector. each value of ?ylevel in Y must be
> in the
> ? ? ? ? ? ? interval [0, ncategories) or 0 <= ylevel < m
> ? ? reflevel - reference level, default code: 0
> ? ? """
>
> ? ? n ?= len(X[0]) # number of coefficients per level.
> ? ? L ?= 0
> ? ? for (xrow, ylevel) in zip(X,Y):
> ? ? ? ? h ? = [0.0] * m
> ? ? ? ? denom = 0.0
> ? ? ? ? for k in range(m):
> ? ? ? ? ? ? ? ? ?if k == reflevel:
> ? ? ? ? ? ? ? ? ? ? denom += 1
> ? ? ? ? ? ? ? ? ?else:
> ? ? ? ? ? ? ? ? ? ? sa = k * n
> ? ? ? ? ? ? ? ? ? ? v = sum([(x * b) for (x,b) in zip(xrow, Betas[sa: sa +
> n])])
> ? ? ? ? ? ? ? ? ? ? h[k] = v
> ? ? ? ? ? ? ? ? ? ? denom += exp(v)
> ? ? ? ? deltaL = h[ylevel] - log(denom)
> ? ? ? ? L += deltaL
> ? ? return -2 * L
> </pre>
>
> I am wondering if there are Scipy/Numpy constructs which can speed up the
> above Python implementation?
> Rewrite if necessary.

Maybe it helps to look at our implementation in statsmodels
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/discrete/discrete_model.py#L1091

I didn't read your loop to see if it is the same.

Cheers,
Josef

>
> Regards,
> Ernesto
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jsseabold at gmail.com  Wed Mar 14 08:38:24 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Wed, 14 Mar 2012 08:38:24 -0400
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <CAKF=Djtxq262RTMcWav04YmQ=yoH1fi6cx_P7YZ=CR2jka2rug@mail.gmail.com>

On Tue, Mar 13, 2012 at 1:37 PM, Andreas H. <lists at hilboll.de> wrote:
> Hi all,
>
> I think everyone agrees that the webdesign of scipy-central.org needs some
> major enhancements in order to make the site appealing to users so that
> they want to stay, browse, and use it.
>
> I think it would make sense to make the site visually similar to the main
> SciPy site (new.scipy.org), so that users can already "feel" the
> connection. I'm mainly talking about colors and fonts here.
>
> Also, a logo would be good. For a start, maybe we could use the main SciPy
> logo, but eventually, scipy-central should have its own, similar logo.
>
> Then, a sidebar would be nice. Possible blocks for the sidebar include
> 'links to core and related projects', 'what is SciPy', ... ideas welcome.
>
> If you agree, I could start playing around with the templates/css over the
> next weeks.
>

A humble suggestion for the layout, if people don't think it's done to
death, bootstrap may be appropriate here [1, 2]. I've had good luck
with ideas from bootstrap at least if not the whole framework. E.g, I
find a CSS grid system to be aesthetically pleasing [3, 4]. There are
many more examples than the given links. I find it saves a lot of the
work of design from scratch.

Skipper

[1] http://blog.baregit.com/2012/bootstrap-or-not-bootstrap
[2] http://twitter.github.com/bootstrap/
[3] http://960.gs/
[4] http://cssgrid.net/


From paustin at eos.ubc.ca  Wed Mar 14 09:09:04 2012
From: paustin at eos.ubc.ca (Phil Austin)
Date: Wed, 14 Mar 2012 06:09:04 -0700
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <CAKF=Djtxq262RTMcWav04YmQ=yoH1fi6cx_P7YZ=CR2jka2rug@mail.gmail.com>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CAKF=Djtxq262RTMcWav04YmQ=yoH1fi6cx_P7YZ=CR2jka2rug@mail.gmail.com>
Message-ID: <4F609870.4080106@eos.ubc.ca>

On 12-03-14 05:38 AM, Skipper Seabold wrote:
>
>  A humble suggestion for the layout, if people don't think it's done to
>  death, bootstrap may be appropriate here [1, 2]. I've had good luck
>  with ideas from bootstrap at least if not the whole framework. E.g, I
>  find a CSS grid system to be aesthetically pleasing [3, 4]. There are
>  many more examples than the given links. I find it saves a lot of the
>  work of design from scratch.
>

and there's a sphinx theme  that adds a few javascript
functions to convert toc and localtoc formatting to
be bootstrap-compatible

https://github.com/ryan-roemer/sphinx-bootstrap-theme

-- Phil


From jeanluc.menut at free.fr  Wed Mar 14 09:12:51 2012
From: jeanluc.menut at free.fr (Jean-Luc Menut)
Date: Wed, 14 Mar 2012 14:12:51 +0100
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <4F609953.2040900@free.fr>

A hierarchical way to browse the packages could be interesting also. For 
example Physics->Fluid mechanics->Navier?Stokes equations.


From wardefar at iro.umontreal.ca  Wed Mar 14 10:14:03 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Wed, 14 Mar 2012 10:14:03 -0400
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <CA+nsYDuov1zZE12vaQ0XmOcBH4063SHnVnhqfk5o2QwDYNJE_Q@mail.gmail.com>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CA+nsYDuov1zZE12vaQ0XmOcBH4063SHnVnhqfk5o2QwDYNJE_Q@mail.gmail.com>
Message-ID: <FC3322C0-2F1E-4B85-875C-69D94E3E188D@iro.umontreal.ca>

On 2012-03-14, at 2:53 AM, Scott Sinclair wrote:

> Thanks for getting this rolling. I'd really like to see the scipy.org
> domain pointed at scipy.github.com, but the cookbook and topical
> software pages are the main sticking point since they contain lots of
> useful (and sometimes not so useful) content that needs to be
> organized in some manner. Scipy Central seems like a good candidate
> for this.

+1. I think the logo looks great, Matthew's observation notwithstanding. :)

David


From wardefar at iro.umontreal.ca  Wed Mar 14 10:17:55 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Wed, 14 Mar 2012 10:17:55 -0400
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
	fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
Message-ID: <BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>

On 2012-03-14, at 6:12 AM, Scott Sinclair wrote:

> Sure they're interesting, I'm not proposing to throw anything away
> (lot's of people have contributed their time to produce the recipes).
> 
> I still think that recipes which tell me how to do task X, with
> package Y are better hosted in the documentation/online resources of
> package Y. Recipes that solve a specific problem primarily using
> Numpy/Scipy, but that might also use Matplotlib/MayaVi/Chaco/? for
> plotting or cython/f2py/SWIG to speed up or wrap compiled code feel
> like they have a better fit.

I agree, certain sorts of recipes are a better fit than others. However, it would be nice if we had some clear and simple guidelines as to what belongs and what doesn't rather than making it a matter of subjective judgment; otherwise the only fair way forward seems to be accepting almost everything.

David

From cweisiger at msg.ucsf.edu  Wed Mar 14 11:23:39 2012
From: cweisiger at msg.ucsf.edu (Chris Weisiger)
Date: Wed, 14 Mar 2012 08:23:39 -0700
Subject: [SciPy-User] delete rows and columns
In-Reply-To: <46726410-54bc-41cc-a1f9-9064d7a50055@x10g2000pbi.googlegroups.com>
References: <46726410-54bc-41cc-a1f9-9064d7a50055@x10g2000pbi.googlegroups.com>
Message-ID: <CABHB1j+vi04O9ifESe2CnrSN4UVXKWxb09b90QAz1orkW1oP7g@mail.gmail.com>

On Wed, Mar 14, 2012 at 3:16 AM, Lee <lchaplin13 at gmail.com> wrote:
> Hi all,
>
> first time here, sorry if I am not posting in the right group.
> I am trying to run the below example from numpy docs:
>
> import numpy as np
> print np.version.version #1.6.1 (win7-64, py2.6)
>
> a = np.array([0, 10, 20, 30, 40])
> np.delete(a, [2,4]) # remove a[2] and a[4]
> print a
> a = np.arange(16).reshape(4,4)
> print a
> np.delete(a, np.s_[1:3], axis=0) # remove rows 1 and 2
> print a
> np.delete(a, np.s_[1:3], axis=1) # remove columns 1 and 2
> print a
>
> Basically I am trying to delete some column/rows from an array or a
> matrix.
> It seems that delete doesn't work I expect (and advertised). Am I
> missing something?

Numpy arrays are continuous blocks of memory, so doing an in-place
deletion would require allocating a new block and copying everything
that isn't deleted over. numpy.delete does exactly that. It doesn't
modify the original array; it creates a copy of the non-deleted
portions and returns that.

If you run the above program in Python's REPL, then you'd see this:

>>> a = np.array([0, 10, 20, 30, 40])
>>> np.delete(a, [2, 4])
array([ 0, 10, 30])
>>> print a
[ 0 10 20 30 40]

Note how there's a result from running np.delete(), which gets printed
by default in the REPL.

Instead of allocating a new chunk of memory that's just a copy of most
of the old chunk, you can accomplish a similar feat using array
slices:

>>> a[[0,1,3]] # Everything in a except columns 2 and 4
array([ 0, 10, 30])

You could assign that back to a, and then be able to treat a as if
those two columns had been erased (since they'd be functionally
inaccessible). I assume this would make lookups into the array a bit
slower though, because now behind the scenes Numpy has to know to skip
over those blocks of memory that you've elided. It's all a question of
if CPU or RAM is more precious.

-Chris

>
> Thanks,
> Lee
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From denis-bz-gg at t-online.de  Wed Mar 14 12:30:17 2012
From: denis-bz-gg at t-online.de (denis)
Date: Wed, 14 Mar 2012 09:30:17 -0700 (PDT)
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
	<BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>
Message-ID: <9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>

On Mar 14, 3:17?pm, David Warde-Farley <warde... at iro.umontreal.ca>
wrote:
> On 2012-03-14, at 6:12 AM, Scott Sinclair wrote:
> I agree, certain sorts of recipes are a better fit than others. However, it would be nice if we had some clear and simple guidelines as to what belongs and what doesn't rather than making it a matter of subjective judgment; otherwise the only fair way forward seems to be accepting almost everything.

"Has anyone used this recipe in living memory ?"
would be a clear guideline.
(SO etc. track that with member voting, up / down and when.
Is there a simple off-the-shelf voting package that we could use for
recipes ?)
You're right, the tradeoff isn't easy:
accept everything -- hodepodge -- or cut through the jungle.

OT / beyond-topic, I think that each major area
(cluster fft integrate interpolate io ...) should have an owner;
all recipes with no owner go into "old/..." aka "nobodyknows/..."
I'm sure that's been discussed, no volunteers ...

cheers
  -- denis


From josef.pktd at gmail.com  Wed Mar 14 12:51:03 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 14 Mar 2012 12:51:03 -0400
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
	<BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>
	<9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>
Message-ID: <CAMMTP+A8ZG3SPvPS4eAE_urM41xNM+bup9U815vL5CHJjPbkaQ@mail.gmail.com>

On Wed, Mar 14, 2012 at 12:30 PM, denis <denis-bz-gg at t-online.de> wrote:
> On Mar 14, 3:17?pm, David Warde-Farley <warde... at iro.umontreal.ca>
> wrote:
>> On 2012-03-14, at 6:12 AM, Scott Sinclair wrote:
>> I agree, certain sorts of recipes are a better fit than others. However, it would be nice if we had some clear and simple guidelines as to what belongs and what doesn't rather than making it a matter of subjective judgment; otherwise the only fair way forward seems to be accepting almost everything.
>
> "Has anyone used this recipe in living memory ?"
> would be a clear guideline.
> (SO etc. track that with member voting, up / down and when.
> Is there a simple off-the-shelf voting package that we could use for
> recipes ?)
> You're right, the tradeoff isn't easy:
> accept everything -- hodepodge -- or cut through the jungle.
>
> OT / beyond-topic, I think that each major area
> (cluster fft integrate interpolate io ...) should have an owner;
> all recipes with no owner go into "old/..." aka "nobodyknows/..."
> I'm sure that's been discussed, no volunteers ...

I think, if a commenting system, download statistic and tagging or
searching works, then there will be very little "moderation" required.
(I don't think mathworks does on the file exchange, except maybe spam,
clear copyright violation, ..)

(When I was still watching Siskel and Ebert (movie critics) then I
could tell from their comments whether I would like the movie, but
thumbs up or down was often not very informative because of different
tastes.)

Cheers,

Josef

>
> cheers
> ?-- denis
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From pav at iki.fi  Wed Mar 14 13:35:56 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 14 Mar 2012 18:35:56 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CAMMTP+A8ZG3SPvPS4eAE_urM41xNM+bup9U815vL5CHJjPbkaQ@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
	<BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>
	<9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>
	<CAMMTP+A8ZG3SPvPS4eAE_urM41xNM+bup9U815vL5CHJjPbkaQ@mail.gmail.com>
Message-ID: <jjqktt$6hq$1@dough.gmane.org>

14.03.2012 17:51, josef.pktd at gmail.com kirjoitti:
[clip]
> I think, if a commenting system, download statistic and tagging or
> searching works, then there will be very little "moderation" required.
> (I don't think mathworks does on the file exchange, except maybe spam,
> clear copyright violation, ..)

In addition to commenting system, there could be a "I found this piece
useful in real life" button for giving explicit endorsements. I'm not
sure if there's a need for a "This is crap" button.

For comments, one could add options for marking comments helpful or not,
and an option for showing only "helpful" ones. Of course, it's probably
not going to be Slashdot, so comments probably will work even without a
relevance system in place.

This leaves spam, but the email activation system that's currently in
place is probably enough. One just needs suitable admin tools.
Flag-as-spam feature could also be added.

To make the download counts count something, one also needs to tell
robots not to follow those links (some sites apparently don't do this).

-- 
Pauli Virtanen


From william.ratcliff at gmail.com  Wed Mar 14 14:12:14 2012
From: william.ratcliff at gmail.com (william ratcliff)
Date: Wed, 14 Mar 2012 14:12:14 -0400
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <jjqktt$6hq$1@dough.gmane.org>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
	<BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>
	<9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>
	<CAMMTP+A8ZG3SPvPS4eAE_urM41xNM+bup9U815vL5CHJjPbkaQ@mail.gmail.com>
	<jjqktt$6hq$1@dough.gmane.org>
Message-ID: <CAFt3ydtTmTNe35MJ0gA_+pAUT3huj+tjVy_FyrpyZDC8FWJ_ng@mail.gmail.com>

There is a django-reddit (is it developed in django) library that could be
used for basic comments/upvoting.   I have some other ideas, but would like
to try to implement them first.


William

On Wed, Mar 14, 2012 at 1:35 PM, Pauli Virtanen <pav at iki.fi> wrote:

> 14.03.2012 17:51, josef.pktd at gmail.com kirjoitti:
> [clip]
> > I think, if a commenting system, download statistic and tagging or
> > searching works, then there will be very little "moderation" required.
> > (I don't think mathworks does on the file exchange, except maybe spam,
> > clear copyright violation, ..)
>
> In addition to commenting system, there could be a "I found this piece
> useful in real life" button for giving explicit endorsements. I'm not
> sure if there's a need for a "This is crap" button.
>
> For comments, one could add options for marking comments helpful or not,
> and an option for showing only "helpful" ones. Of course, it's probably
> not going to be Slashdot, so comments probably will work even without a
> relevance system in place.
>
> This leaves spam, but the email activation system that's currently in
> place is probably enough. One just needs suitable admin tools.
> Flag-as-spam feature could also be added.
>
> To make the download counts count something, one also needs to tell
> robots not to follow those links (some sites apparently don't do this).
>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120314/b2cd45e8/attachment.html>

From denis-bz-gg at t-online.de  Wed Mar 14 15:25:12 2012
From: denis-bz-gg at t-online.de (denis)
Date: Wed, 14 Mar 2012 12:25:12 -0700 (PDT)
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <jjqktt$6hq$1@dough.gmane.org>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
	<BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>
	<9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>
	<CAMMTP+A8ZG3SPvPS4eAE_urM41xNM+bup9U815vL5CHJjPbkaQ@mail.gmail.com>
	<jjqktt$6hq$1@dough.gmane.org>
Message-ID: <39bc189c-023f-46c5-bb69-027a91ec7c87@db5g2000vbb.googlegroups.com>


On Mar 14, 6:35?pm, Pauli Virtanen <p... at iki.fi> wrote:
> 14.03.2012 17:51, josef.p... at gmail.com kirjoitti:
> [clip]
>
> > I think, if a commenting system, download statistic and tagging or
> > searching works, then there will be very little "moderation" required.

May I suggest splitting this thread into
a) new Cookbook (guidelines, all vs the best)
b) voting / commenting system

because they have such different timescales ?
V/C could in theory tell us which recipes get used
but may take a looooong time to discuss and implement

cheers
  -- denis


From ralf.gommers at googlemail.com  Wed Mar 14 18:02:17 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Wed, 14 Mar 2012 23:02:17 +0100
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
	<BC86A8AF-CBC0-4A67-A230-28CEABB6E24B@iro.umontreal.ca>
	<9732cd01-88e3-4cca-b88f-8f312a07afbf@p6g2000yqi.googlegroups.com>
Message-ID: <CABL7CQjxdxzyuh1M9ec7StGH2h=Nvisxx4hwuVDcTHcUaq6jCQ@mail.gmail.com>

On Wed, Mar 14, 2012 at 5:30 PM, denis <denis-bz-gg at t-online.de> wrote:

> On Mar 14, 3:17 pm, David Warde-Farley <warde... at iro.umontreal.ca>
> wrote:
> > On 2012-03-14, at 6:12 AM, Scott Sinclair wrote:
> > I agree, certain sorts of recipes are a better fit than others. However,
> it would be nice if we had some clear and simple guidelines as to what
> belongs and what doesn't rather than making it a matter of subjective
> judgment; otherwise the only fair way forward seems to be accepting almost
> everything.
>
> "Has anyone used this recipe in living memory ?"
> would be a clear guideline.
> (SO etc. track that with member voting, up / down and when.
> Is there a simple off-the-shelf voting package that we could use for
> recipes ?)
> You're right, the tradeoff isn't easy:
> accept everything -- hodepodge -- or cut through the jungle.
>

Not everything is easy to judge, it would be great if someone could take a
shot at drafting a procedure for doing so. But all I wanted to propose is
to remove things like links to external sites, duplicate links and content
that's clearly not useful anymore. Examples of the latter:
http://www.scipy.org/Cookbook/PIL
http://www.scipy.org/Cookbook/xplt
http://www.scipy.org/Cookbook/Pyrex_and_NumPy

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120314/423c9101/attachment.html>

From lchaplin13 at gmail.com  Wed Mar 14 19:00:47 2012
From: lchaplin13 at gmail.com (Lee)
Date: Wed, 14 Mar 2012 16:00:47 -0700 (PDT)
Subject: [SciPy-User] delete rows and columns
In-Reply-To: <CABHB1j+vi04O9ifESe2CnrSN4UVXKWxb09b90QAz1orkW1oP7g@mail.gmail.com>
References: <46726410-54bc-41cc-a1f9-9064d7a50055@x10g2000pbi.googlegroups.com>
	<CABHB1j+vi04O9ifESe2CnrSN4UVXKWxb09b90QAz1orkW1oP7g@mail.gmail.com>
Message-ID: <8f6e743b-ab6f-4942-82eb-5106a0d76606@pz2g2000pbc.googlegroups.com>

Thanks Chris and Puneeth,

This explains all.

Lee

On Mar 15, 4:23?am, Chris Weisiger <cweisi... at msg.ucsf.edu> wrote:

> if CPU or RAM is more precious.
>


From tmp50 at ukr.net  Thu Mar 15 06:48:04 2012
From: tmp50 at ukr.net (Dmitrey)
Date: Thu, 15 Mar 2012 12:48:04 +0200
Subject: [SciPy-User] [ANN] new release 0.38 of OpenOpt, FuncDesigner,
	SpaceFuncs, DerApproximator
Message-ID: <73423.1331808484.17417893843813335040@ffe8.ukr.net>


Hi,
I'm glad to inform you about new release 0.38 (2012-March-15):


OpenOpt:

interalg can handle discrete variables (see MINLP for examples)
interalg can handle multiobjective problems (MOP)
interalg can handle problems with parameters fixedVars/freeVars
Many interalg improvements and some bugfixes
Add another EIG solver: numpy.linalg.eig
New LLSP solver pymls with box bounds handling

FuncDesigner:

Some improvements for sum()
Add funcs tanh, arctanh, arcsinh, arccosh
Can solve EIG built from derivatives of several functions, obtained by
automatic differentiation by FuncDesigner

SpaceFuncs:

Add method point.symmetry(Point|Line|Plane)
Add method LineSegment.middle
Add method Point.rotate(Center, angle)

DerApproximator:

Minor changes


See http://openopt.org for more details.


Regards, D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/73f69cc5/attachment.html>

From pav at iki.fi  Thu Mar 15 08:12:45 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 15 Mar 2012 13:12:45 +0100
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <CAKF=Djtxq262RTMcWav04YmQ=yoH1fi6cx_P7YZ=CR2jka2rug@mail.gmail.com>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CAKF=Djtxq262RTMcWav04YmQ=yoH1fi6cx_P7YZ=CR2jka2rug@mail.gmail.com>
Message-ID: <jjsmbt$gtt$1@dough.gmane.org>

14.03.2012 13:38, Skipper Seabold kirjoitti:
[clip]
> A humble suggestion for the layout, if people don't think it's done to
> death, bootstrap may be appropriate here [1, 2]. I've had good luck
> with ideas from bootstrap at least if not the whole framework. E.g, I
> find a CSS grid system to be aesthetically pleasing [3, 4]. There are
> many more examples than the given links. I find it saves a lot of the
> work of design from scratch.

+1

This is a very good suggestion. Bootstrap seems quite promising to me.
Could use some tweaking in the styling, though, as the out-of-box
appearance seems very generic Web 2.0 marketing-ishy. (To customize
colors etc. in it, you need to modify and rebuild it, which requires
Node.js).

To fix the issues of navigation on the Scipy.org and related sites, what
should be done is: design a base template layout with a place for
navigation tools, with at least the link back to the main site fixed.
Then use that for everything. Basing this on bootstrap would probably
make styling in scipy-central quite a bit easier.

-- 
Pauli Virtanen


From bacmsantos at gmail.com  Wed Mar 14 13:05:03 2012
From: bacmsantos at gmail.com (Bruno Santos)
Date: Wed, 14 Mar 2012 17:05:03 +0000
Subject: [SciPy-User] rv_frozen when using gamma function
Message-ID: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>

I am trying to write a script to do some maximum likelihood parameter
estimation of a function. But when I try to use the gamma function I get:
gamma(5)
Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>

I thought it might have been a problem solved already on the new
distribution but even after installing the last scipy version I get the
same problem.
The test() after installation is also failing with the following
information:
Running unit tests for scipy
NumPy version 1.5.1
NumPy is installed in /usr/lib/pymodules/python2.7/numpy
SciPy version 0.10.1
SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
nose version 1.1.2
...
...
...
AssertionError:
Arrays are not almost equal
 ACTUAL: 0.0
 DESIRED: 0.5

======================================================================
FAIL: Regression test for #651: better handling of badly conditioned
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
line 34, in test_bad_filter
    assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
  File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 982, in
assert_raises
    return nose.tools.assert_raises(*args,**kwargs)
AssertionError: BadCoefficients not raised

----------------------------------------------------------------------
Ran 5103 tests in 47.795s

FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>


My code is as follows:
from numpy import array,log,sum,nan
from scipy.stats import gamma
from scipy import factorial, optimize

#rinterface.initr()
#IntSexpVector = rinterface.IntSexpVector
#lgamma = rinterface.globalenv.get("lgamma")

#Implementation for the Zero-inflated Negative Binomial function
def alphabeta(params,x,dicerAcc):
    alpha = array(params[0])
    beta = array(params[1])
    if alpha<0 or beta<0:return nan
    return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x * log(dicerAcc)
- log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) - log(factorial(x)))

if __name__=='__main__':
    x =
array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
    dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
0.14979999999999999, 0.012999999999999999])
    optimize.()


Am I doing something wrong or is this a known problem?

Best,
Bruno
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120314/1c1b3148/attachment.html>

From cameron.hayne at dftmicrosystems.com  Wed Mar 14 15:16:10 2012
From: cameron.hayne at dftmicrosystems.com (Cameron Hayne)
Date: Wed, 14 Mar 2012 15:16:10 -0400
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
	fittings with bounds: why is scipy not up to the task?
In-Reply-To: <CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
Message-ID: <0FEAABD1-291F-4F90-A7D0-6E38D7DE010B@sympatico.ca>


On 14-Mar-12, at 6:12 AM, Scott Sinclair wrote:

> I still think that recipes which tell me how to do task X, with
> package Y are better hosted in the documentation/online resources of
> package Y.


But that is only useful when you know that you need to use package Y.
The cookbook should answer the question "How do I do X ?" (without any  
reference to specific packages).
For example, it should answer the question "How can I fit my data to a  
straight line?" - the answer would show several ways of doing that  
(with different scipy packages) and discuss the pros/cons of each way.

--
Cameron Hayne
hayne at sympatico.ca


From dhondt.olivier at gmail.com  Thu Mar 15 04:59:28 2012
From: dhondt.olivier at gmail.com (tyldurd)
Date: Thu, 15 Mar 2012 01:59:28 -0700 (PDT)
Subject: [SciPy-User] Numpy/Scipy: Avoiding nested loops to operate on
 matrix-valued images
Message-ID: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>


Hello,

I am a beginner at python and numpy and I need to compute the matrix 
logarithm for each "pixel" (i.e. x,y position) of a matrix-valued image of 
dimension MxNx3x3. 3x3 is the dimensions of the matrix at each pixel.

The function I have written so far is the following:

def logm_img(im):
    from scipy import linalg
    dimx = im.shape[0]
    dimy = im.shape[1]
    res = zeros_like(im)
    for x in range(dimx):
        for y in range(dimy):
            res[x, y, :, :] = linalg.logm(asmatrix(im[x,y,:,:]))
    return res

Is it ok? Is there a way to avoid the two nested loops ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/b488c9c5/attachment.html>

From jsseabold at gmail.com  Thu Mar 15 09:39:14 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Thu, 15 Mar 2012 09:39:14 -0400
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
Message-ID: <CAKF=DjsVqA+1gzr4pW3Qr+jxYYqCY7kKKp23vSHo_BNh3k1+Jw@mail.gmail.com>

On Wed, Mar 14, 2012 at 1:05 PM, Bruno Santos <bacmsantos at gmail.com> wrote:
> I am trying to write a script to do some maximum likelihood parameter
> estimation of a function. But when I try to use the gamma function I get:
> gamma(5)
> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>

That's the Gamma distribution in scipy.stats. You want the Gamma
function, it's in scipy.special

[~/]
[1]: from scipy import special

[~/]
[2]: special.gamma(5)
[2]: 24.0

What kind of likelihood are you trying to maximize?

Skipper


From josef.pktd at gmail.com  Thu Mar 15 09:40:13 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 15 Mar 2012 09:40:13 -0400
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
Message-ID: <CAMMTP+BMUQNVnpbWefYLS_EmTF_Jbrhd3h3Z2HqCbdMJtpo_GQ@mail.gmail.com>

On Wed, Mar 14, 2012 at 1:05 PM, Bruno Santos <bacmsantos at gmail.com> wrote:
> I am trying to write a script to do some maximum likelihood parameter
> estimation of a function. But when I try to use the gamma function I get:
> gamma(5)
> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>
> I thought it might have been a problem solved already on the new
> distribution but even after installing the last scipy version I get the same
> problem.
> The test() after installation is also failing with the following
> information:
> Running unit tests for scipy
> NumPy version 1.5.1
> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
> SciPy version 0.10.1
> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
> Python version 2.7.2+ (default, Oct ?4 2011, 20:06:09) [GCC 4.6.1]
> nose version 1.1.2
> ...
> ...
> ...
> AssertionError:
> Arrays are not almost equal
> ?ACTUAL: 0.0
> ?DESIRED: 0.5
>
> ======================================================================
> FAIL: Regression test for #651: better handling of badly conditioned
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ? File
> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
> line 34, in test_bad_filter
> ? ? assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
> ? File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 982, in
> assert_raises
> ? ? return nose.tools.assert_raises(*args,**kwargs)
> AssertionError: BadCoefficients not raised
>
> ----------------------------------------------------------------------
> Ran 5103 tests in 47.795s
>
> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>
>
> My code is as follows:
> from numpy import array,log,sum,nan
> from scipy.stats import gamma
> from scipy import factorial, optimize
>
> #rinterface.initr()
> #IntSexpVector = rinterface.IntSexpVector
> #lgamma = rinterface.globalenv.get("lgamma")
>
> #Implementation for the Zero-inflated Negative Binomial function
> def alphabeta(params,x,dicerAcc):
> ? ? alpha = array(params[0])
> ? ? beta = array(params[1])
> ? ? if alpha<0 or beta<0:return nan
> ? ? return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x * log(dicerAcc) -
> log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) - log(factorial(x)))

I guess what you want her is scipy.special.gamma which is the gamma
function, not the gamma distribution

loglikelihood of negative binomial is also in statsmodels.discrete if
you want to compare notes

Josef


>
> if __name__=='__main__':
> ? ? x =
> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
> ? ? dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
> 0.14979999999999999, 0.012999999999999999])
> ? ? optimize.()
>
>
> Am I doing something wrong or is this a known problem?
>
> Best,
> Bruno
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From jsseabold at gmail.com  Thu Mar 15 09:53:02 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Thu, 15 Mar 2012 09:53:02 -0400
Subject: [SciPy-User] Contributing to SciPy was Re: Least-squares
 fittings with bounds: why is scipy not up to the task?
In-Reply-To: <0FEAABD1-291F-4F90-A7D0-6E38D7DE010B@sympatico.ca>
References: <CAMMTP+BidfQWmwMZMtTNUh3UpW1-wM0Gw942Y3Cg54OhZofuRQ@mail.gmail.com>
	<878586b0-b759-4c32-beec-b1628c4e3358@y27g2000yqy.googlegroups.com>
	<CA+nsYDsd5L=REmotfR5qmYGf-gQk=mmTJ+z1sU6WCf44Ls2dgg@mail.gmail.com>
	<CABL7CQhc6FReVH5GeiaMxQZyt4vsi1z=nz21jd_xnaHLKQ7cow@mail.gmail.com>
	<CA+nsYDsQkUttVb9M3_qJcuxCVcNKVbJW39F++HSOErmO8-CuXw@mail.gmail.com>
	<CAN06oV8Rp84BPmz3Rt5ads46us07mgEMYk2vTM1yuWHFJgdWNA@mail.gmail.com>
	<CA+nsYDsob6Yzv9bX1zCm364KwRrHN-3n_xm6L72i1LiOshS1ag@mail.gmail.com>
	<0FEAABD1-291F-4F90-A7D0-6E38D7DE010B@sympatico.ca>
Message-ID: <CAKF=Djs+0Lu6UXoKNG3pFvEy5wGUNOEJ6_5eDhwH5rbPS2ttTw@mail.gmail.com>

On Wed, Mar 14, 2012 at 3:16 PM, Cameron Hayne
<cameron.hayne at dftmicrosystems.com> wrote:
>
> On 14-Mar-12, at 6:12 AM, Scott Sinclair wrote:
>
>> I still think that recipes which tell me how to do task X, with
>> package Y are better hosted in the documentation/online resources of
>> package Y.
>
>
> But that is only useful when you know that you need to use package Y.

I agree with this.

> The cookbook should answer the question "How do I do X ?" (without any
> reference to specific packages).

Here's my problem with this. What if the question is, the fairly
common, "how do I solve a non-linear programming problem?" The answer,
with numpy/scipy, is that really you don't (or you drop in the
entirety OpenOpt source code...). I don't think we should purge the
cookbook of examples like these. Don't get me wrong, I don't think we
should duplicate other package's documentation. But Dmitry
(presumably) took the time to write up these examples, and they can be
helpful in pointing someone towards topical software.

Comes down to SciPy vs. scipy I think. And maybe I'm saying I'd prefer
the former for the cookbook, and I think scikits, of which openopt was
formerly one, and closely related packages fall under SciPy.

This raises the question of what to do when the example becomes stale.
Well, for the future, maybe you could link your e-mail with the
posting of the original recipe and have a "Problem with this recipe"
button. I don't know.

That said, I'm also not doing any heavy lifting on this. Just my thoughts,

Skipper


From bacmsantos at gmail.com  Thu Mar 15 11:07:25 2012
From: bacmsantos at gmail.com (Bruno Santos)
Date: Thu, 15 Mar 2012 15:07:25 +0000
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
Message-ID: <CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>

Thank you all very much for the replies that was exactly what I wanted. I
am basically trying to get the parameters for a gamma-poisson distribution.
I have the R code from a previous collaborator just trying to write a
native function in python rather than using the R code or port it using
rpy2.
The function is the following:
[image: Inline images 1]
where f(b,d) is a function that gives me a probability of a certain
position in the vector to be occupied and it depends on b (the position)
and d (the likelihood of making an error).
So the likelihood after a few transformations become:

[image: Inline images 2]
Which I then use the loglikelihood and try to maximise it using an
optimization algorithm.
[image: Inline images 3]
The R code is as following:
alphabeta<-function(alphabeta,x,dicerAcc)
{
  alpha <-alphabeta[1]
  beta <-alphabeta[2]
  if (any(alphabeta<0))
    return(NA)
  sum((alpha*log(beta) + lgamma(alpha + x) + x * log(dicerAcc) -
lgamma(alpha) - (alpha + x) * log(beta+dicerAcc) - lfactorial(x))[dicerAcc
> noiseT])

#sum((alpha*log(beta)+(lgamma(alpha+x)+log(dicerError^x))-(lgamma(alpha)+log((beta+dicerError)^(alpha+x))+lfactorial(x)))[dicerError
!= 0])
}
x and dicerAcc are known so the I use the optim function in R
ab <- optim(c(1,100), alphabeta, control=list(fnscale=-1), x = x, dicerAcc
= dicerAcc)$par

Is there any equivalent function in Scipy to the optim one?

On 14 March 2012 17:05, Bruno Santos <bacmsantos at gmail.com> wrote:

> I am trying to write a script to do some maximum likelihood parameter
> estimation of a function. But when I try to use the gamma function I get:
> gamma(5)
> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>
> I thought it might have been a problem solved already on the new
> distribution but even after installing the last scipy version I get the
> same problem.
> The test() after installation is also failing with the following
> information:
> Running unit tests for scipy
> NumPy version 1.5.1
> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
> SciPy version 0.10.1
> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
> Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
> nose version 1.1.2
> ...
> ...
> ...
> AssertionError:
> Arrays are not almost equal
>  ACTUAL: 0.0
>  DESIRED: 0.5
>
> ======================================================================
> FAIL: Regression test for #651: better handling of badly conditioned
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File
> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
> line 34, in test_bad_filter
>     assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
>   File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 982, in
> assert_raises
>     return nose.tools.assert_raises(*args,**kwargs)
> AssertionError: BadCoefficients not raised
>
> ----------------------------------------------------------------------
> Ran 5103 tests in 47.795s
>
> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>
>
> My code is as follows:
> from numpy import array,log,sum,nan
> from scipy.stats import gamma
> from scipy import factorial, optimize
>
> #rinterface.initr()
> #IntSexpVector = rinterface.IntSexpVector
> #lgamma = rinterface.globalenv.get("lgamma")
>
> #Implementation for the Zero-inflated Negative Binomial function
> def alphabeta(params,x,dicerAcc):
>     alpha = array(params[0])
>     beta = array(params[1])
>     if alpha<0 or beta<0:return nan
>     return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x * log(dicerAcc)
> - log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) - log(factorial(x)))
>
> if __name__=='__main__':
>     x =
> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
>     dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
> 0.14979999999999999, 0.012999999999999999])
>     optimize.()
>
>
> Am I doing something wrong or is this a known problem?
>
> Best,
> Bruno
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/c17d5e27/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3193 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/c17d5e27/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 1620 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/c17d5e27/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 4401 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/c17d5e27/attachment-0002.png>

From jsseabold at gmail.com  Thu Mar 15 11:21:51 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Thu, 15 Mar 2012 11:21:51 -0400
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
	<CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>
Message-ID: <CAKF=DjstNC2Y=COSNSsV3A5_7_qfJmBZi7q-+vGjSxYpZpp6Ag@mail.gmail.com>

On Thu, Mar 15, 2012 at 11:07 AM, Bruno Santos <bacmsantos at gmail.com> wrote:

> Thank you all very much for the replies that was exactly what I wanted. I
> am basically trying to get the parameters for a gamma-poisson distribution.
> I have the R code from a previous collaborator just trying to write a
> native function in python rather than using the R code or port it using
> rpy2.


Oh, fun.


> The function is the following:
> [image: Inline images 1]
> where f(b,d) is a function that gives me a probability of a certain
> position in the vector to be occupied and it depends on b (the position)
> and d (the likelihood of making an error).
> So the likelihood after a few transformations become:
>
> [image: Inline images 2]
> Which I then use the loglikelihood and try to maximise it using an
> optimization algorithm.
> [image: Inline images 3]
> The R code is as following:
> alphabeta<-function(alphabeta,x,dicerAcc)
> {
>   alpha <-alphabeta[1]
>   beta <-alphabeta[2]
>   if (any(alphabeta<0))
>     return(NA)
>   sum((alpha*log(beta) + lgamma(alpha + x) + x * log(dicerAcc) -
> lgamma(alpha) - (alpha + x) * log(beta+dicerAcc) - lfactorial(x))[dicerAcc
> > noiseT])
>

>From a quick (distracted) look (so I could be wrong)

Should this be alpha^2*log(beta) ? +lgamma(alpha) ? And lfactorial(x)
should still be +lgamma(alpha)*lfactorial(x) ? And dicerAcc a scalar
integer I take it?


>
> #sum((alpha*log(beta)+(lgamma(alpha+x)+log(dicerError^x))-(lgamma(alpha)+log((beta+dicerError)^(alpha+x))+lfactorial(x)))[dicerError
> != 0])
> }
> x and dicerAcc are known so the I use the optim function in R
> ab <- optim(c(1,100), alphabeta, control=list(fnscale=-1), x = x, dicerAcc
> = dicerAcc)$par
>
> Is there any equivalent function in Scipy to the optim one?
>
> On 14 March 2012 17:05, Bruno Santos <bacmsantos at gmail.com> wrote:
>
>> I am trying to write a script to do some maximum likelihood parameter
>> estimation of a function. But when I try to use the gamma function I get:
>> gamma(5)
>> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>>
>> I thought it might have been a problem solved already on the new
>> distribution but even after installing the last scipy version I get the
>> same problem.
>> The test() after installation is also failing with the following
>> information:
>> Running unit tests for scipy
>> NumPy version 1.5.1
>> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
>> SciPy version 0.10.1
>> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
>> Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
>> nose version 1.1.2
>> ...
>> ...
>> ...
>> AssertionError:
>> Arrays are not almost equal
>>  ACTUAL: 0.0
>>  DESIRED: 0.5
>>
>> ======================================================================
>> FAIL: Regression test for #651: better handling of badly conditioned
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>   File
>> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
>> line 34, in test_bad_filter
>>     assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
>>   File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 982,
>> in assert_raises
>>     return nose.tools.assert_raises(*args,**kwargs)
>> AssertionError: BadCoefficients not raised
>>
>> ----------------------------------------------------------------------
>> Ran 5103 tests in 47.795s
>>
>> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
>> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>>
>>
>> My code is as follows:
>> from numpy import array,log,sum,nan
>> from scipy.stats import gamma
>> from scipy import factorial, optimize
>>
>> #rinterface.initr()
>> #IntSexpVector = rinterface.IntSexpVector
>> #lgamma = rinterface.globalenv.get("lgamma")
>>
>> #Implementation for the Zero-inflated Negative Binomial function
>> def alphabeta(params,x,dicerAcc):
>>     alpha = array(params[0])
>>     beta = array(params[1])
>>     if alpha<0 or beta<0:return nan
>>     return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x *
>> log(dicerAcc) - log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) -
>> log(factorial(x)))
>>
>> if __name__=='__main__':
>>     x =
>> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
>>     dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
>> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
>> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
>> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
>> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
>> 0.14979999999999999, 0.012999999999999999])
>>     optimize.()
>>
>>
>> Am I doing something wrong or is this a known problem?
>>
>> Best,
>> Bruno
>>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/18590074/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 1620 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/18590074/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3193 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/18590074/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 4401 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/18590074/attachment-0002.png>

From lists at hilboll.de  Thu Mar 15 13:13:04 2012
From: lists at hilboll.de (Andreas H.)
Date: Thu, 15 Mar 2012 18:13:04 +0100
Subject: [SciPy-User] [scipy-central] Site design
In-Reply-To: <CAKF=Djtxq262RTMcWav04YmQ=yoH1fi6cx_P7YZ=CR2jka2rug@mail.gmail.com>
References: <3742dcea3dd622da7c4069310e9574e6.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CAKF=Djtxq262RTMcWav04YmQ=yoH1fi6cx_P7YZ=CR2jka2rug@mail.gmail.com>
Message-ID: <38d8cde2b1bd3c29f88b5032c26a1f1f.squirrel@srv2.s4y.tournesol-consulting.eu>

> On Tue, Mar 13, 2012 at 1:37 PM, Andreas H. <lists at hilboll.de> wrote:
>> Hi all,
>>
>> I think everyone agrees that the webdesign of scipy-central.org needs
>> some
>> major enhancements in order to make the site appealing to users so that
>> they want to stay, browse, and use it.
>>
>> I think it would make sense to make the site visually similar to the
>> main
>> SciPy site (new.scipy.org), so that users can already "feel" the
>> connection. I'm mainly talking about colors and fonts here.
>>
>> Also, a logo would be good. For a start, maybe we could use the main
>> SciPy
>> logo, but eventually, scipy-central should have its own, similar logo.
>>
>> Then, a sidebar would be nice. Possible blocks for the sidebar include
>> 'links to core and related projects', 'what is SciPy', ... ideas
>> welcome.
>>
>> If you agree, I could start playing around with the templates/css over
>> the
>> next weeks.
>>
>
> A humble suggestion for the layout, if people don't think it's done to
> death, bootstrap may be appropriate here [1, 2]. I've had good luck
> with ideas from bootstrap at least if not the whole framework. E.g, I
> find a CSS grid system to be aesthetically pleasing [3, 4]. There are
> many more examples than the given links. I find it saves a lot of the
> work of design from scratch.
>
> Skipper
>
> [1] http://blog.baregit.com/2012/bootstrap-or-not-bootstrap
> [2] http://twitter.github.com/bootstrap/
> [3] http://960.gs/
> [4] http://cssgrid.net/

Thanks for the pointer to bootstrap, Skipper!

I'm working on a very first idea of the site layout, including Tony's
excellent logo. News early next week.

Andreas.


From jjhelmus at gmail.com  Thu Mar 15 15:50:33 2012
From: jjhelmus at gmail.com (Jonathan Helmus)
Date: Thu, 15 Mar 2012 15:50:33 -0400
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
References: <4F5916A2.2040604@eso.org>	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
Message-ID: <4F624809.3000102@gmail.com>

I know I am jumping into this thread late and it has drifted into 
another topics but I have some code that others might be interested in.  
With all the discussion of bounded leastsq and variable substitution I 
recalled that I had a wrapped version of leastsq in a larger project 
that allows for min, max bound using the variable transformations that 
MINUIT uses.  I pulled out the necessary functions, refactored the code 
and made a github repo in case anyone is interested 
(https://github.com/jjhelmus/leastsqbound-scipy).  This might make a 
good jumping off point for a more complete bounded leastsq optimizer 
that David had in mind.

- Jonathan Helmus

David Baddeley wrote:
> From a pure performance perspective, you're probably going to be best 
> setting your bounds by variable substitution (particularly if they're 
> only single-ended - x**2 is cheap) - you really don't want to have the 
> for loops, dictionary lookups and conditionals that lmfit introduces 
> for it's bounds checking inside your objective function.
>
> I think a high level wrapper that permitted bounds, an unadulterated 
> goal function, and setting which parameters to fit, but also retained 
> much of the raw speed of leastsq could be accomplished with some 
> clever on the fly code generation (maybe also using Sympy to 
> automatically derive the Jacobian). Would make an interesting project ...
>
> David
>
> ------------------------------------------------------------------------
> *From:* Eric Emsellem <eemselle at eso.org>
> *To:* Matthew Newville <matt.newville at gmail.com>
> *Cc:* scipy-user at scipy.org; scipy-user at googlegroups.com
> *Sent:* Friday, 9 March 2012 12:17 PM
> *Subject:* Re: [SciPy-User] Least-squares fittings with bounds: why is 
> scipy not up to the task?
>
>
>
> > Yes, see https://github.com/newville/lmfit-py,  which does everything
> > you ask for, and a bit more, with the possible exception of "being
> > included in scipy".  For what its worth, I work with Mark Rivers
> > (who's no longer actively developing Python), and our group is full of
> > IDL users who are very familiar with Markwardt's implementation.
> >
> > The lmfit-py version uses scipy.optimize.leastsq(), which uses MINPACK
> > directly, so has the advantage of not being implemented in pure IDL or
> > Python. It is definitely faster than mpfit.py.
> >
> > With lmfit-py, one writes a python function-to-minimize that takes a
> > list of Parameters instead of the array of floating point variables
> > that scipy.optimize.leastsq() uses. Each Parameter can be freely
> > varied of fixed, have upper and/or lower bounds placed on them, or be
> > written as algebraic expressions of other Parameters.  Uncertainties
> > in varied Parameters and correlations between Parameters are estimated
> > using the same "scaled covariance" method as used in
> > scipy.optimize.curve_fit().  There is limited support for
> > optimization methods other than scipy.optimize.leastsq(), but I don't
> > find these methods to be very useful for the kind of fitting  problems
> > I normally see, so support for them may not be perfect.
> >
> > Whether this gets included into scipy is up to the scipy developers.
> > I'd be happy to support this module within scipy or outside scipy.
> > I have no doubt that improvements could be made to lmfit.py.  If you
> > have suggestion, I'd be happy to hear them.
>
> looks great! I'll have a go at this, as mentioned in my previous post. I
> believe that leastsq is probably the fastest anyway (according to the
> test Adam mentioned to me today) so this could be it. I'll make a test
> and compare it with mpfit (for the specific case I am thinking of, I am
> optimising over ~10^5-6 points with ~90 parameters...).
>
> thanks again for this, and I'll try to report on this (if relevant) asap.
>
> Eric
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>   


From dtlussier at gmail.com  Thu Mar 15 16:23:49 2012
From: dtlussier at gmail.com (Dan Lussier)
Date: Thu, 15 Mar 2012 13:23:49 -0700
Subject: [SciPy-User] Numpy/Scipy: Avoiding nested loops to operate on
	matrix-valued images
In-Reply-To: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
References: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
Message-ID: <B39DF426-116D-4FF5-A369-7097CE72A87A@gmail.com>

Have you tried numpy.frompyfunc?

http://docs.scipy.org/doc/numpy/reference/generated/numpy.frompyfunc.html
http://stackoverflow.com/questions/6126233/can-i-create-a-python-numpy-ufunc-from-an-unbound-member-method

With this approach you may be able create a function which acts elementwise over your array to compute the matrix logarithm at each entry using Numpy's ufuncs.  This would avoid the explicit iteration over the array using the for loops.

As a rough outline try:

from scipy import linalg
import numpy as np

# Assume im is the container array containing a 3x3 matrix at each pixel.

# Composite function so get matrix log of array A as a matrix in one step
def log_matrix(A):
    return linalg.logm(np.asmatrix(A))
   

# Creating function to operate over container array.  Takes one argument and returns the result.
log_ufunc = np.frompyfunc(log_matrix, 1, 1)

# Using log_ufunc on container array, im
res = log_ufunc(im)

Dan
   

On 2012-03-15, at 1:59 AM, tyldurd wrote:

> Hello,
> 
> I am a beginner at python and numpy and I need to compute the matrix logarithm for each "pixel" (i.e. x,y position) of a matrix-valued image of dimension MxNx3x3. 3x3 is the dimensions of the matrix at each pixel.
> 
> The function I have written so far is the following:
> 
> def logm_img(im):
>     from scipy import linalg
>     dimx = im.shape[0]
>     dimy = im.shape[1]
>     res = zeros_like(im)
>     for x in range(dimx):
>         for y in range(dimy):
>             res[x, y, :, :] = linalg.logm(asmatrix(im[x,y,:,:]))
>     return res
> Is it ok? Is there a way to avoid the two nested loops ?
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/92ac84db/attachment.html>

From william.ratcliff at gmail.com  Thu Mar 15 16:38:22 2012
From: william.ratcliff at gmail.com (william ratcliff)
Date: Thu, 15 Mar 2012 16:38:22 -0400
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <4F624809.3000102@gmail.com>
References: <4F5916A2.2040604@eso.org>
	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>
	<4F593DFF.90101@eso.org>
	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>
	<4F624809.3000102@gmail.com>
Message-ID: <CAFt3ydum69My+69fF9mS6Y4BKxjqW2MjQo9adBa_+r+-sCqtXw@mail.gmail.com>

What is the license for MINUIT?

On Thu, Mar 15, 2012 at 3:50 PM, Jonathan Helmus <jjhelmus at gmail.com> wrote:

> I know I am jumping into this thread late and it has drifted into
> another topics but I have some code that others might be interested in.
> With all the discussion of bounded leastsq and variable substitution I
> recalled that I had a wrapped version of leastsq in a larger project
> that allows for min, max bound using the variable transformations that
> MINUIT uses.  I pulled out the necessary functions, refactored the code
> and made a github repo in case anyone is interested
> (https://github.com/jjhelmus/leastsqbound-scipy).  This might make a
> good jumping off point for a more complete bounded leastsq optimizer
> that David had in mind.
>
> - Jonathan Helmus
>
> David Baddeley wrote:
> > From a pure performance perspective, you're probably going to be best
> > setting your bounds by variable substitution (particularly if they're
> > only single-ended - x**2 is cheap) - you really don't want to have the
> > for loops, dictionary lookups and conditionals that lmfit introduces
> > for it's bounds checking inside your objective function.
> >
> > I think a high level wrapper that permitted bounds, an unadulterated
> > goal function, and setting which parameters to fit, but also retained
> > much of the raw speed of leastsq could be accomplished with some
> > clever on the fly code generation (maybe also using Sympy to
> > automatically derive the Jacobian). Would make an interesting project ...
> >
> > David
> >
> > ------------------------------------------------------------------------
> > *From:* Eric Emsellem <eemselle at eso.org>
> > *To:* Matthew Newville <matt.newville at gmail.com>
> > *Cc:* scipy-user at scipy.org; scipy-user at googlegroups.com
> > *Sent:* Friday, 9 March 2012 12:17 PM
> > *Subject:* Re: [SciPy-User] Least-squares fittings with bounds: why is
> > scipy not up to the task?
> >
> >
> >
> > > Yes, see https://github.com/newville/lmfit-py,  which does everything
> > > you ask for, and a bit more, with the possible exception of "being
> > > included in scipy".  For what its worth, I work with Mark Rivers
> > > (who's no longer actively developing Python), and our group is full of
> > > IDL users who are very familiar with Markwardt's implementation.
> > >
> > > The lmfit-py version uses scipy.optimize.leastsq(), which uses MINPACK
> > > directly, so has the advantage of not being implemented in pure IDL or
> > > Python. It is definitely faster than mpfit.py.
> > >
> > > With lmfit-py, one writes a python function-to-minimize that takes a
> > > list of Parameters instead of the array of floating point variables
> > > that scipy.optimize.leastsq() uses. Each Parameter can be freely
> > > varied of fixed, have upper and/or lower bounds placed on them, or be
> > > written as algebraic expressions of other Parameters.  Uncertainties
> > > in varied Parameters and correlations between Parameters are estimated
> > > using the same "scaled covariance" method as used in
> > > scipy.optimize.curve_fit().  There is limited support for
> > > optimization methods other than scipy.optimize.leastsq(), but I don't
> > > find these methods to be very useful for the kind of fitting  problems
> > > I normally see, so support for them may not be perfect.
> > >
> > > Whether this gets included into scipy is up to the scipy developers.
> > > I'd be happy to support this module within scipy or outside scipy.
> > > I have no doubt that improvements could be made to lmfit.py.  If you
> > > have suggestion, I'd be happy to hear them.
> >
> > looks great! I'll have a go at this, as mentioned in my previous post. I
> > believe that leastsq is probably the fastest anyway (according to the
> > test Adam mentioned to me today) so this could be it. I'll make a test
> > and compare it with mpfit (for the specific case I am thinking of, I am
> > optimising over ~10^5-6 points with ~90 parameters...).
> >
> > thanks again for this, and I'll try to report on this (if relevant) asap.
> >
> > Eric
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120315/d2671389/attachment.html>

From jjhelmus at gmail.com  Thu Mar 15 17:08:27 2012
From: jjhelmus at gmail.com (Jonathan Helmus)
Date: Thu, 15 Mar 2012 17:08:27 -0400
Subject: [SciPy-User] Least-squares fittings with bounds: why is scipy
 not up to the task?
In-Reply-To: <CAFt3ydum69My+69fF9mS6Y4BKxjqW2MjQo9adBa_+r+-sCqtXw@mail.gmail.com>
References: <4F5916A2.2040604@eso.org>	<18284571.1324.1331244589247.JavaMail.geo-discussion-forums@ynca15>	<4F593DFF.90101@eso.org>	<1331262863.55138.YahooMailNeo@web113402.mail.gq1.yahoo.com>	<4F624809.3000102@gmail.com>
	<CAFt3ydum69My+69fF9mS6Y4BKxjqW2MjQo9adBa_+r+-sCqtXw@mail.gmail.com>
Message-ID: <4F625A4B.9040905@gmail.com>

MINUIT , or more precisely Minuet2 is part of the ROOT which is GPL 
version 2 licensed.  There already exists a python wrapper for the 
package (http://code.google.com/p/pyminuit/), which is also GPL 
licensed.   I expect the licensing would cause problems if one wanted to 
including the package in scipy. 

The code in my repo on the other hand is BSD licensed and isn't based 
off MINUIT.  I merely used the same mathematical functions (sin, sqrt, 
arcsin, etc) for the variable transforms which are mentioned in MINUIT's 
User Manual. 

    - Jonathan Helmus


william ratcliff wrote:
> What is the license for MINUIT?
>
> On Thu, Mar 15, 2012 at 3:50 PM, Jonathan Helmus <jjhelmus at gmail.com 
> <mailto:jjhelmus at gmail.com>> wrote:
>
>     I know I am jumping into this thread late and it has drifted into
>     another topics but I have some code that others might be
>     interested in.
>     With all the discussion of bounded leastsq and variable substitution I
>     recalled that I had a wrapped version of leastsq in a larger project
>     that allows for min, max bound using the variable transformations that
>     MINUIT uses.  I pulled out the necessary functions, refactored the
>     code
>     and made a github repo in case anyone is interested
>     (https://github.com/jjhelmus/leastsqbound-scipy).  This might make a
>     good jumping off point for a more complete bounded leastsq optimizer
>     that David had in mind.
>
>     - Jonathan Helmus
>
>     David Baddeley wrote:
>     > From a pure performance perspective, you're probably going to be
>     best
>     > setting your bounds by variable substitution (particularly if
>     they're
>     > only single-ended - x**2 is cheap) - you really don't want to
>     have the
>     > for loops, dictionary lookups and conditionals that lmfit introduces
>     > for it's bounds checking inside your objective function.
>     >
>     > I think a high level wrapper that permitted bounds, an unadulterated
>     > goal function, and setting which parameters to fit, but also
>     retained
>     > much of the raw speed of leastsq could be accomplished with some
>     > clever on the fly code generation (maybe also using Sympy to
>     > automatically derive the Jacobian). Would make an interesting
>     project ...
>     >
>     > David
>     >
>     >
>     ------------------------------------------------------------------------
>     > *From:* Eric Emsellem <eemselle at eso.org <mailto:eemselle at eso.org>>
>     > *To:* Matthew Newville <matt.newville at gmail.com
>     <mailto:matt.newville at gmail.com>>
>     > *Cc:* scipy-user at scipy.org <mailto:scipy-user at scipy.org>;
>     scipy-user at googlegroups.com <mailto:scipy-user at googlegroups.com>
>     > *Sent:* Friday, 9 March 2012 12:17 PM
>     > *Subject:* Re: [SciPy-User] Least-squares fittings with bounds:
>     why is
>     > scipy not up to the task?
>     >
>     >
>     >
>     > > Yes, see https://github.com/newville/lmfit-py,  which does
>     everything
>     > > you ask for, and a bit more, with the possible exception of "being
>     > > included in scipy".  For what its worth, I work with Mark Rivers
>     > > (who's no longer actively developing Python), and our group is
>     full of
>     > > IDL users who are very familiar with Markwardt's implementation.
>     > >
>     > > The lmfit-py version uses scipy.optimize.leastsq(), which uses
>     MINPACK
>     > > directly, so has the advantage of not being implemented in
>     pure IDL or
>     > > Python. It is definitely faster than mpfit.py.
>     > >
>     > > With lmfit-py, one writes a python function-to-minimize that
>     takes a
>     > > list of Parameters instead of the array of floating point
>     variables
>     > > that scipy.optimize.leastsq() uses. Each Parameter can be freely
>     > > varied of fixed, have upper and/or lower bounds placed on
>     them, or be
>     > > written as algebraic expressions of other Parameters.
>      Uncertainties
>     > > in varied Parameters and correlations between Parameters are
>     estimated
>     > > using the same "scaled covariance" method as used in
>     > > scipy.optimize.curve_fit().  There is limited support for
>     > > optimization methods other than scipy.optimize.leastsq(), but
>     I don't
>     > > find these methods to be very useful for the kind of fitting
>      problems
>     > > I normally see, so support for them may not be perfect.
>     > >
>     > > Whether this gets included into scipy is up to the scipy
>     developers.
>     > > I'd be happy to support this module within scipy or outside scipy.
>     > > I have no doubt that improvements could be made to lmfit.py.
>      If you
>     > > have suggestion, I'd be happy to hear them.
>     >
>     > looks great! I'll have a go at this, as mentioned in my previous
>     post. I
>     > believe that leastsq is probably the fastest anyway (according
>     to the
>     > test Adam mentioned to me today) so this could be it. I'll make
>     a test
>     > and compare it with mpfit (for the specific case I am thinking
>     of, I am
>     > optimising over ~10^5-6 points with ~90 parameters...).
>     >
>     > thanks again for this, and I'll try to report on this (if
>     relevant) asap.
>     >
>     > Eric
>     > _______________________________________________
>     > SciPy-User mailing list
>     > SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>     <mailto:SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>>
>     > http://mail.scipy.org/mailman/listinfo/scipy-user
>     >
>     >
>     >
>     ------------------------------------------------------------------------
>     >
>     > _______________________________________________
>     > SciPy-User mailing list
>     > SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>     > http://mail.scipy.org/mailman/listinfo/scipy-user
>     >
>
>     _______________________________________________
>     SciPy-User mailing list
>     SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>   


From wesmckinn at gmail.com  Thu Mar 15 18:35:43 2012
From: wesmckinn at gmail.com (Wes McKinney)
Date: Thu, 15 Mar 2012 18:35:43 -0400
Subject: [SciPy-User] ANN: New PyData mailing list for pandas and other
	data-related projects
Message-ID: <CAJPUwMAsM+nZiSLASPE7X_5j5sDCgp5gJhBRKwROwA6KbthPFA@mail.gmail.com>

hi all,

Coming out of PyCon, there was a clear need to better organize &
support the growing pandas userbase and broader "Python for Data"
community. To that end, we've created a new Google Group which will
become the more official home of pandas discussions and, we hope,
broader data-related discussions.

http://groups.google.com/group/pydata

We've already begun to organize under the PyData banner for GitHub:
https://github.com/pydata. I envision a rich ecosystem of projects in
this namespace.

There's also a new #pydata channel on irc.freenode.net. Join us!

Many of you have also noticed the new pydata.org domain where the
pandas project page is hosted:

http://pandas.pydata.org/

The root domain currently points to the new NumFOCUS non-profit
organization, but only as a placeholder (that's really at
http://numfocus.org/). Thanks to them for the generous hosting
support.

This domain will soon become a portal for data scientists (analysts,
hackers, whatever term floats your boat) who use, or want to use,
Python.

Plans for this website are nascent, but the intention is to provide
better resources for new people entering the ecosystem (e.g.. which
packages to use for each problem domain, cookbook-like examples for
problem domains, access to open data sets, simpler installation &
setup, conference/tutorial/meetup announcements, etc.)

The specific mechanism for creating & curating the website are
undetermined. A wiki is the right philosophy, but functionally has
some real drawbacks. Regardless, the goal is to make the website
community-driven.

What do you need to take action on? If you're a pandas user, or are
interested in data tools for Python, please join the new group. pandas
mailing list traffic will be progressively shifted toward this new
list.

I also encourage you to start using the #pydata and #pystats hash tags
on Twitter to help establish a community presence there.

Looking forward to the rest of 2012-- lots of exciting things ahead!

cheers,
Wes


From lists at hilboll.de  Fri Mar 16 04:46:55 2012
From: lists at hilboll.de (Andreas H.)
Date: Fri, 16 Mar 2012 09:46:55 +0100
Subject: [SciPy-User] ANN: New PyData mailing list for pandas and other
 data-related projects
In-Reply-To: <CAJPUwMAsM+nZiSLASPE7X_5j5sDCgp5gJhBRKwROwA6KbthPFA@mail.gmail.com>
References: <CAJPUwMAsM+nZiSLASPE7X_5j5sDCgp5gJhBRKwROwA6KbthPFA@mail.gmail.com>
Message-ID: <c5d21b1ac0d30012dc274c4dd617f700.squirrel@srv2.s4y.tournesol-consulting.eu>

> This domain will soon become a portal for data scientists (analysts,
> hackers, whatever term floats your boat) who use, or want to use,
> Python.
>
> Plans for this website are nascent, but the intention is to provide
> better resources for new people entering the ecosystem (e.g.. which
> packages to use for each problem domain, cookbook-like examples for
> problem domains, access to open data sets, simpler installation &
> setup, conference/tutorial/meetup announcements, etc.)
>
> The specific mechanism for creating & curating the website are
> undetermined. A wiki is the right philosophy, but functionally has
> some real drawbacks. Regardless, the goal is to make the website
> community-driven.

I'm wondering if there are possible benefits of coordinating the
scipy-central.org and pydata.org websites? It seems to me that we're
beginning to develop two new portals for users, which are close enough in
focus. As a users, I wouldn't necessarily know whether one or the other is
better for my needs. Maybe the community could benefit from putting all
effort into one site?

Just my 2 cents ...

Andreas.


From josef.pktd at gmail.com  Fri Mar 16 12:45:33 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 16 Mar 2012 12:45:33 -0400
Subject: [SciPy-User] quadratic programming with fmin_slsqp
Message-ID: <CAMMTP+D1TUFE=sB-FWvOwzEydjtVas=NOSEiqHRWEU-Mtkuetw@mail.gmail.com>

scipy is missing a fmin_quadprog

http://en.wikipedia.org/wiki/Quadratic_programming#Problem_formulation

Did anyone ever try to see if fmin_slsqp can be used for this?
It looks flexible and targeted enough to be the base for a quadratic
programming wrapper.

So far I only got a quick experiment

Josef

------
# -*- coding: utf-8 -*-
"""

Created on Thu Mar 15 20:25:27 2012

Author: Josef Perktold


just a first try

write a function fmin_quadprog(func, A,B,C,b,c)

max xAx
st Bx = b
   Cx >= c

check if there is a standard notation for Matrices
http://en.wikipedia.org/wiki/Quadratic_programming#Problem_formulation

can work with f_eqcons and f_ieqcons

I'm not sure how good the checking is
I thought I had f_ieqcons that conflicted with eqcons, but got
normal successful convergence. f_ineqcons was binding but didn't satisfy eqcons

If f_ieqcons is defined, then ieqcons is ignored. see docstring


"""

from time import time
import numpy as np
from scipy.optimize import fmin_slsqp


def func(x, args):
    A = np.eye(len(x))
    A[0,0] = 2
    x = np.atleast_2d(x)
    return np.dot(np.dot(x,A), x.T)

#fprime=testfunc_deriv
B = np.eye(2)[0]
b = np.ones(2)[0]
f_ieqcons = lambda x, *args: np.atleast_1d(np.dot(x, B) - b)

t0 = time()
xres = fmin_slsqp(func,[2.0,1.0],  args=(-1.0,),
               eqcons=[lambda x, args: x[0]+x[1] + 4 ],
#               ieqcons=[lambda x, args: x[0]+.5,
#                        lambda x, args: x[0]],
               f_ieqcons=f_ieqcons,
               iprint=2, full_output=1)

print "Elapsed time:", 1000*(time()-t0), "ms"
print "Results",xres
print "\n\n"
---


From glenjenness at gmail.com  Fri Mar 16 16:34:58 2012
From: glenjenness at gmail.com (Glen Jenness)
Date: Fri, 16 Mar 2012 15:34:58 -0500
Subject: [SciPy-User] problem running SciPy
Message-ID: <CAHrrQpq4-wC0G_8d0zinpuk6b1LvVtuNpE7kaQeGNgRTz9LuRQ@mail.gmail.com>

Dear users,
I just recently installed SciPy, and when I went to run the tests, I got:

[gjenness at pople tmp]$ python -c "import scipy; scipy.test()"
Traceback (most recent call last):
  File "<string>", line 1, in ?
  File "/home/gjenness/programs/scipy-0.10.1/scipy/__init__.py", line 128,
in ?
    raise ImportError(msg)
ImportError: Error importing scipy: you cannot import scipy while
        being in scipy source directory; please exit the scipy source
        tree first, and relaunch your python intepreter.

Now this is fairly strange to me as I am not ANYWHERE near my SciPy source
directory.  I tried looking around to see if there was a solution (I had a
similar problem with NumPy a couple months back, but sadly I don't recall
what I did to fix it).

My site.cfg is:

[DEFAULT]
library_dirs = /usr/lib
include_dirs = /usr/include
[fftw]
libraries = fftw3
[mkl]
library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t
include_dirs = /opt/intel/mkl/10.0.3.020/include
mkl_libs = mkl_intel_lp64,mkl_intel_thread,mkl_core

and I configured/built SciPy with:
python setup.py config --compiler=intelem --fcompiler=intelem build_clib
--compiler=intelem --fcompiler=intelem build_ext --compiler=intelem
--fcompiler=intelem install --prefix=/home/gjenness/programs/scipy-0.10.1/

If anyone can help me resolve this problem it'd be greatly appreciated.

Thanks!
Dr. Glen Jenness
Schmidt Group
Department of Chemistry
University of Wisconsin - Madison
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120316/082012f1/attachment.html>

From pav at iki.fi  Fri Mar 16 16:52:47 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 16 Mar 2012 21:52:47 +0100
Subject: [SciPy-User] problem running SciPy
In-Reply-To: <CAHrrQpq4-wC0G_8d0zinpuk6b1LvVtuNpE7kaQeGNgRTz9LuRQ@mail.gmail.com>
References: <CAHrrQpq4-wC0G_8d0zinpuk6b1LvVtuNpE7kaQeGNgRTz9LuRQ@mail.gmail.com>
Message-ID: <jk0970$4mm$1@dough.gmane.org>

Hi,

16.03.2012 21:34, Glen Jenness kirjoitti:
[clip]
> [gjenness at pople tmp]$ python -c "import scipy; scipy.test()"
> Traceback (most recent call last):
>   File "<string>", line 1, in ?
>   File "/home/gjenness/programs/scipy-0.10.1/scipy/__init__.py", line
> 128, in ?
>     raise ImportError(msg)
> ImportError: Error importing scipy: you cannot import scipy while
>         being in scipy source directory; please exit the scipy source
>         tree first, and relaunch your python intepreter.
[clip]
> python setup.py config --compiler=intelem --fcompiler=intelem build_clib
> --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem
> --fcompiler=intelem install --prefix=/home/gjenness/programs/scipy-0.10.1/
> 
> If anyone can help me resolve this problem it'd be greatly appreciated.

You probably have done the following:

	cd ~/programs/
	tar xzf scipy-0.10.1.tar.gz
	cd scipy-0.10.1
	python setup.py ..... --prefix=~/programs/scipy-0.10.1/

The correct way would be either

	cd ~/tmp
	tar xzf scipy-0.10.1.tar.gz
	cd scipy-0.10.1
	python setup.py ..... --user

This installs everything under ~/.local. Or,

	python setup.py ..... --prefix=~/programs/scipy-0.10.1/

but now you need to set your PYTHONPATH to point to

	~/programs/scipy-0.10.1/lib/python2.X/site-packages

and not to

	~/programs/scipy-0.10.1

-- 
Pauli Virtanen


From glenjenness at gmail.com  Fri Mar 16 16:58:17 2012
From: glenjenness at gmail.com (Glen Jenness)
Date: Fri, 16 Mar 2012 15:58:17 -0500
Subject: [SciPy-User] problem running SciPy
In-Reply-To: <jk0970$4mm$1@dough.gmane.org>
References: <CAHrrQpq4-wC0G_8d0zinpuk6b1LvVtuNpE7kaQeGNgRTz9LuRQ@mail.gmail.com>
	<jk0970$4mm$1@dough.gmane.org>
Message-ID: <CAHrrQprG_ueqnL_=RqjzF3HtSDHUDbQqnuAfdT+6Frxw9iDKFg@mail.gmail.com>

Pauli,
Ah that did the trick!  I had it in that directory just for testing
purposes, but once I had to go to my other python libraries it went away.

I am currently having another problem related to importing in SciPy's
optimizers.

If I do: python -c "import scipy.optimize"

I get:

Traceback (most recent call last):
  File "<string>", line 1, in ?
  File
"/home/gjenness/lib/lib64/python2.4/site-packages/scipy/optimize/__init__.py",
line 132, in ?
    from lbfgsb import fmin_l_bfgs_b
  File
"/home/gjenness/lib/lib64/python2.4/site-packages/scipy/optimize/lbfgsb.py",
line 28, in ?
    import _lbfgsb
ImportError: /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_intel_thread.so:
undefined symbol: omp_in_parallel

I am currently trying to figure this out, but if anyone has any advice
that'll save the amount of Google'ing I'll have to do, it'd be appreciated
:)

On Fri, Mar 16, 2012 at 3:52 PM, Pauli Virtanen <pav at iki.fi> wrote:

> Hi,
>
> 16.03.2012 21:34, Glen Jenness kirjoitti:
> [clip]
> > [gjenness at pople tmp]$ python -c "import scipy; scipy.test()"
> > Traceback (most recent call last):
> >   File "<string>", line 1, in ?
> >   File "/home/gjenness/programs/scipy-0.10.1/scipy/__init__.py", line
> > 128, in ?
> >     raise ImportError(msg)
> > ImportError: Error importing scipy: you cannot import scipy while
> >         being in scipy source directory; please exit the scipy source
> >         tree first, and relaunch your python intepreter.
> [clip]
> > python setup.py config --compiler=intelem --fcompiler=intelem build_clib
> > --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem
> > --fcompiler=intelem install
> --prefix=/home/gjenness/programs/scipy-0.10.1/
> >
> > If anyone can help me resolve this problem it'd be greatly appreciated.
>
> You probably have done the following:
>
>        cd ~/programs/
>        tar xzf scipy-0.10.1.tar.gz
>        cd scipy-0.10.1
>        python setup.py ..... --prefix=~/programs/scipy-0.10.1/
>
> The correct way would be either
>
>        cd ~/tmp
>        tar xzf scipy-0.10.1.tar.gz
>        cd scipy-0.10.1
>        python setup.py ..... --user
>
> This installs everything under ~/.local. Or,
>
>        python setup.py ..... --prefix=~/programs/scipy-0.10.1/
>
> but now you need to set your PYTHONPATH to point to
>
>        ~/programs/scipy-0.10.1/lib/python2.X/site-packages
>
> and not to
>
>        ~/programs/scipy-0.10.1
>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


-- 
Dr. Glen Jenness
Schmidt Group
Department of Chemistry
University of Wisconsin - Madison
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120316/06d5bfe2/attachment.html>

From wesmckinn at gmail.com  Fri Mar 16 17:46:10 2012
From: wesmckinn at gmail.com (Wes McKinney)
Date: Fri, 16 Mar 2012 17:46:10 -0400
Subject: [SciPy-User] ANN: pandas 0.7.2 released
Message-ID: <CAJPUwMCVx3P0RGjtYEHkiMEukLUb+tBSE5SGhdSmiv=BHEJuxw@mail.gmail.com>

hi all,

I'm pleased to announce the pandas 0.7.2 release. Like prior
releases this includes a handful of bug fixes from 0.7.1, some
performance enhancements, and new features. This release is
recommended for all pandas users.

Major work is underway for pandas 0.8.0, hopefully to be released
at the end of April. In particular, the time series capabilities
are seeing significant work, incorporating the new NumPy
datetime64 dtype and features which have been available in
scikits.timeseries but not in pandas. See the issue tracker for a
full of list planned new features and performance/infrastructural
improvements. If you are interested in becoming more involved
with the project, the issue tracker (which is really the TODO
list!) is the best place to start.

See Adam Klein's post http://blog.adamdklein.com/?p=582 for more on the
ongoing time series work.

Also, my video from PyCon may be of interest to some:

http://pyvideo.org/video/696/pandas-powerful-data-analysis-tools-for-python

Note that pandas has a new mailing list!
http://groups.google.com/group/pydata . There is also a new
#pydata channel on irc.freenode.net.

Thanks to all who contributed to this release!

- Wes

What is it
==========
pandas is a Python package providing fast, flexible, and expressive
data structures designed to make working with ?relational? or
?labeled? data both easy and intuitive. It aims to be the fundamental
high-level building block for doing practical, real world data
analysis in Python.

Links
=====
Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst
Documentation: http://pandas.pydata.org
Installers: http://pypi.python.org/pypi/pandas
Code Repository: http://github.com/pydata/pandas
Mailing List: http://groups.google.com/group/pydata
Blog: http://blog.wesmckinney.com


From khoroshyy at gmail.com  Fri Mar 16 10:26:53 2012
From: khoroshyy at gmail.com (Petro)
Date: Fri, 16 Mar 2012 07:26:53 -0700 (PDT)
Subject: [SciPy-User] load file with "header" in the bottom of the file.
Message-ID: <2e8597ba-14ec-408f-afd7-471969cf6b08@gw9g2000vbb.googlegroups.com>

Hi all.
numpy.loadtxt allows to skip headers line.
I have a lot of tab-delimited files were description is on the bottom.
Does anybody know an easy  way  to read such file.
Thanks in advance.
Petro.


From conny.kuehne at googlemail.com  Fri Mar 16 15:34:49 2012
From: conny.kuehne at googlemail.com (=?iso-8859-1?Q?Conny_K=FChne?=)
Date: Fri, 16 Mar 2012 20:34:49 +0100
Subject: [SciPy-User] non-existing path in 'scipy/io': 'docs'
Message-ID: <74F753BB-984A-400E-82F9-2B95EE0CEE80@googlemail.com>

Hello,

I get the following error when trying to build scipy 0.10.1 from source

blas_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-faltivec', '-I/System/Library/Frameworks/vecLib.framework/Headers']

non-existing path in 'scipy/io': 'docs'
lapack_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-faltivec']

umfpack_info:
  libraries umfpack not found in /Library/Frameworks/Python.framework/Versions/2.7/lib
  libraries umfpack not found in /usr/local/lib
  libraries umfpack not found in /usr/lib
amd_info:
  libraries amd not found in /Library/Frameworks/Python.framework/Versions/2.7/lib
  libraries amd not found in /usr/local/lib
  libraries amd not found in /usr/lib
  FOUND:
    libraries = ['amd']
    library_dirs = ['/opt/local/lib']

  FOUND:
    libraries = ['umfpack', 'amd']
    library_dirs = ['/opt/local/lib']

I already tried different scipy version with no success. Installing with easy_install yields a similar result.

I use
Mac OS X 10.6.8
gcc version: i686-apple-darwin10-gcc-4.0.1 (GCC) 4.0.1
gfortran version: GNU Fortran (GCC) 4.2.3

Any ideas?

Conny K?hne


From andrew_giessel at hms.harvard.edu  Sat Mar 17 12:17:57 2012
From: andrew_giessel at hms.harvard.edu (Andrew Giessel)
Date: Sat, 17 Mar 2012 12:17:57 -0400
Subject: [SciPy-User] load file with "header" in the bottom of the file.
In-Reply-To: <2e8597ba-14ec-408f-afd7-471969cf6b08@gw9g2000vbb.googlegroups.com>
References: <2e8597ba-14ec-408f-afd7-471969cf6b08@gw9g2000vbb.googlegroups.com>
Message-ID: <CAKF8KOdMUpv1E_f6SrdxyzsMJvd8RspMRKiaCz_TJFCqkpE1BQ@mail.gmail.com>

I'd suggest perhaps using the unix utils 'tail' and 'head' and 'wc' to
pre-process the files

On Fri, Mar 16, 2012 at 10:26, Petro <khoroshyy at gmail.com> wrote:

> Hi all.
> numpy.loadtxt allows to skip headers line.
> I have a lot of tab-delimited files were description is on the bottom.
> Does anybody know an easy  way  to read such file.
> Thanks in advance.
> Petro.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


-- 
Andrew Giessel, PhD

Department of Neurobiology, Harvard Medical School
220 Longwood Ave Boston, MA 02115
ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120317/c983c9a4/attachment.html>

From warren.weckesser at enthought.com  Sat Mar 17 12:32:16 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Sat, 17 Mar 2012 11:32:16 -0500
Subject: [SciPy-User] load file with "header" in the bottom of the file.
In-Reply-To: <2e8597ba-14ec-408f-afd7-471969cf6b08@gw9g2000vbb.googlegroups.com>
References: <2e8597ba-14ec-408f-afd7-471969cf6b08@gw9g2000vbb.googlegroups.com>
Message-ID: <CAM-+wY_d+HUbprA=R1zAv1UngiPnaLKW+EqGh6O7nv8FR=OTVA@mail.gmail.com>

On Fri, Mar 16, 2012 at 9:26 AM, Petro <khoroshyy at gmail.com> wrote:

> Hi all.
> numpy.loadtxt allows to skip headers line.
> I have a lot of tab-delimited files were description is on the bottom.
> Does anybody know an easy  way  to read such file.
> Thanks in advance.
> Petro.
>


numpy.genfromtxt has a 'skip_footer' argument for ignoring lines at the end
of the file.  For example:


In [5]: !cat test.tsv
100    200    300
400    500    600
This is a test.

In [6]: a = genfromtxt('test.tsv', delimiter='\t', skip_footer=1)

In [7]: a
Out[7]:
array([[ 100.,  200.,  300.],
       [ 400.,  500.,  600.]])


Warren


> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120317/7771651c/attachment.html>

From khoroshyy at gmail.com  Sun Mar 18 06:04:18 2012
From: khoroshyy at gmail.com (Petro)
Date: Sun, 18 Mar 2012 03:04:18 -0700 (PDT)
Subject: [SciPy-User] load file with "header" in the bottom of the file.
In-Reply-To: <CAM-+wY_d+HUbprA=R1zAv1UngiPnaLKW+EqGh6O7nv8FR=OTVA@mail.gmail.com>
References: <2e8597ba-14ec-408f-afd7-471969cf6b08@gw9g2000vbb.googlegroups.com>
	<CAM-+wY_d+HUbprA=R1zAv1UngiPnaLKW+EqGh6O7nv8FR=OTVA@mail.gmail.com>
Message-ID: <f68bd49c-4003-4703-9410-f8160c7ee8f1@l1g2000vbc.googlegroups.com>

Thanks.
Genfromtxt solved my problem.

On Mar 17, 5:32?pm, Warren Weckesser <warren.weckes... at enthought.com>
wrote:
> On Fri, Mar 16, 2012 at 9:26 AM, Petro <khoros... at gmail.com> wrote:
> > Hi all.
> > numpy.loadtxt allows to skip headers line.
> > I have a lot of tab-delimited files were description is on the bottom.
> > Does anybody know an easy ?way ?to read such file.
> > Thanks in advance.
> > Petro.
>
> numpy.genfromtxt has a 'skip_footer' argument for ignoring lines at the end
> of the file. ?For example:
>
> In [5]: !cat test.tsv
> 100 ? ?200 ? ?300
> 400 ? ?500 ? ?600
> This is a test.
>
> In [6]: a = genfromtxt('test.tsv', delimiter='\t', skip_footer=1)
>
> In [7]: a
> Out[7]:
> array([[ 100., ?200., ?300.],
> ? ? ? ?[ 400., ?500., ?600.]])
>
> Warren
>
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-U... at scipy.org
> >http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user


From ralf.gommers at googlemail.com  Sun Mar 18 17:42:11 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 18 Mar 2012 22:42:11 +0100
Subject: [SciPy-User] non-existing path in 'scipy/io': 'docs'
In-Reply-To: <74F753BB-984A-400E-82F9-2B95EE0CEE80@googlemail.com>
References: <74F753BB-984A-400E-82F9-2B95EE0CEE80@googlemail.com>
Message-ID: <CABL7CQiD3DD7HmZSMxC5DP5G5XBEVDW-mWgim_M50OfUEbRo4w@mail.gmail.com>

2012/3/16 Conny K?hne <conny.kuehne at googlemail.com>

> Hello,
>
> I get the following error when trying to build scipy 0.10.1 from source
>
> blas_opt_info:
>  FOUND:
>    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
>    define_macros = [('NO_ATLAS_INFO', 3)]
>    extra_compile_args = ['-faltivec',
> '-I/System/Library/Frameworks/vecLib.framework/Headers']
>
> non-existing path in 'scipy/io': 'docs'
>

This can be cleaned up by removing the line "config.add_data_dir('docs')"
in scipy/io/setup.py. Note that this is not an actual error though, just a
harmless warning. Do you have an actual install issue? If so, can you post
the full build log?

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120318/f821c7bb/attachment.html>

From vaggi.federico at gmail.com  Sun Mar 18 18:18:26 2012
From: vaggi.federico at gmail.com (federico vaggi)
Date: Sun, 18 Mar 2012 23:18:26 +0100
Subject: [SciPy-User] Numpy/MATLAB difference in array indexing
Message-ID: <CAGvd0=jiUFwoxhJmrri3LCSYXX09ROtZFm3hZ4r6LY-V6Qwh=g@mail.gmail.com>

Hi everyone,

I was trying to port some code from MATLAB to Scipy, and I noticed a slight
bug in the functionality of numpy.tile vs repmat in matlab:

For example:

a = np.random.rand(10,2)

b = tile(a[:,1],(1,5))

b.shape
Out[86]: (1, 50)

While MATLAB gives:

>>  a = rand(10,2);
>> b = repmat(a(:,1),[1,5]);
>> size(b)

ans =

    10     5

This is obviously trivial to fix**, but I'm wondering what causes the
difference?  If you take a vertical slice of an array in numpy that's seen
as a row vector, while in MATLAB its seen as a column vector?

Is it worth making a note in here:
http://www.scipy.org/NumPy_for_Matlab_Users ?

Federico

**
The easiest way I found was:

b = tile(a[:,1],(5,1)).T
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120318/809281af/attachment.html>

From pav at iki.fi  Sun Mar 18 18:37:34 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 18 Mar 2012 23:37:34 +0100
Subject: [SciPy-User] Numpy/MATLAB difference in array indexing
In-Reply-To: <CAGvd0=jiUFwoxhJmrri3LCSYXX09ROtZFm3hZ4r6LY-V6Qwh=g@mail.gmail.com>
References: <CAGvd0=jiUFwoxhJmrri3LCSYXX09ROtZFm3hZ4r6LY-V6Qwh=g@mail.gmail.com>
Message-ID: <jk5o3e$6k9$1@dough.gmane.org>

18.03.2012 23:18, federico vaggi kirjoitti:
> I was trying to port some code from MATLAB to Scipy, and I noticed a
> slight bug in the functionality of numpy.tile vs repmat in matlab:
> 
> For example:
> 
> a = np.random.rand(10,2)
> 
> b = tile(a[:,1],(1,5))

a[:,1] is an 1-d array, and therefore considered as a (1, N) vector in
2-d context. This is not a bug --- the Numpy constructs do not always
map exactly to Matlab ones.

-- 
Pauli Virtanen


From wesmckinn at gmail.com  Sun Mar 18 20:23:34 2012
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sun, 18 Mar 2012 20:23:34 -0400
Subject: [SciPy-User] Numpy/MATLAB difference in array indexing
In-Reply-To: <jk5o3e$6k9$1@dough.gmane.org>
References: <CAGvd0=jiUFwoxhJmrri3LCSYXX09ROtZFm3hZ4r6LY-V6Qwh=g@mail.gmail.com>
	<jk5o3e$6k9$1@dough.gmane.org>
Message-ID: <CAJPUwMCMYQH=TJj0E2i4TJ60-eL6UyjTBLzd97dp_s_A9ojEdQ@mail.gmail.com>

On Sun, Mar 18, 2012 at 6:37 PM, Pauli Virtanen <pav at iki.fi> wrote:
> 18.03.2012 23:18, federico vaggi kirjoitti:
>> I was trying to port some code from MATLAB to Scipy, and I noticed a
>> slight bug in the functionality of numpy.tile vs repmat in matlab:
>>
>> For example:
>>
>> a = np.random.rand(10,2)
>>
>> b = tile(a[:,1],(1,5))
>
> a[:,1] is an 1-d array, and therefore considered as a (1, N) vector in
> 2-d context. This is not a bug --- the Numpy constructs do not always
> map exactly to Matlab ones.
>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

In [3]: b = tile(a[:,1:2],(1,5))

In [4]: b.shape
Out[4]: (10, 5)


From sturla at molden.no  Mon Mar 19 06:51:38 2012
From: sturla at molden.no (Sturla Molden)
Date: Mon, 19 Mar 2012 11:51:38 +0100
Subject: [SciPy-User] Numpy/MATLAB difference in array indexing
In-Reply-To: <jk5o3e$6k9$1@dough.gmane.org>
References: <CAGvd0=jiUFwoxhJmrri3LCSYXX09ROtZFm3hZ4r6LY-V6Qwh=g@mail.gmail.com>
	<jk5o3e$6k9$1@dough.gmane.org>
Message-ID: <4F670FBA.7010806@molden.no>

On 18.03.2012 23:37, Pauli Virtanen wrote:

>> b = tile(a[:,1],(1,5))
>
> a[:,1] is an 1-d array, and therefore considered as a (1, N) vector in
> 2-d context. This is not a bug --- the Numpy constructs do not always
> map exactly to Matlab ones.

Yes.

Also, the need for "repmats" (np.repeat, np.tile, np.hstack, np.vstack) 
and "reshapes" (np.reshape, np.ndarray.reshape) is less prominent in 
NumPy because of broadcasting. Using MATLAB idioms like reshape and 
repmat instead of broadcasting is a common mistake (or bad habit) when 
coming to NumPy for MATLAB.

In my experience, 99% of cases for

   a .* reshape(b,m,n)
   a .* repmat(b,m,n)

in MATLAB will just map to NumPy constructs like these:

   a * b
   a * b[:,np.newaxis]

This, in addition to view arrays, make NumPy much more memory efficient.

Not to mention that a.T is O(1) in NumPy whereas a' is O(N*M) in MATLAB.

Sturla


From bacmsantos at gmail.com  Mon Mar 19 12:14:58 2012
From: bacmsantos at gmail.com (Bruno Santos)
Date: Mon, 19 Mar 2012 16:14:58 +0000
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CAKF=DjstNC2Y=COSNSsV3A5_7_qfJmBZi7q-+vGjSxYpZpp6Ag@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
	<CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>
	<CAKF=DjstNC2Y=COSNSsV3A5_7_qfJmBZi7q-+vGjSxYpZpp6Ag@mail.gmail.com>
Message-ID: <CANUurQqaSanFTexVtMTcsTXp_zS=0ZSV+4oKkxnL0SKGY92beA@mail.gmail.com>

I believe the formula I have is accurate I checked it personally and also
have it checked by two mathematicians in the lab and they come up with the
same results. I left my notebook where I performed the transformations home
so don't completely remember but I believe you can simply things to get rid
of some of the parameters.
dicerAcc is a scalar as you mentioned.
I managed to implement the function in python now and it is giving the same
results as in R my question how to maximize it still remains though. Is
it possibly to maximize a function rather than minimize it in Python?

On 15 March 2012 15:21, Skipper Seabold <jsseabold at gmail.com> wrote:

> On Thu, Mar 15, 2012 at 11:07 AM, Bruno Santos <bacmsantos at gmail.com>wrote:
>
>> Thank you all very much for the replies that was exactly what I wanted. I
>> am basically trying to get the parameters for a gamma-poisson distribution.
>> I have the R code from a previous collaborator just trying to write a
>> native function in python rather than using the R code or port it using
>> rpy2.
>
>
> Oh, fun.
>
>
>> The function is the following:
>> [image: Inline images 1]
>> where f(b,d) is a function that gives me a probability of a certain
>> position in the vector to be occupied and it depends on b (the position)
>> and d (the likelihood of making an error).
>> So the likelihood after a few transformations become:
>>
>> [image: Inline images 2]
>> Which I then use the loglikelihood and try to maximise it using an
>> optimization algorithm.
>> [image: Inline images 3]
>> The R code is as following:
>> alphabeta<-function(alphabeta,x,dicerAcc)
>> {
>>   alpha <-alphabeta[1]
>>   beta <-alphabeta[2]
>>   if (any(alphabeta<0))
>>     return(NA)
>>   sum((alpha*log(beta) + lgamma(alpha + x) + x * log(dicerAcc) -
>> lgamma(alpha) - (alpha + x) * log(beta+dicerAcc) - lfactorial(x))[dicerAcc
>> > noiseT])
>>
>
> From a quick (distracted) look (so I could be wrong)
>
> Should this be alpha^2*log(beta) ? +lgamma(alpha) ? And lfactorial(x)
> should still be +lgamma(alpha)*lfactorial(x) ? And dicerAcc a scalar
> integer I take it?
>
>
>>
>> #sum((alpha*log(beta)+(lgamma(alpha+x)+log(dicerError^x))-(lgamma(alpha)+log((beta+dicerError)^(alpha+x))+lfactorial(x)))[dicerError
>> != 0])
>> }
>> x and dicerAcc are known so the I use the optim function in R
>> ab <- optim(c(1,100), alphabeta, control=list(fnscale=-1), x = x,
>> dicerAcc = dicerAcc)$par
>>
>> Is there any equivalent function in Scipy to the optim one?
>>
>> On 14 March 2012 17:05, Bruno Santos <bacmsantos at gmail.com> wrote:
>>
>>> I am trying to write a script to do some maximum likelihood parameter
>>> estimation of a function. But when I try to use the gamma function I get:
>>> gamma(5)
>>> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>>>
>>> I thought it might have been a problem solved already on the new
>>> distribution but even after installing the last scipy version I get the
>>> same problem.
>>> The test() after installation is also failing with the following
>>> information:
>>> Running unit tests for scipy
>>> NumPy version 1.5.1
>>> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
>>> SciPy version 0.10.1
>>> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
>>> Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
>>> nose version 1.1.2
>>> ...
>>> ...
>>> ...
>>> AssertionError:
>>> Arrays are not almost equal
>>>  ACTUAL: 0.0
>>>  DESIRED: 0.5
>>>
>>> ======================================================================
>>> FAIL: Regression test for #651: better handling of badly conditioned
>>> ----------------------------------------------------------------------
>>> Traceback (most recent call last):
>>>   File
>>> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
>>> line 34, in test_bad_filter
>>>     assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
>>>   File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 982,
>>> in assert_raises
>>>     return nose.tools.assert_raises(*args,**kwargs)
>>> AssertionError: BadCoefficients not raised
>>>
>>> ----------------------------------------------------------------------
>>> Ran 5103 tests in 47.795s
>>>
>>> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
>>> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>>>
>>>
>>> My code is as follows:
>>> from numpy import array,log,sum,nan
>>> from scipy.stats import gamma
>>> from scipy import factorial, optimize
>>>
>>> #rinterface.initr()
>>> #IntSexpVector = rinterface.IntSexpVector
>>> #lgamma = rinterface.globalenv.get("lgamma")
>>>
>>> #Implementation for the Zero-inflated Negative Binomial function
>>> def alphabeta(params,x,dicerAcc):
>>>     alpha = array(params[0])
>>>     beta = array(params[1])
>>>     if alpha<0 or beta<0:return nan
>>>     return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x *
>>> log(dicerAcc) - log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) -
>>> log(factorial(x)))
>>>
>>> if __name__=='__main__':
>>>     x =
>>> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
>>>     dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
>>> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
>>> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
>>> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
>>> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
>>> 0.14979999999999999, 0.012999999999999999])
>>>     optimize.()
>>>
>>>
>>> Am I doing something wrong or is this a known problem?
>>>
>>> Best,
>>> Bruno
>>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/9f7e25d4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 4401 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/9f7e25d4/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 1620 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/9f7e25d4/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3193 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/9f7e25d4/attachment-0002.png>

From jsseabold at gmail.com  Mon Mar 19 12:23:20 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Mon, 19 Mar 2012 12:23:20 -0400
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CANUurQqaSanFTexVtMTcsTXp_zS=0ZSV+4oKkxnL0SKGY92beA@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
	<CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>
	<CAKF=DjstNC2Y=COSNSsV3A5_7_qfJmBZi7q-+vGjSxYpZpp6Ag@mail.gmail.com>
	<CANUurQqaSanFTexVtMTcsTXp_zS=0ZSV+4oKkxnL0SKGY92beA@mail.gmail.com>
Message-ID: <CAKF=DjubrFhwMVZq2pT4bJgrk5C6Tbmf1aCT0Hk_2_dO_-+FiQ@mail.gmail.com>

On Mon, Mar 19, 2012 at 12:14 PM, Bruno Santos <bacmsantos at gmail.com> wrote:

> I believe the formula I have is accurate I checked it personally and also
> have it checked by two mathematicians in the lab and they come up with the
> same results. I left my notebook where I performed the transformations home
> so don't completely remember but I believe you can simply things to get rid
> of some of the parameters.
>
dicerAcc is a scalar as you mentioned.
> I managed to implement the function in python now and it is giving the
> same results as in R my question how to maximize it still remains though.
> Is it possibly to maximize a function rather than minimize it in Python?
>

Ok, then I guess my math is faulty. I only looked quickly and don't see the
other close parens in the formula.

To maximize put a negative in front of the function.


>
>
> On 15 March 2012 15:21, Skipper Seabold <jsseabold at gmail.com> wrote:
>
>> On Thu, Mar 15, 2012 at 11:07 AM, Bruno Santos <bacmsantos at gmail.com>wrote:
>>
>>> Thank you all very much for the replies that was exactly what I wanted.
>>> I am basically trying to get the parameters for a
>>> gamma-poisson distribution. I have the R code from a
>>> previous collaborator just trying to write a native function in python
>>> rather than using the R code or port it using rpy2.
>>
>>
>> Oh, fun.
>>
>>
>>> The function is the following:
>>> [image: Inline images 1]
>>> where f(b,d) is a function that gives me a probability of a certain
>>> position in the vector to be occupied and it depends on b (the position)
>>> and d (the likelihood of making an error).
>>> So the likelihood after a few transformations become:
>>>
>>> [image: Inline images 2]
>>> Which I then use the loglikelihood and try to maximise it using an
>>> optimization algorithm.
>>> [image: Inline images 3]
>>> The R code is as following:
>>> alphabeta<-function(alphabeta,x,dicerAcc)
>>> {
>>>   alpha <-alphabeta[1]
>>>   beta <-alphabeta[2]
>>>   if (any(alphabeta<0))
>>>     return(NA)
>>>   sum((alpha*log(beta) + lgamma(alpha + x) + x * log(dicerAcc) -
>>> lgamma(alpha) - (alpha + x) * log(beta+dicerAcc) - lfactorial(x))[dicerAcc
>>> > noiseT])
>>>
>>
>> From a quick (distracted) look (so I could be wrong)
>>
>> Should this be alpha^2*log(beta) ? +lgamma(alpha) ? And lfactorial(x)
>> should still be +lgamma(alpha)*lfactorial(x) ? And dicerAcc a scalar
>> integer I take it?
>>
>>
>>>
>>> #sum((alpha*log(beta)+(lgamma(alpha+x)+log(dicerError^x))-(lgamma(alpha)+log((beta+dicerError)^(alpha+x))+lfactorial(x)))[dicerError
>>> != 0])
>>> }
>>> x and dicerAcc are known so the I use the optim function in R
>>> ab <- optim(c(1,100), alphabeta, control=list(fnscale=-1), x = x,
>>> dicerAcc = dicerAcc)$par
>>>
>>> Is there any equivalent function in Scipy to the optim one?
>>>
>>> On 14 March 2012 17:05, Bruno Santos <bacmsantos at gmail.com> wrote:
>>>
>>>> I am trying to write a script to do some maximum likelihood parameter
>>>> estimation of a function. But when I try to use the gamma function I get:
>>>> gamma(5)
>>>> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>>>>
>>>> I thought it might have been a problem solved already on the new
>>>> distribution but even after installing the last scipy version I get the
>>>> same problem.
>>>> The test() after installation is also failing with the following
>>>> information:
>>>> Running unit tests for scipy
>>>> NumPy version 1.5.1
>>>> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
>>>> SciPy version 0.10.1
>>>> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
>>>> Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
>>>> nose version 1.1.2
>>>> ...
>>>> ...
>>>> ...
>>>> AssertionError:
>>>> Arrays are not almost equal
>>>>  ACTUAL: 0.0
>>>>  DESIRED: 0.5
>>>>
>>>> ======================================================================
>>>> FAIL: Regression test for #651: better handling of badly conditioned
>>>> ----------------------------------------------------------------------
>>>> Traceback (most recent call last):
>>>>   File
>>>> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
>>>> line 34, in test_bad_filter
>>>>     assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
>>>>   File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 982,
>>>> in assert_raises
>>>>     return nose.tools.assert_raises(*args,**kwargs)
>>>> AssertionError: BadCoefficients not raised
>>>>
>>>> ----------------------------------------------------------------------
>>>> Ran 5103 tests in 47.795s
>>>>
>>>> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
>>>> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>>>>
>>>>
>>>> My code is as follows:
>>>> from numpy import array,log,sum,nan
>>>> from scipy.stats import gamma
>>>> from scipy import factorial, optimize
>>>>
>>>> #rinterface.initr()
>>>> #IntSexpVector = rinterface.IntSexpVector
>>>> #lgamma = rinterface.globalenv.get("lgamma")
>>>>
>>>> #Implementation for the Zero-inflated Negative Binomial function
>>>> def alphabeta(params,x,dicerAcc):
>>>>     alpha = array(params[0])
>>>>     beta = array(params[1])
>>>>     if alpha<0 or beta<0:return nan
>>>>     return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x *
>>>> log(dicerAcc) - log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) -
>>>> log(factorial(x)))
>>>>
>>>> if __name__=='__main__':
>>>>     x =
>>>> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
>>>>     dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
>>>> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
>>>> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
>>>> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
>>>> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
>>>> 0.14979999999999999, 0.012999999999999999])
>>>>     optimize.()
>>>>
>>>>
>>>> Am I doing something wrong or is this a known problem?
>>>>
>>>> Best,
>>>> Bruno
>>>>
>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/adafd28e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 4401 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/adafd28e/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 1620 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/adafd28e/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3193 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/adafd28e/attachment-0002.png>

From jsseabold at gmail.com  Mon Mar 19 12:25:00 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Mon, 19 Mar 2012 12:25:00 -0400
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CAKF=DjubrFhwMVZq2pT4bJgrk5C6Tbmf1aCT0Hk_2_dO_-+FiQ@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
	<CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>
	<CAKF=DjstNC2Y=COSNSsV3A5_7_qfJmBZi7q-+vGjSxYpZpp6Ag@mail.gmail.com>
	<CANUurQqaSanFTexVtMTcsTXp_zS=0ZSV+4oKkxnL0SKGY92beA@mail.gmail.com>
	<CAKF=DjubrFhwMVZq2pT4bJgrk5C6Tbmf1aCT0Hk_2_dO_-+FiQ@mail.gmail.com>
Message-ID: <CAKF=DjtAw5Kbg32JR9-kOZsuGQp-rR0gaXq+wAJOiim4+O6=Pg@mail.gmail.com>

On Mon, Mar 19, 2012 at 12:23 PM, Skipper Seabold <jsseabold at gmail.com>wrote:

> On Mon, Mar 19, 2012 at 12:14 PM, Bruno Santos <bacmsantos at gmail.com>wrote:
>
>> I believe the formula I have is accurate I checked it personally and also
>> have it checked by two mathematicians in the lab and they come up with the
>> same results. I left my notebook where I performed the transformations home
>> so don't completely remember but I believe you can simply things to get rid
>> of some of the parameters.
>>
> dicerAcc is a scalar as you mentioned.
>> I managed to implement the function in python now and it is giving the
>> same results as in R my question how to maximize it still remains though.
>> Is it possibly to maximize a function rather than minimize it in Python?
>>
>
> Ok, then I guess my math is faulty. I only looked quickly and don't see
> the other close parens in the formula.
>

Oh, I see now. I was trying to go off the log-likelihood you provided
instead of looking at the likelihood.


>
> To maximize put a negative in front of the function.
>
>
>
>>
>>
>> On 15 March 2012 15:21, Skipper Seabold <jsseabold at gmail.com> wrote:
>>
>>> On Thu, Mar 15, 2012 at 11:07 AM, Bruno Santos <bacmsantos at gmail.com>wrote:
>>>
>>>> Thank you all very much for the replies that was exactly what I wanted.
>>>> I am basically trying to get the parameters for a
>>>> gamma-poisson distribution. I have the R code from a
>>>> previous collaborator just trying to write a native function in python
>>>> rather than using the R code or port it using rpy2.
>>>
>>>
>>> Oh, fun.
>>>
>>>
>>>> The function is the following:
>>>> [image: Inline images 1]
>>>> where f(b,d) is a function that gives me a probability of a certain
>>>> position in the vector to be occupied and it depends on b (the position)
>>>> and d (the likelihood of making an error).
>>>> So the likelihood after a few transformations become:
>>>>
>>>> [image: Inline images 2]
>>>> Which I then use the loglikelihood and try to maximise it using an
>>>> optimization algorithm.
>>>> [image: Inline images 3]
>>>> The R code is as following:
>>>> alphabeta<-function(alphabeta,x,dicerAcc)
>>>> {
>>>>   alpha <-alphabeta[1]
>>>>   beta <-alphabeta[2]
>>>>   if (any(alphabeta<0))
>>>>     return(NA)
>>>>   sum((alpha*log(beta) + lgamma(alpha + x) + x * log(dicerAcc) -
>>>> lgamma(alpha) - (alpha + x) * log(beta+dicerAcc) - lfactorial(x))[dicerAcc
>>>> > noiseT])
>>>>
>>>
>>> From a quick (distracted) look (so I could be wrong)
>>>
>>> Should this be alpha^2*log(beta) ? +lgamma(alpha) ? And lfactorial(x)
>>> should still be +lgamma(alpha)*lfactorial(x) ? And dicerAcc a scalar
>>> integer I take it?
>>>
>>>
>>>>
>>>> #sum((alpha*log(beta)+(lgamma(alpha+x)+log(dicerError^x))-(lgamma(alpha)+log((beta+dicerError)^(alpha+x))+lfactorial(x)))[dicerError
>>>> != 0])
>>>> }
>>>> x and dicerAcc are known so the I use the optim function in R
>>>> ab <- optim(c(1,100), alphabeta, control=list(fnscale=-1), x = x,
>>>> dicerAcc = dicerAcc)$par
>>>>
>>>> Is there any equivalent function in Scipy to the optim one?
>>>>
>>>> On 14 March 2012 17:05, Bruno Santos <bacmsantos at gmail.com> wrote:
>>>>
>>>>> I am trying to write a script to do some maximum likelihood parameter
>>>>> estimation of a function. But when I try to use the gamma function I get:
>>>>> gamma(5)
>>>>> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>>>>>
>>>>> I thought it might have been a problem solved already on the new
>>>>> distribution but even after installing the last scipy version I get the
>>>>> same problem.
>>>>> The test() after installation is also failing with the following
>>>>> information:
>>>>> Running unit tests for scipy
>>>>> NumPy version 1.5.1
>>>>> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
>>>>> SciPy version 0.10.1
>>>>> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
>>>>> Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
>>>>> nose version 1.1.2
>>>>> ...
>>>>> ...
>>>>> ...
>>>>> AssertionError:
>>>>> Arrays are not almost equal
>>>>>  ACTUAL: 0.0
>>>>>  DESIRED: 0.5
>>>>>
>>>>> ======================================================================
>>>>> FAIL: Regression test for #651: better handling of badly conditioned
>>>>> ----------------------------------------------------------------------
>>>>> Traceback (most recent call last):
>>>>>   File
>>>>> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
>>>>> line 34, in test_bad_filter
>>>>>     assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
>>>>>   File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line
>>>>> 982, in assert_raises
>>>>>     return nose.tools.assert_raises(*args,**kwargs)
>>>>> AssertionError: BadCoefficients not raised
>>>>>
>>>>> ----------------------------------------------------------------------
>>>>> Ran 5103 tests in 47.795s
>>>>>
>>>>> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
>>>>> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>>>>>
>>>>>
>>>>> My code is as follows:
>>>>> from numpy import array,log,sum,nan
>>>>> from scipy.stats import gamma
>>>>> from scipy import factorial, optimize
>>>>>
>>>>> #rinterface.initr()
>>>>> #IntSexpVector = rinterface.IntSexpVector
>>>>> #lgamma = rinterface.globalenv.get("lgamma")
>>>>>
>>>>> #Implementation for the Zero-inflated Negative Binomial function
>>>>> def alphabeta(params,x,dicerAcc):
>>>>>     alpha = array(params[0])
>>>>>     beta = array(params[1])
>>>>>     if alpha<0 or beta<0:return nan
>>>>>     return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x *
>>>>> log(dicerAcc) - log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) -
>>>>> log(factorial(x)))
>>>>>
>>>>> if __name__=='__main__':
>>>>>     x =
>>>>> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
>>>>>     dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
>>>>> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
>>>>> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
>>>>> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
>>>>> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
>>>>> 0.14979999999999999, 0.012999999999999999])
>>>>>     optimize.()
>>>>>
>>>>>
>>>>> Am I doing something wrong or is this a known problem?
>>>>>
>>>>> Best,
>>>>> Bruno
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/eb9d4bf9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3193 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/eb9d4bf9/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 4401 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/eb9d4bf9/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 1620 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/eb9d4bf9/attachment-0002.png>

From josef.pktd at gmail.com  Mon Mar 19 12:28:14 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 19 Mar 2012 12:28:14 -0400
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CAKF=DjubrFhwMVZq2pT4bJgrk5C6Tbmf1aCT0Hk_2_dO_-+FiQ@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
	<CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>
	<CAKF=DjstNC2Y=COSNSsV3A5_7_qfJmBZi7q-+vGjSxYpZpp6Ag@mail.gmail.com>
	<CANUurQqaSanFTexVtMTcsTXp_zS=0ZSV+4oKkxnL0SKGY92beA@mail.gmail.com>
	<CAKF=DjubrFhwMVZq2pT4bJgrk5C6Tbmf1aCT0Hk_2_dO_-+FiQ@mail.gmail.com>
Message-ID: <CAMMTP+Ds1i55rPw4R9tK2oxVrzrNVfsg2cxnUmOmzic_J5Q7pw@mail.gmail.com>

On Mon, Mar 19, 2012 at 12:23 PM, Skipper Seabold <jsseabold at gmail.com>wrote:

> On Mon, Mar 19, 2012 at 12:14 PM, Bruno Santos <bacmsantos at gmail.com>wrote:
>
>> I believe the formula I have is accurate I checked it personally and also
>> have it checked by two mathematicians in the lab and they come up with the
>> same results. I left my notebook where I performed the transformations home
>> so don't completely remember but I believe you can simply things to get rid
>> of some of the parameters.
>>
> dicerAcc is a scalar as you mentioned.
>> I managed to implement the function in python now and it is giving the
>> same results as in R my question how to maximize it still remains though.
>> Is it possibly to maximize a function rather than minimize it in Python?
>>
>
> Ok, then I guess my math is faulty. I only looked quickly and don't see
> the other close parens in the formula.
>

I didn't check the parens, but to me it just like the negative binomial but
I think only the ratio

p = mu/(beta+mu)
1-p = beta/(beta+mu)

are identified, unless a parameter is held fixed.

negative binomial is a poisson-gamma mixture, but I only found the p, 1-p
parameterization

Josef


>
> To maximize put a negative in front of the function.
>
>
>
>>
>>
>> On 15 March 2012 15:21, Skipper Seabold <jsseabold at gmail.com> wrote:
>>
>>> On Thu, Mar 15, 2012 at 11:07 AM, Bruno Santos <bacmsantos at gmail.com>wrote:
>>>
>>>> Thank you all very much for the replies that was exactly what I wanted.
>>>> I am basically trying to get the parameters for a
>>>> gamma-poisson distribution. I have the R code from a
>>>> previous collaborator just trying to write a native function in python
>>>> rather than using the R code or port it using rpy2.
>>>
>>>
>>> Oh, fun.
>>>
>>>
>>>> The function is the following:
>>>> [image: Inline images 1]
>>>> where f(b,d) is a function that gives me a probability of a certain
>>>> position in the vector to be occupied and it depends on b (the position)
>>>> and d (the likelihood of making an error).
>>>> So the likelihood after a few transformations become:
>>>>
>>>> [image: Inline images 2]
>>>> Which I then use the loglikelihood and try to maximise it using an
>>>> optimization algorithm.
>>>> [image: Inline images 3]
>>>> The R code is as following:
>>>> alphabeta<-function(alphabeta,x,dicerAcc)
>>>> {
>>>>   alpha <-alphabeta[1]
>>>>   beta <-alphabeta[2]
>>>>   if (any(alphabeta<0))
>>>>     return(NA)
>>>>   sum((alpha*log(beta) + lgamma(alpha + x) + x * log(dicerAcc) -
>>>> lgamma(alpha) - (alpha + x) * log(beta+dicerAcc) - lfactorial(x))[dicerAcc
>>>> > noiseT])
>>>>
>>>
>>> From a quick (distracted) look (so I could be wrong)
>>>
>>> Should this be alpha^2*log(beta) ? +lgamma(alpha) ? And lfactorial(x)
>>> should still be +lgamma(alpha)*lfactorial(x) ? And dicerAcc a scalar
>>> integer I take it?
>>>
>>>
>>>>
>>>> #sum((alpha*log(beta)+(lgamma(alpha+x)+log(dicerError^x))-(lgamma(alpha)+log((beta+dicerError)^(alpha+x))+lfactorial(x)))[dicerError
>>>> != 0])
>>>> }
>>>> x and dicerAcc are known so the I use the optim function in R
>>>> ab <- optim(c(1,100), alphabeta, control=list(fnscale=-1), x = x,
>>>> dicerAcc = dicerAcc)$par
>>>>
>>>> Is there any equivalent function in Scipy to the optim one?
>>>>
>>>> On 14 March 2012 17:05, Bruno Santos <bacmsantos at gmail.com> wrote:
>>>>
>>>>> I am trying to write a script to do some maximum likelihood parameter
>>>>> estimation of a function. But when I try to use the gamma function I get:
>>>>> gamma(5)
>>>>> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>>>>>
>>>>> I thought it might have been a problem solved already on the new
>>>>> distribution but even after installing the last scipy version I get the
>>>>> same problem.
>>>>> The test() after installation is also failing with the following
>>>>> information:
>>>>> Running unit tests for scipy
>>>>> NumPy version 1.5.1
>>>>> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
>>>>> SciPy version 0.10.1
>>>>> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
>>>>> Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
>>>>> nose version 1.1.2
>>>>> ...
>>>>> ...
>>>>> ...
>>>>> AssertionError:
>>>>> Arrays are not almost equal
>>>>>  ACTUAL: 0.0
>>>>>  DESIRED: 0.5
>>>>>
>>>>> ======================================================================
>>>>> FAIL: Regression test for #651: better handling of badly conditioned
>>>>> ----------------------------------------------------------------------
>>>>> Traceback (most recent call last):
>>>>>   File
>>>>> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
>>>>> line 34, in test_bad_filter
>>>>>     assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
>>>>>   File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line
>>>>> 982, in assert_raises
>>>>>     return nose.tools.assert_raises(*args,**kwargs)
>>>>> AssertionError: BadCoefficients not raised
>>>>>
>>>>> ----------------------------------------------------------------------
>>>>> Ran 5103 tests in 47.795s
>>>>>
>>>>> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
>>>>> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>>>>>
>>>>>
>>>>> My code is as follows:
>>>>> from numpy import array,log,sum,nan
>>>>> from scipy.stats import gamma
>>>>> from scipy import factorial, optimize
>>>>>
>>>>> #rinterface.initr()
>>>>> #IntSexpVector = rinterface.IntSexpVector
>>>>> #lgamma = rinterface.globalenv.get("lgamma")
>>>>>
>>>>> #Implementation for the Zero-inflated Negative Binomial function
>>>>> def alphabeta(params,x,dicerAcc):
>>>>>     alpha = array(params[0])
>>>>>     beta = array(params[1])
>>>>>     if alpha<0 or beta<0:return nan
>>>>>     return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x *
>>>>> log(dicerAcc) - log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) -
>>>>> log(factorial(x)))
>>>>>
>>>>> if __name__=='__main__':
>>>>>     x =
>>>>> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
>>>>>     dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
>>>>> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
>>>>> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
>>>>> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
>>>>> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
>>>>> 0.14979999999999999, 0.012999999999999999])
>>>>>     optimize.()
>>>>>
>>>>>
>>>>> Am I doing something wrong or is this a known problem?
>>>>>
>>>>> Best,
>>>>> Bruno
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/7e0c90fd/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 4401 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/7e0c90fd/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 1620 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/7e0c90fd/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3193 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/7e0c90fd/attachment-0002.png>

From bacmsantos at gmail.com  Mon Mar 19 12:34:48 2012
From: bacmsantos at gmail.com (Bruno Santos)
Date: Mon, 19 Mar 2012 16:34:48 +0000
Subject: [SciPy-User] rv_frozen when using gamma function
In-Reply-To: <CAMMTP+Ds1i55rPw4R9tK2oxVrzrNVfsg2cxnUmOmzic_J5Q7pw@mail.gmail.com>
References: <CANUurQpKwEOgd=23sOGeH+fi40Nh_-ecb7zgHm_c+wj74GVQ9Q@mail.gmail.com>
	<CANUurQqh2xH+fGaf1Aj4Gkz-_MFo2v1SOWkeqcU-XYzAHdw48g@mail.gmail.com>
	<CAKF=DjstNC2Y=COSNSsV3A5_7_qfJmBZi7q-+vGjSxYpZpp6Ag@mail.gmail.com>
	<CANUurQqaSanFTexVtMTcsTXp_zS=0ZSV+4oKkxnL0SKGY92beA@mail.gmail.com>
	<CAKF=DjubrFhwMVZq2pT4bJgrk5C6Tbmf1aCT0Hk_2_dO_-+FiQ@mail.gmail.com>
	<CAMMTP+Ds1i55rPw4R9tK2oxVrzrNVfsg2cxnUmOmzic_J5Q7pw@mail.gmail.com>
Message-ID: <CANUurQqLDuD7_KDfJx54yQHk5TYCa39_n=a6hvdkk35cd5g20g@mail.gmail.com>

[quote]are identified, unless a parameter is held fixed.

negative binomial is a poisson-gamma mixture, but I only found the p, 1-p
parameterization

Josef
[/quote]
Guys you are starting to overwhelm my knowledge :). But you are correct the
only way I can solve it by assuming that d is given and fixed. It is not
ideally but unfortunately I haven't found a way to get it experimentally so
will have  to pretend I know and use several values.

And Skipper thank you very much some times the solution is so obvious I
have problems in seeing it. The -1 did the trick.

Thank you very much for all the help.  I am really learning a lot from this
thread.

Bruno

On 19 March 2012 16:28, <josef.pktd at gmail.com> wrote:

>
>
> On Mon, Mar 19, 2012 at 12:23 PM, Skipper Seabold <jsseabold at gmail.com>wrote:
>
>> On Mon, Mar 19, 2012 at 12:14 PM, Bruno Santos <bacmsantos at gmail.com>wrote:
>>
>>> I believe the formula I have is accurate I checked it personally and
>>> also have it checked by two mathematicians in the lab and they come up with
>>> the same results. I left my notebook where I performed the transformations
>>> home so don't completely remember but I believe you can simply things to
>>> get rid of some of the parameters.
>>>
>> dicerAcc is a scalar as you mentioned.
>>> I managed to implement the function in python now and it is giving the
>>> same results as in R my question how to maximize it still remains though.
>>> Is it possibly to maximize a function rather than minimize it in Python?
>>>
>>
>> Ok, then I guess my math is faulty. I only looked quickly and don't see
>> the other close parens in the formula.
>>
>
> I didn't check the parens, but to me it just like the negative binomial
> but I think only the ratio
>
> p = mu/(beta+mu)
> 1-p = beta/(beta+mu)
>
> are identified, unless a parameter is held fixed.
>
> negative binomial is a poisson-gamma mixture, but I only found the p, 1-p
> parameterization
>
> Josef
>
>
>
>>
>> To maximize put a negative in front of the function.
>>
>>
>>
>>>
>>>
>>> On 15 March 2012 15:21, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>
>>>> On Thu, Mar 15, 2012 at 11:07 AM, Bruno Santos <bacmsantos at gmail.com>wrote:
>>>>
>>>>> Thank you all very much for the replies that was exactly what I
>>>>> wanted. I am basically trying to get the parameters for a
>>>>> gamma-poisson distribution. I have the R code from a
>>>>> previous collaborator just trying to write a native function in python
>>>>> rather than using the R code or port it using rpy2.
>>>>
>>>>
>>>> Oh, fun.
>>>>
>>>>
>>>>> The function is the following:
>>>>> [image: Inline images 1]
>>>>> where f(b,d) is a function that gives me a probability of a certain
>>>>> position in the vector to be occupied and it depends on b (the position)
>>>>> and d (the likelihood of making an error).
>>>>> So the likelihood after a few transformations become:
>>>>>
>>>>> [image: Inline images 2]
>>>>> Which I then use the loglikelihood and try to maximise it using an
>>>>> optimization algorithm.
>>>>> [image: Inline images 3]
>>>>> The R code is as following:
>>>>> alphabeta<-function(alphabeta,x,dicerAcc)
>>>>> {
>>>>>   alpha <-alphabeta[1]
>>>>>   beta <-alphabeta[2]
>>>>>   if (any(alphabeta<0))
>>>>>     return(NA)
>>>>>   sum((alpha*log(beta) + lgamma(alpha + x) + x * log(dicerAcc) -
>>>>> lgamma(alpha) - (alpha + x) * log(beta+dicerAcc) - lfactorial(x))[dicerAcc
>>>>> > noiseT])
>>>>>
>>>>
>>>> From a quick (distracted) look (so I could be wrong)
>>>>
>>>> Should this be alpha^2*log(beta) ? +lgamma(alpha) ? And lfactorial(x)
>>>> should still be +lgamma(alpha)*lfactorial(x) ? And dicerAcc a scalar
>>>> integer I take it?
>>>>
>>>>
>>>>>
>>>>> #sum((alpha*log(beta)+(lgamma(alpha+x)+log(dicerError^x))-(lgamma(alpha)+log((beta+dicerError)^(alpha+x))+lfactorial(x)))[dicerError
>>>>> != 0])
>>>>> }
>>>>> x and dicerAcc are known so the I use the optim function in R
>>>>> ab <- optim(c(1,100), alphabeta, control=list(fnscale=-1), x = x,
>>>>> dicerAcc = dicerAcc)$par
>>>>>
>>>>> Is there any equivalent function in Scipy to the optim one?
>>>>>
>>>>> On 14 March 2012 17:05, Bruno Santos <bacmsantos at gmail.com> wrote:
>>>>>
>>>>>> I am trying to write a script to do some maximum likelihood parameter
>>>>>> estimation of a function. But when I try to use the gamma function I get:
>>>>>> gamma(5)
>>>>>> Out[5]: <scipy.stats.distributions.rv_frozen at 0x7213710>
>>>>>>
>>>>>> I thought it might have been a problem solved already on the new
>>>>>> distribution but even after installing the last scipy version I get the
>>>>>> same problem.
>>>>>> The test() after installation is also failing with the following
>>>>>> information:
>>>>>> Running unit tests for scipy
>>>>>> NumPy version 1.5.1
>>>>>> NumPy is installed in /usr/lib/pymodules/python2.7/numpy
>>>>>> SciPy version 0.10.1
>>>>>> SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy
>>>>>> Python version 2.7.2+ (default, Oct  4 2011, 20:06:09) [GCC 4.6.1]
>>>>>> nose version 1.1.2
>>>>>> ...
>>>>>> ...
>>>>>> ...
>>>>>> AssertionError:
>>>>>> Arrays are not almost equal
>>>>>>  ACTUAL: 0.0
>>>>>>  DESIRED: 0.5
>>>>>>
>>>>>> ======================================================================
>>>>>> FAIL: Regression test for #651: better handling of badly conditioned
>>>>>> ----------------------------------------------------------------------
>>>>>> Traceback (most recent call last):
>>>>>>   File
>>>>>> "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py",
>>>>>> line 34, in test_bad_filter
>>>>>>     assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0])
>>>>>>   File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line
>>>>>> 982, in assert_raises
>>>>>>     return nose.tools.assert_raises(*args,**kwargs)
>>>>>> AssertionError: BadCoefficients not raised
>>>>>>
>>>>>> ----------------------------------------------------------------------
>>>>>> Ran 5103 tests in 47.795s
>>>>>>
>>>>>> FAILED (KNOWNFAIL=13, SKIP=28, failures=3)
>>>>>> Out[7]: <nose.result.TextTestResult run=5103 errors=0 failures=3>
>>>>>>
>>>>>>
>>>>>> My code is as follows:
>>>>>> from numpy import array,log,sum,nan
>>>>>> from scipy.stats import gamma
>>>>>> from scipy import factorial, optimize
>>>>>>
>>>>>> #rinterface.initr()
>>>>>> #IntSexpVector = rinterface.IntSexpVector
>>>>>> #lgamma = rinterface.globalenv.get("lgamma")
>>>>>>
>>>>>> #Implementation for the Zero-inflated Negative Binomial function
>>>>>> def alphabeta(params,x,dicerAcc):
>>>>>>     alpha = array(params[0])
>>>>>>     beta = array(params[1])
>>>>>>     if alpha<0 or beta<0:return nan
>>>>>>     return sum((alpha*log(beta)) + log(gamma(alpha+x)) + x *
>>>>>> log(dicerAcc) - log(gamma(alpha)) - (alpha+x) * log(beta+dicerAcc) -
>>>>>> log(factorial(x)))
>>>>>>
>>>>>> if __name__=='__main__':
>>>>>>     x =
>>>>>> array([123,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,104,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,24,1,0,0,0,0,0,0,0,2,0,0,4,0,0,0,0,0,0,0,0,12,0,0])
>>>>>>     dicerAcc = array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>>> 0.048750000000000002,0.90085000000000004, 0.0504, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0023,
>>>>>> 0.089149999999999993, 0.81464999999999999, 0.091550000000000006,
>>>>>> 0.0023500000000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001, 0.0061000000000000004,
>>>>>> 0.12085, 0.7429, 0.12325, 0.0067000000000000002, 0.0, 0.0, 0.0, 0.0, 0.0,
>>>>>> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00020000000000000001,
>>>>>> 0.012500000000000001, 0.14255000000000001, 0.68159999999999998,
>>>>>> 0.14979999999999999, 0.012999999999999999])
>>>>>>     optimize.()
>>>>>>
>>>>>>
>>>>>> Am I doing something wrong or is this a known problem?
>>>>>>
>>>>>> Best,
>>>>>> Bruno
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>>
>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/8e322c9f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 4401 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/8e322c9f/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3193 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/8e322c9f/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 1620 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/8e322c9f/attachment-0002.png>

From greg.friedland at gmail.com  Mon Mar 19 13:42:32 2012
From: greg.friedland at gmail.com (Greg Friedland)
Date: Mon, 19 Mar 2012 10:42:32 -0700
Subject: [SciPy-User] Scipy.weave.inline and py2exe
Message-ID: <CAJKNnYETx0H6sgFLKhJbf1TXo2KneCgK4OgnJt3g7CP9iCZGrQ@mail.gmail.com>

Hi All,
Is it possible to use scipy.weave.inline to create a windows exe with
py2exe? Perhaps this is too much to ask for because of the behind the
scenes stuff that inline does but I thought I'd ask anyway.

For the moment when I simply try to import scipy.weave, the exe
bundles but when I run it I get:

  File "scipy\weave\__init__.pyc", line 26, in <module>
  File "scipy\weave\inline_tools.pyc", line 5, in <module>
  File "scipy\weave\ext_tools.pyc", line 6, in <module>
  File "scipy\weave\build_tools.pyc", line 28, in <module>
  File "scipy\weave\platform_info.pyc", line 15, in <module>
  File "numpy\distutils\core.pyc", line 25, in <module>
  File "numpy\distutils\command\build_ext.pyc", line 9, in <module>
  File "distutils\command\build_ext.pyc", line 13, in <module>
  File "site.pyc", line 73, in <module>
  File "site.pyc", line 38, in __boot
ImportError: Couldn't find the real 'site' module


thanks,
Greg


From seb.haase at gmail.com  Mon Mar 19 13:52:47 2012
From: seb.haase at gmail.com (Sebastian Haase)
Date: Mon, 19 Mar 2012 18:52:47 +0100
Subject: [SciPy-User] Scipy.weave.inline and py2exe
In-Reply-To: <CAJKNnYETx0H6sgFLKhJbf1TXo2KneCgK4OgnJt3g7CP9iCZGrQ@mail.gmail.com>
References: <CAJKNnYETx0H6sgFLKhJbf1TXo2KneCgK4OgnJt3g7CP9iCZGrQ@mail.gmail.com>
Message-ID: <CAN06oV-E0LjmnRTUMPaobUyCmx8Zw5VMj+RmpxR5zrGD-y3aUg@mail.gmail.com>

Hi,
It should work, with some weave magic ... since weave should know at
compile time (or at "import time") if a module has already been
compiled - so would just have to run (or "import") the module once
before you do the py2exe stuff....

This is just loud thinking,  I would not know the details off hand .....
Hoping someone else here has more details
- Sebastian Haase

PS: Last time I used weave (some 5 or so years ago)  it seemed quite
orphaned... I would be happy to hear now otherwise..

On Mon, Mar 19, 2012 at 6:42 PM, Greg Friedland
<greg.friedland at gmail.com> wrote:
> Hi All,
> Is it possible to use scipy.weave.inline to create a windows exe with
> py2exe? Perhaps this is too much to ask for because of the behind the
> scenes stuff that inline does but I thought I'd ask anyway.
>
> For the moment when I simply try to import scipy.weave, the exe
> bundles but when I run it I get:
>
> ?File "scipy\weave\__init__.pyc", line 26, in <module>
> ?File "scipy\weave\inline_tools.pyc", line 5, in <module>
> ?File "scipy\weave\ext_tools.pyc", line 6, in <module>
> ?File "scipy\weave\build_tools.pyc", line 28, in <module>
> ?File "scipy\weave\platform_info.pyc", line 15, in <module>
> ?File "numpy\distutils\core.pyc", line 25, in <module>
> ?File "numpy\distutils\command\build_ext.pyc", line 9, in <module>
> ?File "distutils\command\build_ext.pyc", line 13, in <module>
> ?File "site.pyc", line 73, in <module>
> ?File "site.pyc", line 38, in __boot
> ImportError: Couldn't find the real 'site' module
>
>
>
> thanks,
> Greg
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From nicolas.pinto at gmail.com  Mon Mar 19 14:24:33 2012
From: nicolas.pinto at gmail.com (Nicolas Pinto)
Date: Mon, 19 Mar 2012 19:24:33 +0100
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse module
Message-ID: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>

Hello,

The following simple code hangs only when sparse has been imported:

```
from scipy import sparse  # <<<<<<< BUG
import numpy as np
from scipy import linalg

N = 1000
np.random.seed(42)
X = np.random.random((N, N))
print X.mean()
v, Q = linalg.eigh(X)
print v.mean()
```

Do you think this may be related to other arpack/umfpack/etc. known failures ?

Please let us know how can we help fix this issue.

Thanks for your help.

Regards,

-- 
Nicolas Pinto
http://web.mit.edu/pinto


From ralf.gommers at googlemail.com  Mon Mar 19 14:54:32 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 19 Mar 2012 19:54:32 +0100
Subject: [SciPy-User] Scipy.weave.inline and py2exe
In-Reply-To: <CAN06oV-E0LjmnRTUMPaobUyCmx8Zw5VMj+RmpxR5zrGD-y3aUg@mail.gmail.com>
References: <CAJKNnYETx0H6sgFLKhJbf1TXo2KneCgK4OgnJt3g7CP9iCZGrQ@mail.gmail.com>
	<CAN06oV-E0LjmnRTUMPaobUyCmx8Zw5VMj+RmpxR5zrGD-y3aUg@mail.gmail.com>
Message-ID: <CABL7CQh43kDzjUH6Y-AEg7X-MZs+b1JPYjnVbJSXrGVq4FuT-g@mail.gmail.com>

On Mon, Mar 19, 2012 at 6:52 PM, Sebastian Haase <seb.haase at gmail.com>wrote:

> Hi,
> It should work, with some weave magic ... since weave should know at
> compile time (or at "import time") if a module has already been
> compiled - so would just have to run (or "import") the module once
> before you do the py2exe stuff....
>
> This is just loud thinking,  I would not know the details off hand .....
> Hoping someone else here has more details
> - Sebastian Haase
>
> PS: Last time I used weave (some 5 or so years ago)  it seemed quite
> orphaned... I would be happy to hear now otherwise..
>

You won't hear otherwise. It's unmaintained and only kept for backwards
compatibility. Use Cython for new code.

Ralf


>
> On Mon, Mar 19, 2012 at 6:42 PM, Greg Friedland
> <greg.friedland at gmail.com> wrote:
> > Hi All,
> > Is it possible to use scipy.weave.inline to create a windows exe with
> > py2exe? Perhaps this is too much to ask for because of the behind the
> > scenes stuff that inline does but I thought I'd ask anyway.
> >
> > For the moment when I simply try to import scipy.weave, the exe
> > bundles but when I run it I get:
> >
> >  File "scipy\weave\__init__.pyc", line 26, in <module>
> >  File "scipy\weave\inline_tools.pyc", line 5, in <module>
> >  File "scipy\weave\ext_tools.pyc", line 6, in <module>
> >  File "scipy\weave\build_tools.pyc", line 28, in <module>
> >  File "scipy\weave\platform_info.pyc", line 15, in <module>
> >  File "numpy\distutils\core.pyc", line 25, in <module>
> >  File "numpy\distutils\command\build_ext.pyc", line 9, in <module>
> >  File "distutils\command\build_ext.pyc", line 13, in <module>
> >  File "site.pyc", line 73, in <module>
> >  File "site.pyc", line 38, in __boot
> > ImportError: Couldn't find the real 'site' module
> >
> >
> >
> > thanks,
> > Greg
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/bd714c9d/attachment.html>

From nwagner at iam.uni-stuttgart.de  Mon Mar 19 15:15:52 2012
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Mon, 19 Mar 2012 20:15:52 +0100
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse
 module
In-Reply-To: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
Message-ID: <web-155230787@uni-stuttgart.de>

On Mon, 19 Mar 2012 19:24:33 +0100
  Nicolas Pinto <nicolas.pinto at gmail.com> wrote:
> Hello,
> 
> The following simple code hangs only when sparse has 
>been imported:
> 
> ```
> from scipy import sparse  # <<<<<<< BUG
> import numpy as np
> from scipy import linalg
> 
> N = 1000
> np.random.seed(42)
> X = np.random.random((N, N))
> print X.mean()
> v, Q = linalg.eigh(X)
> print v.mean()
> ```
> 
> Do you think this may be related to other 
>arpack/umfpack/etc. known failures ?
> 
> Please let us know how can we help fix this issue.
> 
> Thanks for your help.
> 
> Regards,
> 

Your matrix X  is not symmetric.
>>> X-X.T
array([[ 0.        ,  0.76558138,  0.47028826, ..., 
 0.0515565 ,
          0.19001774,  0.33171462],
        [-0.76558138,  0.        ,  0.62596704, ..., 
-0.0230795 ,
         -0.90677174,  0.12238354],
        [-0.47028826, -0.62596704,  0.        , ..., 
-0.38459427,
          0.28527075,  0.04568694],
        ...,
        [-0.0515565 ,  0.0230795 ,  0.38459427, ...,  0. 
       ,
          0.57859577, -0.24268277],
        [-0.19001774,  0.90677174, -0.28527075, ..., 
-0.57859577,
          0.        ,  0.52747713],
        [-0.33171462, -0.12238354, -0.04568694, ..., 
 0.24268277,
         -0.52747713,  0.        ]])

eigh assumes a symmetric or hermitian matrix.

Nils


From dyamins at gmail.com  Mon Mar 19 15:20:44 2012
From: dyamins at gmail.com (Dan Yamins)
Date: Mon, 19 Mar 2012 15:20:44 -0400
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse
	module
In-Reply-To: <web-155230787@uni-stuttgart.de>
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
	<web-155230787@uni-stuttgart.de>
Message-ID: <CAKX=FT_LOKkETL3eNc0jZOXEOmQJ+iFkqcKq983P10GG-gtKhA@mail.gmail.com>

Your matrix X  is not symmetric.
>

This is not the problem.   (Even if that were the problem, it wouldn't
cause a hang -- docs say "no error will be reported but results will be
wrong".)

In fact, the same hang happens on the installation which originally had
this problem if you replace X with X = np.dot(X, X.T) so that the matrix is
symmetric.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/f605ea89/attachment.html>

From pav at iki.fi  Mon Mar 19 17:09:17 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 19 Mar 2012 22:09:17 +0100
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse
	module
In-Reply-To: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
Message-ID: <jk879u$2bh$1@dough.gmane.org>

Hi,

19.03.2012 19:24, Nicolas Pinto kirjoitti:
> The following simple code hangs only when sparse has been imported:
[clip]

It does not hang for me. So, first things first:

- which platform?
- which binaries?
- which LAPACK?

> Do you think this may be related to other arpack/umfpack/etc. known failures ?
> 
> Please let us know how can we help fix this issue.

ARPACK et al are probably not related, because they are not imported by
``from scipy import sparse``.

A more likely candidate is the SWIG-wrapped `sparsetools` package: it is
known to also cause some other weirdness:

    http://projects.scipy.org/scipy/ticket/1314

This unfortunately seems pretty difficult to debug. One thing I could
imagine doing is minimizing the problem, by first stripping everything
away from `scipy.sparse` except the sparsetools module, and then
stripping down the sparsetools code until the failing part is found.

-- 
Pauli Virtanen


From pav at iki.fi  Mon Mar 19 17:37:19 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 19 Mar 2012 22:37:19 +0100
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse
	module
In-Reply-To: <jk879u$2bh$1@dough.gmane.org>
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
	<jk879u$2bh$1@dough.gmane.org>
Message-ID: <jk88ug$faf$1@dough.gmane.org>

19.03.2012 22:09, Pauli Virtanen kirjoitti:
[clip]
> This unfortunately seems pretty difficult to debug. One thing I could
> imagine doing is minimizing the problem, by first stripping everything
> away from `scipy.sparse` except the sparsetools module, and then
> stripping down the sparsetools code until the failing part is found.

Another possibility is that the problem comes just from the c++ runtime.
There's another c++ module in Scipy, `scipy.interpolate._interpolate` --
could you check if importing it also causes the same issue?

-- 
Pauli Virtanen


From dg.gmane at thesamovar.net  Mon Mar 19 17:49:45 2012
From: dg.gmane at thesamovar.net (Dan Goodman)
Date: Mon, 19 Mar 2012 22:49:45 +0100
Subject: [SciPy-User] Scipy.weave.inline and py2exe
In-Reply-To: <CABL7CQh43kDzjUH6Y-AEg7X-MZs+b1JPYjnVbJSXrGVq4FuT-g@mail.gmail.com>
References: <CAJKNnYETx0H6sgFLKhJbf1TXo2KneCgK4OgnJt3g7CP9iCZGrQ@mail.gmail.com>
	<CAN06oV-E0LjmnRTUMPaobUyCmx8Zw5VMj+RmpxR5zrGD-y3aUg@mail.gmail.com>
	<CABL7CQh43kDzjUH6Y-AEg7X-MZs+b1JPYjnVbJSXrGVq4FuT-g@mail.gmail.com>
Message-ID: <jk89lp$k3g$1@dough.gmane.org>

On 19/03/2012 19:54, Ralf Gommers wrote:
> On Mon, Mar 19, 2012 at 6:52 PM, Sebastian Haase <seb.haase at gmail.com
> <mailto:seb.haase at gmail.com>> wrote:
>     PS: Last time I used weave (some 5 or so years ago)  it seemed quite
>     orphaned... I would be happy to hear now otherwise..
>
> You won't hear otherwise. It's unmaintained and only kept for backwards
> compatibility. Use Cython for new code.

Interesting. I use weave for runtime code generation. Is it 
possible/simple to do this with Cython?

Dan


From kwgoodman at gmail.com  Mon Mar 19 18:07:21 2012
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 19 Mar 2012 15:07:21 -0700
Subject: [SciPy-User] [ANN] la 0.6, labeled array
Message-ID: <CAB6Y5349uj+JG6C5uco6u_v+QLq6jcw=jPaUhMENPxuVtDNxcQ@mail.gmail.com>

The sixth release of la (labeled array) adds new functions, improves
existing functions, and fixes bugs.

The main class of the la package is a labeled array, larry. A larry
consists of data and labels. The data is stored as a NumPy array and
the labels as a list of lists (one list per dimension).

New functions

- la.isaligned() returns True if two larrys are aligned along specified axis
- la.sortby() sorts a larry by a row or column specified by its label
- la.align_axis() aligns multiple larrys along (possibly) different axes
- la.zeros(), la.ones(), la.empty()
- la.lrange() similar to np.arange() but allows multi-dimensional output

Enhancements

- larry.lag() now accepts negative lags
- datime.time and datetime.datetime labels can now be (HDF5) archived
- la.align() can now skip the axes you do not wish to align
- Upgrade numpydoc from 0.3.1 to 0.4 to support Sphinx 1.0.1
- la.farray.ranking() and larry ranking method support `axis=None`
- Generate C code with Cython 0.15.1 instead of Cython 0.11
- Add makefile

Faster

- larry methods: merge, nan_replace, push, cumsum, cumprod, astype, __rdiv__
- larry function: cov
- Numpy array functions: geometric_mean, correlation, covMissing

Breakage from la 0.5

- optional parameter for larry creation renamed from integrity to validate

Bugs fixes

- #14 larry.lag() gives wrong output when nlag=0
- #20 Indexing chokes on lar[:,3:2]
- #21 Merging two larrys chokes when one is empty
- #22 Morphing an empty larry chokes lar.morph()
- #31 la.panel() gives wrong output
- #35 larry([1, 2]) == 'a' did not return a bool like numpy does
- #38 Indexing single element of larry with object dtype
- #39 move_func(myfunc) did not pass kwargs to myfunc when method='loop'
- #49 setup.py does not install module to load yahoo finance data
- #50 la.larry([], dtype=np.int).sum(0), and similar reductions, choke
- #51 -la.larry([True, False]) returns wrong answer

URLs

download  http://pypi.python.org/pypi/la
docs  http://berkeleyanalytics.com/la
code  https://github.com/kwgoodman/la
mailing list  http://groups.google.com/group/labeled-array
issue tracker  https://github.com/kwgoodman/la/issues


From ralf.gommers at googlemail.com  Mon Mar 19 18:12:28 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 19 Mar 2012 23:12:28 +0100
Subject: [SciPy-User] Scipy.weave.inline and py2exe
In-Reply-To: <jk89lp$k3g$1@dough.gmane.org>
References: <CAJKNnYETx0H6sgFLKhJbf1TXo2KneCgK4OgnJt3g7CP9iCZGrQ@mail.gmail.com>
	<CAN06oV-E0LjmnRTUMPaobUyCmx8Zw5VMj+RmpxR5zrGD-y3aUg@mail.gmail.com>
	<CABL7CQh43kDzjUH6Y-AEg7X-MZs+b1JPYjnVbJSXrGVq4FuT-g@mail.gmail.com>
	<jk89lp$k3g$1@dough.gmane.org>
Message-ID: <CABL7CQjYizKBLnXY_yRJKoPP=JDgbvFpqEeMQe_METjoh_bk3A@mail.gmail.com>

On Mon, Mar 19, 2012 at 10:49 PM, Dan Goodman <dg.gmane at thesamovar.net>wrote:

> On 19/03/2012 19:54, Ralf Gommers wrote:
> > On Mon, Mar 19, 2012 at 6:52 PM, Sebastian Haase <seb.haase at gmail.com
> > <mailto:seb.haase at gmail.com>> wrote:
> >     PS: Last time I used weave (some 5 or so years ago)  it seemed quite
> >     orphaned... I would be happy to hear now otherwise..
> >
> > You won't hear otherwise. It's unmaintained and only kept for backwards
> > compatibility. Use Cython for new code.
>
> Interesting. I use weave for runtime code generation. Is it
> possible/simple to do this with Cython?


http://docs.cython.org/src/reference/compilation.html#compiling-with-cython-inline
not sure how that exactly compares to weave.inline

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/fb0185fc/attachment.html>

From greg.friedland at gmail.com  Mon Mar 19 19:41:05 2012
From: greg.friedland at gmail.com (Greg Friedland)
Date: Mon, 19 Mar 2012 16:41:05 -0700
Subject: [SciPy-User] Scipy.weave.inline and py2exe
In-Reply-To: <CABL7CQh43kDzjUH6Y-AEg7X-MZs+b1JPYjnVbJSXrGVq4FuT-g@mail.gmail.com>
References: <CAJKNnYETx0H6sgFLKhJbf1TXo2KneCgK4OgnJt3g7CP9iCZGrQ@mail.gmail.com>
	<CAN06oV-E0LjmnRTUMPaobUyCmx8Zw5VMj+RmpxR5zrGD-y3aUg@mail.gmail.com>
	<CABL7CQh43kDzjUH6Y-AEg7X-MZs+b1JPYjnVbJSXrGVq4FuT-g@mail.gmail.com>
Message-ID: <CAJKNnYG734MtNzJC2bF1_5AFaoP2azWNnVEdJ98uM9WDLnbTYQ@mail.gmail.com>

Thanks, that's good to know. I hadn't realized it was so out of date.

I will switch to cython then. I took a quick look and couldn't get
cython.inline to work with py2exe but pyximport.install() did.

Greg

On Mon, Mar 19, 2012 at 11:54 AM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Mon, Mar 19, 2012 at 6:52 PM, Sebastian Haase <seb.haase at gmail.com>
> wrote:
>>
>> Hi,
>> It should work, with some weave magic ... since weave should know at
>> compile time (or at "import time") if a module has already been
>> compiled - so would just have to run (or "import") the module once
>> before you do the py2exe stuff....
>>
>> This is just loud thinking, ?I would not know the details off hand .....
>> Hoping someone else here has more details
>> - Sebastian Haase
>>
>> PS: Last time I used weave (some 5 or so years ago) ?it seemed quite
>> orphaned... I would be happy to hear now otherwise..
>
>
> You won't hear otherwise. It's unmaintained and only kept for backwards
> compatibility. Use Cython for new code.
>
> Ralf
>
>
>>
>>
>> On Mon, Mar 19, 2012 at 6:42 PM, Greg Friedland
>> <greg.friedland at gmail.com> wrote:
>> > Hi All,
>> > Is it possible to use scipy.weave.inline to create a windows exe with
>> > py2exe? Perhaps this is too much to ask for because of the behind the
>> > scenes stuff that inline does but I thought I'd ask anyway.
>> >
>> > For the moment when I simply try to import scipy.weave, the exe
>> > bundles but when I run it I get:
>> >
>> > ?File "scipy\weave\__init__.pyc", line 26, in <module>
>> > ?File "scipy\weave\inline_tools.pyc", line 5, in <module>
>> > ?File "scipy\weave\ext_tools.pyc", line 6, in <module>
>> > ?File "scipy\weave\build_tools.pyc", line 28, in <module>
>> > ?File "scipy\weave\platform_info.pyc", line 15, in <module>
>> > ?File "numpy\distutils\core.pyc", line 25, in <module>
>> > ?File "numpy\distutils\command\build_ext.pyc", line 9, in <module>
>> > ?File "distutils\command\build_ext.pyc", line 13, in <module>
>> > ?File "site.pyc", line 73, in <module>
>> > ?File "site.pyc", line 38, in __boot
>> > ImportError: Couldn't find the real 'site' module
>> >
>> >
>> >
>> > thanks,
>> > Greg
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From pujari.manisha at gmail.com  Tue Mar 20 05:26:24 2012
From: pujari.manisha at gmail.com (manisha pujari)
Date: Tue, 20 Mar 2012 10:26:24 +0100
Subject: [SciPy-User] Problem with Installation of Scipy on Macbook
Message-ID: <CALZ=UAFTEp==EVwrjeVSNgw-tfHnH4SUy8-JSUCGJX07V34n9w@mail.gmail.com>

Hello everyone,

This is the first time I am using scipy and I am having much trouble to
just install it on my macbook. I am using Python 2.6 and numpy 1.6.0.
I downloaded scipy0.9.0 tar.gz file  from the web link
http://sourceforge.net/projects/scipy/files/scipy/ and tried to install it
from the source.
But on giving the command

scipy-0.9.0$ python setup.py build

it is always giving the following problem :

Warning: No configuration returned, assuming unavailable.blas_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-faltivec',
'-I/System/Library/Frameworks/vecLib.framework/Headers']

non-existing path in 'scipy/io': 'docs'
lapack_opt_info:
  FOUND:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3)]
    extra_compile_args = ['-faltivec']

umfpack_info:
  libraries umfpack not found in
/System/Library/Frameworks/Python.framework/Versions/2.6/lib
  libraries umfpack not found in /usr/local/lib
  libraries umfpack not found in /usr/lib
/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/system_info.py:463:
UserWarning:
    UMFPACK sparse solver (http://www.cise.ufl.edu/research/sparse/umfpack/)
    not found. Directories to search for the libraries can be specified in
the
    numpy/distutils/site.cfg file (section [umfpack]) or by setting
    the UMFPACK environment variable.
  warnings.warn(self.notfounderror.__doc__)
  NOT AVAILABLE

Traceback (most recent call last):
  File "setup.py", line 181, in <module>
    setup_package()
  File "setup.py", line 173, in setup_package
    configuration=configuration )
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/core.py",
line 152, in setup
    config = configuration()
  File "setup.py", line 122, in configuration
    config.add_subpackage('scipy')
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 972, in add_subpackage
    caller_level = 2)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 941, in get_subpackage
    caller_level = caller_level + 1)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 878, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "scipy/setup.py", line 20, in configuration
    config.add_subpackage('special')
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 972, in add_subpackage
    caller_level = 2)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 941, in get_subpackage
    caller_level = caller_level + 1)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 878, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "scipy/special/setup.py", line 54, in configuration
    extra_info=get_info("npymath")
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 2184, in get_info
    pkg_info = get_pkg_info(pkgname, dirs)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
line 2136, in get_pkg_info
    return read_config(pkgname, dirs)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
line 390, in read_config
    v = _read_config_imp(pkg_to_filename(pkgname), dirs)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
line 326, in _read_config_imp
    meta, vars, sections, reqs = _read_config(filenames)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
line 309, in _read_config
    meta, vars, sections, reqs = parse_config(f, dirs)
  File
"/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
line 281, in parse_config
    raise PkgNotFound("Could not find file(s) %s" % str(filenames))
numpy.distutils.npy_pkg_config.PkgNotFound: Could not find file(s)
['/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/core/lib/npy-pkg-config/npymath.ini']

I am unable to understand where and what exactly the problem is.
Can anyone please help me? I will be really thankful for a response.
-- 
Regards,

Manisha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120320/10a2d047/attachment.html>

From denis-bz-gg at t-online.de  Tue Mar 20 07:58:49 2012
From: denis-bz-gg at t-online.de (denis)
Date: Tue, 20 Mar 2012 04:58:49 -0700 (PDT)
Subject: [SciPy-User] quadratic programming with fmin_slsqp
In-Reply-To: <CAMMTP+D1TUFE=sB-FWvOwzEydjtVas=NOSEiqHRWEU-Mtkuetw@mail.gmail.com>
References: <CAMMTP+D1TUFE=sB-FWvOwzEydjtVas=NOSEiqHRWEU-Mtkuetw@mail.gmail.com>
Message-ID: <4b58200d-59e9-4cc8-bd7b-6beee0921fd8@w32g2000vbt.googlegroups.com>


On Mar 16, 5:45?pm, josef.p... at gmail.com wrote:
> scipy is missing a fmin_quadprog

Josef,
  minmize()  is a reasonable common interface to 10 or so optimizers,
see http://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html
however
- minimize.py is not in scipy-0.9.0.tar nor in scipy-0.10.1.tar
    (a test to see if anybody's using it ?)
- only L-BFGS-B TNC COBYLA and SLSQP support bounds.

One could supply a trivial box / penaltybox as outlined below
(I use this playing around with Neldermead)
but I'm not sure anybody would use it
plus there's openopt pyomo mystic ...
maybe more solvers than real testcases :-

cheers
  -- denis


class Funcbox:
    """ F = Funcbox( func, [box=(0,1), penalty=None, grid=0,
*funcargs, **kwargs
        wraps a func() with a constraint box and grid

    Parameters
    ----------
    func: a function of a numpy vector or array-like
    box: (low, high) to np.clip, default (0,1).
        These can be vectors; low_j == high_j freezes x_j at that
value.
    penalty: e.g. (0, 1, 1000) adds a quadratic penalty
        to func() where xclip is outside (0, 1)
            1000 * sum( max( 0 - x, 0 )**2 + max( x - 1, 0 )**2 )
            = 1 4 9 16 ... at -.01 -.02 ... and 1.01 1.02 ...
        The default is None, no penalty.
        (The penalty box should be smaller than the clip box;
        x is first gridded if grid > 0, then clipped, then penalty
computed.)
    grid: e.g. .01 snaps all x_j to multiples of .01 --
        a simple noise smoother, recommended for noisy functions.
        The default is 0, no gridding.


From dhondt.olivier at gmail.com  Mon Mar 19 05:23:57 2012
From: dhondt.olivier at gmail.com (tyldurd)
Date: Mon, 19 Mar 2012 02:23:57 -0700 (PDT)
Subject: [SciPy-User] Numpy/Scipy: Avoiding nested loops to operate on
 matrix-valued images
In-Reply-To: <B39DF426-116D-4FF5-A369-7097CE72A87A@gmail.com>
References: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
	<B39DF426-116D-4FF5-A369-7097CE72A87A@gmail.com>
Message-ID: <28284362.88.1332149037436.JavaMail.geo-discussion-forums@yneo2>

Dan,

Thanks for your answer.

However, this solution does not work for me. 

First, it returns an array with dtype=object which is not the original type 
of the data.  Besides, the values in the array are not equal to the ones 
given by the 'traditional' nested loops. I think the problem comes from the 
fact that ufuncs are functions that act over each element of an array, not 
over slices.

I have done a lot of research on this topic but it seems it is not feasible 
in terms of slicing or vectorizing. The only solution I found would be with 
generalized ufuncs but from what I understand, they require to write C 
code, which I would like to avoid :-)

Therefore, I am going to stick to nested loops at least for now.

Regards,

Olivier

On Thursday, March 15, 2012 9:23:49 PM UTC+1, Dan Lussier wrote:
>
> Have you tried numpy.frompyfunc?
>
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.frompyfunc.html
>
> http://stackoverflow.com/questions/6126233/can-i-create-a-python-numpy-ufunc-from-an-unbound-member-method
>
> With this approach you may be able create a function which acts 
> elementwise over your array to compute the matrix logarithm at each entry 
> using Numpy's ufuncs.  This would avoid the explicit iteration over the 
> array using the for loops.
>
> As a rough outline try:
>
> from scipy import linalg
> import numpy as np
>
> # Assume im is the container array containing a 3x3 matrix at each pixel.
>
> # Composite function so get matrix log of array A as a matrix in one step
> def log_matrix(A):
>     return linalg.logm(np.asmatrix(A))
>    
>
> # Creating function to operate over container array.  Takes one argument 
> and returns the result.
> log_ufunc = np.frompyfunc(log_matrix, 1, 1)
>
> # Using log_ufunc on container array, im
> res = log_ufunc(im)
>
> Dan
>    
>
> On 2012-03-15, at 1:59 AM, tyldurd wrote:
>
> Hello,
>
> I am a beginner at python and numpy and I need to compute the matrix 
> logarithm for each "pixel" (i.e. x,y position) of a matrix-valued image of 
> dimension MxNx3x3. 3x3 is the dimensions of the matrix at each pixel.
>
> The function I have written so far is the following:
>
> def logm_img(im):
>     from scipy import linalg
>     dimx = im.shape[0]
>     dimy = im.shape[1]
>     res = zeros_like(im)
>     for x in range(dimx):
>         for y in range(dimy):
>             res[x, y, :, :] = linalg.logm(asmatrix(im[x,y,:,:]))
>     return res
>
> Is it ok? Is there a way to avoid the two nested loops ?
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120319/2dbd037b/attachment.html>

From conny.kuehne at googlemail.com  Tue Mar 20 05:43:08 2012
From: conny.kuehne at googlemail.com (=?iso-8859-1?Q?Conny_K=FChne?=)
Date: Tue, 20 Mar 2012 10:43:08 +0100
Subject: [SciPy-User] non-existing path in 'scipy/io': 'docs'
In-Reply-To: <CABL7CQiD3DD7HmZSMxC5DP5G5XBEVDW-mWgim_M50OfUEbRo4w@mail.gmail.com>
References: <74F753BB-984A-400E-82F9-2B95EE0CEE80@googlemail.com>
	<CABL7CQiD3DD7HmZSMxC5DP5G5XBEVDW-mWgim_M50OfUEbRo4w@mail.gmail.com>
Message-ID: <E4931B65-9E70-498E-B6E9-05534C7E2E35@googlemail.com>

Hi, 

thanks for the info. I actually got more install issues using the sources. So I removed everything and installed the binaries. This seems to have worked.

Conny

Am 18.03.2012 um 22:42 schrieb Ralf Gommers:

> 
> 
> 2012/3/16 Conny K?hne <conny.kuehne at googlemail.com>
> Hello,
> 
> I get the following error when trying to build scipy 0.10.1 from source
> 
> blas_opt_info:
>  FOUND:
>    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
>    define_macros = [('NO_ATLAS_INFO', 3)]
>    extra_compile_args = ['-faltivec', '-I/System/Library/Frameworks/vecLib.framework/Headers']
> 
> non-existing path in 'scipy/io': 'docs'
> 
> This can be cleaned up by removing the line "config.add_data_dir('docs')" in scipy/io/setup.py. Note that this is not an actual error though, just a harmless warning. Do you have an actual install issue? If so, can you post the full build log?
> 
> Ralf
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From njs at pobox.com  Tue Mar 20 11:33:58 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 20 Mar 2012 15:33:58 +0000
Subject: [SciPy-User] Numpy/Scipy: Avoiding nested loops to operate on
 matrix-valued images
In-Reply-To: <28284362.88.1332149037436.JavaMail.geo-discussion-forums@yneo2>
References: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
	<B39DF426-116D-4FF5-A369-7097CE72A87A@gmail.com>
	<28284362.88.1332149037436.JavaMail.geo-discussion-forums@yneo2>
Message-ID: <CAPJVwBkkFTniMsTsnWq1ia=S269O75VZU9-_bLNrE0Nga7b0OQ@mail.gmail.com>

On Mon, Mar 19, 2012 at 9:23 AM, tyldurd <dhondt.olivier at gmail.com> wrote:
> I have done a lot of research on this topic but it seems it is not feasible
> in terms of slicing or vectorizing. The only solution I found would be with
> generalized ufuncs but from what I understand, they require to write C code,
> which I would like to avoid :-)

I think the idea of generalized ufuncs is that linalg.logm should be
written as a generalized ufunc already out of the box, and then this
would be straightforward. However: (1) it isn't, and (2) even if it
were, I'm having trouble understanding from the available docs how you
would actually use it -- maybe calling logm would just work for your
case, but there don't seem to be any examples available of how it
chooses which dimensions to apply to. (Are there any generalized
ufuncs actually defined in the standard packages? For instance, is
np.dot implemented as a generalized ufunc? Should it be?)

> Therefore, I am going to stick to nested loops at least for now.

That seems like the best option to me. Nothing immoral about using a
loop when that's what you need :-).

-- Nathaniel


From josef.pktd at gmail.com  Tue Mar 20 11:38:42 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 20 Mar 2012 11:38:42 -0400
Subject: [SciPy-User] Numpy/Scipy: Avoiding nested loops to operate on
 matrix-valued images
In-Reply-To: <CAPJVwBkkFTniMsTsnWq1ia=S269O75VZU9-_bLNrE0Nga7b0OQ@mail.gmail.com>
References: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
	<B39DF426-116D-4FF5-A369-7097CE72A87A@gmail.com>
	<28284362.88.1332149037436.JavaMail.geo-discussion-forums@yneo2>
	<CAPJVwBkkFTniMsTsnWq1ia=S269O75VZU9-_bLNrE0Nga7b0OQ@mail.gmail.com>
Message-ID: <CAMMTP+AQw5tuGwwdo2Q6frr+4pHmPQXiqtC_Oac1tYFcBHZadg@mail.gmail.com>

On Tue, Mar 20, 2012 at 11:33 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Mon, Mar 19, 2012 at 9:23 AM, tyldurd <dhondt.olivier at gmail.com> wrote:
>> I have done a lot of research on this topic but it seems it is not feasible
>> in terms of slicing or vectorizing. The only solution I found would be with
>> generalized ufuncs but from what I understand, they require to write C code,
>> which I would like to avoid :-)
>
> I think the idea of generalized ufuncs is that linalg.logm should be
> written as a generalized ufunc already out of the box, and then this
> would be straightforward. However: (1) it isn't, and (2) even if it
> were, I'm having trouble understanding from the available docs how you
> would actually use it -- maybe calling logm would just work for your
> case, but there don't seem to be any examples available of how it
> chooses which dimensions to apply to. (Are there any generalized
> ufuncs actually defined in the standard packages? For instance, is
> np.dot implemented as a generalized ufunc? Should it be?)

only in a test case, AFAIK
from numpy.core.umath_tests import matrix_multiply

Josef

>
>> Therefore, I am going to stick to nested loops at least for now.
>
> That seems like the best option to me. Nothing immoral about using a
> loop when that's what you need :-).
>
> -- Nathaniel
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From borreguero at gmail.com  Tue Mar 20 17:08:22 2012
From: borreguero at gmail.com (Jose Borreguero)
Date: Tue, 20 Mar 2012 17:08:22 -0400
Subject: [SciPy-User] how to obtain I,J,V from sparse matrix (V,(I,J)) ?
Message-ID: <CAEee4gVnj_1cx4kopwP7SnL9xNrb8O-Y_rzJxTBFhKbAOvw8iw@mail.gmail.com>

Dear Scipy users,

Scipy docs state one can construct a matrix from three 1D arrays, A = sparse
.coo_matrix((V,(I,J)),shape=(4,4))

However, given sparse matrix A, how can I obtain arrays V, I, and J?
I could not find any methods of the sparse matrix that would return
these arrays...

- Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120320/c96e5aa4/attachment.html>

From pav at iki.fi  Tue Mar 20 17:55:31 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 20 Mar 2012 22:55:31 +0100
Subject: [SciPy-User] how to obtain I, J, V from sparse matrix (V, (I,
	J)) ?
In-Reply-To: <CAEee4gVnj_1cx4kopwP7SnL9xNrb8O-Y_rzJxTBFhKbAOvw8iw@mail.gmail.com>
References: <CAEee4gVnj_1cx4kopwP7SnL9xNrb8O-Y_rzJxTBFhKbAOvw8iw@mail.gmail.com>
Message-ID: <jkaucj$fhn$1@dough.gmane.org>

20.03.2012 22:08, Jose Borreguero kirjoitti:
> Scipy docs state one can construct a matrix from three 1D
> arrays,A=sparse.coo_matrix((V,(I,J)),shape=(4,4))
> 
> However, given sparse matrix A, how can I obtain arrays V, I, and J?
> I could not find any methods of the sparse matrix that would return these arrays...

Check out the `data`, `row` and `col` attributes:

http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html


From ralf.gommers at googlemail.com  Tue Mar 20 18:06:04 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 20 Mar 2012 23:06:04 +0100
Subject: [SciPy-User] Problem with Installation of Scipy on Macbook
In-Reply-To: <CALZ=UAFTEp==EVwrjeVSNgw-tfHnH4SUy8-JSUCGJX07V34n9w@mail.gmail.com>
References: <CALZ=UAFTEp==EVwrjeVSNgw-tfHnH4SUy8-JSUCGJX07V34n9w@mail.gmail.com>
Message-ID: <CABL7CQgstNMaMPoX1-jKAuNpaYtWphfKPMzkB5rsykNSLj4TTg@mail.gmail.com>

On Tue, Mar 20, 2012 at 10:26 AM, manisha pujari
<pujari.manisha at gmail.com>wrote:

> Hello everyone,
>
> This is the first time I am using scipy and I am having much trouble to
> just install it on my macbook. I am using Python 2.6 and numpy 1.6.0.
> I downloaded scipy0.9.0 tar.gz file  from the web link
> http://sourceforge.net/projects/scipy/files/scipy/ and tried to install
> it from the source.
> But on giving the command
>
> scipy-0.9.0$ python setup.py build
>
> it is always giving the following problem :
>
> Warning: No configuration returned, assuming unavailable.blas_opt_info:
>   FOUND:
>     extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
>     define_macros = [('NO_ATLAS_INFO', 3)]
>     extra_compile_args = ['-faltivec',
> '-I/System/Library/Frameworks/vecLib.framework/Headers']
>
> non-existing path in 'scipy/io': 'docs'
> lapack_opt_info:
>   FOUND:
>     extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
>     define_macros = [('NO_ATLAS_INFO', 3)]
>     extra_compile_args = ['-faltivec']
>
> umfpack_info:
>   libraries umfpack not found in
> /System/Library/Frameworks/Python.framework/Versions/2.6/lib
>   libraries umfpack not found in /usr/local/lib
>   libraries umfpack not found in /usr/lib
> /Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/system_info.py:463:
> UserWarning:
>     UMFPACK sparse solver (
> http://www.cise.ufl.edu/research/sparse/umfpack/)
>     not found. Directories to search for the libraries can be specified in
> the
>     numpy/distutils/site.cfg file (section [umfpack]) or by setting
>     the UMFPACK environment variable.
>   warnings.warn(self.notfounderror.__doc__)
>   NOT AVAILABLE
>
> Traceback (most recent call last):
>   File "setup.py", line 181, in <module>
>     setup_package()
>   File "setup.py", line 173, in setup_package
>     configuration=configuration )
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/core.py",
> line 152, in setup
>     config = configuration()
>   File "setup.py", line 122, in configuration
>     config.add_subpackage('scipy')
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 972, in add_subpackage
>     caller_level = 2)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 941, in get_subpackage
>     caller_level = caller_level + 1)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 878, in _get_configuration_from_setup_py
>     config = setup_module.configuration(*args)
>   File "scipy/setup.py", line 20, in configuration
>     config.add_subpackage('special')
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 972, in add_subpackage
>     caller_level = 2)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 941, in get_subpackage
>     caller_level = caller_level + 1)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 878, in _get_configuration_from_setup_py
>     config = setup_module.configuration(*args)
>   File "scipy/special/setup.py", line 54, in configuration
>     extra_info=get_info("npymath")
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 2184, in get_info
>     pkg_info = get_pkg_info(pkgname, dirs)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/misc_util.py",
> line 2136, in get_pkg_info
>     return read_config(pkgname, dirs)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
> line 390, in read_config
>     v = _read_config_imp(pkg_to_filename(pkgname), dirs)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
> line 326, in _read_config_imp
>     meta, vars, sections, reqs = _read_config(filenames)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
> line 309, in _read_config
>     meta, vars, sections, reqs = parse_config(f, dirs)
>   File
> "/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/distutils/npy_pkg_config.py",
> line 281, in parse_config
>     raise PkgNotFound("Could not find file(s) %s" % str(filenames))
> numpy.distutils.npy_pkg_config.PkgNotFound: Could not find file(s)
> ['/Library/Python/2.6/site-packages/numpy-1.6.0-py2.6-macosx-10.6-universal.egg/numpy/core/lib/npy-pkg-config/npymath.ini']
>
> I am unable to understand where and what exactly the problem is.
> Can anyone please help me? I will be really thankful for a response.


That doesn't look like a familiar error message to me. But before trying to
get to the root of that a simple question: why don't you just use a binary
installer? You can download dmg installers for python itself from
python.organd for numpy and scipy from the Sourceforge download sites.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120320/956df2ac/attachment.html>

From borreguero at gmail.com  Tue Mar 20 18:47:49 2012
From: borreguero at gmail.com (Jose Borreguero)
Date: Tue, 20 Mar 2012 18:47:49 -0400
Subject: [SciPy-User] how to obtain I, J, V from sparse matrix (V, (I,
	J)) ?
In-Reply-To: <jkaucj$fhn$1@dough.gmane.org>
References: <CAEee4gVnj_1cx4kopwP7SnL9xNrb8O-Y_rzJxTBFhKbAOvw8iw@mail.gmail.com>
	<jkaucj$fhn$1@dough.gmane.org>
Message-ID: <CAEee4gUpgGpGi+DGKEOZh08x6W1cfiHSnnU1a5nGFT7E=f0Spw@mail.gmail.com>

Thank you. I was working with csr_matrix which lacks row and col
attributes. I'll just cast to coo_matrix.
-Jose


On Tue, Mar 20, 2012 at 5:55 PM, Pauli Virtanen <pav at iki.fi> wrote:

> 20.03.2012 22:08, Jose Borreguero kirjoitti:
> > Scipy docs state one can construct a matrix from three 1D
> > arrays,A=sparse.coo_matrix((V,(I,J)),shape=(4,4))
> >
> > However, given sparse matrix A, how can I obtain arrays V, I, and J?
> > I could not find any methods of the sparse matrix that would return
> these arrays...
>
> Check out the `data`, `row` and `col` attributes:
>
>
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120320/44506d9e/attachment.html>

From josef.pktd at gmail.com  Wed Mar 21 09:51:53 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 21 Mar 2012 09:51:53 -0400
Subject: [SciPy-User] quadratic programming with fmin_slsqp
In-Reply-To: <4b58200d-59e9-4cc8-bd7b-6beee0921fd8@w32g2000vbt.googlegroups.com>
References: <CAMMTP+D1TUFE=sB-FWvOwzEydjtVas=NOSEiqHRWEU-Mtkuetw@mail.gmail.com>
	<4b58200d-59e9-4cc8-bd7b-6beee0921fd8@w32g2000vbt.googlegroups.com>
Message-ID: <CAMMTP+CBP1eLiQmbJF22WmdER=CS4wNV5EF59R5cV56Gdv8N5g@mail.gmail.com>

On Tue, Mar 20, 2012 at 7:58 AM, denis <denis-bz-gg at t-online.de> wrote:
>
> On Mar 16, 5:45?pm, josef.p... at gmail.com wrote:
>> scipy is missing a fmin_quadprog
>
> Josef,
> ?minmize() ?is a reasonable common interface to 10 or so optimizers,
> see http://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html
> however
> - minimize.py is not in scipy-0.9.0.tar nor in scipy-0.10.1.tar
> ? ?(a test to see if anybody's using it ?)
> - only L-BFGS-B TNC COBYLA and SLSQP support bounds.
>
> One could supply a trivial box / penaltybox as outlined below
> (I use this playing around with Neldermead)
> but I'm not sure anybody would use it
> plus there's openopt pyomo mystic ...
> maybe more solvers than real testcases :-
>
> cheers
> ?-- denis
>
>
> class Funcbox:
> ? ?""" F = Funcbox( func, [box=(0,1), penalty=None, grid=0,
> *funcargs, **kwargs
> ? ? ? ?wraps a func() with a constraint box and grid
>
> ? ?Parameters
> ? ?----------
> ? ?func: a function of a numpy vector or array-like
> ? ?box: (low, high) to np.clip, default (0,1).
> ? ? ? ?These can be vectors; low_j == high_j freezes x_j at that
> value.
> ? ?penalty: e.g. (0, 1, 1000) adds a quadratic penalty
> ? ? ? ?to func() where xclip is outside (0, 1)
> ? ? ? ? ? ?1000 * sum( max( 0 - x, 0 )**2 + max( x - 1, 0 )**2 )
> ? ? ? ? ? ?= 1 4 9 16 ... at -.01 -.02 ... and 1.01 1.02 ...
> ? ? ? ?The default is None, no penalty.
> ? ? ? ?(The penalty box should be smaller than the clip box;
> ? ? ? ?x is first gridded if grid > 0, then clipped, then penalty
> computed.)
> ? ?grid: e.g. .01 snaps all x_j to multiples of .01 --
> ? ? ? ?a simple noise smoother, recommended for noisy functions.
> ? ? ? ?The default is 0, no gridding.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

What I meant was, getting a high level interface that can be used as
in other packages

http://stats.stackexchange.com/questions/15741/matlabs-quadprog-equivalent-in-python

http://www.mathworks.com/help/toolbox/optim/ug/quadprog.html
http://svitsrv25.epfl.ch/R-doc/library/quadprog/html/solve.QP.html
http://abel.ee.ucla.edu/cvxopt/userguide/coneprog.html#quadratic-programming

scikits.datasmooth is using cvxopt for it

fmin_slsqp "sounds" similar

Josef


From will at thearete.co.uk  Wed Mar 21 13:48:38 2012
From: will at thearete.co.uk (William Furnass)
Date: Wed, 21 Mar 2012 17:48:38 +0000
Subject: [SciPy-User] Joint distributions
Message-ID: <CAJJEUTJDeCifqDonMnm2gYfRHPF5TGCX0j9+uJF0oxBgv7-WVg@mail.gmail.com>

I am wanting to fit a parameterised model to a series of 15
datapoints, with each being a concentration C and time t.  Within the
objective function of the optimisation routine that I'm using for the
model fitting I presently calculate fitness using the Bray Curtis
distance between the data series and the prediction corresponding to a
candidate solution.

I would ideally like to calculate fitness in such a way as to account
for uncertainty in each (C, t).  I think I can achieve this for a
given data series by
 a) modelling each data point using a bivariate Gaussian PDF (with
static variances for both C and t)
 b) calculate a prediction using a small dt
 c) find the highest probability of all points in the prediction
series for each of the 15 bivariate PDFs
 d) sum or average the probabilities to get a measure of the fit of
the real data series to the prediction corresponding to the candidate
solution.

My question is is there an easy way of finding joint probabilities
using scipy.stats?  I thought I could construct a bivariate normal
distribution using

dens = scipy.stats.norm(loc=np.array([t[i], C[i]]),
scale=np.array([t_stdev, C_stdev]))

but

dens.pdf(np.array([5,7]))

returns an array when I thought it should return a scalar probability.

Apologies if the above is not particularly clear or if I'm missing
something obvious here.

Regards,

Will Furnass


From josef.pktd at gmail.com  Wed Mar 21 14:07:51 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 21 Mar 2012 14:07:51 -0400
Subject: [SciPy-User] Joint distributions
In-Reply-To: <CAJJEUTJDeCifqDonMnm2gYfRHPF5TGCX0j9+uJF0oxBgv7-WVg@mail.gmail.com>
References: <CAJJEUTJDeCifqDonMnm2gYfRHPF5TGCX0j9+uJF0oxBgv7-WVg@mail.gmail.com>
Message-ID: <CAMMTP+CKjTg5FKRGWKcqHyQvufd70ohTnriayqCXkO4uxNmVCw@mail.gmail.com>

On Wed, Mar 21, 2012 at 1:48 PM, William Furnass <will at thearete.co.uk> wrote:
> I am wanting to fit a parameterised model to a series of 15
> datapoints, with each being a concentration C and time t. ?Within the
> objective function of the optimisation routine that I'm using for the
> model fitting I presently calculate fitness using the Bray Curtis
> distance between the data series and the prediction corresponding to a
> candidate solution.
>
> I would ideally like to calculate fitness in such a way as to account
> for uncertainty in each (C, t). ?I think I can achieve this for a
> given data series by
> ?a) modelling each data point using a bivariate Gaussian PDF (with
> static variances for both C and t)
> ?b) calculate a prediction using a small dt
> ?c) find the highest probability of all points in the prediction
> series for each of the 15 bivariate PDFs
> ?d) sum or average the probabilities to get a measure of the fit of
> the real data series to the prediction corresponding to the candidate
> solution.
>
> My question is is there an easy way of finding joint probabilities
> using scipy.stats? ?I thought I could construct a bivariate normal
> distribution using
>
> dens = scipy.stats.norm(loc=np.array([t[i], C[i]]),
> scale=np.array([t_stdev, C_stdev]))
>
> but
>
> dens.pdf(np.array([5,7]))
>
> returns an array when I thought it should return a scalar probability.

scipy.stats only has univariate distributions, or to be exact it
calculates it for many points independently.

So the returned array is the pdf for each point separately calculated.

If you want the pdf for the bivariate or multivariate normal
distribution then it's just a few lines,
( I think the bivariate normal is also in matplotlib, in statsmodels ?)

Your fitting problem sounds a bit like what scipy.odr does.

Josef

>
> Apologies if the above is not particularly clear or if I'm missing
> something obvious here.
>
> Regards,
>
> Will Furnass
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From hate_pod at yahoo.com  Wed Mar 21 22:43:37 2012
From: hate_pod at yahoo.com (Odin Den)
Date: Thu, 22 Mar 2012 02:43:37 +0000 (UTC)
Subject: [SciPy-User] numpy array root operation
Message-ID: <loom.20120322T033740-781@post.gmane.org>

Hi,
5th root of -32 can be computed correctly as follows:
>>> -32**(1/5)
>>> -2.0

However, when I try to perform same operation on numpy arrays I get
the following:
>>> array([-32])**(1/5)
>>> array([ nan])

Is there anyway to compute roots of numpy arrays? I have a huge matrix which
contains both negative and positive values. What is the easiest way of making
python compute the "nth" roots of each element of this matrix without knowing
the value of "n" a priory?


From guziy.sasha at gmail.com  Wed Mar 21 23:22:46 2012
From: guziy.sasha at gmail.com (Oleksandr Huziy)
Date: Wed, 21 Mar 2012 23:22:46 -0400
Subject: [SciPy-User] numpy array root operation
In-Reply-To: <loom.20120322T033740-781@post.gmane.org>
References: <loom.20120322T033740-781@post.gmane.org>
Message-ID: <CAN3p1sUn0qsespqu4grLoOW1ABSV4uSBuUS6H+PHcL34smQhzA@mail.gmail.com>

Maybe like this,


>>> import numpy as np
>>> x = np.array([-81,25])
>>> np.sign(x) * np.absolute(x) ** (1.0/5.0)
array([-2.40822469,  1.90365394])
>>> np.sign(x) * np.absolute(x) ** (1.0/2.0)
array([-9.,  5.])

try this way and you'll be also in trouble
(-32)**(1.0/5.0)

Cheers
--
Oleksandr Huziy

2012/3/21 Odin Den <hate_pod at yahoo.com>:
> Hi,
> 5th root of -32 can be computed correctly as follows:
>>>> -32**(1/5)
>>>> -2.0
>
> However, when I try to perform same operation on numpy arrays I get
> the following:
>>>> array([-32])**(1/5)
>>>> array([ nan])
>
> Is there anyway to compute roots of numpy arrays? I have a huge matrix which
> contains both negative and positive values. What is the easiest way of making
> python compute the "nth" roots of each element of this matrix without knowing
> the value of "n" a priory?
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From wardefar at iro.umontreal.ca  Wed Mar 21 23:26:21 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Wed, 21 Mar 2012 23:26:21 -0400
Subject: [SciPy-User] numpy array root operation
In-Reply-To: <loom.20120322T033740-781@post.gmane.org>
References: <loom.20120322T033740-781@post.gmane.org>
Message-ID: <5E811AA7-0D07-4103-9369-6059C1158FF7@iro.umontreal.ca>

On 2012-03-21, at 10:43 PM, Odin Den wrote:

> Is there anyway to compute roots of numpy arrays? I have a huge matrix which
> contains both negative and positive values. What is the easiest way of making
> python compute the "nth" roots of each element of this matrix without knowing
> the value of "n" a priory?

I suspect not, not without writing your own function that handles negative numbers correctly in all cases. 

Note that the C standard pow() doesn't support non-integer powers on negative numbers, nor does plain Python itself. You might have luck at the symbolic/multiprecision math packages like SymPy or mpmath, which might implement the correct algorithm, but if you want to operate on arrays you probably need to write a custom function borrowing one of their algorithms and making it operate on array data.

From wardefar at iro.umontreal.ca  Wed Mar 21 23:31:13 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Wed, 21 Mar 2012 23:31:13 -0400
Subject: [SciPy-User] numpy array root operation
In-Reply-To: <loom.20120322T033740-781@post.gmane.org>
References: <loom.20120322T033740-781@post.gmane.org>
Message-ID: <46118E9F-1C06-4AB9-9D0D-317744B30C36@iro.umontreal.ca>

On 2012-03-21, at 10:43 PM, Odin Den wrote:

>>>> -32**(1/5)
>>>> -2.0

I just noticed that this was probably a REPL prompt (my mail client shows it as quotation indentation, which failed to register in my mind).

You have your order of operations wrong. Exponentiation is higher priority than multiplication (i.e. the unary -) so what you are getting is -1 * (32 ** (1/5)).

>>> (-32)**(1/5.)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: negative number cannot be raised to a fractional power


From wardefar at iro.umontreal.ca  Wed Mar 21 23:34:48 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Wed, 21 Mar 2012 23:34:48 -0400
Subject: [SciPy-User] Numpy/Scipy: Avoiding nested loops to operate on
	matrix-valued images
In-Reply-To: <CAPJVwBkkFTniMsTsnWq1ia=S269O75VZU9-_bLNrE0Nga7b0OQ@mail.gmail.com>
References: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
	<B39DF426-116D-4FF5-A369-7097CE72A87A@gmail.com>
	<28284362.88.1332149037436.JavaMail.geo-discussion-forums@yneo2>
	<CAPJVwBkkFTniMsTsnWq1ia=S269O75VZU9-_bLNrE0Nga7b0OQ@mail.gmail.com>
Message-ID: <E9EA57B0-5864-46B4-B5E4-948CAC055163@iro.umontreal.ca>

On 2012-03-20, at 11:33 AM, Nathaniel Smith wrote:

>  (Are there any generalized
> ufuncs actually defined in the standard packages? For instance, is
> np.dot implemented as a generalized ufunc? Should it be?)

Ideally, so long as it still made use of BLAS for the actual matrix products.

I tried my hand at implementing a gufunc for log(sum(exp(...))), with the sum being the "generalized" part. Did not have much luck...

David

From jaakko.luttinen at aalto.fi  Thu Mar 22 05:34:52 2012
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 22 Mar 2012 11:34:52 +0200
Subject: [SciPy-User] Sparse matrix multiply
Message-ID: <4F6AF23C.9080104@aalto.fi>

Hi!

Why do I get two different results for the code below?

import numpy as np
import scipy.sparse as sp
A = sp.rand(20,20,density=0.1)
B = sp.rand(20,20,density=0.1)
np.multiply(A,B).sum()
# out: 21.058793740984925
A.multiply(B).sum()
# out: 0.76482546226069481

Am I missing something?
I think numpy.multiply should either return the correct answer or an
error that it can't compute the correct answer.

Regards,
Jaakko


From cmutel at gmail.com  Thu Mar 22 06:03:29 2012
From: cmutel at gmail.com (Christopher Mutel)
Date: Thu, 22 Mar 2012 11:03:29 +0100
Subject: [SciPy-User] Sparse matrix multiply
In-Reply-To: <4F6AF23C.9080104@aalto.fi>
References: <4F6AF23C.9080104@aalto.fi>
Message-ID: <CAOc=QnKtmjzXshgizXaGtAEetw9bxQnfg9LzxJ5Xku6vfOc_rg@mail.gmail.com>

On Thu, Mar 22, 2012 at 10:34 AM, Jaakko Luttinen
<jaakko.luttinen at aalto.fi> wrote:
> Hi!
>
> Why do I get two different results for the code below?
>
> import numpy as np
> import scipy.sparse as sp
> A = sp.rand(20,20,density=0.1)
> B = sp.rand(20,20,density=0.1)
> np.multiply(A,B).sum()
> # out: 21.058793740984925
> A.multiply(B).sum()
> # out: 0.76482546226069481
>
> Am I missing something?
> I think numpy.multiply should either return the correct answer or an
> error that it can't compute the correct answer.

np.multiply performs element-wise multiplication, while A.multiply is
matrix multiplication. They are both "correct", but answer different
questions.

See:
http://en.wikipedia.org/wiki/Matrix_multiplication
http://docs.scipy.org/doc/numpy/reference/generated/numpy.multiply.html

> Regards,
> Jaakko
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


-- 
############################
Chris Mutel
?kologisches Systemdesign - Ecological Systems Design
Institut f.Umweltingenieurwissenschaften - Institute for Environmental
Engineering
ETH Z?rich - HIF C 44 - Schafmattstr. 6
8093 Z?rich

Telefon: +41 44 633 71 45 - Fax: +41 44 633 10 61
############################


From jaakko.luttinen at aalto.fi  Thu Mar 22 06:07:23 2012
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 22 Mar 2012 12:07:23 +0200
Subject: [SciPy-User] Sparse matrix multiply
In-Reply-To: <CAOc=QnKtmjzXshgizXaGtAEetw9bxQnfg9LzxJ5Xku6vfOc_rg@mail.gmail.com>
References: <4F6AF23C.9080104@aalto.fi>
	<CAOc=QnKtmjzXshgizXaGtAEetw9bxQnfg9LzxJ5Xku6vfOc_rg@mail.gmail.com>
Message-ID: <4F6AF9DB.4050201@aalto.fi>

On 03/22/2012 12:03 PM, Christopher Mutel wrote:
> On Thu, Mar 22, 2012 at 10:34 AM, Jaakko Luttinen
> <jaakko.luttinen at aalto.fi> wrote:
>> Hi!
>>
>> Why do I get two different results for the code below?
>>
>> import numpy as np
>> import scipy.sparse as sp
>> A = sp.rand(20,20,density=0.1)
>> B = sp.rand(20,20,density=0.1)
>> np.multiply(A,B).sum()
>> # out: 21.058793740984925
>> A.multiply(B).sum()
>> # out: 0.76482546226069481
>>
>> Am I missing something?
>> I think numpy.multiply should either return the correct answer or an
>> error that it can't compute the correct answer.
> 
> np.multiply performs element-wise multiplication, while A.multiply is
> matrix multiplication. They are both "correct", but answer different
> questions.

Is it so..?

http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.multiply.html

I don't know what "point-wise multiplication" means..

Anyway, I thought that dot computes matrix multiplication and multiply
computes matrix multiplication.

-Jaakko


From jaakko.luttinen at aalto.fi  Thu Mar 22 06:08:47 2012
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 22 Mar 2012 12:08:47 +0200
Subject: [SciPy-User] Sparse matrix multiply
In-Reply-To: <4F6AF9DB.4050201@aalto.fi>
References: <4F6AF23C.9080104@aalto.fi>	<CAOc=QnKtmjzXshgizXaGtAEetw9bxQnfg9LzxJ5Xku6vfOc_rg@mail.gmail.com>
	<4F6AF9DB.4050201@aalto.fi>
Message-ID: <4F6AFA2F.1030701@aalto.fi>

On 03/22/2012 12:07 PM, Jaakko Luttinen wrote:
> On 03/22/2012 12:03 PM, Christopher Mutel wrote:
>> On Thu, Mar 22, 2012 at 10:34 AM, Jaakko Luttinen
>> <jaakko.luttinen at aalto.fi> wrote:
>>> Hi!
>>>
>>> Why do I get two different results for the code below?
>>>
>>> import numpy as np
>>> import scipy.sparse as sp
>>> A = sp.rand(20,20,density=0.1)
>>> B = sp.rand(20,20,density=0.1)
>>> np.multiply(A,B).sum()
>>> # out: 21.058793740984925
>>> A.multiply(B).sum()
>>> # out: 0.76482546226069481
>>>
>>> Am I missing something?
>>> I think numpy.multiply should either return the correct answer or an
>>> error that it can't compute the correct answer.
>>
>> np.multiply performs element-wise multiplication, while A.multiply is
>> matrix multiplication. They are both "correct", but answer different
>> questions.
> 
> Is it so..?
> 
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.multiply.html
> 
> I don't know what "point-wise multiplication" means..
> 
> Anyway, I thought that dot computes matrix multiplication and multiply
> computes matrix multiplication.

TYPOFIX: I thought that multiply computes element-wise multiplication.

-Jaakko


From pav at iki.fi  Thu Mar 22 06:18:40 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 22 Mar 2012 11:18:40 +0100
Subject: [SciPy-User] Sparse matrix multiply
In-Reply-To: <4F6AF23C.9080104@aalto.fi>
References: <4F6AF23C.9080104@aalto.fi>
Message-ID: <jkeua0$80k$1@dough.gmane.org>

22.03.2012 10:34, Jaakko Luttinen kirjoitti:
> import numpy as np
> import scipy.sparse as sp
> A = sp.rand(20,20,density=0.1)
> B = sp.rand(20,20,density=0.1)
> np.multiply(A,B).sum()
> # out: 0.76482546226069481
> Am I missing something?
> I think numpy.multiply should either return the correct answer or an
> error that it can't compute the correct answer.

The answer is the same as to your previous questions --- the Numpy
ufuncs do not deal with sparse matrices in a reasonable way. This lack
of integration between dense and sparse is a bug.

Why it does not raise an error instead, is probably that as a
consequence of the operation overloading rules defined, there is a
(nonsensical) operation that matches the call.

-- 
Pauli Virtanen


From jaakko.luttinen at aalto.fi  Thu Mar 22 06:18:56 2012
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 22 Mar 2012 12:18:56 +0200
Subject: [SciPy-User] Sparse matrix multiply
In-Reply-To: <CAOc=QnKtmjzXshgizXaGtAEetw9bxQnfg9LzxJ5Xku6vfOc_rg@mail.gmail.com>
References: <4F6AF23C.9080104@aalto.fi>
	<CAOc=QnKtmjzXshgizXaGtAEetw9bxQnfg9LzxJ5Xku6vfOc_rg@mail.gmail.com>
Message-ID: <4F6AFC90.9040102@aalto.fi>

On 03/22/2012 12:03 PM, Christopher Mutel wrote:
> On Thu, Mar 22, 2012 at 10:34 AM, Jaakko Luttinen
> <jaakko.luttinen at aalto.fi> wrote:
>> Hi!
>>
>> Why do I get two different results for the code below?
>>
>> import numpy as np
>> import scipy.sparse as sp
>> A = sp.rand(20,20,density=0.1)
>> B = sp.rand(20,20,density=0.1)
>> np.multiply(A,B).sum()
>> # out: 21.058793740984925
>> A.multiply(B).sum()
>> # out: 0.76482546226069481
>>
>> Am I missing something?
>> I think numpy.multiply should either return the correct answer or an
>> error that it can't compute the correct answer.
> 
> np.multiply performs element-wise multiplication, while A.multiply is
> matrix multiplication. They are both "correct", but answer different
> questions.

Actually it seems that np.multiply computes matrix multiplication in
this case! Below, only A.multiply(B) computes element-wise multiplication.

import numpy as np
import scipy.sparse as sp
A = sp.rand(20,20,density=0.1)
B = sp.rand(20,20,density=0.1)
np.multiply(A,B).sum()
# out: 25.240683127057885
A.multiply(B).sum()
# out: 2.6382118196920503
A.dot(B).sum()
# out: 25.240683127057885
np.dot(A,B).sum()
# out: 25.240683127057885

-Jaakko


From cmutel at gmail.com  Thu Mar 22 06:24:04 2012
From: cmutel at gmail.com (Christopher Mutel)
Date: Thu, 22 Mar 2012 11:24:04 +0100
Subject: [SciPy-User] Sparse matrix multiply
In-Reply-To: <4F6AFC90.9040102@aalto.fi>
References: <4F6AF23C.9080104@aalto.fi>
	<CAOc=QnKtmjzXshgizXaGtAEetw9bxQnfg9LzxJ5Xku6vfOc_rg@mail.gmail.com>
	<4F6AFC90.9040102@aalto.fi>
Message-ID: <CAOc=QnLPtexEvg2fWS0LYpFsxpGBgwf7i88w2_e+9KEsSCk9Yg@mail.gmail.com>

On Thu, Mar 22, 2012 at 11:18 AM, Jaakko Luttinen
<jaakko.luttinen at aalto.fi> wrote:
> On 03/22/2012 12:03 PM, Christopher Mutel wrote:
>> On Thu, Mar 22, 2012 at 10:34 AM, Jaakko Luttinen
>> <jaakko.luttinen at aalto.fi> wrote:
>>> Hi!
>>>
>>> Why do I get two different results for the code below?
>>>
>>> import numpy as np
>>> import scipy.sparse as sp
>>> A = sp.rand(20,20,density=0.1)
>>> B = sp.rand(20,20,density=0.1)
>>> np.multiply(A,B).sum()
>>> # out: 21.058793740984925
>>> A.multiply(B).sum()
>>> # out: 0.76482546226069481
>>>
>>> Am I missing something?
>>> I think numpy.multiply should either return the correct answer or an
>>> error that it can't compute the correct answer.
>>
>> np.multiply performs element-wise multiplication, while A.multiply is
>> matrix multiplication. They are both "correct", but answer different
>> questions.
>
> Actually it seems that np.multiply computes matrix multiplication in
> this case! Below, only A.multiply(B) computes element-wise multiplication.
>
> import numpy as np
> import scipy.sparse as sp
> A = sp.rand(20,20,density=0.1)
> B = sp.rand(20,20,density=0.1)
> np.multiply(A,B).sum()
> # out: 25.240683127057885
> A.multiply(B).sum()
> # out: 2.6382118196920503
> A.dot(B).sum()
> # out: 25.240683127057885
> np.dot(A,B).sum()
> # out: 25.240683127057885

Indeed. Sorry for the confusion.


From jaakko.luttinen at aalto.fi  Thu Mar 22 09:42:51 2012
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 22 Mar 2012 15:42:51 +0200
Subject: [SciPy-User] Bug in scipy.io.mmread?
Message-ID: <4F6B2C5B.1050408@aalto.fi>

Hi!

I am trying to read the following matrix market file:
ftp://math.nist.gov/pub/MatrixMarket2/Harwell-Boeing/lsq/illc1033.mtx.gz

However, it doesn't work with Python 3 and SciPy 0.11.0:

=====================================================
Python 3.2.2 (default, Oct 27 2011, 13:08:00)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> scipy.version.version
'0.11.0.dev-Unknown'
>>> from scipy.io import mmread
>>> mmread('illc1033.mtx.gz')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File
"/home/jluttine/.local/lib/python3.2/site-packages/scipy/io/mmio.py",
line 68, in mmread
    return MMFile().read(source)
  File
"/home/jluttine/.local/lib/python3.2/site-packages/scipy/io/mmio.py",
line 302, in read
    return self._parse_body(stream)
  File
"/home/jluttine/.local/lib/python3.2/site-packages/scipy/io/mmio.py",
line 447, in _parse_body
    flat_data = flat_data.reshape(-1,3)
ValueError: total size of new array must be unchanged
=====================================================

It does work with Python 2 and SciPy 0.7.2:

=====================================================
Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> scipy.version.version
'0.7.2'
>>> from scipy.io import mmread
>>> mmread('illc1033.mtx.gz')
<1033x320 sparse matrix of type '<type 'numpy.float64'>'
	with 4732 stored elements in COOrdinate format>
=====================================================

Is there a bug in recent scipy.io.mmread or what could be the problem?

Best,
Jaakko


From nwagner at iam.uni-stuttgart.de  Thu Mar 22 13:15:43 2012
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Thu, 22 Mar 2012 18:15:43 +0100
Subject: [SciPy-User] Bug in scipy.io.mmread?
In-Reply-To: <4F6B2C5B.1050408@aalto.fi>
References: <4F6B2C5B.1050408@aalto.fi>
Message-ID: <web-155367586@uni-stuttgart.de>

On Thu, 22 Mar 2012 15:42:51 +0200
  Jaakko Luttinen <jaakko.luttinen at aalto.fi> wrote:
> Hi!
> 
> I am trying to read the following matrix market file:
> ftp://math.nist.gov/pub/MatrixMarket2/Harwell-Boeing/lsq/illc1033.mtx.gz
> 
> However, it doesn't work with Python 3 and SciPy 0.11.0:
> 
> =====================================================
> Python 3.2.2 (default, Oct 27 2011, 13:08:00)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for 
>more information.
>>>> import scipy
>>>> scipy.version.version
> '0.11.0.dev-Unknown'
>>>> from scipy.io import mmread
>>>> mmread('illc1033.mtx.gz')
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File
> "/home/jluttine/.local/lib/python3.2/site-packages/scipy/io/mmio.py",
> line 68, in mmread
>    return MMFile().read(source)
>  File
> "/home/jluttine/.local/lib/python3.2/site-packages/scipy/io/mmio.py",
> line 302, in read
>    return self._parse_body(stream)
>  File
> "/home/jluttine/.local/lib/python3.2/site-packages/scipy/io/mmio.py",
> line 447, in _parse_body
>    flat_data = flat_data.reshape(-1,3)
> ValueError: total size of new array must be unchanged
> =====================================================
> 
> It does work with Python 2 and SciPy 0.7.2:
> 
> =====================================================
> Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for 
>more information.
>>>> import scipy
>>>> scipy.version.version
> '0.7.2'
>>>> from scipy.io import mmread
>>>> mmread('illc1033.mtx.gz')
> <1033x320 sparse matrix of type '<type 'numpy.float64'>'
> 	with 4732 stored elements in COOrdinate format>
> =====================================================
> 
> Is there a bug in recent scipy.io.mmread or what could 
>be the problem?
> 
> Best,
> Jaakko
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

Hi Jaakko,

works fine for me with python 2.7.2 on opensuse12.1

>>> sp.__version__
'0.11.0.dev-e7d3e33'
>>> np.__version__
'1.7.0.dev-3503c5f'

Cheers,
                         Nils


From jaakko.luttinen at aalto.fi  Thu Mar 22 13:46:02 2012
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 22 Mar 2012 19:46:02 +0200
Subject: [SciPy-User] Bug in scipy.io.mmread?
In-Reply-To: <web-155367586@uni-stuttgart.de>
References: <4F6B2C5B.1050408@aalto.fi> <web-155367586@uni-stuttgart.de>
Message-ID: <4F6B655A.6080000@aalto.fi>

>> I am trying to read the following matrix market file:
>> ftp://math.nist.gov/pub/MatrixMarket2/Harwell-Boeing/lsq/illc1033.mtx.gz
>>
>> However, it doesn't work with Python 3 and SciPy 0.11.0:
> 
> works fine for me with python 2.7.2 on opensuse12.1
> 
>>>> sp.__version__
> '0.11.0.dev-e7d3e33'
>>>> np.__version__
> '1.7.0.dev-3503c5f'

Hi!

I also got it working with Python 2.6.6:
>>> scipy.version.version
'0.11.0.dev-0fbfdbc'

So, it seems like it is related to Python 3.2? I tried to diff mmio.py
but didn't notice any relevant differences between the Python versions..

Here is the diff:
http://pastebin.com/e2xm3CVx

-Jaakko


From friedrichromstedt at gmail.com  Thu Mar 22 15:24:49 2012
From: friedrichromstedt at gmail.com (Friedrich Romstedt)
Date: Thu, 22 Mar 2012 20:24:49 +0100
Subject: [SciPy-User] numpy array root operation
In-Reply-To: <loom.20120322T033740-781@post.gmane.org>
References: <loom.20120322T033740-781@post.gmane.org>
Message-ID: <644F8AAD-69EF-43C2-AD64-514D568C103F@gmail.com>

Am 22.03.2012 um 03:43 schrieb Odin Den <hate_pod at yahoo.com>:

> Hi,
> 5th root of -32 can be computed correctly as follows:
>>>> -32**(1/5)
>>>> -2.0

Warning, mathematician (physicist to be precise) speaking. 

Additional to the operator precedence issue pointed out by David, notice that powers to non-integer numbers are cumbersome to define. In fact, let q be a rational number q = n/d, and v a complex number v = |v| exp(i phi), e.g. ?42 with |v| = 42 and phi = pi. Then the root v^(1/d) is d-fold, and can be defined as the set of all (complex) numbers whose d-th power is v, namely the set {r exp(2 pi i f/d + i phi/d), f = 0...(d ? 1)}, with a real positive number r such that r^d = |v|. The d-th power of each of this numbers is v. v^q is then just (v^(1/d))^n. 

For real v phi is either 0 or pi, so for positive v phi/d in the set equation will be zero, so there's always a positive root, which we call just "root" in daily language and in Python. For negative v, phi = pi, and (2 pi)/d is just twice that large, so there's no longer a positive root. But for odd d, the term 2 pi f/d + phi/d will be, for f = (d ? 1)/2, just pi, so there's then a negative root, amongst all this roots. For even d, there's neither a positive nor a negative root of v, but only d complex ones. 

Notice that taking the numbers in the set to the n-th power multiplies their angle with n. 

You can calculate the first root (f = 0) in Python and numpy by taking a complex number to the power, e.g. in Python (?42 + 0j) ** (0.2). But this will never be real, always complex for negative v and noninteger exponent. Notice that for numpy, the convention for phi is ?pi < phi <= pi. Negative numbers have phi = pi. 

The "first" root can be defined for any kind of exponent, but for irrational numbers as exponent there will be an infinite set of roots, AISI. But don't bet on that. It cannot be finite because then the irrational number would have been rational. :-)

As I warned you, mathematically inclined people can speak an hour about the most obvious thing to CS people, but normally I found it worth doing it once and then just keeping in mind that there is no such thing as natural intuition. 

cu
Friedrich

> However, when I try to perform same operation on numpy arrays I get
> the following:
>>>> array([-32])**(1/5)
>>>> array([ nan])

So ya, it's Not A Number, but A Set Of Numbers. :-)

Maybe it could spit out the complex first root instead, as it will do for numpy.asarray(?42 + 0j) ** 0.2. But I'm not involved enough to be knowledgable about the design here. 

> Is there anyway to compute roots of numpy arrays? I have a huge matrix which
> contains both negative and positive values. What is the easiest way of making
> python compute the "nth" roots of each element of this matrix without knowing
> the value of "n" a priory?

The other proposed approach, by using just the modulus and keeping the sign, is, as I pointed out above, not in all cases mathematically valid. I would guess you screwed up your model if you ran across taking fractional powers of negative numbers.

From pav at iki.fi  Thu Mar 22 15:50:04 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 22 Mar 2012 20:50:04 +0100
Subject: [SciPy-User] Bug in scipy.io.mmread?
In-Reply-To: <4F6B2C5B.1050408@aalto.fi>
References: <4F6B2C5B.1050408@aalto.fi>
Message-ID: <jkfvpc$8le$1@dough.gmane.org>

22.03.2012 14:42, Jaakko Luttinen kirjoitti:
[clip]
> Python 3.2.2 (default, Oct 27 2011, 13:08:00)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import scipy
>>>> scipy.version.version
> '0.11.0.dev-Unknown'
>>>> from scipy.io import mmread
>>>> mmread('illc1033.mtx.gz')

Note that this succeeds if you gunzip the file first.

This is a bug in the gzip module in Python 3.x. The
PyObject_AsFileDescriptor call on a GzipFile object succeeds on
Python 3, although it should fail (there is no OS level file handle
giving the uncompressed stream). As a consequence, mmio ends up reading
the compressed data stream, which of course does not work.

It's possible to work around this in mmio.

-- 
Pauli Virtanen


From njs at pobox.com  Thu Mar 22 16:26:35 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 22 Mar 2012 20:26:35 +0000
Subject: [SciPy-User] numpy array root operation
In-Reply-To: <loom.20120322T033740-781@post.gmane.org>
References: <loom.20120322T033740-781@post.gmane.org>
Message-ID: <CAPJVwBkVdV0z12-0yaJwOUspXJCXLADav6vSMt8HL39asKFNeA@mail.gmail.com>

On Thu, Mar 22, 2012 at 2:43 AM, Odin Den <hate_pod at yahoo.com> wrote:
> Hi,
> 5th root of -32 can be computed correctly as follows:
>>>> -32**(1/5)
>>>> -2.0
>
> However, when I try to perform same operation on numpy arrays I get
> the following:
>>>> array([-32])**(1/5)
>>>> array([ nan])
>
> Is there anyway to compute roots of numpy arrays? I have a huge matrix which
> contains both negative and positive values. What is the easiest way of making
> python compute the "nth" roots of each element of this matrix without knowing
> the value of "n" a priory?

As long as you *know* that what you're computing is an odd root, and
that what you want for negative numbers is the real root, then you
could just work around this:
  roots = np.sign(a) * (a * np.sign(a))**(1./5)

-- Nathaniel


From fperez.net at gmail.com  Thu Mar 22 17:11:14 2012
From: fperez.net at gmail.com (Fernando Perez)
Date: Thu, 22 Mar 2012 14:11:14 -0700
Subject: [SciPy-User] [ANN] PyData workshop videos are up online,
	including panel with Guido
Message-ID: <CAHAreOpXhKKR4KW13b3hw2M=zqABWgg0+muEnBdVnzXJNeEG2Q@mail.gmail.com>

Hi all,

just to let you know that the videos from the PyData workshop we held
at Google a couple of weeks ago are now online (not all talks are up
yet, so watch the page over the next few days if a talk you wanted to
see isn't posted yet):

http://marakana.com/s/2012_pydata_workshop,1090/index.html

The panel discussion with Guido that we talked about on these lists is
in there; I hope to write up a short summary about it soon.

Many thanks to Simeon Franklin and the rest of the Marakana team for
doing all this work (for free)!

Cheers,

f


From ryanlists at gmail.com  Thu Mar 22 17:33:34 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu, 22 Mar 2012 16:33:34 -0500
Subject: [SciPy-User] problem with loading data from data_store
Message-ID: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>

I have some data sets stored using scipy.io.save_as_module.  I
recently upgrade to 0.10 and I can no longer open this module.
Further, I tried to reprocess my data and resave it and I am still
getting the same error message.  Just a couple of lines are needed to
recreate my problem:

mydict = {'a':12.34}
scipy.io.save_as_module('mymod',mydict)
import mymod

The response to the last command (import) is

In [18]: import mymod
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)

/home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/TMM_SFLR_model1.py
in <module>()
----> 1
      2
      3
      4
      5

/home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/mymod.py in <module>()
      1 import scipy.io.data_store as data_store
      2 import mymod
----> 3 data_store._load(mymod)
      4
      5

AttributeError: 'module' object has no attribute '_load'


Can anyone help me with this?

Thanks,

Ryan


From cpsmusic at yahoo.com  Fri Mar 23 07:28:23 2012
From: cpsmusic at yahoo.com (Chris Share)
Date: Fri, 23 Mar 2012 04:28:23 -0700 (PDT)
Subject: [SciPy-User] OSX, Python 3.2.2 and NumPy/SciPy
Message-ID: <1332502103.17367.YahooMailNeo@web161502.mail.bf1.yahoo.com>

Hi,

I'm new to Python however I have a reasonable amount of programming experience (C/C++).

I'm currently working on OSX (10.6.8) and I've installed Python 3.2.2. OSX also comes with Python 2.6.6.

I'm interested in scientific computing so I'd like to install Numpy/SciPy. I've managed to do this for the 2.6.6 version of Python however I'm unclear as to how I do this for the 3.2.2 version.

According to the 3.2.2 installer ReadMe:


The installer puts applications, an "Update Shell Profile" command,
>and a link to the optionally installed Python Documentation into the
>"Python 3.2" subfolder of the system Applications folder,
>and puts the underlying machinery into the folder
>/Library/Frameworks/Python.framework. It can
>optionally place links to the command-line tools in /usr/local/bin as
>well. Double-click on the "Update Shell Profile" command to add the
>"bin" directory inside the framework to your shell's search path.

How do install NumPy/SciPy so that the 3.2.2 version of IDLE can access them?

Is this possible?

Cheers,

Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120323/a0795226/attachment.html>

From chris at simplistix.co.uk  Fri Mar 23 14:55:15 2012
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 23 Mar 2012 18:55:15 +0000
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <20120310234442.GN24301@ninja.nosyntax.net>
References: <4F5BD12D.9090504@simplistix.co.uk>
	<CAF6FJisip27J4U+Su7LEp2FeVcy6JVjMpYezomkrYDEbPqhO6g@mail.gmail.com>
	<4F5BD861.8050205@simplistix.co.uk>
	<20120310234442.GN24301@ninja.nosyntax.net>
Message-ID: <4F6CC713.50704@simplistix.co.uk>

On 10/03/2012 23:44, rex wrote:
> Perhaps the NumPy+SciPy+Matplotlib community could learn something by
> looking at how the R community works? To this mere user who wants to
> get a job done, it's a night and day difference. I still use Python
> for GP programming, but there's a snowball's chance I'd ever use
> anything but R for my main interest, which is econometrics.

You really should try EPD.

Sorry you had a bad experience.

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From chris at simplistix.co.uk  Fri Mar 23 15:00:42 2012
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 23 Mar 2012 19:00:42 +0000
Subject: [SciPy-User] Layering a virtualenv over EPD
In-Reply-To: <4F5C6892.3050606@hilboll.de>
References: <4F5BD12D.9090504@simplistix.co.uk> <4F5C6892.3050606@hilboll.de>
Message-ID: <4F6CC85A.5080208@simplistix.co.uk>

On 11/03/2012 08:55, Andreas H. wrote:
> I just uploaded a quick log of what I did to accomplish exactly this to
>
>     https://gist.github.com/2015652
>
> I do have the problem that within the virtualenv, something with the
> console's not working right, as iPythons help doesn't work properly, and
> I cannot launch applications which open windows (except for ``ipython
> pylab=wx``) ...

That sounds less than ideal ;-)

I suspect you've ended up doing what I'm intent on avoiding: 
re-installing ipython just to get the launch script in the bin directory 
of the virtualenv.

Now, you can just manually craft a script in there by copying the 
system-wide one and hacking the pling line, but you shouldn't have to. 
I've opened a bug on virtualenv for this:

https://github.com/pypa/pip/issues/480

However, now that I'm CC'ing the ipython list, does ipython only provide 
distutils shell scripts? It would be great if it could also provide a 
setuptools-compatible console_scripts entry point?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From takowl at gmail.com  Sat Mar 24 06:59:04 2012
From: takowl at gmail.com (Thomas Kluyver)
Date: Sat, 24 Mar 2012 10:59:04 +0000
Subject: [SciPy-User] [IPython-User]  Layering a virtualenv over EPD
In-Reply-To: <4F6CC85A.5080208@simplistix.co.uk>
References: <4F5BD12D.9090504@simplistix.co.uk> <4F5C6892.3050606@hilboll.de>
	<4F6CC85A.5080208@simplistix.co.uk>
Message-ID: <CAOvn4qg7G=bzBmZHTLSO+EtZ=6j-h5AkRAVG0NXYEA3ZDE0m5w@mail.gmail.com>

On 23 March 2012 19:00, Chris Withers <chris at simplistix.co.uk> wrote:
> I suspect you've ended up doing what I'm intent on avoiding:
> re-installing ipython just to get the launch script in the bin directory
> of the virtualenv.
>
> Now, you can just manually craft a script in there by copying the
> system-wide one and hacking the pling line, but you shouldn't have to.

Just to mention: the development version of IPython will detect if
there's a virtualenv active when it starts and try to put its
directories on sys.path. It's not flawless - it will always behave as
though the virtualenv was created with --system-site-packages, but
it's convenient for simple cases. Of course, that doesn't interfere if
IPython is installed inside the virtualenv.

Thomas


From ralf.gommers at googlemail.com  Sat Mar 24 18:24:04 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 24 Mar 2012 23:24:04 +0100
Subject: [SciPy-User] OSX, Python 3.2.2 and NumPy/SciPy
In-Reply-To: <1332502103.17367.YahooMailNeo@web161502.mail.bf1.yahoo.com>
References: <1332502103.17367.YahooMailNeo@web161502.mail.bf1.yahoo.com>
Message-ID: <CABL7CQi=x3ZFfwGPDF3B6SkfPmWx2C1CfQXa6sci1FgxJTKN3Q@mail.gmail.com>

On Fri, Mar 23, 2012 at 12:28 PM, Chris Share <cpsmusic at yahoo.com> wrote:

> Hi,
>
> I'm new to Python however I have a reasonable amount of programming
> experience (C/C++).
>
> I'm currently working on OSX (10.6.8) and I've installed Python 3.2.2. OSX
> also comes with Python 2.6.6.
>
> I'm interested in scientific computing so I'd like to install Numpy/SciPy.
> I've managed to do this for the 2.6.6 version of Python however I'm unclear
> as to how I do this for the 3.2.2 version.
>
> According to the 3.2.2 installer ReadMe:
>
> The installer puts applications, an "Update Shell Profile" command,
> and a link to the optionally installed Python Documentation into the
> "Python 3.2" subfolder of the system Applications folder,
> and puts the underlying machinery into the folder
> /Library/Frameworks/Python.framework. It can
> optionally place links to the command-line tools in /usr/local/bin as
> well. Double-click on the "Update Shell Profile" command to add the
> "bin" directory inside the framework to your shell's search path.
>
>
>
> How do install NumPy/SciPy so that the 3.2.2 version of IDLE can access
> them?
>

There's no binary installer for Python 3.x on OS X yet, so you have to
compile numpy/scipy. Assuming you have XCode installed and the correct
gfortran compiler (linked at http://scipy.org/Installing_SciPy/Mac_OS_X),
you simply type "python setup.py install" in the base dir of the
numpy/scipy repos. It will first convert the source with 2to3 to python 3.x
format, then compile and install it. You should then be able to import it
in IDLE.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120324/919482ab/attachment.html>

From ralf.gommers at googlemail.com  Sun Mar 25 16:55:22 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 22:55:22 +0200
Subject: [SciPy-User] problem with loading data from data_store
In-Reply-To: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>
References: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>
Message-ID: <CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>

On Thu, Mar 22, 2012 at 10:33 PM, Ryan Krauss <ryanlists at gmail.com> wrote:

> I have some data sets stored using scipy.io.save_as_module.  I
> recently upgrade to 0.10 and I can no longer open this module.
> Further, I tried to reprocess my data and resave it and I am still
> getting the same error message.  Just a couple of lines are needed to
> recreate my problem:
>
> mydict = {'a':12.34}
> scipy.io.save_as_module('mymod',mydict)
> import mymod
>
> The response to the last command (import) is
>
> In [18]: import mymod
> ---------------------------------------------------------------------------
> AttributeError                            Traceback (most recent call last)
>
> /home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/TMM_SFLR_model1.py
> in <module>()
> ----> 1
>      2
>      3
>      4
>      5
>
> /home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/mymod.py in <module>()
>      1 import scipy.io.data_store as data_store
>      2 import mymod
> ----> 3 data_store._load(mymod)
>      4
>      5
>
> AttributeError: 'module' object has no attribute '_load'
>
>
> Can anyone help me with this?
>

You can add this in scipy/io/data_store.py:

def _load(module):
    """ Load data into module from a shelf with
        the same name as the module.
    """
    dir,filename = os.path.split(module.__file__)
    filebase = filename.split('.')[0]
    fn = os.path.join(dir, filebase)
    f = dumb_shelve.open(fn, "r")
    #exec( 'import ' + module.__name__)
    for i in f.keys():
        exec( 'import ' + module.__name__+ ';' +
              module.__name__+'.'+i + '=' + 'f["' + i + '"]')

This was caused by an incorrect removal of deprecated code in
https://github.com/scipy/scipy/commit/329a5e2713. So apparently
save_as_module() has been completely broken for 2 years without anyone
noticing.....

Proposal: fix save_as_module now so it can load data again, deprecate it
for 0.11 and remove it for 0.12.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120325/a69a6707/attachment.html>

From mdekauwe at gmail.com  Sun Mar 25 18:39:47 2012
From: mdekauwe at gmail.com (mdekauwe)
Date: Sun, 25 Mar 2012 15:39:47 -0700 (PDT)
Subject: [SciPy-User] [SciPy-user] Problem with Installation of Scipy on
	Macbook
In-Reply-To: <CALZ=UAFTEp==EVwrjeVSNgw-tfHnH4SUy8-JSUCGJX07V34n9w@mail.gmail.com>
References: <CALZ=UAFTEp==EVwrjeVSNgw-tfHnH4SUy8-JSUCGJX07V34n9w@mail.gmail.com>
Message-ID: <33544711.post@talk.nabble.com>


Hi,

I would recommend install via macports, I have had no issue that way
-- 
View this message in context: http://old.nabble.com/Problem-with-Installation-of-Scipy-on-Macbook-tp33537462p33544711.html
Sent from the Scipy-User mailing list archive at Nabble.com.


From pengkui.luo at gmail.com  Sun Mar 25 18:47:40 2012
From: pengkui.luo at gmail.com (Pengkui Luo)
Date: Sun, 25 Mar 2012 17:47:40 -0500
Subject: [SciPy-User] How to get the index iterator of a scipy sparse matrix?
Message-ID: <CAN6=Uc3JyinCnCHTpccQaSf4qBXzJnUTekuEkHd1TVuqrQW5Sw@mail.gmail.com>

e.g. suppose A is a scipy lil sparse matrix, and the result of print(A) is:

  (0, 1) 1.0
  (0, 2) -1.0
  (1, 0) 1.0
  (1, 2) -1.0
  (2, 1) 2.0

How can I get an iterator (or at least a list) of these (i, j) index pairs?

Thanks!

--
Pengkui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120325/9ff04632/attachment.html>

From JRadinger at gmx.at  Mon Mar 26 06:11:24 2012
From: JRadinger at gmx.at (Johannes Radinger)
Date: Mon, 26 Mar 2012 12:11:24 +0200
Subject: [SciPy-User] ier-integer in optimize.leastsq
Message-ID: <20120326101124.70420@gmx.net>

Hi,

Some months ago I started already this topic,... Now, while writing my
manuskript this topic comes up again. I am using the optimize.leastsq function
and would like to describe my results especielly the "ier-level".

If ier in my result is 1, then the solution which was found ensures a ftol-"quality", resp. the sum of squares for the relative errors are below the ftol value?
As I didn't set any ftol value, the default value is used. Therefore, in the case of optimization
results with ier=1, is it possible to state:

"For all fitted solutions the relative errors in the sum of squares are below the desired standard value of 1.49012e-08" ???

Where does this standard value come from?

Best regards,

Johannes
-- 
NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!                                  
Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a


From Philip_Bransford at vrtx.com  Mon Mar 26 09:54:03 2012
From: Philip_Bransford at vrtx.com (Philip_Bransford at vrtx.com)
Date: Mon, 26 Mar 2012 09:54:03 -0400
Subject: [SciPy-User] ier-integer in optimize.leastsq
In-Reply-To: <20120326101124.70420@gmx.net>
References: <20120326101124.70420@gmx.net>
Message-ID: <OFEFAF8BEF.4F55D1D8-ON852579CD.004C46F4-852579CD.004C5BAE@vrtx.com>

1.49012e-08 = numpy.sqrt(numpy.finfo(float).eps)


From:   "Johannes Radinger" <JRadinger at gmx.at>
To:     scipy-user at scipy.org
Date:   03/26/2012 06:11 AM
Subject:        [SciPy-User] ier-integer in optimize.leastsq
Sent by:        scipy-user-bounces at scipy.org


Hi,

Some months ago I started already this topic,... Now, while writing my
manuskript this topic comes up again. I am using the optimize.leastsq 
function
and would like to describe my results especielly the "ier-level".

If ier in my result is 1, then the solution which was found ensures a 
ftol-"quality", resp. the sum of squares for the relative errors are below 
the ftol value?
As I didn't set any ftol value, the default value is used. Therefore, in 
the case of optimization
results with ier=1, is it possible to state:

"For all fitted solutions the relative errors in the sum of squares are 
below the desired standard value of 1.49012e-08" ???

Where does this standard value come from?

Best regards,

Johannes
-- 
NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!  
Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120326/0cf21339/attachment.html>

From cweisiger at msg.ucsf.edu  Mon Mar 26 11:59:52 2012
From: cweisiger at msg.ucsf.edu (Chris Weisiger)
Date: Mon, 26 Mar 2012 08:59:52 -0700
Subject: [SciPy-User] OSX, Python 3.2.2 and NumPy/SciPy
In-Reply-To: <CABL7CQi=x3ZFfwGPDF3B6SkfPmWx2C1CfQXa6sci1FgxJTKN3Q@mail.gmail.com>
References: <1332502103.17367.YahooMailNeo@web161502.mail.bf1.yahoo.com>
	<CABL7CQi=x3ZFfwGPDF3B6SkfPmWx2C1CfQXa6sci1FgxJTKN3Q@mail.gmail.com>
Message-ID: <CABHB1jLmZkWOT3+g9jPb7389Bvuh9fX6PHTFPD8bu-OStEwviw@mail.gmail.com>

On Sat, Mar 24, 2012 at 3:24 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Fri, Mar 23, 2012 at 12:28 PM, Chris Share <cpsmusic at yahoo.com> wrote:
>>
>> Hi,
>>
>> I'm new to Python however I have a reasonable amount of programming
>> experience (C/C++).
>>
>> I'm currently working on OSX (10.6.8) and I've installed Python 3.2.2. OSX
>> also comes with Python 2.6.6.
>>
>> I'm interested in scientific computing so I'd like to install Numpy/SciPy.
>> I've managed to do this for the 2.6.6 version of Python however I'm unclear
>> as to how I do this for the 3.2.2 version.

If you aren't interested in compiling Numpy/Scipy yourself, you might
also consider installing Python 2.7 and using the binary installers
for Numpy/Scipy. Support for Python 3.x is still rather spotty despite
it having been out for quite some time now.

I wouldn't recommend installing anything using the system Python.
Usually it works well, but you don't want to end up accidentally
overwriting something the system is relying on, and if you ever find
yourself wanting to make a standalone program with a bundled Python
interpreter, you don't have the rights to distribute the system
Python. Libraries like py2app for making standalone Python programs
will refuse to bundle the system Python for that reason. Easier to
just install another Python and then install all your libraries onto
that.

-Chris


From andrew_giessel at hms.harvard.edu  Mon Mar 26 12:10:32 2012
From: andrew_giessel at hms.harvard.edu (Andrew Giessel)
Date: Mon, 26 Mar 2012 12:10:32 -0400
Subject: [SciPy-User] ier-integer in optimize.leastsq
In-Reply-To: <OFEFAF8BEF.4F55D1D8-ON852579CD.004C46F4-852579CD.004C5BAE@vrtx.com>
References: <20120326101124.70420@gmx.net>
	<OFEFAF8BEF.4F55D1D8-ON852579CD.004C46F4-852579CD.004C5BAE@vrtx.com>
Message-ID: <CAKF8KOfru9z7yjk6u5PkvByDaLgL4aYfRQ1+eKLg-xuQKSVF3A@mail.gmail.com>

In other words, it is a function of the precision of the number type
(float) you use in your fitting routines (specific to your
computer architecture).

hth,

ag

On Mon, Mar 26, 2012 at 09:54, <Philip_Bransford at vrtx.com> wrote:

> 1.49012e-08= numpy.sqrt(numpy.finfo(float).eps)
>
>
>
> From:        "Johannes Radinger" <JRadinger at gmx.at>
> To:        scipy-user at scipy.org
> Date:        03/26/2012 06:11 AM
> Subject:        [SciPy-User] ier-integer in optimize.leastsq
> Sent by:        scipy-user-bounces at scipy.org
> ------------------------------
>
>
>
> Hi,
>
> Some months ago I started already this topic,... Now, while writing my
> manuskript this topic comes up again. I am using the optimize.leastsq
> function
> and would like to describe my results especielly the "ier-level".
>
> If ier in my result is 1, then the solution which was found ensures a
> ftol-"quality", resp. the sum of squares for the relative errors are below
> the ftol value?
> As I didn't set any ftol value, the default value is used. Therefore, in
> the case of optimization
> results with ier=1, is it possible to state:
>
> "For all fitted solutions the relative errors in the sum of squares are
> below the desired standard value of 1.49012e-08" ???
>
> Where does this standard value come from?
>
> Best regards,
>
> Johannes
> --
> NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!
>
> Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


-- 
Andrew Giessel, PhD

Department of Neurobiology, Harvard Medical School
220 Longwood Ave Boston, MA 02115
ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120326/0d7b5cfd/attachment.html>

From ryanlists at gmail.com  Mon Mar 26 13:28:13 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Mon, 26 Mar 2012 12:28:13 -0500
Subject: [SciPy-User] problem with loading data from data_store
In-Reply-To: <CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>
References: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>
	<CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>
Message-ID: <CAAroWSa53LxSK=LYQf+A2aECEToP3u2UU7vL045eiTYXY4LwbQ@mail.gmail.com>

It seems like I am using code no one cares about or uses.  I also
worked around this by just using cPickle.  Is there a better approach
I should be using to save a collection of numpy arrays efficiently
with little hassle?

On Sun, Mar 25, 2012 at 3:55 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Thu, Mar 22, 2012 at 10:33 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
>>
>> I have some data sets stored using scipy.io.save_as_module. ?I
>> recently upgrade to 0.10 and I can no longer open this module.
>> Further, I tried to reprocess my data and resave it and I am still
>> getting the same error message. ?Just a couple of lines are needed to
>> recreate my problem:
>>
>> mydict = {'a':12.34}
>> scipy.io.save_as_module('mymod',mydict)
>> import mymod
>>
>> The response to the last command (import) is
>>
>> In [18]: import mymod
>>
>> ---------------------------------------------------------------------------
>> AttributeError ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call
>> last)
>>
>> /home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/TMM_SFLR_model1.py
>> in <module>()
>> ----> 1
>> ? ? ?2
>> ? ? ?3
>> ? ? ?4
>> ? ? ?5
>>
>> /home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/mymod.py in
>> <module>()
>> ? ? ?1 import scipy.io.data_store as data_store
>> ? ? ?2 import mymod
>> ----> 3 data_store._load(mymod)
>> ? ? ?4
>> ? ? ?5
>>
>> AttributeError: 'module' object has no attribute '_load'
>>
>>
>> Can anyone help me with this?
>
>
> You can add this in scipy/io/data_store.py:
>
> def _load(module):
> ??? """ Load data into module from a shelf with
> ??????? the same name as the module.
> ??? """
> ??? dir,filename = os.path.split(module.__file__)
> ??? filebase = filename.split('.')[0]
> ??? fn = os.path.join(dir, filebase)
> ??? f = dumb_shelve.open(fn, "r")
> ??? #exec( 'import ' + module.__name__)
> ??? for i in f.keys():
> ??????? exec( 'import ' + module.__name__+ ';' +
> ????????????? module.__name__+'.'+i + '=' + 'f["' + i + '"]')
>
> This was caused by an incorrect removal of deprecated code in
> https://github.com/scipy/scipy/commit/329a5e2713. So apparently
> save_as_module() has been completely broken for 2 years without anyone
> noticing.....
>
> Proposal: fix save_as_module now so it can load data again, deprecate it for
> 0.11 and remove it for 0.12.
>
> Ralf
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From matthew.brett at gmail.com  Mon Mar 26 13:47:52 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 26 Mar 2012 10:47:52 -0700
Subject: [SciPy-User] problem with loading data from data_store
In-Reply-To: <CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>
References: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>
	<CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>
Message-ID: <CAH6Pt5rcRw5UH3=2ieuTCjmYCM2PJTRZP=bq=Sa4sqL5TT7qUg@mail.gmail.com>

Hi,

On Sun, Mar 25, 2012 at 1:55 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Thu, Mar 22, 2012 at 10:33 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
>>
>> I have some data sets stored using scipy.io.save_as_module. ?I
>> recently upgrade to 0.10 and I can no longer open this module.
>> Further, I tried to reprocess my data and resave it and I am still
>> getting the same error message. ?Just a couple of lines are needed to
>> recreate my problem:
>>
>> mydict = {'a':12.34}
>> scipy.io.save_as_module('mymod',mydict)
>> import mymod
>>
>> The response to the last command (import) is
>>
>> In [18]: import mymod
>>
>> ---------------------------------------------------------------------------
>> AttributeError ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call
>> last)
>>
>> /home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/TMM_SFLR_model1.py
>> in <module>()
>> ----> 1
>> ? ? ?2
>> ? ? ?3
>> ? ? ?4
>> ? ? ?5
>>
>> /home/ryan/siue/Research/modeling/SFLR/system_ID/TMM/mymod.py in
>> <module>()
>> ? ? ?1 import scipy.io.data_store as data_store
>> ? ? ?2 import mymod
>> ----> 3 data_store._load(mymod)
>> ? ? ?4
>> ? ? ?5
>>
>> AttributeError: 'module' object has no attribute '_load'
>>
>>
>> Can anyone help me with this?
>
>
> You can add this in scipy/io/data_store.py:
>
> def _load(module):
> ??? """ Load data into module from a shelf with
> ??????? the same name as the module.
> ??? """
> ??? dir,filename = os.path.split(module.__file__)
> ??? filebase = filename.split('.')[0]
> ??? fn = os.path.join(dir, filebase)
> ??? f = dumb_shelve.open(fn, "r")
> ??? #exec( 'import ' + module.__name__)
> ??? for i in f.keys():
> ??????? exec( 'import ' + module.__name__+ ';' +
> ????????????? module.__name__+'.'+i + '=' + 'f["' + i + '"]')
>
> This was caused by an incorrect removal of deprecated code in
> https://github.com/scipy/scipy/commit/329a5e2713. So apparently
> save_as_module() has been completely broken for 2 years without anyone
> noticing.....
>
> Proposal: fix save_as_module now so it can load data again, deprecate it for
> 0.11 and remove it for 0.12.

That sounds reasonable to me.

See you,

Matthew


From ralf.gommers at googlemail.com  Mon Mar 26 14:19:25 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 26 Mar 2012 20:19:25 +0200
Subject: [SciPy-User] problem with loading data from data_store
In-Reply-To: <CAAroWSa53LxSK=LYQf+A2aECEToP3u2UU7vL045eiTYXY4LwbQ@mail.gmail.com>
References: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>
	<CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>
	<CAAroWSa53LxSK=LYQf+A2aECEToP3u2UU7vL045eiTYXY4LwbQ@mail.gmail.com>
Message-ID: <CABL7CQjdD6hfTsv5fWrQ_mHZNZWCdDduhVCo4r7GOU9VfBTFtA@mail.gmail.com>

On Mon, Mar 26, 2012 at 7:28 PM, Ryan Krauss <ryanlists at gmail.com> wrote:

> It seems like I am using code no one cares about or uses.  I also
> worked around this by just using cPickle.  Is there a better approach
> I should be using to save a collection of numpy arrays efficiently
> with little hassle?
>

Yes: http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120326/cad48cf4/attachment.html>

From ryanlists at gmail.com  Mon Mar 26 16:45:29 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Mon, 26 Mar 2012 15:45:29 -0500
Subject: [SciPy-User] problem with loading data from data_store
In-Reply-To: <CABL7CQjdD6hfTsv5fWrQ_mHZNZWCdDduhVCo4r7GOU9VfBTFtA@mail.gmail.com>
References: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>
	<CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>
	<CAAroWSa53LxSK=LYQf+A2aECEToP3u2UU7vL045eiTYXY4LwbQ@mail.gmail.com>
	<CABL7CQjdD6hfTsv5fWrQ_mHZNZWCdDduhVCo4r7GOU9VfBTFtA@mail.gmail.com>
Message-ID: <CAAroWSa5840WvrhT8YmvuOuS+yV_9SzdtDwNNPbS=EskK=JaXg@mail.gmail.com>

> Yes: http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html

Thanks.

If I am the only one who has tried to use save_as_module in a really
long time, feel free to get rid of it sooner.  I will either use savez
or cPickle.

On Mon, Mar 26, 2012 at 1:19 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Mon, Mar 26, 2012 at 7:28 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
>>
>> It seems like I am using code no one cares about or uses. ?I also
>> worked around this by just using cPickle. ?Is there a better approach
>> I should be using to save a collection of numpy arrays efficiently
>> with little hassle?
>
>
> Yes: http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html
>
> Ralf
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From ralf.gommers at googlemail.com  Mon Mar 26 16:55:31 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 26 Mar 2012 22:55:31 +0200
Subject: [SciPy-User] problem with loading data from data_store
In-Reply-To: <CAAroWSa5840WvrhT8YmvuOuS+yV_9SzdtDwNNPbS=EskK=JaXg@mail.gmail.com>
References: <CAAroWSYrcT0xYr45W8M3fG8BeD4EuKMbuhQZfR_5i3CPyRUjJA@mail.gmail.com>
	<CABL7CQgwjg_N0VDCmZ_uK+Woisesf2Dwjh+6NLb8Kui7Zzs+8Q@mail.gmail.com>
	<CAAroWSa53LxSK=LYQf+A2aECEToP3u2UU7vL045eiTYXY4LwbQ@mail.gmail.com>
	<CABL7CQjdD6hfTsv5fWrQ_mHZNZWCdDduhVCo4r7GOU9VfBTFtA@mail.gmail.com>
	<CAAroWSa5840WvrhT8YmvuOuS+yV_9SzdtDwNNPbS=EskK=JaXg@mail.gmail.com>
Message-ID: <CABL7CQgcSTJOUBWv8G-TC6r9u-7wCUYCVUB=0BM9oshhVKkCFA@mail.gmail.com>

On Mon, Mar 26, 2012 at 10:45 PM, Ryan Krauss <ryanlists at gmail.com> wrote:

> > Yes:
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html
>
> Thanks.
>
> If I am the only one who has tried to use save_as_module in a really
> long time, feel free to get rid of it sooner.  I will either use savez
> or cPickle.
>

Even if you're the only user (impossible to tell), there's no reason to
skip the normal deprecation dance I think.

Ralf


>
> On Mon, Mar 26, 2012 at 1:19 PM, Ralf Gommers
> <ralf.gommers at googlemail.com> wrote:
> >
> >
> > On Mon, Mar 26, 2012 at 7:28 PM, Ryan Krauss <ryanlists at gmail.com>
> wrote:
> >>
> >> It seems like I am using code no one cares about or uses.  I also
> >> worked around this by just using cPickle.  Is there a better approach
> >> I should be using to save a collection of numpy arrays efficiently
> >> with little hassle?
> >
> >
> > Yes:
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html
> >
> > Ralf
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120326/2e7612be/attachment.html>

From warren.weckesser at enthought.com  Mon Mar 26 18:25:39 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Mon, 26 Mar 2012 17:25:39 -0500
Subject: [SciPy-User] How to get the index iterator of a scipy sparse
	matrix?
In-Reply-To: <CAN6=Uc3JyinCnCHTpccQaSf4qBXzJnUTekuEkHd1TVuqrQW5Sw@mail.gmail.com>
References: <CAN6=Uc3JyinCnCHTpccQaSf4qBXzJnUTekuEkHd1TVuqrQW5Sw@mail.gmail.com>
Message-ID: <CAM-+wY-GMU3GTAf3KQjORt-SN1HFzgPsAnS3GcQHJ7Wx_y24ug@mail.gmail.com>

On Sun, Mar 25, 2012 at 5:47 PM, Pengkui Luo <pengkui.luo at gmail.com> wrote:

> e.g. suppose A is a scipy lil sparse matrix, and the result of print(A) is:
>
>   (0, 1) 1.0
>   (0, 2) -1.0
>   (1, 0) 1.0
>   (1, 2) -1.0
>   (2, 1) 2.0
>
> How can I get an iterator (or at least a list) of these (i, j) index pairs?
>
> Thanks!
>
>

You could convert the matrix to DOK format and get the keys:

In [1]: from scipy.sparse import lil_matrix

In [2]: a = lil_matrix([[0,0,0],[0,10,0],[20,0,30]])

In [3]: a.todok().keys()
Out[3]: [(2, 0), (1, 1), (2, 2)]

In [4]: a.todense()
Out[4]:
matrix([[ 0,  0,  0],
        [ 0, 10,  0],
        [20,  0, 30]])


That is not the most efficient method, but it is certainly easy to
implement.

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120326/359be4ee/attachment.html>

From alexander.borghgraef.rma at gmail.com  Tue Mar 27 10:03:31 2012
From: alexander.borghgraef.rma at gmail.com (Alexander Borghgraef)
Date: Tue, 27 Mar 2012 16:03:31 +0200
Subject: [SciPy-User] Inverse of binary_repr
Message-ID: <CAD8UrxJGuQFxicFRVF=BuR34nk6JqxUob0_k5hJ3VWN8WtdK1A@mail.gmail.com>

Hi all,

 Is there an inverse function of binary_repr, which takes a binary string
representation of a number ( like '100') and returns an integer?

-- 
Alex Borghgraef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120327/0f859a58/attachment.html>

From warren.weckesser at enthought.com  Tue Mar 27 10:06:07 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Tue, 27 Mar 2012 09:06:07 -0500
Subject: [SciPy-User] Inverse of binary_repr
In-Reply-To: <CAD8UrxJGuQFxicFRVF=BuR34nk6JqxUob0_k5hJ3VWN8WtdK1A@mail.gmail.com>
References: <CAD8UrxJGuQFxicFRVF=BuR34nk6JqxUob0_k5hJ3VWN8WtdK1A@mail.gmail.com>
Message-ID: <CAM-+wY-xBX1zvuCqJ_w9=e5OM1v8GhYyQCdgHvZnC8tDnoGXmA@mail.gmail.com>

On Tue, Mar 27, 2012 at 9:03 AM, Alexander Borghgraef <
alexander.borghgraef.rma at gmail.com> wrote:

> Hi all,
>
>  Is there an inverse function of binary_repr, which takes a binary string
> representation of a number ( like '100') and returns an integer?
>
>

Use the optional second argument of int(), which is the base:

In [1]: s = "1001"

In [2]: int(s, 2)
Out[2]: 9


Warren


> --
> Alex Borghgraef
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120327/c384ebfa/attachment.html>

From robert.kern at gmail.com  Tue Mar 27 10:06:40 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 27 Mar 2012 15:06:40 +0100
Subject: [SciPy-User] Inverse of binary_repr
In-Reply-To: <CAD8UrxJGuQFxicFRVF=BuR34nk6JqxUob0_k5hJ3VWN8WtdK1A@mail.gmail.com>
References: <CAD8UrxJGuQFxicFRVF=BuR34nk6JqxUob0_k5hJ3VWN8WtdK1A@mail.gmail.com>
Message-ID: <CAF6FJiv9MVfH+LuXudL-2WWWesUhT1zN_2CgGUb30cfUGPX2ow@mail.gmail.com>

On Tue, Mar 27, 2012 at 15:03, Alexander Borghgraef
<alexander.borghgraef.rma at gmail.com> wrote:
> Hi all,
>
> ?Is there an inverse function of binary_repr, which takes a binary string
> representation of a number ( like '100') and returns an integer?

[~]
|1> int('100', 2)
4

-- 
Robert Kern


From alexander.borghgraef.rma at gmail.com  Tue Mar 27 10:18:00 2012
From: alexander.borghgraef.rma at gmail.com (Alexander Borghgraef)
Date: Tue, 27 Mar 2012 16:18:00 +0200
Subject: [SciPy-User] Inverse of binary_repr
In-Reply-To: <CAF6FJiv9MVfH+LuXudL-2WWWesUhT1zN_2CgGUb30cfUGPX2ow@mail.gmail.com>
References: <CAD8UrxJGuQFxicFRVF=BuR34nk6JqxUob0_k5hJ3VWN8WtdK1A@mail.gmail.com>
	<CAF6FJiv9MVfH+LuXudL-2WWWesUhT1zN_2CgGUb30cfUGPX2ow@mail.gmail.com>
Message-ID: <CAD8Urx+2NKJBCt=NdNmLEtP_Kj3r9GLCMoh2Z01xoFo4bmfcHw@mail.gmail.com>

Thanks, I knew I was looking in the wrong place :-)

-- 
Alex Borghgraef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120327/6ec1b2cc/attachment.html>

From ryanlists at gmail.com  Tue Mar 27 14:48:14 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue, 27 Mar 2012 13:48:14 -0500
Subject: [SciPy-User] problem with dot for complex matrices
Message-ID: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>

I am loosing my mind while trying to debug some code.  I am trying to
find the cause of some differences between numpy analysis and analysis
done first in maxima and then converted to python code.  The maxima
approach is more difficult to do, but seems to lead to the correct
answers.  The core issue seems to be one dot product of a 2x2 and a
2x1 that are both complex numbers:

here is the 2x2:

ipdb> submatinv
array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
       [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])

here is the 2x1:

ipdb> augcol
array([[ -3.74729148e-05-0.0005937j ],
       [  7.96025801e-04+0.01137658j]])

verifying their shapes and data types:

ipdb> submatinv.shape
(2, 2)
ipdb> submatinv.dtype
dtype('complex128')
ipdb> augcol.shape
(2, 1)
ipdb> augcol.dtype
dtype('complex128')

I need to compute this result:

ipdb> -1*numpy.dot(submatinv,augcol)
array([[  5.30985737e-05+0.00038316j],
       [  1.72370377e-04+0.00115503j]])

If I hard code how to do the matrix multiplication, I get the correct
answer (it agrees with Maxima):

For the first element:
ipdb> -1*(submatinv[0,0]*augcol[0,0]+submatinv[0,1]*augcol[1,0])
(-0.005327660633034575+0.0011288088216130766j)

For the second
ipdb> -1*(submatinv[1,0]*augcol[0,0]+submatinv[1,1]*augcol[1,0])
(-0.016047752110848554+0.003432076134378004j)

What is the dot product doing if it isn't dotting row by column?

I am not seeing something.

Thanks,

Ryan


From josef.pktd at gmail.com  Tue Mar 27 14:57:03 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 27 Mar 2012 14:57:03 -0400
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
Message-ID: <CAMMTP+DFNx0-9LTMjgf_ei-eFT+jJ+kwshSpHyOuLOmyeiZpCQ@mail.gmail.com>

On Tue, Mar 27, 2012 at 2:48 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
> I am loosing my mind while trying to debug some code. ?I am trying to
> find the cause of some differences between numpy analysis and analysis
> done first in maxima and then converted to python code. ?The maxima
> approach is more difficult to do, but seems to lead to the correct
> answers. ?The core issue seems to be one dot product of a 2x2 and a
> 2x1 that are both complex numbers:
>
> here is the 2x2:
>
> ipdb> submatinv
> array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
> ? ? ? [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])
>
> here is the 2x1:
>
> ipdb> augcol
> array([[ -3.74729148e-05-0.0005937j ],
> ? ? ? [ ?7.96025801e-04+0.01137658j]])
>
> verifying their shapes and data types:
>
> ipdb> submatinv.shape
> (2, 2)
> ipdb> submatinv.dtype
> dtype('complex128')
> ipdb> augcol.shape
> (2, 1)
> ipdb> augcol.dtype
> dtype('complex128')
>
> I need to compute this result:
>
> ipdb> -1*numpy.dot(submatinv,augcol)
> array([[ ?5.30985737e-05+0.00038316j],
> ? ? ? [ ?1.72370377e-04+0.00115503j]])
>
> If I hard code how to do the matrix multiplication, I get the correct
> answer (it agrees with Maxima):
>
> For the first element:
> ipdb> -1*(submatinv[0,0]*augcol[0,0]+submatinv[0,1]*augcol[1,0])
> (-0.005327660633034575+0.0011288088216130766j)
>
> For the second
> ipdb> -1*(submatinv[1,0]*augcol[0,0]+submatinv[1,1]*augcol[1,0])
> (-0.016047752110848554+0.003432076134378004j)
>
> What is the dot product doing if it isn't dotting row by column?
>
> I am not seeing something.

with numpy 1.5.1, I get the results you want

>>> m1 = np.array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
...       [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])
>>> m2 = np.array([[ -3.74729148e-05-0.0005937j ],
...       [  7.96025801e-04+0.01137658j]])

>>> np.dot(m1, m2)
array([[ 0.00532766-0.00112881j],
       [ 0.01604775-0.00343208j]])

Josef

>
> Thanks,
>
> Ryan
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From nwagner at iam.uni-stuttgart.de  Tue Mar 27 14:58:26 2012
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Tue, 27 Mar 2012 20:58:26 +0200
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
Message-ID: <web-155516925@uni-stuttgart.de>

On Tue, 27 Mar 2012 13:48:14 -0500
  Ryan Krauss <ryanlists at gmail.com> wrote:
> I am loosing my mind while trying to debug some code.  I 
>am trying to
> find the cause of some differences between numpy 
>analysis and analysis
> done first in maxima and then converted to python code. 
> The maxima
> approach is more difficult to do, but seems to lead to 
>the correct
> answers.  The core issue seems to be one dot product of 
>a 2x2 and a
> 2x1 that are both complex numbers:
> 
> here is the 2x2:
> 
> ipdb> submatinv
> array([[-0.22740265-1.63857451j, 
>-0.07740957-0.55847886j],
>       [-3.20602957-4.93959054j, 
>-0.36746252-1.68352465j]])
> 
> here is the 2x1:
> 
> ipdb> augcol
> array([[ -3.74729148e-05-0.0005937j ],
>       [  7.96025801e-04+0.01137658j]])
> 
> verifying their shapes and data types:
> 
> ipdb> submatinv.shape
> (2, 2)
> ipdb> submatinv.dtype
> dtype('complex128')
> ipdb> augcol.shape
> (2, 1)
> ipdb> augcol.dtype
> dtype('complex128')
> 
> I need to compute this result:
> 
> ipdb> -1*numpy.dot(submatinv,augcol)
> array([[  5.30985737e-05+0.00038316j],
>       [  1.72370377e-04+0.00115503j]])
> 
> If I hard code how to do the matrix multiplication, I 
>get the correct
> answer (it agrees with Maxima):
> 
>For the first element:
> ipdb> 
>-1*(submatinv[0,0]*augcol[0,0]+submatinv[0,1]*augcol[1,0])
> (-0.005327660633034575+0.0011288088216130766j)
> 
>For the second
> ipdb> 
>-1*(submatinv[1,0]*augcol[0,0]+submatinv[1,1]*augcol[1,0])
> (-0.016047752110848554+0.003432076134378004j)
> 
> What is the dot product doing if it isn't dotting row by 
>column?
> 
> I am not seeing something.
> 
> Thanks,
> 
> Ryan
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

Hi Ryan,

I cannot reproduce your np.dot results, here.

python -i ryan_1.py
[[-0.00532766+0.00112881j]
  [-0.01604775+0.00343208j]]
1.7.0.dev-3503c5f

Nils
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ryan_1.py
Type: text/x-python
Size: 315 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120327/1c652927/attachment.py>

From ryanlists at gmail.com  Tue Mar 27 15:14:57 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue, 27 Mar 2012 14:14:57 -0500
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <web-155516925@uni-stuttgart.de>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
	<web-155516925@uni-stuttgart.de>
Message-ID: <CAAroWSb31RcjFWqazyuNMVK4wQpjWQ1Fpwy6datDG1FzJdDiHg@mail.gmail.com>

Thanks to Nils and Josef for responding so quickly.

I don't know if I feel more or less confused:

If I copy and paste the code from my email, I can't reproduce my own problem:

In [5]: A = array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
   ...:       [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])

In [6]: B = array([[ -3.74729148e-05-0.0005937j ],
   ...:       [  7.96025801e-04+0.01137658j]])

In [7]: -1*dot(A,B)
Out[7]:
array([[-0.00532766+0.00112881j],
       [-0.01604775+0.00343208j]])


But if I use the matrices returned by my function I get the wrong result:

In [27]: -1*numpy.dot(submat_inv_num, augcol_num)
Out[27]:
array([[  5.30985737e-05+0.00038316j],
       [  1.72370377e-04+0.00115503j]])

Even though they seem to be very nearly the same arrays:

In [11]: submat_inv_num
Out[11]:
array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
       [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])

In [12]: A
Out[12]:
array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
       [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])

In [13]: submat_inv_num.dtype
Out[13]: dtype('complex128')

In [14]: A.dtype
Out[14]: dtype('complex128')

In [15]: A.shape
Out[15]: (2, 2)

In [16]: submat_inv_num.shape
Out[16]: (2, 2)

In [17]: submat_inv_num - A
Out[17]:
array([[  1.18593824e-09 +3.83908949e-09j,
          1.45239888e-09 +4.30272740e-09j],
       [ -2.42228770e-09 -2.12942108e-09j,
         -4.36657455e-09 +2.14619789e-09j]])

In [20]: augcol_num.dtype
Out[20]: dtype('complex128')

In [21]: augcol_num.shape
Out[21]: (2, 1)

In [22]: B.dtype
Out[22]: dtype('complex128')

In [23]: B.shape
Out[23]: (2, 1)

In [18]: augcol_num - B
Out[18]:
array([[ -2.57355850e-14 -5.09694849e-11j],
       [ -1.51298895e-13 +2.85492891e-09j]])


Any ideas as to what might be going on?

FYI,

In [24]: numpy.__version__
Out[24]: '1.6.1'

In [25]: scipy.__version__
Out[25]: '0.10.0'


Thanks again,

Ryan

On Tue, Mar 27, 2012 at 1:58 PM, Nils Wagner
<nwagner at iam.uni-stuttgart.de> wrote:
> On Tue, 27 Mar 2012 13:48:14 -0500
> ?Ryan Krauss <ryanlists at gmail.com> wrote:
>>
>> I am loosing my mind while trying to debug some code. ?I am trying to
>> find the cause of some differences between numpy analysis and analysis
>> done first in maxima and then converted to python code. The maxima
>> approach is more difficult to do, but seems to lead to the correct
>> answers. ?The core issue seems to be one dot product of a 2x2 and a
>> 2x1 that are both complex numbers:
>>
>> here is the 2x2:
>>
>> ipdb> submatinv
>> array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
>> ? ? ?[-3.20602957-4.93959054j, -0.36746252-1.68352465j]])
>>
>> here is the 2x1:
>>
>> ipdb> augcol
>> array([[ -3.74729148e-05-0.0005937j ],
>> ? ? ?[ ?7.96025801e-04+0.01137658j]])
>>
>> verifying their shapes and data types:
>>
>> ipdb> submatinv.shape
>> (2, 2)
>> ipdb> submatinv.dtype
>> dtype('complex128')
>> ipdb> augcol.shape
>> (2, 1)
>> ipdb> augcol.dtype
>> dtype('complex128')
>>
>> I need to compute this result:
>>
>> ipdb> -1*numpy.dot(submatinv,augcol)
>> array([[ ?5.30985737e-05+0.00038316j],
>> ? ? ?[ ?1.72370377e-04+0.00115503j]])
>>
>> If I hard code how to do the matrix multiplication, I get the correct
>> answer (it agrees with Maxima):
>>
>> For the first element:
>> ipdb> -1*(submatinv[0,0]*augcol[0,0]+submatinv[0,1]*augcol[1,0])
>> (-0.005327660633034575+0.0011288088216130766j)
>>
>> For the second
>> ipdb> -1*(submatinv[1,0]*augcol[0,0]+submatinv[1,1]*augcol[1,0])
>> (-0.016047752110848554+0.003432076134378004j)
>>
>> What is the dot product doing if it isn't dotting row by column?
>>
>> I am not seeing something.
>>
>> Thanks,
>>
>> Ryan
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> Hi Ryan,
>
> I cannot reproduce your np.dot results, here.
>
> python -i ryan_1.py
> [[-0.00532766+0.00112881j]
> ?[-0.01604775+0.00343208j]]
> 1.7.0.dev-3503c5f
>
> Nils
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From e.antero.tammi at gmail.com  Tue Mar 27 15:15:19 2012
From: e.antero.tammi at gmail.com (eat)
Date: Tue, 27 Mar 2012 22:15:19 +0300
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
Message-ID: <CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>

Hi,

On Tue, Mar 27, 2012 at 9:48 PM, Ryan Krauss <ryanlists at gmail.com> wrote:

> I am loosing my mind while trying to debug some code.  I am trying to
> find the cause of some differences between numpy analysis and analysis
> done first in maxima and then converted to python code.  The maxima
> approach is more difficult to do, but seems to lead to the correct
> answers.  The core issue seems to be one dot product of a 2x2 and a
> 2x1 that are both complex numbers:
>
> here is the 2x2:
>
> ipdb> submatinv
> array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
>       [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])
>
> here is the 2x1:
>
> ipdb> augcol
> array([[ -3.74729148e-05-0.0005937j ],
>       [  7.96025801e-04+0.01137658j]])
>
> verifying their shapes and data types:
>
> ipdb> submatinv.shape
> (2, 2)
> ipdb> submatinv.dtype
> dtype('complex128')
> ipdb> augcol.shape
> (2, 1)
> ipdb> augcol.dtype
> dtype('complex128')
>
> I need to compute this result:
>
> ipdb> -1*numpy.dot(submatinv,augcol)
> array([[  5.30985737e-05+0.00038316j],
>       [  1.72370377e-04+0.00115503j]])
>
> If I hard code how to do the matrix multiplication, I get the correct
> answer (it agrees with Maxima):
>
> For the first element:
> ipdb> -1*(submatinv[0,0]*augcol[0,0]+submatinv[0,1]*augcol[1,0])
> (-0.005327660633034575+0.0011288088216130766j)
>
> For the second
> ipdb> -1*(submatinv[1,0]*augcol[0,0]+submatinv[1,1]*augcol[1,0])
> (-0.016047752110848554+0.003432076134378004j)
>
> What is the dot product doing if it isn't dotting row by column?
>
> I am not seeing something.
>
FWIIWO, I can't either reproduce your results:
In []: sys.version
Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]'
In []: np.version.version
Out[]: '1.6.0'

In []: s.round(7)
Out[]:
array([[-0.2274026-1.6385745j, -0.0774096-0.5584789j],
       [-3.2060296-4.9395905j, -0.3674625-1.6835246j]])
In []: a.round(7)
Out[]:
array([[ -3.75000000e-05-0.0005937j],
       [  7.96000000e-04+0.0113766j]])

In []: -1* dot(s, a).round(7)
Out[]:
array([[-0.0053277+0.0011288j],
       [-0.0160477+0.0034321j]])

Regards,
-eat

>
> Thanks,
>
> Ryan
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120327/05bdd1ce/attachment.html>

From ryanlists at gmail.com  Tue Mar 27 15:26:02 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue, 27 Mar 2012 14:26:02 -0500
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
	<CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>
Message-ID: <CAAroWSb1oaOt_tn4XC_DZX2srrZTa5HxZbA1i7DH1LUyWAcf4g@mail.gmail.com>

To further add to my own mystery, why does this fix the problem:

In [37]: -1*numpy.dot(submat_inv_num, augcol_num)
Out[37]:
array([[  5.30985737e-05+0.00038316j],
       [  1.72370377e-04+0.00115503j]])

In [38]: A2 = copy.copy(submat_inv_num)

In [39]: B2 = copy.copy(augcol_num)

In [40]: -1*dot(A2,B2)
Out[40]:
array([[-0.00532766+0.00112881j],
       [-0.01604775+0.00343208j]])


On Tue, Mar 27, 2012 at 2:15 PM, eat <e.antero.tammi at gmail.com> wrote:
> Hi,
>
> On Tue, Mar 27, 2012 at 9:48 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
>>
>> I am loosing my mind while trying to debug some code. ?I am trying to
>> find the cause of some differences between numpy analysis and analysis
>> done first in maxima and then converted to python code. ?The maxima
>> approach is more difficult to do, but seems to lead to the correct
>> answers. ?The core issue seems to be one dot product of a 2x2 and a
>> 2x1 that are both complex numbers:
>>
>> here is the 2x2:
>>
>> ipdb> submatinv
>> array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
>> ? ? ? [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])
>>
>> here is the 2x1:
>>
>> ipdb> augcol
>> array([[ -3.74729148e-05-0.0005937j ],
>> ? ? ? [ ?7.96025801e-04+0.01137658j]])
>>
>> verifying their shapes and data types:
>>
>> ipdb> submatinv.shape
>> (2, 2)
>> ipdb> submatinv.dtype
>> dtype('complex128')
>> ipdb> augcol.shape
>> (2, 1)
>> ipdb> augcol.dtype
>> dtype('complex128')
>>
>> I need to compute this result:
>>
>> ipdb> -1*numpy.dot(submatinv,augcol)
>> array([[ ?5.30985737e-05+0.00038316j],
>> ? ? ? [ ?1.72370377e-04+0.00115503j]])
>>
>> If I hard code how to do the matrix multiplication, I get the correct
>> answer (it agrees with Maxima):
>>
>> For the first element:
>> ipdb> -1*(submatinv[0,0]*augcol[0,0]+submatinv[0,1]*augcol[1,0])
>> (-0.005327660633034575+0.0011288088216130766j)
>>
>> For the second
>> ipdb> -1*(submatinv[1,0]*augcol[0,0]+submatinv[1,1]*augcol[1,0])
>> (-0.016047752110848554+0.003432076134378004j)
>>
>> What is the dot product doing if it isn't dotting row by column?
>>
>> I am not seeing something.
>
> FWIIWO, I can't either reproduce your results:
> In []: sys.version
> Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]'
> In []: np.version.version
> Out[]: '1.6.0'
>
> In []: s.round(7)
> Out[]:
> array([[-0.2274026-1.6385745j, -0.0774096-0.5584789j],
> ? ? ? ?[-3.2060296-4.9395905j, -0.3674625-1.6835246j]])
> In []: a.round(7)
> Out[]:
> array([[ -3.75000000e-05-0.0005937j],
> ? ? ? ?[ ?7.96000000e-04+0.0113766j]])
>
> In []: -1* dot(s, a).round(7)
> Out[]:
> array([[-0.0053277+0.0011288j],
> ? ? ? ?[-0.0160477+0.0034321j]])
>
> Regards,
> -eat
>>
>>
>> Thanks,
>>
>> Ryan
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


From aronne.merrelli at gmail.com  Tue Mar 27 15:35:38 2012
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Tue, 27 Mar 2012 14:35:38 -0500
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAAroWSb1oaOt_tn4XC_DZX2srrZTa5HxZbA1i7DH1LUyWAcf4g@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
	<CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>
	<CAAroWSb1oaOt_tn4XC_DZX2srrZTa5HxZbA1i7DH1LUyWAcf4g@mail.gmail.com>
Message-ID: <CAHNdQ4JkYKXRq=bH3hXgqKwjZgKdm94=rrNTW_J3LEDDssyLEA@mail.gmail.com>

On Tue, Mar 27, 2012 at 2:26 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
> To further add to my own mystery, why does this fix the problem:
>
> In [37]: -1*numpy.dot(submat_inv_num, augcol_num)
> Out[37]:
> array([[ ?5.30985737e-05+0.00038316j],
> ? ? ? [ ?1.72370377e-04+0.00115503j]])
>


It appears to be equal to:

In [1]: M = array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
   ...:       [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])

In [2]: x = array([[ -3.74729148e-05-0.0005937j ],
   ...:       [  7.96025801e-04+0.01137658j]])

In [16]: -1 * (dot(M.real,x.real) + 1j*dot(M.imag,x.real))
Out[16]:
array([[  5.30985748e-05+0.00038316j],
       [  1.72370374e-04+0.00115503j]])


I don't have any idea why it is doing that. You've never posted what
the type of those arrays are, though - is it possible it is a subclass
of ndarray that is doing something strange to the dot method? I think
the call to copy might put it back into a "plain" ndarray.


From ryanlists at gmail.com  Tue Mar 27 15:49:41 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue, 27 Mar 2012 14:49:41 -0500
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAHNdQ4JkYKXRq=bH3hXgqKwjZgKdm94=rrNTW_J3LEDDssyLEA@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
	<CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>
	<CAAroWSb1oaOt_tn4XC_DZX2srrZTa5HxZbA1i7DH1LUyWAcf4g@mail.gmail.com>
	<CAHNdQ4JkYKXRq=bH3hXgqKwjZgKdm94=rrNTW_J3LEDDssyLEA@mail.gmail.com>
Message-ID: <CAAroWSag5Gx6YrU_1_aXqFtpySMXOcreo4SuvtciB3rmSxf7rQ@mail.gmail.com>

Thanks for digging further.  I don't think I ever deliberately
subclass ndarray....(let me look into it).

On Tue, Mar 27, 2012 at 2:35 PM, Aronne Merrelli
<aronne.merrelli at gmail.com> wrote:
> On Tue, Mar 27, 2012 at 2:26 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
>> To further add to my own mystery, why does this fix the problem:
>>
>> In [37]: -1*numpy.dot(submat_inv_num, augcol_num)
>> Out[37]:
>> array([[ ?5.30985737e-05+0.00038316j],
>> ? ? ? [ ?1.72370377e-04+0.00115503j]])
>>
>
>
> It appears to be equal to:
>
> In [1]: M = array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
> ? ...: ? ? ? [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])
>
> In [2]: x = array([[ -3.74729148e-05-0.0005937j ],
> ? ...: ? ? ? [ ?7.96025801e-04+0.01137658j]])
>
> In [16]: -1 * (dot(M.real,x.real) + 1j*dot(M.imag,x.real))
> Out[16]:
> array([[ ?5.30985748e-05+0.00038316j],
> ? ? ? [ ?1.72370374e-04+0.00115503j]])
>
>
> I don't have any idea why it is doing that. You've never posted what
> the type of those arrays are, though - is it possible it is a subclass
> of ndarray that is doing something strange to the dot method? I think
> the call to copy might put it back into a "plain" ndarray.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


From ryanlists at gmail.com  Tue Mar 27 15:56:14 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue, 27 Mar 2012 14:56:14 -0500
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAAroWSag5Gx6YrU_1_aXqFtpySMXOcreo4SuvtciB3rmSxf7rQ@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
	<CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>
	<CAAroWSb1oaOt_tn4XC_DZX2srrZTa5HxZbA1i7DH1LUyWAcf4g@mail.gmail.com>
	<CAHNdQ4JkYKXRq=bH3hXgqKwjZgKdm94=rrNTW_J3LEDDssyLEA@mail.gmail.com>
	<CAAroWSag5Gx6YrU_1_aXqFtpySMXOcreo4SuvtciB3rmSxf7rQ@mail.gmail.com>
Message-ID: <CAAroWSa4ZgEdhb4=yq_uvxBYxC_DCRxcAPdzU79z80PDj=oGRg@mail.gmail.com>

The matrices are initially created by these lines:

        matout=scipy.zeros((n,n),dtype=complex128)#+0j
        colout=scipy.zeros((n,1),dtype=complex128)#+0j

They get assigned values from a matrix created using

U=scipy.eye(self.maxsize+1,dtype=complex128)

And when I ask for their types I get:

In [15]: type(augcol_num)
Out[15]: <type 'numpy.ndarray'>

In [16]: type(submat_inv_num)
Out[16]: <type 'numpy.ndarray'>

So, I don't believe they are subtyped.

On Tue, Mar 27, 2012 at 2:49 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
> Thanks for digging further. ?I don't think I ever deliberately
> subclass ndarray....(let me look into it).
>
> On Tue, Mar 27, 2012 at 2:35 PM, Aronne Merrelli
> <aronne.merrelli at gmail.com> wrote:
>> On Tue, Mar 27, 2012 at 2:26 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
>>> To further add to my own mystery, why does this fix the problem:
>>>
>>> In [37]: -1*numpy.dot(submat_inv_num, augcol_num)
>>> Out[37]:
>>> array([[ ?5.30985737e-05+0.00038316j],
>>> ? ? ? [ ?1.72370377e-04+0.00115503j]])
>>>
>>
>>
>> It appears to be equal to:
>>
>> In [1]: M = array([[-0.22740265-1.63857451j, -0.07740957-0.55847886j],
>> ? ...: ? ? ? [-3.20602957-4.93959054j, -0.36746252-1.68352465j]])
>>
>> In [2]: x = array([[ -3.74729148e-05-0.0005937j ],
>> ? ...: ? ? ? [ ?7.96025801e-04+0.01137658j]])
>>
>> In [16]: -1 * (dot(M.real,x.real) + 1j*dot(M.imag,x.real))
>> Out[16]:
>> array([[ ?5.30985748e-05+0.00038316j],
>> ? ? ? [ ?1.72370374e-04+0.00115503j]])
>>
>>
>> I don't have any idea why it is doing that. You've never posted what
>> the type of those arrays are, though - is it possible it is a subclass
>> of ndarray that is doing something strange to the dot method? I think
>> the call to copy might put it back into a "plain" ndarray.
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user


From eddybarratt1 at yahoo.co.uk  Tue Mar 27 18:31:18 2012
From: eddybarratt1 at yahoo.co.uk (Eddy Barratt)
Date: Tue, 27 Mar 2012 23:31:18 +0100 (BST)
Subject: [SciPy-User] Building numpy/scipy for python3 on MacOS Lion
In-Reply-To: <1331588767.69711.YahooMailNeo@web29505.mail.ird.yahoo.com>
References: <1331588767.69711.YahooMailNeo@web29505.mail.ird.yahoo.com>
Message-ID: <1332887478.68161.YahooMailNeo@web29506.mail.ird.yahoo.com>

I've made some progress with this problem, with much assistance from Ned Deily on the pythonmac mailing list. I can now build numpy for python3, but scipy still won't build.

Too install numpy:
The scipy website (http://www.scipy.org/Installing_SciPy/Mac_OS_X) suggests working around the C compiler problem with three typed commands, but these are insufficient, you need one more:

$ export CC=clang
$ export CXX=clang
$ export FFLAGS=-ff2c
$ export LDSHARED='clang -bundle -undefined dynamic_lookup \
? ? -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk -g'

After this building from source should work. See here for details: http://python.6.n6.nabble.com/Building-numpy-scipy-for-python3-on-MacOS-Lion-td4642828.html

Problem building scipy:
I don't know what the issue here is, something with the C compiler again though I think. Here are the error messages. I'd greatly appreciate any thoughts on this matter.

Thanks, Eddy

compiling C sources?
C compiler: clang -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk?

compile options: '-DNO_ATLAS_INFO=3 -DUSE_VENDOR_BLAS=1 -I/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c'?
extra options: '-msse3'?
clang: scipy/sparse/linalg/dsolve/_superlumodule.c?
In file included from scipy/sparse/linalg/dsolve/_superlumodule.c:18:?
In file included from /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/arrayobject.h:15:?
In file included from /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/ndarrayobject.h:17:?
In file included from /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/ndarraytypes.h:1972:?
/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API" [-W#warnings]?
#warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API"?
?^?
scipy/sparse/linalg/dsolve/_superlumodule.c:268:9: error: non-void function 'PyInit__superlu' should return a value [-Wreturn-type]?
? ? ? ? return;?
? ? ? ? ^?
1 warning and 1 error generated.?
In file included from scipy/sparse/linalg/dsolve/_superlumodule.c:18:?
In file included from /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/arrayobject.h:15:?
In file included from /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/ndarrayobject.h:17:?
In file included from /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/ndarraytypes.h:1972:?
/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API" [-W#warnings]?
#warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API"?
?^?
scipy/sparse/linalg/dsolve/_superlumodule.c:268:9: error: non-void function 'PyInit__superlu' should return a value [-Wreturn-type]?
? ? ? ? return;?
? ? ? ? ^?
1 warning and 1 error generated.?
error: Command "clang -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.6.sdk -DNO_ATLAS_INFO=3 -DUSE_VENDOR_BLAS=1 -I/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.2/include/python3.2m -c scipy/sparse/linalg/dsolve/_superlumodule.c -o build/temp.macosx-10.6-intel-3.2/scipy/sparse/linalg/dsolve/_superlumodule.o -msse3" failed with exit status 1?


----- Original Message -----
From: Eddy Barratt <eddybarratt1 at yahoo.co.uk>
To: "scipy-user at scipy.org" <scipy-user at scipy.org>
Cc: 
Sent: Monday, 12 March 2012, 15:46
Subject: Building numpy/scipy for python3 on MacOS Lion

I can't get Numpy or Scipy to work with Python3 on Mac OSX Lion.

I have used pip successfully to install numpy, scipy and matplotlib, and they work well with Python2.7, but in Python3 typing 'import numpy' brings up 'No module named numpy'. I've tried downloading the source code directly and then running 'python3 setup.py build', but I get various error warnings, some in red that have to do with fortran (e.g. 'Could not locate executable f95'). The error message that appears to fail in the end is 'RuntimeError: Broken toolchain: cannot link a simple C program', and appears to be related to the previous line 'sh: gcc-4.2: command not found'.

The Scipy website (http://www.scipy.org/Installing_SciPy/Mac_OS_X) suggests that there may be issues with the c compiler, but the same problems didn't arise using pip to install for python2.7. I have followed the instructions on the website regarding changing the compiler but this has not made any difference.


I have also tried installing from a virtual environment:


>>> mkvirtualenv -p python3.2 test1
>>> pip install numpy

But this fails with "Command python setup.py egg_info failed with error code 1 in /Users/Eddy/.virtualenvs/test1/build/numpy"
I've considered making python3 default, and then I thought a pip install might work, but I don't know how to do that. Does anyone have any suggestions for how I might proceed? I'm relatively new to Python but it's something I feel I'm likely to become more involved in so I'd like to start using Python3 before I get too established with 2.7. Thanks for your help.

Eddy


From mdekauwe at gmail.com  Tue Mar 27 19:20:13 2012
From: mdekauwe at gmail.com (Martin De Kauwe)
Date: Tue, 27 Mar 2012 16:20:13 -0700 (PDT)
Subject: [SciPy-User] Numpy/Scipy: Avoiding nested loops to operate on
 matrix-valued images
In-Reply-To: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
References: <29677913.980.1331801968906.JavaMail.geo-discussion-forums@ynkz21>
Message-ID: <26902679.5.1332890413406.JavaMail.geo-discussion-forums@pbcwe9>

I didn't quite follow exactly what you were doing, but someone previously 
showed me how to avoid inner loops and so perhaps this will help?

Instead of...

tmp = np.arange(500000).reshape(1000,500)
nrows, ncols = tmp.shape[0], tmp.shape[1]
out = np.zeros((nrows, ncols))
for i in xrange(nrows):
    for j in xrange(ncols):
        out[i,j] = tmp[i,j] * 3.0

you might try...

tmp = np.arange(500000).reshape(1000,500)
nrows, ncols = tmp.shape[0], tmp.shape[1]
out = np.zeros((nrows, ncols))
r = np.arange(nrows)
c = np.arange(ncols)
out[r[:,None],c] = tmp[r[:,None],c] * 3.0

Assuming your arrays are large you would get a speed bump


On Thursday, March 15, 2012 7:59:28 PM UTC+11, tyldurd wrote:
>
> Hello,
>
> I am a beginner at python and numpy and I need to compute the matrix 
> logarithm for each "pixel" (i.e. x,y position) of a matrix-valued image of 
> dimension MxNx3x3. 3x3 is the dimensions of the matrix at each pixel.
>
> The function I have written so far is the following:
>
> def logm_img(im):
>     from scipy import linalg
>     dimx = im.shape[0]
>     dimy = im.shape[1]
>     res = zeros_like(im)
>     for x in range(dimx):
>         for y in range(dimy):
>             res[x, y, :, :] = linalg.logm(asmatrix(im[x,y,:,:]))
>     return res
>
> Is it ok? Is there a way to avoid the two nested loops ?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120327/04f17919/attachment.html>

From nicolas.pinto at gmail.com  Tue Mar 27 20:30:33 2012
From: nicolas.pinto at gmail.com (Nicolas Pinto)
Date: Tue, 27 Mar 2012 20:30:33 -0400
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse
	module
In-Reply-To: <jk88ug$faf$1@dough.gmane.org>
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
	<jk879u$2bh$1@dough.gmane.org> <jk88ug$faf$1@dough.gmane.org>
Message-ID: <CADo_NTx6HGO=rDpasdJcD1BtWjGEpbsda1+Wk790CQvuUfthWQ@mail.gmail.com>

Thanks for the answers. Sorry for the late answer, I've been out of the country.

> Another possibility is that the problem comes just from the c++ runtime.
> There's another c++ module in Scipy, `scipy.interpolate._interpolate` --
> could you check if importing it also causes the same issue?

You are right, the same issue happens with `from scipy.interpolate
import _interpolate`. Any advice on how to debug/fix from here?

Thanks.

N


From aronne.merrelli at gmail.com  Wed Mar 28 01:22:30 2012
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Wed, 28 Mar 2012 00:22:30 -0500
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAAroWSa4ZgEdhb4=yq_uvxBYxC_DCRxcAPdzU79z80PDj=oGRg@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
	<CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>
	<CAAroWSb1oaOt_tn4XC_DZX2srrZTa5HxZbA1i7DH1LUyWAcf4g@mail.gmail.com>
	<CAHNdQ4JkYKXRq=bH3hXgqKwjZgKdm94=rrNTW_J3LEDDssyLEA@mail.gmail.com>
	<CAAroWSag5Gx6YrU_1_aXqFtpySMXOcreo4SuvtciB3rmSxf7rQ@mail.gmail.com>
	<CAAroWSa4ZgEdhb4=yq_uvxBYxC_DCRxcAPdzU79z80PDj=oGRg@mail.gmail.com>
Message-ID: <CAHNdQ4+c4r9yt=MCpyEtQhWXW14DbZovNeE+785r6=7f26Gp6A@mail.gmail.com>

On Tue, Mar 27, 2012 at 2:56 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
> The matrices are initially created by these lines:
>
> ? ? ? ?matout=scipy.zeros((n,n),dtype=complex128)#+0j
> ? ? ? ?colout=scipy.zeros((n,1),dtype=complex128)#+0j
>
> They get assigned values from a matrix created using
>
> U=scipy.eye(self.maxsize+1,dtype=complex128)
>
> And when I ask for their types I get:
>
> In [15]: type(augcol_num)
> Out[15]: <type 'numpy.ndarray'>
>
> In [16]: type(submat_inv_num)
> Out[16]: <type 'numpy.ndarray'>
>
> So, I don't believe they are subtyped.
>

The only other idea I have is to check if you can save the "problem"
arrays. Specifically, try this, with the arrays that give the
incorrect dot product:

In [6]: savez('testing.npz', submat_inv_num=submat_inv_num,
augcol_num=augcol_num)

Then load them into a new session:

In [1]: d = load('testing.npz')

In [2]: submat_inv_num = d['submat_inv_num']; augcol_num = d['augcol_num']

Do the reloaded variables give the same incorrect dot product? It is
probably a long shot, since I would imagine the save/load would be
similar to copy... but if it works then others might be able to
inspect the object to see what might be different. One last detail -
it looks like the augcol is getting cast to a real number - (this is a
clearer example than what I showed earlier):

In [17]: dot(submat_inv_num,  augcol_num.real)
Out[17]:
array([[ -5.30985748e-05-0.00038316j],
       [ -1.72370374e-04-0.00115503j]])

That might be a clue that something is causing augcol_num to get cast
into a "normal" float before the dot product is taken.


From ryanlists at gmail.com  Wed Mar 28 15:04:20 2012
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed, 28 Mar 2012 14:04:20 -0500
Subject: [SciPy-User] problem with dot for complex matrices
In-Reply-To: <CAHNdQ4+c4r9yt=MCpyEtQhWXW14DbZovNeE+785r6=7f26Gp6A@mail.gmail.com>
References: <CAAroWSamJ-3NJG0fLxwpNA_O-_0g+E_=R3rwj4uYdUpGAt=r=w@mail.gmail.com>
	<CAKa=AYTVEr7Zyf_m-Z+HURbVcoTD2OUjnhZLhRg-4rWViBEftg@mail.gmail.com>
	<CAAroWSb1oaOt_tn4XC_DZX2srrZTa5HxZbA1i7DH1LUyWAcf4g@mail.gmail.com>
	<CAHNdQ4JkYKXRq=bH3hXgqKwjZgKdm94=rrNTW_J3LEDDssyLEA@mail.gmail.com>
	<CAAroWSag5Gx6YrU_1_aXqFtpySMXOcreo4SuvtciB3rmSxf7rQ@mail.gmail.com>
	<CAAroWSa4ZgEdhb4=yq_uvxBYxC_DCRxcAPdzU79z80PDj=oGRg@mail.gmail.com>
	<CAHNdQ4+c4r9yt=MCpyEtQhWXW14DbZovNeE+785r6=7f26Gp6A@mail.gmail.com>
Message-ID: <CAAroWSYThjyYOv-bKat44Rp_=Pc==n1e098n0ShzzKO4jkDobA@mail.gmail.com>

Saving and loading the arrays seems to lead to a reproducible error,
at least on my machine:

d = numpy.load('testing.npz')
submat_inv_num = d['submat_inv_num']; augcol_num = d['augcol_num']
-1*dot(submat_inv_num, augcol_num)

In [5]: -1*dot(submat_inv_num, augcol_num)
Out[5]:
array([[  5.30985737e-05+0.00038316j],
       [  1.72370377e-04+0.00115503j]])

On Wed, Mar 28, 2012 at 12:22 AM, Aronne Merrelli
<aronne.merrelli at gmail.com> wrote:
> On Tue, Mar 27, 2012 at 2:56 PM, Ryan Krauss <ryanlists at gmail.com> wrote:
>> The matrices are initially created by these lines:
>>
>> ? ? ? ?matout=scipy.zeros((n,n),dtype=complex128)#+0j
>> ? ? ? ?colout=scipy.zeros((n,1),dtype=complex128)#+0j
>>
>> They get assigned values from a matrix created using
>>
>> U=scipy.eye(self.maxsize+1,dtype=complex128)
>>
>> And when I ask for their types I get:
>>
>> In [15]: type(augcol_num)
>> Out[15]: <type 'numpy.ndarray'>
>>
>> In [16]: type(submat_inv_num)
>> Out[16]: <type 'numpy.ndarray'>
>>
>> So, I don't believe they are subtyped.
>>
>
> The only other idea I have is to check if you can save the "problem"
> arrays. Specifically, try this, with the arrays that give the
> incorrect dot product:
>
> In [6]: savez('testing.npz', submat_inv_num=submat_inv_num,
> augcol_num=augcol_num)
>
> Then load them into a new session:
>
> In [1]: d = load('testing.npz')
>
> In [2]: submat_inv_num = d['submat_inv_num']; augcol_num = d['augcol_num']
>
> Do the reloaded variables give the same incorrect dot product? It is
> probably a long shot, since I would imagine the save/load would be
> similar to copy... but if it works then others might be able to
> inspect the object to see what might be different. One last detail -
> it looks like the augcol is getting cast to a real number - (this is a
> clearer example than what I showed earlier):
>
> In [17]: dot(submat_inv_num, ?augcol_num.real)
> Out[17]:
> array([[ -5.30985748e-05-0.00038316j],
> ? ? ? [ -1.72370374e-04-0.00115503j]])
>
> That might be a clue that something is causing augcol_num to get cast
> into a "normal" float before the dot product is taken.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testing.npz
Type: application/octet-stream
Size: 494 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120328/338670b5/attachment.obj>

From pav at iki.fi  Wed Mar 28 15:38:16 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 28 Mar 2012 19:38:16 +0000 (UTC)
Subject: [SciPy-User]
	=?utf-8?q?linalg=2Eeigh_hangs_only_after_importing_s?=
	=?utf-8?q?parse=09module?=
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
	<jk879u$2bh$1@dough.gmane.org> <jk88ug$faf$1@dough.gmane.org>
	<CADo_NTx6HGO=rDpasdJcD1BtWjGEpbsda1+Wk790CQvuUfthWQ@mail.gmail.com>
Message-ID: <loom.20120328T213351-679@post.gmane.org>

Nicolas Pinto <nicolas.pinto <at> gmail.com> writes:
[clip]
> You are right, the same issue happens with `from scipy.interpolate
> import _interpolate`. Any advice on how to debug/fix from here?

I think you should also verify this by copying `_interpolate.so` outside Scipy 
and importing it --- namely, `from scipy.interpolate import ...` import also 
`scipy.sparse` so you cannot pinpoint the problem to `_interpolate`.

-- 
Pauli Virtanen


From sebas0 at gmail.com  Wed Mar 28 16:22:45 2012
From: sebas0 at gmail.com (Sebastian)
Date: Wed, 28 Mar 2012 17:22:45 -0300
Subject: [SciPy-User] conotur plot
Message-ID: <CAPx0h+k3oXPJyW7TZk_GKjAj5yykNxiH_ONB_z7NHxs59jvNEw@mail.gmail.com>

Dear Folks,

I'm using the python bin of epd-7.0-2-rh5-x86_64
on a Linux 2.6.38.8-32.fc15.x86_64 #1 SMP Mon Jun 13 19:49:05 UTC 2011
x86_64 x86_64 x86_64 GNU/Linux
system.

I'm trying to make a contour plot of astronomical data binning data in a 2D
grid.
When I bin the data in square bins (eg 60 x 60), the contour plot works
fine.
But if I change the binning to (10 x 60), by changing the following line in
the code, from:

"resolucion_x1=(xc1.max()-xc1.min())/(nceldas)"
to
"resolucion_x1=3* (xc1.max()-xc1.min())/(nceldas)"

then I produce three arrays  (x1,y1,z1)
with size 20,60,1200 instead of 60,60,3600
BUT when I try plot a contour map with pylab.contour I get the follow error:

"
TypeError: Length of x must be number of columns in z,
and length of y must be number of rows."
and I think this occurs because:

In [788]: shape(x1),shape(y1),shape(z1)
Out[788]: ((20,), (60,), (20, 60))

Any idea as to how to solve this so I can use rectangular binning
with the code?

I use the following code:

import numpy as N
from subprocess import *
import pyfits
import matplotlib
import pylab
import pickle
from itertools import izip

magk=N.loadtxt("mag_k.gz")
magj=N.loadtxt("mag_j.gz")


xc1=N.array(magj-magk)
yc1=N.array(magk)
print ("xc1 max",xc1.max())
print ("yc1 max",yc1.max())
print ("xc1 min",xc1.min())
print ("yc1 min",yc1.min())
print ("is nonnum",N.isnan(xc1).any())
nceldas=60.0
resolucion_x1=(xc1.max()-xc1.min())/(nceldas)
resolucion_y1=(yc1.max()-yc1.min())/(nceldas)
minix1=xc1.min();miniy1=yc1.min()
x1=N.arange(xc1.min(),xc1.max(),resolucion_x1,dtype=xc1.dtype)
y1=N.arange(yc1.min(),yc1.max(),resolucion_y1,dtype=yc1.dtype)
z1=N.zeros((x1.shape[0],y1.shape[0]),yc1.dtype)
print "x1 size" , x1.size
print "y1 size" , y1.size
print "z1 size" , z1.size

xc1=(xc1-xc1.min())/resolucion_x1
yc1=(yc1-yc1.min())/resolucion_y1
print xc1.max(), xc1.min(),yc1.max(),yc1.min()
for i,j in zip(xc1,yc1):
    try: z1[int(j),int(i)]+=1.0
    except: pass

figure=pylab.figure()
pylab.plot(magj-magk,magk,'b.',ms=2.3,alpha=0.70)
pylab.ylim(pylab.ylim()[::-1])
pylab.contour(x1,y1,z1*100/len(magk),30,alpha=1,linewidths=5)
pylab.show()

confused...
- Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120328/f1fb3a9c/attachment.html>

From jsseabold at gmail.com  Wed Mar 28 20:19:00 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Wed, 28 Mar 2012 20:19:00 -0400
Subject: [SciPy-User] call signature for dgees f2py external user routine?
Message-ID: <CAKF=DjugD=OvzsvpX1PtUirm5aMg-qn1dnOc8d=d7bt-cVf0SQ@mail.gmail.com>

Can someone explain the call signature for the select function used
from the gees routines? Or point me to a reference? I don't understand
the syntax. <_arg=...>

https://github.com/scipy/scipy/blob/master/scipy/linalg/flapack_user.pyf.src#L3

Thanks,

Skipper


From jsseabold at gmail.com  Wed Mar 28 20:20:28 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Wed, 28 Mar 2012 20:20:28 -0400
Subject: [SciPy-User] call signature for dgees f2py external user
	routine?
In-Reply-To: <CAKF=DjugD=OvzsvpX1PtUirm5aMg-qn1dnOc8d=d7bt-cVf0SQ@mail.gmail.com>
References: <CAKF=DjugD=OvzsvpX1PtUirm5aMg-qn1dnOc8d=d7bt-cVf0SQ@mail.gmail.com>
Message-ID: <CAKF=Djt-zcxY+nU3jSbA6BfdTFCkJ5nrpYoYB8O4xRiZ_-pWFQ@mail.gmail.com>

On Wed, Mar 28, 2012 at 8:19 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> Can someone explain the call signature for the select function used
> from the gees routines? Or point me to a reference? I don't understand
> the syntax. <_arg=...>
>
> https://github.com/scipy/scipy/blob/master/scipy/linalg/flapack_user.pyf.src#L3

Sorry meant to send this to the dev list. Please reply there.

Skipper


From aronne.merrelli at gmail.com  Wed Mar 28 23:20:58 2012
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Wed, 28 Mar 2012 22:20:58 -0500
Subject: [SciPy-User] conotur plot
In-Reply-To: <CAPx0h+k3oXPJyW7TZk_GKjAj5yykNxiH_ONB_z7NHxs59jvNEw@mail.gmail.com>
References: <CAPx0h+k3oXPJyW7TZk_GKjAj5yykNxiH_ONB_z7NHxs59jvNEw@mail.gmail.com>
Message-ID: <CAHNdQ4+4_CKyRnY5R7iR9z9QwMSMtVtB7wkBACT_=pZwc+XY7Q@mail.gmail.com>

On Wed, Mar 28, 2012 at 3:22 PM, Sebastian <sebas0 at gmail.com> wrote:
> TypeError: Length of x must be number of columns in z,
> and length of y must be number of rows."
> and I think this occurs because:
>
> In [788]: shape(x1),shape(y1),shape(z1)
> Out[788]: ((20,), (60,), (20, 60))
>


The shape of a 2-D array is (number_of_rows, number_of_columns).
For that z1 array, the x-array should be 60, and the y should be 20.
So you need to transpose z, or swap the x/y arrays to get the correct
dimensions. For example:

In [15]: x1.shape, y1.shape, z.shape
Out[15]: ((20,), (60,), (20, 60))

In [16]: contour(x1,y1,z.T)
Out[16]: <matplotlib.contour.QuadContourSet instance at 0x6242a80>

In [17]: contour(y1,x1,z)
Out[17]: <matplotlib.contour.QuadContourSet instance at 0x624ba30>

In [18]: contour(x1,y1,z)
TypeError: Length of x must be number of columns in z,
and length of y must be number of rows.


From nicolas.pinto at gmail.com  Thu Mar 29 11:28:17 2012
From: nicolas.pinto at gmail.com (Nicolas Pinto)
Date: Thu, 29 Mar 2012 11:28:17 -0400
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse
	module
In-Reply-To: <loom.20120328T213351-679@post.gmane.org>
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
	<jk879u$2bh$1@dough.gmane.org> <jk88ug$faf$1@dough.gmane.org>
	<CADo_NTx6HGO=rDpasdJcD1BtWjGEpbsda1+Wk790CQvuUfthWQ@mail.gmail.com>
	<loom.20120328T213351-679@post.gmane.org>
Message-ID: <CADo_NTzKHuGUuw8L3NmA3uJiaBB-p9O1tSPJst4HwCVJPFvy-g@mail.gmail.com>

> I think you should also verify this by copying `_interpolate.so` outside Scipy
> and importing it --- namely, `from scipy.interpolate import ...` import also
> `scipy.sparse` so you cannot pinpoint the problem to `_interpolate`.

Good point. I can still reproduce the bug by copying `_interpolate.so`.


>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


-- 
Nicolas Pinto
http://web.mit.edu/pinto


From sebas0 at gmail.com  Thu Mar 29 13:59:42 2012
From: sebas0 at gmail.com (Sebastian)
Date: Thu, 29 Mar 2012 14:59:42 -0300
Subject: [SciPy-User] conotur plot
Message-ID: <CAPx0h+=1vMA9URGHz2uf6Lu04HrN49j+iHtVZfo1LKRTtJwnyw@mail.gmail.com>

Thanks for the help Aronne, just some feedback on post 6 of SciPy-User
Digest, Vol 103, Issue 65

Even though both solution plots contours both sets of contours are
off-center (axis inverted). and so they don't follow the density of points,
as intended.

The solution was proposed by Octavia Bruzzone.

Changing:

z1=N.zeros((x1.shape[0],y1.shape[0]),yc1.dtype)

to

z1=N.zeros((y1.shape[0],x1.shape[0]),yc1.dtype)

and then plotting contours as normal

pylab.contour(x1,y1,z1*100/len(magk),30,alpha=0.5,linewidths=5)

best wishes,
- Sebastian

I'm trying to make a contour plot of astronomical data binning data in a 2D
> grid. When I bin the data in square bins (eg 60 x 60), the contour plot
> works fine. But if I change the binning to (10 x 60), by changing the
> following line in  the code, from:
>
> "resolucion_x1=(xc1.max()-xc1.min())/(nceldas)"
> to
> "resolucion_x1=3* (xc1.max()-xc1.min())/(nceldas)"
>
> then I produce three arrays  (x1,y1,z1)
> with size 20,60,1200 instead of 60,60,3600
> BUT when I try plot a contour map with pylab.contour I get the follow
> error:
>
> "
> TypeError: Length of x must be number of columns in z,
> and length of y must be number of rows."
> and I think this occurs because:
>
> In [788]: shape(x1),shape(y1),shape(z1)
> Out[788]: ((20,), (60,), (20, 60))
>
> Any idea as to how to solve this so I can use rectangular binning
> with the code?
>
> I use the following code:
>
> import numpy as N
> from subprocess import *
> import pyfits
> import matplotlib
> import pylab
> import pickle
> from itertools import izip
>
> magk=N.loadtxt("mag_k.gz")
> magj=N.loadtxt("mag_j.gz")
>
>
> xc1=N.array(magj-magk)
> yc1=N.array(magk)
> print ("xc1 max",xc1.max())
> print ("yc1 max",yc1.max())
> print ("xc1 min",xc1.min())
> print ("yc1 min",yc1.min())
> print ("is nonnum",N.isnan(xc1).any())
> nceldas=60.0
> resolucion_x1=(xc1.max()-xc1.min())/(nceldas)
> resolucion_y1=(yc1.max()-yc1.min())/(nceldas)
> minix1=xc1.min();miniy1=yc1.min()
> x1=N.arange(xc1.min(),xc1.max(),resolucion_x1,dtype=xc1.dtype)
> y1=N.arange(yc1.min(),yc1.max(),resolucion_y1,dtype=yc1.dtype)
> z1=N.zeros((x1.shape[0],y1.shape[0]),yc1.dtype)
> print "x1 size" , x1.size
> print "y1 size" , y1.size
> print "z1 size" , z1.size
>
> xc1=(xc1-xc1.min())/resolucion_x1
> yc1=(yc1-yc1.min())/resolucion_y1
> print xc1.max(), xc1.min(),yc1.max(),yc1.min()
> for i,j in zip(xc1,yc1):
>    try: z1[int(j),int(i)]+=1.0
>    except: pass
>
> figure=pylab.figure()
> pylab.plot(magj-magk,magk,'b.',ms=2.3,alpha=0.70)
> pylab.ylim(pylab.ylim()[::-1])
> pylab.contour(x1,y1,z1*100/len(magk),30,alpha=1,linewidths=5)
> pylab.show()
>
> confused...
> - Sebastian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120329/1ed75037/attachment.html>

From ptittmann at gmail.com  Thu Mar 29 22:06:15 2012
From: ptittmann at gmail.com (Peter Tittmann)
Date: Thu, 29 Mar 2012 19:06:15 -0700
Subject: [SciPy-User] scipy.spatial module import errors
Message-ID: <E826FC64DF50493DABE06EB2EDAD62FF@gmail.com>

Hi, 

I'm using EPD 7.2  on mac OSX lion:

In [8]: scipy.__version__
Out[8]: '0.10.0'

When I attempt to load the spatial module. I am using spyder  and with 
>>>  scipy.spatial( 

I get the  docstrings.

When I try to load like this:

>>> kd=scipy.spatial.cKDTree(block,1000)

i get:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/Users/peter/src/spyderlib/<ipython-input-6-cfb06daebc25> in <module>()
----> 1 kd=scipy.spatial.cKDTree(block,1000)

AttributeError: 'module' object has no attribute 'spatial'

Can anyone suggest what might be going on, and or a solution?

Thanks!

Peter 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120329/6f9e7ed8/attachment.html>

From warren.weckesser at enthought.com  Fri Mar 30 00:46:00 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Thu, 29 Mar 2012 23:46:00 -0500
Subject: [SciPy-User] scipy.spatial module import errors
In-Reply-To: <E826FC64DF50493DABE06EB2EDAD62FF@gmail.com>
References: <E826FC64DF50493DABE06EB2EDAD62FF@gmail.com>
Message-ID: <CAM-+wY8-uD9=HRGWdPTrgV9DK4OLboq2JNktJAXu_zb+XpK18A@mail.gmail.com>

On Thu, Mar 29, 2012 at 9:06 PM, Peter Tittmann <ptittmann at gmail.com> wrote:

>  Hi,
>
> I'm using EPD 7.2  on mac OSX lion:
>
> In [8]: scipy.__version__
> Out[8]: '0.10.0'
>
> When I attempt to load the spatial module. I am using spyder  and with
> >>>  scipy.spatial(
>
> I get the  docstrings.
>
> When I try to load like this:
>
> >>> kd=scipy.spatial.cKDTree(block,1000)
>
> i get:
>
> ---------------------------------------------------------------------------
> AttributeError                            Traceback (most recent call last)
> /Users/peter/src/spyderlib/<ipython-input-6-cfb06daebc25> in <module>()
> ----> 1 kd=scipy.spatial.cKDTree(block,1000)
>
> AttributeError: 'module' object has no attribute 'spatial'
>
> Can anyone suggest what might be going on, and or a solution?
>
> Thanks!
>
> Peter
>
>

'scipy' is actually a collection of subpackages.  The subpackages are not
imported by executing 'import scipy'.  You'll have to explicitly import the
scipy.spatial package with 'import scipy.spatial'.

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120329/ca2979ef/attachment.html>

From jallikattu at googlemail.com  Thu Mar 29 02:48:26 2012
From: jallikattu at googlemail.com (morovia morovia)
Date: Thu, 29 Mar 2012 12:18:26 +0530
Subject: [SciPy-User] feature extraction from an image.
Message-ID: <CAL59mAm+MO5xd6ZBBn-vSEyL6=gP8+fMXBVVoDAk9tbqJNwqow@mail.gmail.com>

Dear scipy users,

       I am facing difficulty in the extraction of specific feature from an
image from a time series.  There is a small speck which is irregular in
shape within the circular region of the domain.

I am trying to calculate the velocity at which the speck is moving based on
this.  A sample image is attached.

Thanks
Best regards

Viswanath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120329/2d3af9b5/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 02.jpeg
Type: image/jpeg
Size: 41813 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120329/2d3af9b5/attachment.jpeg>

From zachary.pincus at yale.edu  Fri Mar 30 10:31:58 2012
From: zachary.pincus at yale.edu (Zachary Pincus)
Date: Fri, 30 Mar 2012 10:31:58 -0400
Subject: [SciPy-User] feature extraction from an image.
In-Reply-To: <CAL59mAm+MO5xd6ZBBn-vSEyL6=gP8+fMXBVVoDAk9tbqJNwqow@mail.gmail.com>
References: <CAL59mAm+MO5xd6ZBBn-vSEyL6=gP8+fMXBVVoDAk9tbqJNwqow@mail.gmail.com>
Message-ID: <79518D61-C9AB-4C22-AEA7-59E7F4D9E03A@yale.edu>

> Dear scipy users,
> 
>        I am facing difficulty in the extraction of specific feature from an
> image from a time series.

In general, you may want to investigate the tools in the scikits-image ("skimage") package:
http://scikits-image.org/


> There is a small speck which is irregular in shape within the circular region of the domain.
> 
> I am trying to calculate the velocity at which the speck is moving based on
> this.  A sample image is attached.

This could be a very easy task or a very hard task, depending on a lot of information about the timeseries images that you haven't provided:
(1) Is the circular region constant in size/position/coloration?
(2) Is the "speck" constant in shape/coloration?
(3) Are there other transient "specks" or other sources of noise?
(4) Are there multiple "specks" at once?
(5) How many time-series do you have? How many frames per time-series?
(6) Are there variations (in any of the above features) between the different time-series you have that may not be present within each single time-series? (Such as differences in coloration or position of the circular region?)
and so forth.

Perhaps you could post a sample timeseries somewhere online and provide a link, so that it's easier to get a sense of the problem?

Also, where is this data derived from? Perhaps that information would help formulate a good solution.

Zach

From Jerome.Kieffer at esrf.fr  Fri Mar 30 10:55:39 2012
From: Jerome.Kieffer at esrf.fr (Jerome Kieffer)
Date: Fri, 30 Mar 2012 16:55:39 +0200
Subject: [SciPy-User] feature extraction from an image.
In-Reply-To: <CAL59mAm+MO5xd6ZBBn-vSEyL6=gP8+fMXBVVoDAk9tbqJNwqow@mail.gmail.com>
References: <CAL59mAm+MO5xd6ZBBn-vSEyL6=gP8+fMXBVVoDAk9tbqJNwqow@mail.gmail.com>
Message-ID: <20120330165539.8c15088c.Jerome.Kieffer@esrf.fr>

On Thu, 29 Mar 2012 12:18:26 +0530
morovia morovia <jallikattu at googlemail.com> wrote:

>        I am facing difficulty in the extraction of specific feature from an
> image from a time series.  There is a small speck which is irregular in
> shape within the circular region of the domain.
> 
> I am trying to calculate the velocity at which the speck is moving based on
> this.  A sample image is attached.

I made some bindings for "feature extraction" of images (like SIFT and SURF) for image alignment.
The code is here:
https://github.com/kif/imageAlignment

cheers,

-- 
J?r?me Kieffer
On-Line Data analysis / Software Group 
ISDD / ESRF
tel +33 476 882 445


From pav at iki.fi  Fri Mar 30 14:03:54 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 30 Mar 2012 20:03:54 +0200
Subject: [SciPy-User] linalg.eigh hangs only after importing sparse
	module
In-Reply-To: <CADo_NTzKHuGUuw8L3NmA3uJiaBB-p9O1tSPJst4HwCVJPFvy-g@mail.gmail.com>
References: <CADo_NTxYc_HwXE0LXnfHg-1nH3ZXchVd0A3V-oYdNQNQ1fi6AQ@mail.gmail.com>
	<jk879u$2bh$1@dough.gmane.org> <jk88ug$faf$1@dough.gmane.org>
	<CADo_NTx6HGO=rDpasdJcD1BtWjGEpbsda1+Wk790CQvuUfthWQ@mail.gmail.com>
	<loom.20120328T213351-679@post.gmane.org>
	<CADo_NTzKHuGUuw8L3NmA3uJiaBB-p9O1tSPJst4HwCVJPFvy-g@mail.gmail.com>
Message-ID: <jl4sia$kdr$1@dough.gmane.org>

29.03.2012 17:28, Nicolas Pinto kirjoitti:
>> I think you should also verify this by copying `_interpolate.so` outside Scipy
>> and importing it --- namely, `from scipy.interpolate import ...` import also
>> `scipy.sparse` so you cannot pinpoint the problem to `_interpolate`.
> 
> Good point. I can still reproduce the bug by copying `_interpolate.so`.

To get further, the following information is needed:

- which platform?
- which binaries?
- which LAPACK?

I'm assuming you're on 64-bit Windows 7. If so, I don't have good clues
on how to fix or debug the issue. However, if it's really the C++
runtime that is causing the problems, then compiling Numpy/Scipy with a
different compiler could fix the problem.

-- 
Pauli Virtanen


From klonuo at gmail.com  Sat Mar 31 12:24:38 2012
From: klonuo at gmail.com (klo uo)
Date: Sat, 31 Mar 2012 18:24:38 +0200
Subject: [SciPy-User] ndimage/morphology - binary dilation and erosion?
Message-ID: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>

While preparing some images for OCR, I usually discard those with low DPI,
but as this happens often I thought to try some image processing and
on suggestion (morphological operations) I tried ndimage.morph with idea to
play around binary_dilation

Images were G4 TIFFs which PIL/MPL can't decode, so I convert to 1bit PNG
which I normalized after to 0 and 1.

On sample img I applied:

ndi.morphology.binary_dilation(img).astype(img.dtype)

and

ndi.morphology.binary_erosion(img).astype(img.dtype)

I attached result images, and wanted to ask two question:

1. Is this result correct? From what I read today seems like what dilation
does is erosion and vice versa, but I probably overlooked something
2. Does someone maybe know of better approach for enhancing original sample
for OCR (except thresholding, for which I'm aware)?

TIA

[image: Inline image 1]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/bcc775f2/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 11309 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/bcc775f2/attachment.png>

From klonuo at gmail.com  Sat Mar 31 12:43:48 2012
From: klonuo at gmail.com (klo uo)
Date: Sat, 31 Mar 2012 18:43:48 +0200
Subject: [SciPy-User] ndimage/morphology - binary dilation and erosion?
In-Reply-To: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>
References: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>
Message-ID: <CAA-8Ld_+ZiiKBH=O0bLR1wNqyr-Zr9GA+VzEy-tGR75PAPz7jw@mail.gmail.com>

On Sat, Mar 31, 2012 at 6:24 PM, klo uo <klonuo at gmail.com> wrote:

>
> (except thresholding, for which I'm aware)?
>
>
I mean here upsample/blur/sharpen/threshold
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/c33cc8a8/attachment.html>

From tsyu80 at gmail.com  Sat Mar 31 12:48:40 2012
From: tsyu80 at gmail.com (Tony Yu)
Date: Sat, 31 Mar 2012 12:48:40 -0400
Subject: [SciPy-User] ndimage/morphology - binary dilation and erosion?
In-Reply-To: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>
References: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>
Message-ID: <CAEym_Hq-wHXVe+Q9LcdqeB4L66pCAaZSp0Mh1dTfemLDqy9oog@mail.gmail.com>

On Sat, Mar 31, 2012 at 12:24 PM, klo uo <klonuo at gmail.com> wrote:

> While preparing some images for OCR, I usually discard those with low DPI,
> but as this happens often I thought to try some image processing and
> on suggestion (morphological operations) I tried ndimage.morph with idea
> to play around binary_dilation
>
> Images were G4 TIFFs which PIL/MPL can't decode, so I convert to 1bit PNG
> which I normalized after to 0 and 1.
>
> On sample img I applied:
>
> ndi.morphology.binary_dilation(img).astype(img.dtype)
>
> and
>
> ndi.morphology.binary_erosion(img).astype(img.dtype)
>
> I attached result images, and wanted to ask two question:
>
> 1. Is this result correct? From what I read today seems like what dilation
> does is erosion and vice versa, but I probably overlooked something
>

This result looks correct to me. I think it depends on what you consider
"object" and "background": Typically (I think), image-processing operators
consider light regions to be objects and dark objects to be background. So
dilation grows right regions and erosion shrinks bright regions. Obviously,
in your images, definitions of object and background are reversed (black is
object; white is background).


> 2. Does someone maybe know of better approach for enhancing original
> sample for OCR (except thresholding, for which I'm aware)?
>

Have you tried the `open` and `close` operators? A morphological opening is
just an erosion followed by a dilation and the closing is just the reverse
(see e.g., the scikits-image
docstrings<http://scikits-image.org/docs/dev/api/skimage.morphology.html#greyscale-open>).
For an opening, the erosion would remove some of "salt" (white pixels) in
the letters, and the dilation would (more-or-less) restore the letters to
their original thickness. The closing would do the same for black pixels on
the background.

There are other approaches of course, but since you're already thinking
about erosion and dilation, these came to mind

-Tony


> TIA
>
> [image: Inline image 1]
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/0c328aef/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 11309 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/0c328aef/attachment.png>

From klonuo at gmail.com  Sat Mar 31 13:09:41 2012
From: klonuo at gmail.com (klo uo)
Date: Sat, 31 Mar 2012 19:09:41 +0200
Subject: [SciPy-User] ndimage/morphology - binary dilation and erosion?
In-Reply-To: <CAEym_Hq-wHXVe+Q9LcdqeB4L66pCAaZSp0Mh1dTfemLDqy9oog@mail.gmail.com>
References: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>
	<CAEym_Hq-wHXVe+Q9LcdqeB4L66pCAaZSp0Mh1dTfemLDqy9oog@mail.gmail.com>
Message-ID: <CAA-8Ld-A2727RO-+eaJQVXEXbpHe9bqkCTOSsAy54zn5VcKHKQ@mail.gmail.com>

On Sat, Mar 31, 2012 at 6:48 PM, Tony Yu <tsyu80 at gmail.com> wrote:

>
>> 1. Is this result correct? From what I read today seems like what
>> dilation does is erosion and vice versa, but I probably overlooked something
>>
>
> This result looks correct to me. I think it depends on what you consider
> "object" and "background": Typically (I think), image-processing operators
> consider light regions to be objects and dark objects to be background. So
> dilation grows right regions and erosion shrinks bright regions. Obviously,
> in your images, definitions of object and background are reversed (black is
> object; white is background).
>

You are right. I thought on first it couldn't be flip logic, but thinking
more about it and then backing with result from abs(img-1) shows it's just
like that.


> 2. Does someone maybe know of better approach for enhancing original
>> sample for OCR (except thresholding, for which I'm aware)?
>>
>
> Have you tried the `open` and `close` operators? A morphological opening
> is just an erosion followed by a dilation and the closing is just the
> reverse (see e.g., the scikits-image docstrings<http://scikits-image.org/docs/dev/api/skimage.morphology.html#greyscale-open>).
> For an opening, the erosion would remove some of "salt" (white pixels) in
> the letters, and the dilation would (more-or-less) restore the letters to
> their original thickness. The closing would do the same for black pixels on
> the background.
>

Thanks for suggestion. Is morphology module in skimage reflection of
ndimage, or it's separate implementation?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/826f4f9d/attachment.html>

From klonuo at gmail.com  Sat Mar 31 13:18:02 2012
From: klonuo at gmail.com (klo uo)
Date: Sat, 31 Mar 2012 19:18:02 +0200
Subject: [SciPy-User] ndimage/morphology - binary dilation and erosion?
In-Reply-To: <CAA-8Ld-A2727RO-+eaJQVXEXbpHe9bqkCTOSsAy54zn5VcKHKQ@mail.gmail.com>
References: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>
	<CAEym_Hq-wHXVe+Q9LcdqeB4L66pCAaZSp0Mh1dTfemLDqy9oog@mail.gmail.com>
	<CAA-8Ld-A2727RO-+eaJQVXEXbpHe9bqkCTOSsAy54zn5VcKHKQ@mail.gmail.com>
Message-ID: <CAA-8Ld9VfS=335Dj4iXu7Hd5we9EJn7jEWw9RN3PAmx0Rsz58Q@mail.gmail.com>

On Sat, Mar 31, 2012 at 7:09 PM, klo uo <klonuo at gmail.com> wrote:

>
> Is morphology module in skimage reflection of ndimage, or it's separate
> implementation?
>

Nevermind. It's clearly different implementation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/c7661d78/attachment.html>

From klonuo at gmail.com  Sat Mar 31 14:08:41 2012
From: klonuo at gmail.com (klo uo)
Date: Sat, 31 Mar 2012 20:08:41 +0200
Subject: [SciPy-User] ndimage/morphology - binary dilation and erosion?
In-Reply-To: <CAEym_Hq-wHXVe+Q9LcdqeB4L66pCAaZSp0Mh1dTfemLDqy9oog@mail.gmail.com>
References: <CAA-8Ld-u-WDE=0ahAXnYZndm_=r7te_Bc-T5F5eivk=LvWmQHA@mail.gmail.com>
	<CAEym_Hq-wHXVe+Q9LcdqeB4L66pCAaZSp0Mh1dTfemLDqy9oog@mail.gmail.com>
Message-ID: <CAA-8Ld_mf99UuioRq2YgM0Y-AwTcoaGyN4FPzQqF6+s3PZsJEQ@mail.gmail.com>

On Sat, Mar 31, 2012 at 6:48 PM, Tony Yu <tsyu80 at gmail.com> wrote:

> Have you tried the `open` and `close` operators? A morphological opening
> is just an erosion followed by a dilation and the closing is just the
> reverse (see e.g., the scikits-image docstrings<http://scikits-image.org/docs/dev/api/skimage.morphology.html#greyscale-open>).
> For an opening, the erosion would remove some of "salt" (white pixels) in
> the letters, and the dilation would (more-or-less) restore the letters to
> their original thickness. The closing would do the same for black pixels on
> the background.
>

 I tried grey opening on sample image with both modules. Approach seems
good and result is bit identical with both modules (footprint=square(3)),
and I thought to comment on differences on both modules:

 - skimage requires converting data type to 'uint8' and won't accept
anything less
 - ndimage grey opening is 3 times faster on my PC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120331/e6d9b7e5/attachment.html>