From josef.pktd at gmail.com  Sat Jan  3 00:29:14 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 3 Jan 2009 00:29:14 -0500
Subject: [SciPy-user] rewriting stats.spearmanr
Message-ID: <1cd32cbb0901022129o5277e217k1911b93406fb4fc7@mail.gmail.com>

spearmanr in scipy.stats does not handle ties correctly.

I was looking at a way to fix it, and ended up instead with a complete
rewrite. The main difference to the current version is that it can
return a correlation matrix for several variables at the same time.
This came pretty cheap, because, instead of using the old shortcut
formula for Spearmans rho, I just use np.corrcoef. Calculating the
correlation matrix takes 3 lines, but as usual dimension handling and
tests scripts take several times more time and lines than the function
itself.

Results are verified with R (through rpy) and are the same to 15,16
digits for both integer variables with ties and continuous variables
without ties, although R has more options and has exact test
statistic.

I could keep the API completely consistent with the current version,
but I would like to return also the test-statistic, and not just the
p-value, this
would, however, require to return a 3-tuple instead of a 2-tuple.

new signature is: spearmanr(a, b=None, axis=0):

Notes are below and new function and test scripts are in attachment.

Comments?

Josef


    Notes
    -----

    main changes to existing stats.spearmanr
    * correct tie handling
    * calculates correlation matrix instead of only single correlation
      coefficient,
      similar to np.corrcoef but using keyword argument axis=0 (default)
    * returns also t-statistic (can be dropped for backwards compatibility)
    * open question, zero division
        >>> stats.spearmanr([1,1,1,1],[2,2,2,2])
        (1.0, 0.0)
        >>> spearmanr([1,1,1,1],[2,2,2,2])
        (-1.#IND, -1.#IND, 0.0)
        >>> np.corrcoef([1,1,1,1],[2,2,2,2])
        array([[ NaN,  NaN],
               [ NaN,  NaN]])

    comparison to stats.mstats.spearmanr
    * both have correct tie handling
    * mstats.spearmanr
      - ravels if more than 1 variable per array
      - calculates only one correlation coefficient, no correlation matrix
      - uses masked arrays

    difference to np.corrcoef
    * using keyword argument axis=0 (default), instead of rowvar=1
    * returns one correlation coefficient for two variables, instead of
      2 by 2 matrix

    comparison to R
    * identical correlation matrix if only one array given
    * if 2 arrays are given, then R only returns cross-correlation
    * p-value is the same as in R with exact=False
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: spearmanr_rewrite.py
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090103/61002027/attachment.ksh>

From josef.pktd at gmail.com  Sun Jan  4 00:00:19 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 4 Jan 2009 00:00:19 -0500
Subject: [SciPy-user] stats.gaussian_kde prevent oversmoothing
Message-ID: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com>

I was working on an example for stats.gaussian_kde.

In one example I have a 1 dimensional mixture of normal distribution,
and the estimated distribution stats.gaussian_kde is too smooth, the
peaks are to small compared to the original distribution.

What's the easiest way to reduce the bandwidth for stats.gaussian_kde?
I didn't find any direct option. Is subclassing the only way?

Josef


From robert.kern at gmail.com  Sun Jan  4 00:05:24 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 3 Jan 2009 23:05:24 -0600
Subject: [SciPy-user] stats.gaussian_kde prevent oversmoothing
In-Reply-To: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com>
References: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com>
Message-ID: <3d375d730901032105h264059a1l3a0de927773f26f8@mail.gmail.com>

On Sat, Jan 3, 2009 at 23:00,  <josef.pktd at gmail.com> wrote:
> I was working on an example for stats.gaussian_kde.
>
> In one example I have a 1 dimensional mixture of normal distribution,
> and the estimated distribution stats.gaussian_kde is too smooth, the
> peaks are to small compared to the original distribution.
>
> What's the easiest way to reduce the bandwidth for stats.gaussian_kde?
> I didn't find any direct option. Is subclassing the only way?

Currently, yes. Feel free to enhance the code to allow for more flexibility.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From josef.pktd at gmail.com  Sun Jan  4 01:05:25 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 4 Jan 2009 01:05:25 -0500
Subject: [SciPy-user] stats.gaussian_kde prevent oversmoothing
In-Reply-To: <3d375d730901032105h264059a1l3a0de927773f26f8@mail.gmail.com>
References: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com>
	<3d375d730901032105h264059a1l3a0de927773f26f8@mail.gmail.com>
Message-ID: <1cd32cbb0901032205m18c22fd6xa85c26bdc0083bef@mail.gmail.com>

On Sun, Jan 4, 2009 at 12:05 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Sat, Jan 3, 2009 at 23:00,  <josef.pktd at gmail.com> wrote:
>> I was working on an example for stats.gaussian_kde.
>>
>> In one example I have a 1 dimensional mixture of normal distribution,
>> and the estimated distribution stats.gaussian_kde is too smooth, the
>> peaks are to small compared to the original distribution.
>>
>> What's the easiest way to reduce the bandwidth for stats.gaussian_kde?
>> I didn't find any direct option. Is subclassing the only way?
>
> Currently, yes. Feel free to enhance the code to allow for more flexibility.
>

Thanks,

I tried some quick monkey patching and it works, e.g.

    def covariance_factor(self):
        return 0.1
    gkde=stats.gaussian_kde(xn) #get kde for original sample

    setattr(gkde, 'covariance_factor', covariance_factor.__get__(gkde,
type(gkde)))
    gkde._compute_covariance()

and then call gkde.evaluate

The automatic covariance_factor was at around 0.25 in the example.
After setting it to 0.1, the kde gets both peaks of the mixture
correctly.

After some googling, I found a discussion from the mailinglist
http://www.nabble.com/Width-of-the-gaussian-in-stats.kde.gaussian_kde---td19558924.html

Currently, I was just trying to find out how the different functions
and classes in scipy.stats work.

Josef


From scott.p.macdonald at gmail.com  Sun Jan  4 14:53:42 2009
From: scott.p.macdonald at gmail.com (Scott MacDonald)
Date: Sun, 4 Jan 2009 12:53:42 -0700
Subject: [SciPy-user] mapminmax function?
Message-ID: <c4adf6e90901041153v579634f8s6f98f345b9d9451d@mail.gmail.com>

I was wondering if there is a function analogous to Matlab's mapminmax (in
the neural network toolbox)?

Thanks,
Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090104/1c687327/attachment.html>

From contact at pythonxy.com  Sun Jan  4 15:43:07 2009
From: contact at pythonxy.com (Pierre Raybaut)
Date: Sun, 04 Jan 2009 21:43:07 +0100
Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.8
Message-ID: <49611F5B.70009@pythonxy.com>

Hi all,

Release 2.1.8 is now available on http://www.pythonxy.com:
   - All-in-One Installer ("Full Edition"),
   - Plugin Installer -- to be downloaded with xyweb,
   - Update

Changes history
Version 2.1.8 (01-04-2009)

     * Added:
           o SciTE 1.77.0 (replacement for Notepad++)
           o WinMerge 2.10.2 - Open Source differencing and merging tool 
for Windows
     * Updated:
           o Console 2.0.141.6
           o VPython 5.0.1.0
           o xy 1.0.16
           o xydoc 1.0.2
           o IPython 0.9.1.6
     * Corrected:
           o Issues 50, 51, 52


Regards,
Pierre Raybaut


From robert.kern at gmail.com  Sun Jan  4 16:12:56 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 4 Jan 2009 15:12:56 -0600
Subject: [SciPy-user] mapminmax function?
In-Reply-To: <c4adf6e90901041153v579634f8s6f98f345b9d9451d@mail.gmail.com>
References: <c4adf6e90901041153v579634f8s6f98f345b9d9451d@mail.gmail.com>
Message-ID: <3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com>

On Sun, Jan 4, 2009 at 13:53, Scott MacDonald
<scott.p.macdonald at gmail.com> wrote:
> I was wondering if there is a function analogous to Matlab's mapminmax (in
> the neural network toolbox)?

What does it do?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From scott.p.macdonald at gmail.com  Sun Jan  4 17:54:18 2009
From: scott.p.macdonald at gmail.com (Scott MacDonald)
Date: Sun, 4 Jan 2009 15:54:18 -0700
Subject: [SciPy-user] mapminmax function?
In-Reply-To: <3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com>
References: <c4adf6e90901041153v579634f8s6f98f345b9d9451d@mail.gmail.com>
	<3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com>
Message-ID: <c4adf6e90901041454y5d622c1cv28fde3d4bbebb0be@mail.gmail.com>

Oops, I guess that information would have been helpful.  The description:

%   MAPMINMAX processes matrices by normalizing the minimum and maximum
values
%   of each row to [YMIN, YMAX].

%   MAPMINMAX(X,YMIN,YMAX) takes X and optional parameters,
%     X - NxQ matrix or a 1xTS row cell array of NxQ matrices.
%     YMIN - Minimum value for each row of Y. (Default is -1)
%     YMAX - Maximum value for each row of Y. (Default is +1)
%   and returns,
%     Y - Each MxQ matrix (where M == N) (optional).
%     PS - Process settings, to allow consistent processing of values.

%    Examples
%
%   Here is how to format a matrix so that the minimum and maximum
%   values of each row are mapped to default interval [-1,+1].
%
%     x1 = [1 2 4; 1 1 1; 3 2 2; 0 0 0]
%     [y1,ps] = mapminmax(x1)
%
%   Next, we apply the same processing settings to new values.
%
%     x2 = [5 2 3; 1 1 1; 6 7 3; 0 0 0]
%     y2 = mapminmax('apply',x2,ps)
%
%   Here we reverse the processing of y1 to get x1 again.
%
%     x1_again = mapminmax('reverse',y1,ps)
%
%  Algorithm
%
%     It is assumed that X has only finite real values, and that
%     the elements of each row are not all equal.
%
%     y = (ymax-ymin)*(x-xmin)/(xmax-xmin) + ymin;

On Sun, Jan 4, 2009 at 2:12 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Jan 4, 2009 at 13:53, Scott MacDonald
> <scott.p.macdonald at gmail.com> wrote:
> > I was wondering if there is a function analogous to Matlab's mapminmax
> (in
> > the neural network toolbox)?
>
> What does it do?
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090104/da28a7e0/attachment.html>

From robert.kern at gmail.com  Sun Jan  4 18:11:33 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 4 Jan 2009 18:11:33 -0500
Subject: [SciPy-user] mapminmax function?
In-Reply-To: <c4adf6e90901041454y5d622c1cv28fde3d4bbebb0be@mail.gmail.com>
References: <c4adf6e90901041153v579634f8s6f98f345b9d9451d@mail.gmail.com>
	<3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com>
	<c4adf6e90901041454y5d622c1cv28fde3d4bbebb0be@mail.gmail.com>
Message-ID: <3d375d730901041511o7a3238b7qd7bc7d931ce6bf0f@mail.gmail.com>

On Sun, Jan 4, 2009 at 17:54, Scott MacDonald
<scott.p.macdonald at gmail.com> wrote:
> Oops, I guess that information would have been helpful.  The description:
>
> %   MAPMINMAX processes matrices by normalizing the minimum and maximum
> values
> %   of each row to [YMIN, YMAX].
>
> %   MAPMINMAX(X,YMIN,YMAX) takes X and optional parameters,
> %     X - NxQ matrix or a 1xTS row cell array of NxQ matrices.
> %     YMIN - Minimum value for each row of Y. (Default is -1)
> %     YMAX - Maximum value for each row of Y. (Default is +1)
> %   and returns,
> %     Y - Each MxQ matrix (where M == N) (optional).
> %     PS - Process settings, to allow consistent processing of values.
>
> %    Examples
> %
> %   Here is how to format a matrix so that the minimum and maximum
> %   values of each row are mapped to default interval [-1,+1].
> %
> %     x1 = [1 2 4; 1 1 1; 3 2 2; 0 0 0]
> %     [y1,ps] = mapminmax(x1)
> %
> %   Next, we apply the same processing settings to new values.
> %
> %     x2 = [5 2 3; 1 1 1; 6 7 3; 0 0 0]
> %     y2 = mapminmax('apply',x2,ps)

<sigh> Every time I manage to forget why I hate Matlab, they make a
function with a schizoid argument spec like this.

No, there's nothing floating around that does this. Here's a quick
implementation. It needs robustifying (it doesn't handle the
all-values-equal case), but it doesn't overload the function's
arguments. Sometimes objects really are the solution.

In [1]: import numpy as np

In [3]: class MapMinMaxApplier(object):
    def __init__(self, slope, intercept):
        self.slope = slope
        self.intercept = intercept
    def __call__(self, x):
        return x * self.slope + self.intercept
    def reverse(self, y):
        return (y-self.intercept) / self.slope
   ....:
   ....:

In [11]: def mapminmax(x, ymin=-1, ymax=+1):
   ....:     x = np.asanyarray(x)
   ....:     xmax = x.max(axis=-1)
   ....:     xmin = x.min(axis=-1)
   ....:     if (xmax==xmin).any():
   ....:         raise ValueError("some rows have no variation")
   ....:     slope = ((ymax-ymin) / (xmax - xmin))[:,np.newaxis]
   ....:     intercept = (-xmin*(ymax-ymin)/(xmax-xmin))[:,np.newaxis] + ymin
   ....:     ps = MapMinMaxApplier(slope, intercept)
   ....:     return ps(x), ps
   ....:

In [12]: x1 = np.array([[1.,2,4], [1,1,2], [3,2,2],[0,0,1]])

In [14]: y1, ps = mapminmax(x1)

In [15]: y1
Out[15]:
array([[-1.        , -0.33333333,  1.        ],
       [-1.        , -1.        ,  1.        ],
       [ 1.        , -1.        , -1.        ],
       [-1.        , -1.        ,  1.        ]])

In [16]: ps(x1)
Out[16]:
array([[-1.        , -0.33333333,  1.        ],
       [-1.        , -1.        ,  1.        ],
       [ 1.        , -1.        , -1.        ],
       [-1.        , -1.        ,  1.        ]])

In [17]: ps.reverse(y1)
Out[17]:
array([[ 1.,  2.,  4.],
       [ 1.,  1.,  2.],
       [ 3.,  2.,  2.],
       [ 0.,  0.,  1.]])

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From scott.p.macdonald at gmail.com  Sun Jan  4 18:29:06 2009
From: scott.p.macdonald at gmail.com (Scott MacDonald)
Date: Sun, 4 Jan 2009 16:29:06 -0700
Subject: [SciPy-user] mapminmax function?
In-Reply-To: <3d375d730901041511o7a3238b7qd7bc7d931ce6bf0f@mail.gmail.com>
References: <c4adf6e90901041153v579634f8s6f98f345b9d9451d@mail.gmail.com>
	<3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com>
	<c4adf6e90901041454y5d622c1cv28fde3d4bbebb0be@mail.gmail.com>
	<3d375d730901041511o7a3238b7qd7bc7d931ce6bf0f@mail.gmail.com>
Message-ID: <c4adf6e90901041529g1bb902bha489b795b2ea867b@mail.gmail.com>

Thank you, much appreciated.

Scott

On Sun, Jan 4, 2009 at 4:11 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Jan 4, 2009 at 17:54, Scott MacDonald
> <scott.p.macdonald at gmail.com> wrote:
> > Oops, I guess that information would have been helpful.  The description:
> >
> > %   MAPMINMAX processes matrices by normalizing the minimum and maximum
> > values
> > %   of each row to [YMIN, YMAX].
> >
> > %   MAPMINMAX(X,YMIN,YMAX) takes X and optional parameters,
> > %     X - NxQ matrix or a 1xTS row cell array of NxQ matrices.
> > %     YMIN - Minimum value for each row of Y. (Default is -1)
> > %     YMAX - Maximum value for each row of Y. (Default is +1)
> > %   and returns,
> > %     Y - Each MxQ matrix (where M == N) (optional).
> > %     PS - Process settings, to allow consistent processing of values.
> >
> > %    Examples
> > %
> > %   Here is how to format a matrix so that the minimum and maximum
> > %   values of each row are mapped to default interval [-1,+1].
> > %
> > %     x1 = [1 2 4; 1 1 1; 3 2 2; 0 0 0]
> > %     [y1,ps] = mapminmax(x1)
> > %
> > %   Next, we apply the same processing settings to new values.
> > %
> > %     x2 = [5 2 3; 1 1 1; 6 7 3; 0 0 0]
> > %     y2 = mapminmax('apply',x2,ps)
>
> <sigh> Every time I manage to forget why I hate Matlab, they make a
> function with a schizoid argument spec like this.
>
> No, there's nothing floating around that does this. Here's a quick
> implementation. It needs robustifying (it doesn't handle the
> all-values-equal case), but it doesn't overload the function's
> arguments. Sometimes objects really are the solution.
>
> In [1]: import numpy as np
>
> In [3]: class MapMinMaxApplier(object):
>    def __init__(self, slope, intercept):
>        self.slope = slope
>        self.intercept = intercept
>    def __call__(self, x):
>        return x * self.slope + self.intercept
>    def reverse(self, y):
>        return (y-self.intercept) / self.slope
>   ....:
>   ....:
>
> In [11]: def mapminmax(x, ymin=-1, ymax=+1):
>   ....:     x = np.asanyarray(x)
>   ....:     xmax = x.max(axis=-1)
>   ....:     xmin = x.min(axis=-1)
>   ....:     if (xmax==xmin).any():
>   ....:         raise ValueError("some rows have no variation")
>   ....:     slope = ((ymax-ymin) / (xmax - xmin))[:,np.newaxis]
>   ....:     intercept = (-xmin*(ymax-ymin)/(xmax-xmin))[:,np.newaxis] +
> ymin
>   ....:     ps = MapMinMaxApplier(slope, intercept)
>   ....:     return ps(x), ps
>   ....:
>
> In [12]: x1 = np.array([[1.,2,4], [1,1,2], [3,2,2],[0,0,1]])
>
> In [14]: y1, ps = mapminmax(x1)
>
> In [15]: y1
> Out[15]:
> array([[-1.        , -0.33333333,  1.        ],
>       [-1.        , -1.        ,  1.        ],
>       [ 1.        , -1.        , -1.        ],
>       [-1.        , -1.        ,  1.        ]])
>
> In [16]: ps(x1)
> Out[16]:
> array([[-1.        , -0.33333333,  1.        ],
>       [-1.        , -1.        ,  1.        ],
>       [ 1.        , -1.        , -1.        ],
>       [-1.        , -1.        ,  1.        ]])
>
> In [17]: ps.reverse(y1)
> Out[17]:
> array([[ 1.,  2.,  4.],
>       [ 1.,  1.,  2.],
>       [ 3.,  2.,  2.],
>       [ 0.,  0.,  1.]])
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090104/f265da24/attachment.html>

From stef.mientki at gmail.com  Sun Jan  4 18:38:38 2009
From: stef.mientki at gmail.com (Stef Mientki)
Date: Mon, 05 Jan 2009 00:38:38 +0100
Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.8
In-Reply-To: <49611F5B.70009@pythonxy.com>
References: <49611F5B.70009@pythonxy.com>
Message-ID: <4961487E.9000300@gmail.com>

Pierre Raybaut wrote:
> Hi all,
>
> Release 2.1.8 is now available on http://www.pythonxy.com:
>    - All-in-One Installer ("Full Edition"),
>    - Plugin Installer -- to be downloaded with xyweb,
>    - Update
>
> Changes history
> Version 2.1.8 (01-04-2009)
>
>      * Added:
>            o SciTE 1.77.0 (replacement for Notepad++)
>            o WinMerge 2.10.2 - Open Source differencing and merging tool 
> for Windows
>      * Updated:
>            o Console 2.0.141.6
>            o VPython 5.0.1.0
>   
Isn't VPython-5 still a little buggy and missing features of VPyton-3 ?
And why only for Windows ?
I would suggest to add both  VPython-3 and VPython-5,
and use a programmatical switch between these two.

cheers,
Stef


From pgmdevlist at gmail.com  Sun Jan  4 19:17:54 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Sun, 4 Jan 2009 19:17:54 -0500
Subject: [SciPy-user] rewriting stats.spearmanr
In-Reply-To: <1cd32cbb0901022129o5277e217k1911b93406fb4fc7@mail.gmail.com>
References: <1cd32cbb0901022129o5277e217k1911b93406fb4fc7@mail.gmail.com>
Message-ID: <7D9E357E-2049-42F3-AF4D-4564E59B0578@gmail.com>


On Jan 3, 2009, at 12:29 AM, josef.pktd at gmail.com wrote:

> spearmanr in scipy.stats does not handle ties correctly.
>
>    comparison to stats.mstats.spearmanr
>    * both have correct tie handling
>    * mstats.spearmanr
>      - ravels if more than 1 variable per array
>      - calculates only one correlation coefficient, no correlation  
> matrix
>      - uses masked arrays

Josef,
Please feel free to modify mstats.spermanr to match your new  
implementation (especially the correlation matrix).
In any case, the two functions should have the same signature and  
output the same results for arrays w/o missing values.


From josef.pktd at gmail.com  Sun Jan  4 20:41:29 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 4 Jan 2009 20:41:29 -0500
Subject: [SciPy-user] scipy tutorial for stats
Message-ID: <1cd32cbb0901041741i5b420024s8def9df9fcd4560a@mail.gmail.com>

Since the scipy tutorial for stats was almost empty, I started to
write some comments and examples for it. A first draft is at

http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/

(I'm not sure if all the formatting is correct)

Currently it is very heavily focused on the parts that I was
rewriting, especially distributions and tests (ks and t).

Since scipy.stats is an agglomeration of different functions and
groups of functions, it is not so obvious what a good structure and
topic list for a stats tutorial is.

Hopefully the function and methods will get their own docstring
examples, then, I think, that it would be more useful if the tutorial
shows how to group or tie the functions together. I saw that
py4science and sage have stats introductory examples, and something
like that would be helpful for new users.

Proposals for a tutorial structure and clarifying the objective would
be helpful.

Personally, I prefer recipes and longer examples scripts to tutorials,
as for example the matplotlib example page. Is there a place in the
new docs where we can list and link to recipes and example scripts?

Josef


From peter.skomoroch at gmail.com  Sun Jan  4 21:37:03 2009
From: peter.skomoroch at gmail.com (Peter Skomoroch)
Date: Sun, 4 Jan 2009 21:37:03 -0500
Subject: [SciPy-user] fast max() on sparse matrices
Message-ID: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>

Does anyone have suggestions on a fast max() function for sparse matrices
(COO, CSC, or CSR format)?

I was thinking of slicing CSC or CSR matrices, and iterating through the
columns, but I suspect any loop based approach will be slow.

def sparse_amax(V):
    """Returns the max of a sparse CSR matrix V with shape (m,n)
    m = number of examples (# columns),
    n = dimensionality of examples (# rows) """
    n,m = V.shape
    # if type is CSR, slice by rows
    maxvals = []
    for row in xrange(n):
        #find max of row
        maxvals.append(max(array(V[row,:].todense())[0]))
    Vmax = max(maxvals)
    return Vmax

-- 
Peter N. Skomoroch
peter.skomoroch at gmail.com
http://www.datawrangling.com
http://del.icio.us/pskomoroch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090104/72ea13b0/attachment.html>

From wbaxter at gmail.com  Sun Jan  4 21:48:11 2009
From: wbaxter at gmail.com (Bill Baxter)
Date: Mon, 5 Jan 2009 11:48:11 +0900
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
Message-ID: <e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>

On Mon, Jan 5, 2009 at 11:37 AM, Peter Skomoroch
<peter.skomoroch at gmail.com> wrote:
> Does anyone have suggestions on a fast max() function for sparse matrices
> (COO, CSC, or CSR format)?
>
> I was thinking of slicing CSC or CSR matrices, and iterating through the
> columns, but I suspect any loop based approach will be slow.
>
> def sparse_amax(V):
>     """Returns the max of a sparse CSR matrix V with shape (m,n)
>     m = number of examples (# columns),
>     n = dimensionality of examples (# rows) """
>     n,m = V.shape
>     # if type is CSR, slice by rows
>     maxvals = []
>     for row in xrange(n):
>         #find max of row
>         maxvals.append(max(array(V[row,:].todense())[0]))
>     Vmax = max(maxvals)
>     return Vmax

The CSC and CSR formats both internally store a dense array of all the
non-zero values.
I'm not sure how the Python interface looks like in SciPy's versions,
but if there's a way to get at that values array, then you can just do
the max of that.  (But don't forget the corner case of an unset
implicit zero value being the max).

--bb


From zhangchipr at gmail.com  Sun Jan  4 22:01:52 2009
From: zhangchipr at gmail.com (zhang chi)
Date: Mon, 5 Jan 2009 11:01:52 +0800
Subject: [SciPy-user] =?utf-8?q?how_to_use_SciPy=2E=E2=80=8Boptimize=2E?=
	=?utf-8?b?4oCLY29ieWxhPw==?=
Message-ID: <90c482ab0901041901w417f22b3h3b8ad01cc0653655@mail.gmail.com>

hi
    I have a function Fm(x1,x2) that can't be expressed in math mode, but
can be written using program.  And x1 $\in$ [1,100]; x2 $\in$ [ 0.2,0.8].
So  could I use Cobyla like the following:

 def Fm(x1,x2):
       ..........
       return value

x0 = [50,0.5]
cons = [1:100;0.2:0.8]
min = fmin_coblya(Fm, x0, cons, args=(), consargs=None, rhobeg=1.0,
rhoend=1e-4,iprint=1, maxfun=1000)

thank you very much.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090105/aa7ccc65/attachment.html>

From peter.skomoroch at gmail.com  Sun Jan  4 22:58:18 2009
From: peter.skomoroch at gmail.com (Peter Skomoroch)
Date: Sun, 4 Jan 2009 22:58:18 -0500
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
Message-ID: <e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>

I knew I overlooked something simple :)  Thanks Bill


>>> import scipy
>>> from scipy.sparse import csr_matrix, csc_matrix
>>> A= array([[1,2,3],[1,0,0],[4,5,0]])
>>> A
array([[1, 2, 3],
       [1, 0, 0],
       [4, 5, 0]])
>>> B = csr_matrix(A)  # just for this simple example, construct with COO
for speed
>>> B
<3x3 sparse matrix of type '<type 'numpy.int32'>'
    with 6 stored elements in Compressed Sparse Row format>
>>> print B
  (0, 0)    1
  (0, 1)    2
  (0, 2)    3
  (1, 0)    1
  (2, 0)    4
  (2, 1)    5
>>> B.data
array([1, 2, 3, 1, 4, 5])
>>> max(B.data)
5


On Sun, Jan 4, 2009 at 9:48 PM, Bill Baxter <wbaxter at gmail.com> wrote:

> On Mon, Jan 5, 2009 at 11:37 AM, Peter Skomoroch
> <peter.skomoroch at gmail.com> wrote:
> > Does anyone have suggestions on a fast max() function for sparse matrices
> > (COO, CSC, or CSR format)?
> >
> > I was thinking of slicing CSC or CSR matrices, and iterating through the
> > columns, but I suspect any loop based approach will be slow.
> >
> > def sparse_amax(V):
> >     """Returns the max of a sparse CSR matrix V with shape (m,n)
> >     m = number of examples (# columns),
> >     n = dimensionality of examples (# rows) """
> >     n,m = V.shape
> >     # if type is CSR, slice by rows
> >     maxvals = []
> >     for row in xrange(n):
> >         #find max of row
> >         maxvals.append(max(array(V[row,:].todense())[0]))
> >     Vmax = max(maxvals)
> >     return Vmax
>
> The CSC and CSR formats both internally store a dense array of all the
> non-zero values.
> I'm not sure how the Python interface looks like in SciPy's versions,
> but if there's a way to get at that values array, then you can just do
> the max of that.  (But don't forget the corner case of an unset
> implicit zero value being the max).
>
> --bb
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
Peter N. Skomoroch
peter.skomoroch at gmail.com
http://www.datawrangling.com
http://del.icio.us/pskomoroch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090104/0f38d0c1/attachment.html>

From gilles.rochefort at gmail.com  Mon Jan  5 08:47:17 2009
From: gilles.rochefort at gmail.com (Gilles Rochefort)
Date: Mon, 5 Jan 2009 14:47:17 +0100
Subject: [SciPy-user] how to use SciPy. optimize. cobyla?
Message-ID: <f2858210901050547m1b3941aau61b0e8334511092@mail.gmail.com>

Hello,

Not sure to fully understand what you want to do ?

Assuming you want to minimize a function Fm with bounds constaints - maybe
fmin_tnc or fmin_l_bfgs_b is better choice, depending on the
nature of function (continuity, differentiable, etc. ) .


2009/1/5 zhang chi <zhangchipr at gmail.com>

> hi
>     I have a function Fm(x1,x2) that can't be expressed in math mode, but
> can be written using program.  And x1 $\in$ [1,100]; x2 $\in$ [ 0.2,0.8].
> So  could I use Cobyla like the following:
>
>  def Fm(x1,x2):
>        ..........
>        return value
>
> x0 = [50,0.5]
> cons = [1:100;0.2:0.8]
> min = fmin_coblya(Fm, x0, cons, args=(), consargs=None, rhobeg=1.0, rhoend=1e-4,iprint=1, maxfun=1000)
>
> thank you very much.
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090105/cc122bb4/attachment.html>

From gilles.rochefort at gmail.com  Mon Jan  5 08:53:44 2009
From: gilles.rochefort at gmail.com (Gilles Rochefort)
Date: Mon, 5 Jan 2009 14:53:44 +0100
Subject: [SciPy-user] how to use SciPy. optimize. cobyla?
In-Reply-To: <f2858210901050547m1b3941aau61b0e8334511092@mail.gmail.com>
References: <f2858210901050547m1b3941aau61b0e8334511092@mail.gmail.com>
Message-ID: <f2858210901050553v26e1f1dbyd871e34546b41964@mail.gmail.com>

Anyway, cons is a list of functions and not a list of values.
In that case, I guess these functions have to be your bound constraints,

x1 >= 1 --> C1 = lambda x:  x[0] - 1
x1 <= 100 --> C2 = lambda x:  100 - x[0]
x2 >=.2 --> C3 = lambda x: x[1] - .2
x2 <= .8 --> C4 = lambda x: .8 - x[1]

and finally cons = [C1,C2,C3,C4]


Best regards,
Gilles Rochefort.

2009/1/5 Gilles Rochefort <gilles.rochefort at gmail.com>

> Hello,
>
> Not sure to fully understand what you want to do ?
>
> Assuming you want to minimize a function Fm with bounds constaints - maybe
> fmin_tnc or fmin_l_bfgs_b is better choice, depending on the
> nature of function (continuity, differentiable, etc. ) .
>
>
>
> 2009/1/5 zhang chi <zhangchipr at gmail.com>
>
>> hi
>>     I have a function Fm(x1,x2) that can't be expressed in math mode, but
>> can be written using program.  And x1 $\in$ [1,100]; x2 $\in$ [ 0.2,0.8].
>> So  could I use Cobyla like the following:
>>
>>  def Fm(x1,x2):
>>        ..........
>>        return value
>>
>> x0 = [50,0.5]
>> cons = [1:100;0.2:0.8]
>> min = fmin_coblya(Fm, x0, cons, args=(), consargs=None, rhobeg=1.0, rhoend=1e-4,iprint=1, maxfun=1000)
>>
>> thank you very much.
>>
>>
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090105/0fb00aaa/attachment.html>

From vanforeest at gmail.com  Mon Jan  5 16:43:53 2009
From: vanforeest at gmail.com (nicky van foreest)
Date: Mon, 5 Jan 2009 22:43:53 +0100
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
	<e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
Message-ID: <fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>

Hi,

A few days ago I encountered just the same problem, and solved by
taking the max of the values(), just as suggested below. However, it
took me some minutes to fiugre this out, and I first, of course, tried
the max() function. Thus, I suggest that the max function will be
added to the sparse class. Is there a reason not to do so?

bye

Nicky

2009/1/5 Peter Skomoroch <peter.skomoroch at gmail.com>:
> I knew I overlooked something simple :)  Thanks Bill
>
>
>>>> import scipy
>>>> from scipy.sparse import csr_matrix, csc_matrix
>>>> A= array([[1,2,3],[1,0,0],[4,5,0]])
>>>> A
> array([[1, 2, 3],
>        [1, 0, 0],
>        [4, 5, 0]])
>>>> B = csr_matrix(A)  # just for this simple example, construct with COO
>>>> for speed
>>>> B
> <3x3 sparse matrix of type '<type 'numpy.int32'>'
>     with 6 stored elements in Compressed Sparse Row format>
>>>> print B
>   (0, 0)    1
>   (0, 1)    2
>   (0, 2)    3
>   (1, 0)    1
>   (2, 0)    4
>   (2, 1)    5
>>>> B.data
> array([1, 2, 3, 1, 4, 5])
>>>> max(B.data)
> 5
>
>
>
> On Sun, Jan 4, 2009 at 9:48 PM, Bill Baxter <wbaxter at gmail.com> wrote:
>>
>> On Mon, Jan 5, 2009 at 11:37 AM, Peter Skomoroch
>> <peter.skomoroch at gmail.com> wrote:
>> > Does anyone have suggestions on a fast max() function for sparse
>> > matrices
>> > (COO, CSC, or CSR format)?
>> >
>> > I was thinking of slicing CSC or CSR matrices, and iterating through the
>> > columns, but I suspect any loop based approach will be slow.
>> >
>> > def sparse_amax(V):
>> >     """Returns the max of a sparse CSR matrix V with shape (m,n)
>> >     m = number of examples (# columns),
>> >     n = dimensionality of examples (# rows) """
>> >     n,m = V.shape
>> >     # if type is CSR, slice by rows
>> >     maxvals = []
>> >     for row in xrange(n):
>> >         #find max of row
>> >         maxvals.append(max(array(V[row,:].todense())[0]))
>> >     Vmax = max(maxvals)
>> >     return Vmax
>>
>> The CSC and CSR formats both internally store a dense array of all the
>> non-zero values.
>> I'm not sure how the Python interface looks like in SciPy's versions,
>> but if there's a way to get at that values array, then you can just do
>> the max of that.  (But don't forget the corner case of an unset
>> implicit zero value being the max).
>>
>> --bb
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
>
> --
> Peter N. Skomoroch
> peter.skomoroch at gmail.com
> http://www.datawrangling.com
> http://del.icio.us/pskomoroch
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>


From vanforeest at gmail.com  Mon Jan  5 17:21:59 2009
From: vanforeest at gmail.com (nicky van foreest)
Date: Mon, 5 Jan 2009 23:21:59 +0100
Subject: [SciPy-user] tutorial on solving large Markov chains
Message-ID: <fa510ff80901051421le0a07b7h611ca495c6fa2a6c@mail.gmail.com>

Hi,

I submitted a cookbook tutorial on how to solve large Markov chains,
see http://www.scipy.org/Cookbook/Solving_Large_Markov_Chains. In case
any of you has ideas on how to improve/extend this, please let me
know.

bye

Nicky


From wnbell at gmail.com  Mon Jan  5 17:40:44 2009
From: wnbell at gmail.com (Nathan Bell)
Date: Mon, 5 Jan 2009 17:40:44 -0500
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
	<e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
	<fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>
Message-ID: <d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>

On Mon, Jan 5, 2009 at 4:43 PM, nicky van foreest <vanforeest at gmail.com> wrote:
>
> A few days ago I encountered just the same problem, and solved by
> taking the max of the values(), just as suggested below. However, it
> took me some minutes to fiugre this out, and I first, of course, tried
> the max() function. Thus, I suggest that the max function will be
> added to the sparse class. Is there a reason not to do so?
>

Hi Nicky,

It should be added, but it's not as straightforward as you might think.

For conformity with dense matrices, max() should return zero if the
nonzero entries of the matrix are all negative and there is at least
one missing value in the matrix.  This might surprise people who
expect the largest nonzero value instead.  For instance,
csr_matrix([[0,-1]]).max() should be 0.

Another minor problem is that some matrices permit duplicate entries.
Currently, we implicitly sum duplicate values together (e.g. when
computing sparse matrix-vector products) and when converting to other
formats.  We'd probably want to make max() and min() agree with this
behavior.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/


From s.mientki at ru.nl  Mon Jan  5 17:48:00 2009
From: s.mientki at ru.nl (Stef Mientki)
Date: Mon, 05 Jan 2009 23:48:00 +0100
Subject: [SciPy-user] Getting error scipy / cookbook
Message-ID: <49628E20.5050102@ru.nl>

hello,

I get an error trying to access the Scipy Cookbook:
  http://www.scipy.org/Cookbook

Anyone recognizes this problem ?

btw: The error message is formatted in such a way,
that I can't read it ( with Mozilla on winXP)

cheers,
Stef


From robert.kern at gmail.com  Mon Jan  5 17:54:01 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 5 Jan 2009 17:54:01 -0500
Subject: [SciPy-user] Getting error scipy / cookbook
In-Reply-To: <49628E20.5050102@ru.nl>
References: <49628E20.5050102@ru.nl>
Message-ID: <3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com>

On Mon, Jan 5, 2009 at 17:48, Stef Mientki <s.mientki at ru.nl> wrote:
> hello,
>
> I get an error trying to access the Scipy Cookbook:
>  http://www.scipy.org/Cookbook
>
> Anyone recognizes this problem ?

I did get an error, but it worked when I tried again.

> btw: The error message is formatted in such a way,
> that I can't read it ( with Mozilla on winXP)

What do you mean?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From s.mientki at ru.nl  Mon Jan  5 18:32:06 2009
From: s.mientki at ru.nl (Stef Mientki)
Date: Tue, 06 Jan 2009 00:32:06 +0100
Subject: [SciPy-user] Getting error scipy / cookbook
In-Reply-To: <3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com>
References: <49628E20.5050102@ru.nl>
	<3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com>
Message-ID: <49629876.8030309@ru.nl>


Robert Kern wrote:
> On Mon, Jan 5, 2009 at 17:48, Stef Mientki <s.mientki at ru.nl> wrote:
>   
>> hello,
>>
>> I get an error trying to access the Scipy Cookbook:
>>  http://www.scipy.org/Cookbook
>>
>> Anyone recognizes this problem ?
>>     
>
> I did get an error, but it worked when I tried again.
>
>   
Ok after 3 attempts it worked, thanks.
>> btw: The error message is formatted in such a way,
>> that I can't read it ( with Mozilla on winXP)
>>     
>
> What do you mean?
>   
see attached image

cheers,
Stef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090106/3063041b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pw_application_vpython3_img4.png
Type: image/png
Size: 7791 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090106/3063041b/attachment.png>

From robert.kern at gmail.com  Mon Jan  5 18:34:06 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 5 Jan 2009 18:34:06 -0500
Subject: [SciPy-user] Getting error scipy / cookbook
In-Reply-To: <49629876.8030309@ru.nl>
References: <49628E20.5050102@ru.nl>
	<3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com>
	<49629876.8030309@ru.nl>
Message-ID: <3d375d730901051534s3f093912p8c16bc5bc3b7e5f@mail.gmail.com>

On Mon, Jan 5, 2009 at 18:32, Stef Mientki <s.mientki at ru.nl> wrote:
>
>
> Robert Kern wrote:
>
> On Mon, Jan 5, 2009 at 17:48, Stef Mientki <s.mientki at ru.nl> wrote:

> btw: The error message is formatted in such a way,
> that I can't read it ( with Mozilla on winXP)
>
>
> What do you mean?
>
>
> see attached image

I think you sent the wrong image.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From s.mientki at ru.nl  Mon Jan  5 18:36:21 2009
From: s.mientki at ru.nl (Stef Mientki)
Date: Tue, 06 Jan 2009 00:36:21 +0100
Subject: [SciPy-user] Getting error scipy / cookbook
In-Reply-To: <49629876.8030309@ru.nl>
References: <49628E20.5050102@ru.nl>
	<3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com>
	<49629876.8030309@ru.nl>
Message-ID: <49629975.6070702@ru.nl>

sorry wrong image
Stef

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pylab_works_temp_img10.png
Type: image/png
Size: 9972 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090106/7b68372e/attachment.png>

From peter.skomoroch at gmail.com  Mon Jan  5 20:04:18 2009
From: peter.skomoroch at gmail.com (Peter Skomoroch)
Date: Mon, 5 Jan 2009 20:04:18 -0500
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
	<e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
	<fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>
	<d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>
Message-ID: <e4fc0d2a0901051704h25dd734do85f11cdedb23cd15@mail.gmail.com>

Nathan,

You said:

 "... some matrices permit duplicate entries.
Currently, we implicitly sum duplicate values together (e.g. when
computing sparse matrix-vector products) and when converting to other
formats."

Could you elaborate on that a bit?  I'm trying to track down a nasty bug
right now where the result of a sparse matrix-matrix product (A_sparse *
B_dense) does not agree with the corresponding dense product (A_dense *
B_dense).

-Pete

On Mon, Jan 5, 2009 at 5:40 PM, Nathan Bell <wnbell at gmail.com> wrote:

> On Mon, Jan 5, 2009 at 4:43 PM, nicky van foreest <vanforeest at gmail.com>
> wrote:
> >
> > A few days ago I encountered just the same problem, and solved by
> > taking the max of the values(), just as suggested below. However, it
> > took me some minutes to fiugre this out, and I first, of course, tried
> > the max() function. Thus, I suggest that the max function will be
> > added to the sparse class. Is there a reason not to do so?
> >
>
> Hi Nicky,
>
> It should be added, but it's not as straightforward as you might think.
>
> For conformity with dense matrices, max() should return zero if the
> nonzero entries of the matrix are all negative and there is at least
> one missing value in the matrix.  This might surprise people who
> expect the largest nonzero value instead.  For instance,
> csr_matrix([[0,-1]]).max() should be 0.
>
> Another minor problem is that some matrices permit duplicate entries.
> Currently, we implicitly sum duplicate values together (e.g. when
> computing sparse matrix-vector products) and when converting to other
> formats.  We'd probably want to make max() and min() agree with this
> behavior.
>
> --
> Nathan Bell wnbell at gmail.com
> http://graphics.cs.uiuc.edu/~wnbell/<http://graphics.cs.uiuc.edu/%7Ewnbell/>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
Peter N. Skomoroch
peter.skomoroch at gmail.com
http://www.datawrangling.com
http://del.icio.us/pskomoroch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090105/d9594e78/attachment.html>

From robert.kern at gmail.com  Mon Jan  5 20:32:05 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 5 Jan 2009 19:32:05 -0600
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <e4fc0d2a0901051704h25dd734do85f11cdedb23cd15@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
	<e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
	<fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>
	<d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>
	<e4fc0d2a0901051704h25dd734do85f11cdedb23cd15@mail.gmail.com>
Message-ID: <3d375d730901051732x62216dev77397f791e43a333@mail.gmail.com>

On Mon, Jan 5, 2009 at 19:04, Peter Skomoroch <peter.skomoroch at gmail.com> wrote:
> Nathan,
>
> You said:
>
>  "... some matrices permit duplicate entries.
> Currently, we implicitly sum duplicate values together (e.g. when
> computing sparse matrix-vector products) and when converting to other
> formats."
>
> Could you elaborate on that a bit?  I'm trying to track down a nasty bug
> right now where the result of a sparse matrix-matrix product (A_sparse *
> B_dense) does not agree with the corresponding dense product (A_dense *
> B_dense).

Note that if A_dense and B_dense are ndarray objects rather than
(dense) matrix objects, then (A_dense*B_dense) does elementwise
multiplication, not matrix multiplication. spmatrix objects do matrix
multiplication with the * operator.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From wnbell at gmail.com  Mon Jan  5 20:42:36 2009
From: wnbell at gmail.com (Nathan Bell)
Date: Mon, 5 Jan 2009 20:42:36 -0500
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <e4fc0d2a0901051704h25dd734do85f11cdedb23cd15@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
	<e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
	<fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>
	<d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>
	<e4fc0d2a0901051704h25dd734do85f11cdedb23cd15@mail.gmail.com>
Message-ID: <d05265cb0901051742v4be456dai67de9f8cf1cfdb1f@mail.gmail.com>

On Mon, Jan 5, 2009 at 8:04 PM, Peter Skomoroch
<peter.skomoroch at gmail.com> wrote:
> Nathan,
>
> You said:
>
>  "... some matrices permit duplicate entries.
> Currently, we implicitly sum duplicate values together (e.g. when
> computing sparse matrix-vector products) and when converting to other
> formats."
>
> Could you elaborate on that a bit?  I'm trying to track down a nasty bug
> right now where the result of a sparse matrix-matrix product (A_sparse *
> B_dense) does not agree with the corresponding dense product (A_dense *
> B_dense).
>

It's a little costly to detect the presence of duplicates in the CSR,
CSC, and COO formats so we adopt the convention that a matrix with
duplicates should behave as if those duplicates were summed together.
If A and B are sparse matrices then A * B should be close to
dot(A.toarray(), B.toarray()).  The only difference between the two
would be due to to order of operations.

Note that there's another oddity w.r.t. sorting of the indices in the
CSR/CSC formats.  Certain operations will shuffle the nonzeros about,
so it's dangerous to share arrays between multiple CSR/CSC matrices.

I suspect Robert's suggestion might be the source of your problems.
If not, try to reduce the problem to something small and reproduceable
and we'll try to sort it out.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/


From vanforeest at gmail.com  Tue Jan  6 02:50:16 2009
From: vanforeest at gmail.com (nicky van foreest)
Date: Tue, 6 Jan 2009 08:50:16 +0100
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
	<e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
	<fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>
	<d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>
Message-ID: <fa510ff80901052350g1b6d27e5ma5de8576c7835341@mail.gmail.com>

Hi Nathan,

Thanks for your feedback.

> For conformity with dense matrices, max() should return zero if the
> nonzero entries of the matrix are all negative and there is at least
> one missing value in the matrix.  This might surprise people who
> expect the largest nonzero value instead.  For instance,
> csr_matrix([[0,-1]]).max() should be 0.

To bring things in line with dense matrices a norm operator would be
the most logical, so that norm(A, infty) would yield the max of the
absolute values, etc. I suppose this will be more difficult than
implementing max(), but anyway that is what I ultimately would expect
to use and be consistent with the dense matrices.

bye

Nicky


From bayer.justin at googlemail.com  Tue Jan  6 10:46:09 2009
From: bayer.justin at googlemail.com (Justin Bayer)
Date: Tue, 6 Jan 2009 16:46:09 +0100
Subject: [SciPy-user] Weave: Distinction between integers being either
	py:object or int
Message-ID: <db793aab0901060746v1bbae3c2v836f6d07b4cec482@mail.gmail.com>

Hi group,

I am currently having a problem for which I am searching a workaround.
I searched the mailinglist archives but did not find anything useful.

Consider the following code:  http://privatepaste.com/881oUqpHFJ

This will compile two different versions of the snippet. One for
number being an integer and one for it being an py::object.

I would like to have a long always in my snippet, which I could
achieve by PyInt_AsLong(), I guess. But I don't know of a way to
reliably tell wether the c++ snippet has just gotten an integer or a
py::object.

Of course it would be cool, if weave would reliably convert a Python
Long to a C Long.

Any ideas for workarounds for my problem? This is really getting a
problem for me.


Regards,
-Justin

-- 
P.S.: No Dogs!


From alexandre.fayolle at logilab.fr  Tue Jan  6 10:38:48 2009
From: alexandre.fayolle at logilab.fr (Alexandre Fayolle)
Date: Tue, 6 Jan 2009 16:38:48 +0100
Subject: [SciPy-user] looking for a consultant for the design of an
	interface for our mathematical models
In-Reply-To: <e2712e1d0812241423s296851a9w3befbd2b9d58095f@mail.gmail.com>
References: <e2712e1d0812241423s296851a9w3befbd2b9d58095f@mail.gmail.com>
Message-ID: <200901061638.57115.alexandre.fayolle@logilab.fr>

Le Wednesday 24 December 2008 23:23:53 Marko Loparic, vous avez ?crit?:
> Hi,
>
> I am looking for someone that could help us to design (perhaps also to
> implement) a user interface (GUI + repository of data) for our
> mathematical models.
>
> I work for a company in the energy sector. Currently in our department
> we have 5 different mathematical models using different GUIs and excel
> hacks to allow users to feed the data and get results. We would like
> to have a single, powerful, user-friendly interface for all those
> models. We need the help of an experienced and inventive software
> designer to help us to choose the technology to use and to make the
> design (possibly also the implementation) of the tool. Of course we
> would like to reuse existing tools whenever possible.
>
> We propose to pay for one or two days of consultancy when we will
> describe our needs and discuss the possible design choices. Depending
> on the conclusions we get we can work further together. We are located
> near Brussels.
>
> Usage of python is not a request but it is a natural choice since it
> is the main language we use.

Hi,

Logilab is located in Paris, and we could certainly send someone to Brussels 
to investigate your problem. Designing user interfaces for scientific data is 
part of the things we do for our customers, and Python is our language of 
choice. 

-- 
Alexandre Fayolle                              LOGILAB, Paris (France)
Formations Python, Zope, Plone, Debian:  http://www.logilab.fr/formations
D?veloppement logiciel sur mesure:       http://www.logilab.fr/services
Informatique scientifique:               http://www.logilab.fr/science
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 481 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090106/a58cff91/attachment.sig>

From bayer.justin at googlemail.com  Tue Jan  6 12:48:48 2009
From: bayer.justin at googlemail.com (Justin Bayer)
Date: Tue, 6 Jan 2009 18:48:48 +0100
Subject: [SciPy-user] scipy.linalg.inv does not work
Message-ID: <db793aab0901060948s60bcca43yd09a38b1cf10ec7d@mail.gmail.com>

Hi,

I am using numpy rev 6297 and scipy rev 5331, together with Python
2.6.1 (compiled from source) on Mac OS 10.5 Leopard, Intel 64 bit and
am getting this output:

>>> from scipy import array
>>> from scipy.linalg import inv
>>> array(((2, 3), (1, 5)))
array([[2, 3],
       [1, 5]])
>>> inv(_)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.6/site-packages/scipy/linalg/basic.py",
line 369, in inv
    lwork = calc_lwork.getri(getri.prefix,a1.shape[0])
RuntimeError: more argument specifiers than keyword list entries
(remaining format:'|:calc_lwork.getri')

Any ideas how I can fix this?

Regards,
-Justin


-- 
P.S.: No Dogs!


From peter.skomoroch at gmail.com  Tue Jan  6 12:58:29 2009
From: peter.skomoroch at gmail.com (Peter Skomoroch)
Date: Tue, 6 Jan 2009 12:58:29 -0500
Subject: [SciPy-user] fast max() on sparse matrices
In-Reply-To: <d05265cb0901051742v4be456dai67de9f8cf1cfdb1f@mail.gmail.com>
References: <e4fc0d2a0901041837n38e27b0cr2e29b7735a1d3e7e@mail.gmail.com>
	<e86a5fd00901041848h190aea3dh77ff01669aae30e4@mail.gmail.com>
	<e4fc0d2a0901041958g4130a38ag76fd7e83457c2986@mail.gmail.com>
	<fa510ff80901051343u529215fby4c57402ca99c219f@mail.gmail.com>
	<d05265cb0901051440w30bc3ee5i526960eaee57552b@mail.gmail.com>
	<e4fc0d2a0901051704h25dd734do85f11cdedb23cd15@mail.gmail.com>
	<d05265cb0901051742v4be456dai67de9f8cf1cfdb1f@mail.gmail.com>
Message-ID: <e4fc0d2a0901060958l17c89143v97d29d9a46fb4235@mail.gmail.com>

Nathan,

Thanks for all the help, the sparse module is pretty powerful stuff.  I'll
pull together a small scale example and post it tonight.

-Pete

On Mon, Jan 5, 2009 at 8:42 PM, Nathan Bell <wnbell at gmail.com> wrote:

> On Mon, Jan 5, 2009 at 8:04 PM, Peter Skomoroch
> <peter.skomoroch at gmail.com> wrote:
> > Nathan,
> >
> > You said:
> >
> >  "... some matrices permit duplicate entries.
> > Currently, we implicitly sum duplicate values together (e.g. when
> > computing sparse matrix-vector products) and when converting to other
> > formats."
> >
> > Could you elaborate on that a bit?  I'm trying to track down a nasty bug
> > right now where the result of a sparse matrix-matrix product (A_sparse *
> > B_dense) does not agree with the corresponding dense product (A_dense *
> > B_dense).
> >
>
> It's a little costly to detect the presence of duplicates in the CSR,
> CSC, and COO formats so we adopt the convention that a matrix with
> duplicates should behave as if those duplicates were summed together.
> If A and B are sparse matrices then A * B should be close to
> dot(A.toarray(), B.toarray()).  The only difference between the two
> would be due to to order of operations.
>
> Note that there's another oddity w.r.t. sorting of the indices in the
> CSR/CSC formats.  Certain operations will shuffle the nonzeros about,
> so it's dangerous to share arrays between multiple CSR/CSC matrices.
>
> I suspect Robert's suggestion might be the source of your problems.
> If not, try to reduce the problem to something small and reproduceable
> and we'll try to sort it out.
>
> --
> Nathan Bell wnbell at gmail.com
> http://graphics.cs.uiuc.edu/~wnbell/<http://graphics.cs.uiuc.edu/%7Ewnbell/>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
Peter N. Skomoroch
peter.skomoroch at gmail.com
http://www.datawrangling.com
http://del.icio.us/pskomoroch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090106/f3c64ecd/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Tue Jan  6 12:48:06 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Wed, 07 Jan 2009 02:48:06 +0900
Subject: [SciPy-user] scipy.linalg.inv does not work
In-Reply-To: <db793aab0901060948s60bcca43yd09a38b1cf10ec7d@mail.gmail.com>
References: <db793aab0901060948s60bcca43yd09a38b1cf10ec7d@mail.gmail.com>
Message-ID: <49639956.6010505@ar.media.kyoto-u.ac.jp>

Justin Bayer wrote:
> Hi,
>
> I am using numpy rev 6297 and scipy rev 5331, together with Python
> 2.6.1 (compiled from source) on Mac OS 10.5 Leopard, Intel 64 bit and
> am getting this output:
>   

This is caused by a bug in python 2.6; but I am a bit surprised, because
your numpy version is supposed to  have a workaround. Did you rebuilt
scipy after updating numpy (e.g. did you rebuild from scratch by rm -rf
build in scipy sources).

cheers,

David


From kamran.husain at aramco.com  Tue Jan  6 14:21:52 2009
From: kamran.husain at aramco.com (Husain, Kamran B)
Date: Tue, 6 Jan 2009 22:21:52 +0300
Subject: [SciPy-user] Using fmin_slsqp
Message-ID: <A2D269AC76F50D4E8A6EBC48DFC37DF6058E79BD96@EMAILI.aramco.com>

Hello,

While attempting to use fmin_slsqp, I keep getting an error "Error imode = 6 Singular Matrix C in LSQ subproblem".

The same constraints (upper bound, lower bound) and a very simple sum(xvector) minimization function work well in Matlab using fmincon. I want to convince our users to use Scipy instead (for ease of programming in the future.
Unfortunately, even a small exercise such as this one is proving to be impossible

The results from the cobyla call for a similar try yielded results from a local minima other than found by fmincom.

I have seen the really trivial examples in the ticket #570, but does someone have a more concrete example or could point me in the right direction?

Thanks,
Kamran

________________________________
The contents of this email, including all related responses, files and attachments transmitted with it (collectively referred to as "this Email"), are intended solely for the use of the individual/entity to whom/which they are addressed, and may contain confidential and/or legally privileged information. This Email may not be disclosed or forwarded to anyone else without authorization from the originator of this Email. If you have received this Email in error, please notify the sender immediately and delete all copies from your system. Please note that the views or opinions presented in this Email are those of the author and may not necessarily represent those of Saudi Aramco. The recipient should check this Email and any attachments for the presence of any viruses. Saudi Aramco accepts no liability for any damage caused by any virus/error transmitted by this Email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090106/195d9997/attachment.html>

From contact at pythonxy.com  Tue Jan  6 15:50:11 2009
From: contact at pythonxy.com (Pierre Raybaut)
Date: Tue, 06 Jan 2009 21:50:11 +0100
Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.9
Message-ID: <4963C403.5020004@pythonxy.com>

Hi all,

Release 2.1.9 is now available on http://www.pythonxy.com:
   - All-in-One Installer ("Full Edition"),
   - Plugin Installer -- to be downloaded with xyweb,
   - Update

Changes history
Version 2.1.9 (01-06-2009)

     * Updated:
           o VTK 5.2.1
           o Enthought Tool Suite 3.1.0.2
     * Corrected:
           o Issues 54, 55


Regards,
Pierre Raybaut


From stef.mientki at gmail.com  Tue Jan  6 16:05:56 2009
From: stef.mientki at gmail.com (Stef Mientki)
Date: Tue, 06 Jan 2009 22:05:56 +0100
Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : 2.1.9
In-Reply-To: <4963C403.5020004@pythonxy.com>
References: <4963C403.5020004@pythonxy.com>
Message-ID: <4963C7B4.80101@gmail.com>

hi Pierre,
Did you miss this question ?

Pierre Raybaut wrote:
> Hi all,
>
> Release 2.1.8 is now available on http://www.pythonxy.com:
>    - All-in-One Installer ("Full Edition"),
>    - Plugin Installer -- to be downloaded with xyweb,
>    - Update
>
> Changes history
> Version 2.1.8 (01-04-2009)
>
>      * Added:
>            o SciTE 1.77.0 (replacement for Notepad++)
>            o WinMerge 2.10.2 - Open Source differencing and merging 
> tool for Windows
>      * Updated:
>            o Console 2.0.141.6
>            o VPython 5.0.1.0
>   
Isn't VPython-5 still a little buggy and missing features of VPyton-3 ?
And why only for Windows ?
I would suggest to add both  VPython-3 and VPython-5,
and use a programmatical switch between these two.

cheers,
Stef


Pierre Raybaut wrote:
> Hi all,
>
> Release 2.1.9 is now available on http://www.pythonxy.com:
>    - All-in-One Installer ("Full Edition"),
>    - Plugin Installer -- to be downloaded with xyweb,
>    - Update
>
> Changes history
> Version 2.1.9 (01-06-2009)
>
>      * Updated:
>            o VTK 5.2.1
>            o Enthought Tool Suite 3.1.0.2
>      * Corrected:
>            o Issues 54, 55
>
>
> Regards,
> Pierre Raybaut
>
>
>
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups "python(x,y)" group.
> To post to this group, send email to pythonxy at googlegroups.com
> To unsubscribe from this group, send email to pythonxy+unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/pythonxy?hl=en
> -~----------~----~----~----~------~----~------~--~---
>
>   


From lorenzo.isella at gmail.com  Tue Jan  6 17:09:35 2009
From: lorenzo.isella at gmail.com (Lorenzo Isella)
Date: Tue, 06 Jan 2009 23:09:35 +0100
Subject: [SciPy-user] Efficient file reading
Message-ID: <4963D69F.3030800@gmail.com>

Dear All,
I sometimes need to read rather large data files (~500Mb).
These are plain text files (usually tables with 500 x 2e5 entries).
It seems to me (but I have not done any serious test/benchmark) that R 
is faster than Python to read/write files.
Or better: maybe I am too naive when doing I/O operation in python.
I usually simply do the following

import pylab as p

my_arr=p.load("my_data.txt")

which gets the job done, but is slow in this case. Probably there is a 
more efficient way for doing this, and I should also add that I know 
beforehand the dimensions of the data table I want to read into a scipy 
array.
Any suggestions?
Many thanks

Lorenzo


From stefan at sun.ac.za  Tue Jan  6 17:24:01 2009
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 7 Jan 2009 00:24:01 +0200
Subject: [SciPy-user] Efficient file reading
In-Reply-To: <4963D69F.3030800@gmail.com>
References: <4963D69F.3030800@gmail.com>
Message-ID: <9457e7c80901061424v345890bfm8d9c7be7d8c0cb2d@mail.gmail.com>

Hi Lorenzo

2009/1/7 Lorenzo Isella <lorenzo.isella at gmail.com>:
> I sometimes need to read rather large data files (~500Mb).
> These are plain text files (usually tables with 500 x 2e5 entries).
> It seems to me (but I have not done any serious test/benchmark) that R
> is faster than Python to read/write files.

For simply formatted files, numpy.fromfile should do the trick, and is
fast.  Otherwise, try numpy.loadtxt.

Cheers
St?fan


From cohen at lpta.in2p3.fr  Tue Jan  6 17:22:43 2009
From: cohen at lpta.in2p3.fr (Cohen-Tanugi Johann)
Date: Tue, 06 Jan 2009 23:22:43 +0100
Subject: [SciPy-user] Efficient file reading
In-Reply-To: <4963D69F.3030800@gmail.com>
References: <4963D69F.3030800@gmail.com>
Message-ID: <4963D9B3.4050709@lpta.in2p3.fr>

numpy.loadtxt?
Johann

Lorenzo Isella wrote:
> Dear All,
> I sometimes need to read rather large data files (~500Mb).
> These are plain text files (usually tables with 500 x 2e5 entries).
> It seems to me (but I have not done any serious test/benchmark) that R 
> is faster than Python to read/write files.
> Or better: maybe I am too naive when doing I/O operation in python.
> I usually simply do the following
>
> import pylab as p
>
> my_arr=p.load("my_data.txt")
>
> which gets the job done, but is slow in this case. Probably there is a 
> more efficient way for doing this, and I should also add that I know 
> beforehand the dimensions of the data table I want to read into a scipy 
> array.
> Any suggestions?
> Many thanks
>
> Lorenzo
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   


From mueller at pitt.edu  Tue Jan  6 18:07:32 2009
From: mueller at pitt.edu (James Mueller)
Date: Tue, 6 Jan 2009 18:07:32 -0500
Subject: [SciPy-user]  [python(x,y)] [ Python(x,y) ] New release : 2.1.9
Message-ID: <95205F9B-4383-4F17-B944-F6CDEA5CA326@pitt.edu>

Stef,
	VPython 5.0.ReleaseCandidate1, replaces VPython 4.0.beta26.  VPython  
3 has never been in Python(x,y).  Given that version 3 relies on  
Numeric instead of Numpy, I am not sure how easy it would be for  
Pierre to add it in.

-Jim


From bayer.justin at googlemail.com  Tue Jan  6 18:56:33 2009
From: bayer.justin at googlemail.com (Justin Bayer)
Date: Wed, 7 Jan 2009 00:56:33 +0100
Subject: [SciPy-user] scipy.linalg.inv does not work
In-Reply-To: <49639956.6010505@ar.media.kyoto-u.ac.jp>
References: <db793aab0901060948s60bcca43yd09a38b1cf10ec7d@mail.gmail.com>
	<49639956.6010505@ar.media.kyoto-u.ac.jp>
Message-ID: <db793aab0901061556q314d0cfyac1f09fece7455d9@mail.gmail.com>

Thanks!

It seems a 1.2.0 numpy installation still lurked around in my
site-packages. I removed everything numpy/scipy related from
site-packages and the install dirs and installed it again -  inv()
works now!

2009/1/6 David Cournapeau <david at ar.media.kyoto-u.ac.jp>:
> Justin Bayer wrote:
>> Hi,
>>
>> I am using numpy rev 6297 and scipy rev 5331, together with Python
>> 2.6.1 (compiled from source) on Mac OS 10.5 Leopard, Intel 64 bit and
>> am getting this output:
>>
>
> This is caused by a bug in python 2.6; but I am a bit surprised, because
> your numpy version is supposed to  have a workaround. Did you rebuilt
> scipy after updating numpy (e.g. did you rebuild from scratch by rm -rf
> build in scipy sources).
>
> cheers,
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
P.S.: No Dogs!


From nicolas.wolfhurt at gmail.com  Wed Jan  7 02:47:15 2009
From: nicolas.wolfhurt at gmail.com (Nicolas Vergnes)
Date: Wed, 7 Jan 2009 08:47:15 +0100
Subject: [SciPy-user] Fwd: Building scipy with lround
In-Reply-To: <de5a20890901062343h55dd11e6rd3abb02be764e79d@mail.gmail.com>
References: <de5a20890901062343h55dd11e6rd3abb02be764e79d@mail.gmail.com>
Message-ID: <de5a20890901062347y22344864mb61783b8c0f855c6@mail.gmail.com>

Hello

I have build Scipy-0.7.0b1 on sparc Solaris 9 with FFTW , BLAS and LAPACK in
static

When I use it I have this error message :

Python 2.6.1 (r261:67515, Dec 26 2008, 13:02:49)
[GCC 4.3.2] on sunos5
>>> import numpy
>>> import scipy
>>> import scipy.interpolate
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File
"/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/interpolate/__init__.py",
line 7, in <module>
    from interpolate import *
  File
"/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/interpolate/interpolate.py",
line 13, in <module>
    import scipy.special as spec
  File
"/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/__init__.py",
line 8, in <module>
    from basic import *
  File
"/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/basic.py",
line 8, in <module>
    from _cephes import *
ImportError: ld.so.1: python: fatal: relocation error: file
/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/_cephes.so:
symbol lround: referenced symbol not found

calc-gen5-ci:/Produits/tmp/nicolas/python $ ldd
/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/_cephes.so
        libgfortran.so.3 =>
/Produits/publics/sparc.SunOS.5.9/gcc/4.3.2/lib/libgfortran.so.3
        libm.so.1 =>     /usr/lib/libm.so.1
        libgcc_s.so.1 =>
/Produits/publics/sparc.SunOS.5.9/gcc/4.3.2/lib/libgcc_s.so.1
        libc.so.1 =>     /usr/lib/libc.so.1
        libdl.so.1 =>    /usr/lib/libdl.so.1
        /usr/platform/SUNW,Sun-Fire-V890/lib/libc_psr.so.1


I think /usr/lib/libm.so dont have lround()  on Solaris 9 ( on solaris 10 i
pretty sure that's ok )

how can I compile Scipy correctly please ?

thank you all,

Nicolas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090107/19026ee8/attachment.html>

From zhangchipr at gmail.com  Wed Jan  7 03:05:49 2009
From: zhangchipr at gmail.com (zhang chi)
Date: Wed, 7 Jan 2009 16:05:49 +0800
Subject: [SciPy-user] Can scipy resolve this problem?
Message-ID: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>

hi
    I want to get the minimum value of a derivative free optimization
problem.  The function F(x1,x2) can't be given the expression, but the
function can be realized using python language.  Where x1 $\in$ [1,100], and
x2 $\in$ [50,80]. Can scipy resolve this problem? I have tried the cobyla,
but it cannot find the minimum value.
By the way, the step of x1 and x2 is 1.

Thank you very much.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090107/3c1ff865/attachment.html>

From w.richert at gmx.net  Wed Jan  7 03:15:04 2009
From: w.richert at gmx.net (Willi Richert)
Date: Wed, 7 Jan 2009 09:15:04 +0100
Subject: [SciPy-user] Current status of spatial data structures
Message-ID: <200901070915.04411.w.richert@gmx.net>

Hi,

here are some observations regarding the current status of kdtree support in 
Python:

- scipy 0.7 includes scipy.spatial and supports spatial searches via KDTree 
http://docs.scipy.org/doc/scipy/reference/spatial.html

- the cookbook contains another kdtree version: 
http://scipy.org/Cookbook/KDTree

- I have provided Python swig wrappers to the libkdtree++ library 
(http://libkdtree.alioth.debian.org/). Although the data structure has to be 
fixed (at compile time of libkdtree++) and thus one has to change the swig 
bindings if one needs to store a different type of vector, it is by my 
knowledge the only implementation that allows changes to the kdtree data 
structure at runtime (add/remove support after initial setup). All the other 
approaches are "create once/query multiple times" approaches.

Maybe this is of interest to somebody on this list. The authors of libkdtree++ 
work towards a dynamic data structure support. If that is accomplished and I 
have adjusted the Python wrapper, will there be room for another kdtree 
implementation in scipy.spatial? If yes, I would try to match the interface as 
closely as possible to the current 

Regards,
wr


From robert.kern at gmail.com  Wed Jan  7 03:15:19 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Jan 2009 02:15:19 -0600
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
Message-ID: <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>

On Wed, Jan 7, 2009 at 02:05, zhang chi <zhangchipr at gmail.com> wrote:
> hi
>     I want to get the minimum value of a derivative free optimization
> problem.  The function F(x1,x2) can't be given the expression, but the
> function can be realized using python language.  Where x1 $\in$ [1,100], and
> x2 $\in$ [50,80]. Can scipy resolve this problem?

Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this.

> I have tried the cobyla,
> but it cannot find the minimum value.
> By the way, the step of x1 and x2 is 1.

What do you mean by "the step"?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From zhangchipr at gmail.com  Wed Jan  7 03:18:26 2009
From: zhangchipr at gmail.com (zhang chi)
Date: Wed, 7 Jan 2009 16:18:26 +0800
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>
References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
	<3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>
Message-ID: <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>

Thank you.

"step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5]

On Wed, Jan 7, 2009 at 4:15 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Wed, Jan 7, 2009 at 02:05, zhang chi <zhangchipr at gmail.com> wrote:
> > hi
> >     I want to get the minimum value of a derivative free optimization
> > problem.  The function F(x1,x2) can't be given the expression, but the
> > function can be realized using python language.  Where x1 $\in$ [1,100],
> and
> > x2 $\in$ [50,80]. Can scipy resolve this problem?
>
> Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this.
>
> > I have tried the cobyla,
> > but it cannot find the minimum value.
> > By the way, the step of x1 and x2 is 1.
>
> What do you mean by "the step"?
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090107/ad929172/attachment.html>

From robert.kern at gmail.com  Wed Jan  7 03:23:37 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Jan 2009 02:23:37 -0600
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>
References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
	<3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>
	<90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>
Message-ID: <3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com>

On Wed, Jan 7, 2009 at 02:18, zhang chi <zhangchipr at gmail.com> wrote:
> Thank you.
>
> "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5]

No, there is no combinatorial optimization in scipy. For a problem as
small as yours, I recommend just doing a brute force search.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From matthieu.brucher at gmail.com  Wed Jan  7 03:24:21 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Wed, 7 Jan 2009 09:24:21 +0100
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>
References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
	<3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>
	<90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>
Message-ID: <e76aa17f0901070024g596e8214of9fa0902680ebfb3@mail.gmail.com>

Then perhaps the best course of action is to explicitely test every
possibility and then take the argmin ?

Matthieu

2009/1/7 zhang chi <zhangchipr at gmail.com>:
> Thank you.
>
> "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5]
>
> On Wed, Jan 7, 2009 at 4:15 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> On Wed, Jan 7, 2009 at 02:05, zhang chi <zhangchipr at gmail.com> wrote:
>> > hi
>> >     I want to get the minimum value of a derivative free optimization
>> > problem.  The function F(x1,x2) can't be given the expression, but the
>> > function can be realized using python language.  Where x1 $\in$ [1,100],
>> > and
>> > x2 $\in$ [50,80]. Can scipy resolve this problem?
>>
>> Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this.
>>
>> > I have tried the cobyla,
>> > but it cannot find the minimum value.
>> > By the way, the step of x1 and x2 is 1.
>>
>> What do you mean by "the step"?
>>
>> --
>> Robert Kern
>>
>> "I have come to believe that the whole world is an enigma, a harmless
>> enigma that is made terrible by our own mad attempt to interpret it as
>> though it had an underlying truth."
>>  -- Umberto Eco
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>


-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From zhangchipr at gmail.com  Wed Jan  7 03:29:31 2009
From: zhangchipr at gmail.com (zhang chi)
Date: Wed, 7 Jan 2009 16:29:31 +0800
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com>
References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
	<3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>
	<90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>
	<3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com>
Message-ID: <90c482ab0901070029l4480f164p855f1cc617a1e2aa@mail.gmail.com>

Thank you, the two function anneal, brute in scipy can resolve this problem?


On Wed, Jan 7, 2009 at 4:23 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Wed, Jan 7, 2009 at 02:18, zhang chi <zhangchipr at gmail.com> wrote:
> > Thank you.
> >
> > "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5]
>
> No, there is no combinatorial optimization in scipy. For a problem as
> small as yours, I recommend just doing a brute force search.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090107/bf5f72a1/attachment.html>

From zhangchipr at gmail.com  Wed Jan  7 03:31:49 2009
From: zhangchipr at gmail.com (zhang chi)
Date: Wed, 7 Jan 2009 16:31:49 +0800
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <e76aa17f0901070024g596e8214of9fa0902680ebfb3@mail.gmail.com>
References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
	<3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>
	<90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>
	<e76aa17f0901070024g596e8214of9fa0902680ebfb3@mail.gmail.com>
Message-ID: <90c482ab0901070031h6fd3a444o12c8620452e9636d@mail.gmail.com>

Thank you ,but I only give a example. In fact there are many points to be
computed, if I calculate very point, it will take me two days to complete
this work.


On Wed, Jan 7, 2009 at 4:24 PM, Matthieu Brucher <matthieu.brucher at gmail.com
> wrote:

> Then perhaps the best course of action is to explicitely test every
> possibility and then take the argmin ?
>
> Matthieu
>
> 2009/1/7 zhang chi <zhangchipr at gmail.com>:
> > Thank you.
> >
> > "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5]
> >
> > On Wed, Jan 7, 2009 at 4:15 PM, Robert Kern <robert.kern at gmail.com>
> wrote:
> >>
> >> On Wed, Jan 7, 2009 at 02:05, zhang chi <zhangchipr at gmail.com> wrote:
> >> > hi
> >> >     I want to get the minimum value of a derivative free optimization
> >> > problem.  The function F(x1,x2) can't be given the expression, but the
> >> > function can be realized using python language.  Where x1 $\in$
> [1,100],
> >> > and
> >> > x2 $\in$ [50,80]. Can scipy resolve this problem?
> >>
> >> Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this.
> >>
> >> > I have tried the cobyla,
> >> > but it cannot find the minimum value.
> >> > By the way, the step of x1 and x2 is 1.
> >>
> >> What do you mean by "the step"?
> >>
> >> --
> >> Robert Kern
> >>
> >> "I have come to believe that the whole world is an enigma, a harmless
> >> enigma that is made terrible by our own mad attempt to interpret it as
> >> though it had an underlying truth."
> >>  -- Umberto Eco
> >> _______________________________________________
> >> SciPy-user mailing list
> >> SciPy-user at scipy.org
> >> http://projects.scipy.org/mailman/listinfo/scipy-user
> >
> >
> > _______________________________________________
> > SciPy-user mailing list
> > SciPy-user at scipy.org
> > http://projects.scipy.org/mailman/listinfo/scipy-user
> >
> >
>
>
>
> --
> Information System Engineer, Ph.D.
> Website: http://matthieu-brucher.developpez.com/
> Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090107/8a2a0694/attachment.html>

From robert.kern at gmail.com  Wed Jan  7 03:39:20 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Jan 2009 02:39:20 -0600
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <90c482ab0901070029l4480f164p855f1cc617a1e2aa@mail.gmail.com>
References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
	<3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com>
	<90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com>
	<3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com>
	<90c482ab0901070029l4480f164p855f1cc617a1e2aa@mail.gmail.com>
Message-ID: <3d375d730901070039i235bcafdhf8c48d1e995cb5a0@mail.gmail.com>

On Wed, Jan 7, 2009 at 02:29, zhang chi <zhangchipr at gmail.com> wrote:
> Thank you, the two function anneal, brute in scipy can resolve this problem?

For anneal(), you will have to implement an appropriate annealing
schedule that only picks values in your discrete domain, but you have
to do it carefully. Note that brute() just loops over all of the
possibilities, which you say will take too much time. It is entirely
possible that anneal() will take at least as many evaluations as the
brute force search, so you should wrap your evaluation function inside
another function that will cache the results.

If you only have to solve this problem once, just start doing the
brute force search now. It will probably take as long to develop a
correct annealing schedule as to just exhaustively search.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From bkomaki at yahoo.com  Wed Jan  7 05:19:21 2009
From: bkomaki at yahoo.com (Ch B Komaki)
Date: Wed, 7 Jan 2009 02:19:21 -0800 (PST)
Subject: [SciPy-user] Can scipy resolve this problem?
In-Reply-To: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com>
Message-ID: <34329.35417.qm@web30408.mail.mud.yahoo.com>

Hi
You can use SciKits to solve your prolems,
you can see more;
http://projects.scipy.org/scipy/scikits/browser/trunk/openopt/scikits/openopt/examples/nlp_1.py
bye

--- On Wed, 1/7/09, zhang chi <zhangchipr at gmail.com> wrote:
From: zhang chi <zhangchipr at gmail.com>
Subject: [SciPy-user] Can scipy resolve this problem?
To: SciPy-user at scipy.org
Date: Wednesday, January 7, 2009, 11:35 AM

hi
??? I want to get the minimum value of a derivative free optimization problem.? The function F(x1,x2) can't be given the expression, but the function can be realized using python language.? Where x1 $\in$ [1,100], and x2 $\in$ [50,80]. Can scipy resolve this problem? I have tried the cobyla, but it cannot find the minimum value. 

By the way, the step of x1 and x2 is 1.

Thank you very much.

_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090107/387a3d92/attachment.html>

From grh at mur.at  Wed Jan  7 07:52:15 2009
From: grh at mur.at (Georg Holzmann)
Date: Wed, 07 Jan 2009 13:52:15 +0100
Subject: [SciPy-user] Numpy/SciPy and performance optimizations
Message-ID: <4964A57F.5010104@mur.at>

Hallo!

In the last days I went through some tutorials about python and 
performance optimizations, basically the two main articles I looked 
through are (and the references in there):
- http://www.scipy.org/PerformancePython
- http://wiki.cython.org/tutorials/numpy

So it seems that there exist now many possibilities to speed up some 
essential parts of the python code, however, I am still not satisfied 
with those solutions.


My problem:

I have parts in my projects, where I have to iterate over loops (some 
recursive algorithms). In the past I developed the basic library in C++ 
(using SWIG to generate python modules) - but now I want to switch fully 
to python and only optimize some small parts, because I waste too much 
time while trying to extend the C++ library, which is already quite 
complex ...

Okay, of course weave in combination with blitz looked very attractive 
to me.
After struggling through the documentation of weave and blitz++, I 
understood the concept and tried to implement an example.
One example of such a typical loop would be (all variables are arrays, 
from numpy import *):

for n in range(steps):
     x = dot(A, x)
     x += dot(B, u[:, n])
     x = tanh(x)
     y[:,n] = dot(C, r_[x,u[:,n]] )

So I need in blitz++ some matrix-vector multiplications and similar 
stuff, which is unfortunately not very intuitive.
One way is to use the blitz::sum function, which is IMHO not intuitive 
and very slow, slower than usual numpy (see for instance also some 
benchmark of C/C++ libraries I made last year: 
http://grh.mur.at/misc/sparselib_benchmark/index.html).
Another way would be to use blas and write support code for every needed 
blas (or maybe also lapack) function - as for instance demonstrated in 
http://www.math.washington.edu/~jkantor/Numerical_Sage/node14.html.
However, this was now too much work for me ...


What I want:

- easy embeddable C/C++ code, without having to handle a complicated 
python API (like in weave)
- basic matrix operations (blas, maybe also lapack) available in C/C++
- nice indexing, slicing etc. also in C/C++ (which is nice with blitz++)
- handling of sparse matrices also in C/C++ (at least basic blas methods 
for sparse matrices)

OK, this is quite a big wishlist ;)
However, ATM I can think of two possible solutions:

1. Add some additional header files to weave/blitz, so that it is out of 
the box possible to have at least blas functions available

2. Writing a new type converter for weave, which supports a more feature 
rich (and faster) C++ library than blitz++

I don't know how hard 2. would be ?
At least I played with quite some C++ libraries last year (see again the 
benchmark http://grh.mur.at/misc/sparselib_benchmark/index.html) and 
there would be three nice candidates:
- MTL: http://www.osl.iu.edu/research/mtl/
- gmm++: http://home.gna.org/getfem/gmm_intro
- flens: http://flens.sourceforge.net/
(- maybe also boost ublas: 
http://grh.mur.at/misc/sparselib_benchmark/www.boost.org/libs/numeric/)

These three libraries are very fast, header only libs (like blitz++) and 
also have blas, lapack and sparse support.
See also this more general benchmark, which shows advantages of MTL 
compared to Intel BLAS, blitz, fortran, c: 
http://projects.opencascade.org/btl/


So, it would be nice to get some feedback, maybe there are other 
solutions I don't know of ?
(Maybe it is easier to do all this in fortran and use f2py ?)
How do other people optimize more complicated code ?

I would be also happy to get some remarks, if it is useful to implement 
type converters for an other C++ library than blitz++ (e.g. MTL or 
gmm++) - and maybe some suggestions for that ...

Thanks for any hints,
LG
Georg


From ndbecker2 at gmail.com  Wed Jan  7 08:28:20 2009
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 07 Jan 2009 08:28:20 -0500
Subject: [SciPy-user] Numpy/SciPy and performance optimizations
References: <4964A57F.5010104@mur.at>
Message-ID: <gk2alk$n9i$1@ger.gmane.org>

Georg Holzmann wrote:

> Hallo!
> 
> In the last days I went through some tutorials about python and
> performance optimizations, basically the two main articles I looked
> through are (and the references in there):
> - http://www.scipy.org/PerformancePython
> - http://wiki.cython.org/tutorials/numpy
> 
> So it seems that there exist now many possibilities to speed up some
> essential parts of the python code, however, I am still not satisfied
> with those solutions.
> 
> 
> My problem:
> 
> I have parts in my projects, where I have to iterate over loops (some
> recursive algorithms). In the past I developed the basic library in C++
> (using SWIG to generate python modules) - but now I want to switch fully
> to python and only optimize some small parts, because I waste too much
> time while trying to extend the C++ library, which is already quite
> complex ...
> 
> Okay, of course weave in combination with blitz looked very attractive
> to me.
> After struggling through the documentation of weave and blitz++, I
> understood the concept and tried to implement an example.
> One example of such a typical loop would be (all variables are arrays,
> from numpy import *):
> 
> for n in range(steps):
>      x = dot(A, x)
>      x += dot(B, u[:, n])
>      x = tanh(x)
>      y[:,n] = dot(C, r_[x,u[:,n]] )
> 
> So I need in blitz++ some matrix-vector multiplications and similar
> stuff, which is unfortunately not very intuitive.
> One way is to use the blitz::sum function, which is IMHO not intuitive
> and very slow, slower than usual numpy (see for instance also some
> benchmark of C/C++ libraries I made last year:
> http://grh.mur.at/misc/sparselib_benchmark/index.html).
> Another way would be to use blas and write support code for every needed
> blas (or maybe also lapack) function - as for instance demonstrated in
> http://www.math.washington.edu/~jkantor/Numerical_Sage/node14.html.
> However, this was now too much work for me ...
> 
> 
> What I want:
> 
> - easy embeddable C/C++ code, without having to handle a complicated
> python API (like in weave)
> - basic matrix operations (blas, maybe also lapack) available in C/C++
> - nice indexing, slicing etc. also in C/C++ (which is nice with blitz++)
> - handling of sparse matrices also in C/C++ (at least basic blas methods
> for sparse matrices)
> 
> OK, this is quite a big wishlist ;)
> However, ATM I can think of two possible solutions:
> 
> 1. Add some additional header files to weave/blitz, so that it is out of
> the box possible to have at least blas functions available
> 
> 2. Writing a new type converter for weave, which supports a more feature
> rich (and faster) C++ library than blitz++
> 
> I don't know how hard 2. would be ?
> At least I played with quite some C++ libraries last year (see again the
> benchmark http://grh.mur.at/misc/sparselib_benchmark/index.html) and
> there would be three nice candidates:
> - MTL: http://www.osl.iu.edu/research/mtl/
> - gmm++: http://home.gna.org/getfem/gmm_intro
> - flens: http://flens.sourceforge.net/
> (- maybe also boost ublas:
> http://grh.mur.at/misc/sparselib_benchmark/www.boost.org/libs/numeric/)
> 
> These three libraries are very fast, header only libs (like blitz++) and
> also have blas, lapack and sparse support.
> See also this more general benchmark, which shows advantages of MTL
> compared to Intel BLAS, blitz, fortran, c:
> http://projects.opencascade.org/btl/
> 
I have had best luck with boost::ublas.  Limited to 2d though.
blitz is very nice, but 2 problems:
suffers from lots of old cruft from supporting ancient c++ compilers
Poorly maintained - future uncertain IMO.

MTL is moving extremely slowly.

One very active project is eigen.  I haven't used it myself.


From sturla at molden.no  Wed Jan  7 08:41:06 2009
From: sturla at molden.no (Sturla Molden)
Date: Wed, 07 Jan 2009 14:41:06 +0100
Subject: [SciPy-user] Current status of spatial data structures
In-Reply-To: <200901070915.04411.w.richert@gmx.net>
References: <200901070915.04411.w.richert@gmx.net>
Message-ID: <4964B0F2.2020205@molden.no>


First of all: scipy.stat.KDTree is better than my version in the 
Cookbook. Here is what Anne Archibald wrote about it:

"There is now a compiled kd-tree implementation in scipy.spatial. It is
written in cython and based on the python implementation. It supports
only optionally-bounded, optionally-approximate, k-nearest neighbor
queries but runs without any per-point python code. It includes all
the algorithmic optimizations described by the ANN authors (sliding
midpoint subdivision, multiple-entry leaves, updating minimum-distance
calculation, priority search, and short-circuit distance
calculations). I think it's pretty good. The major feature it is
missing, from what people have asked for, is an all-neighbors query."


Note that 'written in cython' means it is compiled to C.

I did not know of libkdtree++ until recently. It is written in C++ with 
the dimension statically defined as a template. This is a severe 
limitation, as a Scipy module would be bloated (even if you limit 
yourself to say d < 22 and single and double precision).

As for C++: I once wrote a version in C++ similar to that in the 
Cookbook. It ended up being slower than my Python prototype. Can you 
demonstrate that libkdtree++ is faster than the Cyton compiled version 
in SVN?

I have not checked, but I hope the KDTree in scipy.spatial supports 
pickling or some other form of serialization, e.g. for use with 
multiprocessing or saving to disk.

The Cookbook KDTree must be changed after the next release. It is not 
that useful anymore.


Regards,

Sturla Molden


On 1/7/2009 9:15 AM, Willi Richert wrote:
> Hi,
> 
> here are some observations regarding the current status of kdtree support in 
> Python:
> 
> - scipy 0.7 includes scipy.spatial and supports spatial searches via KDTree 
> http://docs.scipy.org/doc/scipy/reference/spatial.html
> 
> - the cookbook contains another kdtree version: 
> http://scipy.org/Cookbook/KDTree
> 
> - I have provided Python swig wrappers to the libkdtree++ library 
> (http://libkdtree.alioth.debian.org/). Although the data structure has to be 
> fixed (at compile time of libkdtree++) and thus one has to change the swig 
> bindings if one needs to store a different type of vector, it is by my 
> knowledge the only implementation that allows changes to the kdtree data 
> structure at runtime (add/remove support after initial setup). All the other 
> approaches are "create once/query multiple times" approaches.
> 
> Maybe this is of interest to somebody on this list. The authors of libkdtree++ 
> work towards a dynamic data structure support. If that is accomplished and I 
> have adjusted the Python wrapper, will there be room for another kdtree 
> implementation in scipy.spatial? If yes, I would try to match the interface as 
> closely as possible to the current 
> 
> Regards,
> wr
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From sturla at molden.no  Wed Jan  7 08:48:24 2009
From: sturla at molden.no (Sturla Molden)
Date: Wed, 07 Jan 2009 14:48:24 +0100
Subject: [SciPy-user] Current status of spatial data structures
In-Reply-To: <4964B0F2.2020205@molden.no>
References: <200901070915.04411.w.richert@gmx.net> <4964B0F2.2020205@molden.no>
Message-ID: <4964B2A8.9090502@molden.no>

On 1/7/2009 2:41 PM, Sturla Molden wrote:
> First of all: scipy.stat.KDTree is better than my version in the 
> Cookbook. Here is what Anne Archibald wrote about it:
> 
> "There is now a compiled kd-tree implementation in scipy.spatial. It is
> written in cython and based on the python implementation. It supports
> only optionally-bounded, optionally-approximate, k-nearest neighbor
> queries but runs without any per-point python code. It includes all
> the algorithmic optimizations described by the ANN authors (sliding
> midpoint subdivision, multiple-entry leaves, updating minimum-distance
> calculation, priority search, and short-circuit distance
> calculations). I think it's pretty good. The major feature it is
> missing, from what people have asked for, is an all-neighbors query."

Anne's code is here:

http://svn.scipy.org/svn/scipy/trunk/scipy/spatial/ckdtree.pyx


Sturla Molden


From grh at mur.at  Wed Jan  7 09:23:43 2009
From: grh at mur.at (Georg Holzmann)
Date: Wed, 07 Jan 2009 15:23:43 +0100
Subject: [SciPy-user] Numpy/SciPy and performance optimizations
In-Reply-To: <gk2alk$n9i$1@ger.gmane.org>
References: <4964A57F.5010104@mur.at> <gk2alk$n9i$1@ger.gmane.org>
Message-ID: <4964BAEF.2000108@mur.at>

Hallo!

> I have had best luck with boost::ublas.  Limited to 2d though.
> blitz is very nice, but 2 problems:
> suffers from lots of old cruft from supporting ancient c++ compilers
> Poorly maintained - future uncertain IMO.

For me the biggest problem with blitz++ is, that it is very unintuitive 
and slow to write blas-like statements.

> MTL is moving extremely slowly.
> 
> One very active project is eigen.  I haven't used it myself.

Thanks for the hint to eigen (http://eigen.tuxfamily.org/), I did not 
know this library - it looks very promising (although there are no 
benchmarks for sparse operations, I should try that!).

Do you also know if this library is well maintained ?
I am now using a very nice c++ lib (flens: 
http://flens.sourceforge.net/), but with a very unforeseeable future 
(and many dependencies, hard to build) ...


However, what I mainly wanted to ask was, how you use boost::ublas or 
eigen with numpy/scipy - how both systems are combined ?
For me ATM the optimal way would be to use e.g. eigen in weave like now 
blitz++ is used in weave ...

Thanks,
LG
Georg


From sturla at molden.no  Wed Jan  7 09:58:41 2009
From: sturla at molden.no (Sturla Molden)
Date: Wed, 07 Jan 2009 15:58:41 +0100
Subject: [SciPy-user] parallelizing cKDTRee
Message-ID: <4964C321.6090504@molden.no>


Speed is very important when searching kd-trees; otherwise we should not 
be using kd-trees but brute force. Thus exploiting multiple processors 
are important as well.

1. Multiprocessing:
Must add support for pickling and unpickling to cKDTree (i.e. __reduce__ 
and __setstate__ methods). This would be useful for saving to disk as well.

2. Multithreading (Python):
cKDTree.query calls cKDTree.__query with the GIL released (i.e. a 'with 
nogil:' block). I think this will be safe.

3. Multithreading (Cython):
We could simply call cKDTree.__query in parallel using OpenMP pragmas. 
It would be a simple and quite portable hack.

Which do you prefer? All three?

(Forgive me for cross-posting. I did not know which list is the more 
appropriate.)


Regards,
Sturla Molden


From contact at pythonxy.com  Wed Jan  7 12:15:40 2009
From: contact at pythonxy.com (Pierre Raybaut)
Date: Wed, 07 Jan 2009 18:15:40 +0100
Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : 2.1.9
In-Reply-To: <mailman.3927.1231314446.2878.scipy-user@scipy.org>
References: <mailman.3927.1231314446.2878.scipy-user@scipy.org>
Message-ID: <4964E33C.10109@pythonxy.com>

From: Stef Mientki <stef.mientki at gmail.com>
> Subject: Re: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release :
> 	2.1.9
> To: pythonxy at googlegroups.com
> Cc: scipy-user at scipy.org
> Message-ID: <4963C7B4.80101 at gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> hi Pierre,
> Did you miss this question ?
>
> Pierre Raybaut wrote:
>   
> Date: Tue, 06 Jan 2009 22:05:56 +0100
>> Hi all,
>>
>> Release 2.1.8 is now available on http://www.pythonxy.com:
>>    - All-in-One Installer ("Full Edition"),
>>    - Plugin Installer -- to be downloaded with xyweb,
>>    - Update
>>
>> Changes history
>> Version 2.1.8 (01-04-2009)
>>
>>      * Added:
>>            o SciTE 1.77.0 (replacement for Notepad++)
>>            o WinMerge 2.10.2 - Open Source differencing and merging 
>> tool for Windows
>>      * Updated:
>>            o Console 2.0.141.6
>>            o VPython 5.0.1.0
>>   
>>     
> Isn't VPython-5 still a little buggy and missing features of VPyton-3 ?
> And why only for Windows ?
> I would suggest to add both  VPython-3 and VPython-5,
> and use a programmatical switch between these two.
>
> cheers,
> Stef
>
>
> Pierre Raybaut wrote:
>   
>> Hi all,
>>
>> Release 2.1.9 is now available on http://www.pythonxy.com:
>>    - All-in-One Installer ("Full Edition"),
>>    - Plugin Installer -- to be downloaded with xyweb,
>>    - Update
>>
>> Changes history
>> Version 2.1.9 (01-06-2009)
>>
>>      * Updated:
>>            o VTK 5.2.1
>>            o Enthought Tool Suite 3.1.0.2
>>      * Corrected:
>>            o Issues 54, 55
>>
>>
>> Regards,
>> Pierre Raybaut
>>
>>
>>
>> --~--~---------~--~----~------------~-------~--~----~
>> You received this message because you are subscribed to the Google Groups "python(x,y)" group.
>> To post to this group, send email to pythonxy at googlegroups.com
>> To unsubscribe from this group, send email to pythonxy+unsubscribe at googlegroups.com
>> For more options, visit this group at http://groups.google.com/group/pythonxy?hl=en
>> -~----------~----~----~----~------~----~------~--~---
>>
>>   
>>     
> ------------------------------
>
> Message: 8
> Date: Tue, 6 Jan 2009 18:07:32 -0500
> From: James Mueller <mueller at pitt.edu>
> Subject: [SciPy-user]  [python(x,y)] [ Python(x,y) ] New release :
> 	2.1.9
> To: <scipy-user at scipy.org>
> Message-ID: <95205F9B-4383-4F17-B944-F6CDEA5CA326 at pitt.edu>
> Content-Type: text/plain; charset="US-ASCII"; format=flowed; delsp=yes
>
> Stef,
> 	VPython 5.0.ReleaseCandidate1, replaces VPython 4.0.beta26.  VPython  
> 3 has never been in Python(x,y).  Given that version 3 relies on  
> Numeric instead of Numpy, I am not sure how easy it would be for  
> Pierre to add it in.
>
> -Jim
>
>   

Stef,

To the best of my knowledge (which is quite limited on this matter as 
I'm not using personnaly this module), there never was a stable version 
of VPython since v3 which relies indeed on Numeric instead of NumPy as 
mentioned Jim.
Moreover, v5.0 being a release candidate, I guess that it's intended to 
be more stable than v4.0 which was a beta release.

Thanks for your interest in Python(x,y),
Cheers,
Pierre


From ellisonbg.net at gmail.com  Wed Jan  7 15:00:03 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 7 Jan 2009 12:00:03 -0800
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
Message-ID: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>

Hi,

I see that people are starting to use multiprocessing to parallelize
numerical Python code.  I am wondering if we want to allow/recommend
using multiprocessing in scipy.  Here are some of my concerns:

* Currently multiprocessing doesn't play well with IPython.  Thus, if
scipy starts to use multiprocessing, people will get very unpleasant
surprises when using IPython.  I don't know exactly what the problems
are, but my feeling is that it is unlikely that IPython will ever have
*full* support for multiprocessing.  Some support might be possible,
though.

* I have no idea about how multiprocessing plays with GUIs.  Because
multiprocessing uses fork, my gut feeling is that GUIs would not be
very happy with multiprocessing.  But, I imagine that it really
depends on what exactly multiprocessing does when it forks.  It would
be bad if parts of scipy became unusable from a GUI because of
multiprocessing.

* Multiprocessing doesn't play well with other things as well, such as
Twisted.  Again, if scipy uses multiprocessing, it would become
usuable within Twisted based servers.

What experience have others had with using multiprocessing in these
contexts.  Success?  Failure?  Based on that, what to other people
recommend and think about using multiprocessing in scipy or numpy?  I
guess this also applies to any other project in this realm (sympy,
pymc, ETS, matplotlib, etc., etc.).

Cheers,

Brian


From ndbecker2 at gmail.com  Wed Jan  7 15:05:58 2009
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 07 Jan 2009 15:05:58 -0500
Subject: [SciPy-user] Numpy/SciPy and performance optimizations
References: <4964A57F.5010104@mur.at> <gk2alk$n9i$1@ger.gmane.org>
	<4964BAEF.2000108@mur.at>
Message-ID: <gk31v7$l5o$1@ger.gmane.org>

Georg Holzmann wrote:

> Hallo!
> 
>> I have had best luck with boost::ublas.  Limited to 2d though.
>> blitz is very nice, but 2 problems:
>> suffers from lots of old cruft from supporting ancient c++ compilers
>> Poorly maintained - future uncertain IMO.
> 
> For me the biggest problem with blitz++ is, that it is very unintuitive
> and slow to write blas-like statements.
> 
>> MTL is moving extremely slowly.
>> 
>> One very active project is eigen.  I haven't used it myself.
> 
> Thanks for the hint to eigen (http://eigen.tuxfamily.org/), I did not
> know this library - it looks very promising (although there are no
> benchmarks for sparse operations, I should try that!).
> 
> Do you also know if this library is well maintained ?
> I am now using a very nice c++ lib (flens:
> http://flens.sourceforge.net/), but with a very unforeseeable future
> (and many dependencies, hard to build) ...
> 
> 
> However, what I mainly wanted to ask was, how you use boost::ublas or
> eigen with numpy/scipy - how both systems are combined ?
> For me ATM the optimal way would be to use e.g. eigen in weave like now
> blitz++ is used in weave ...
> 

I haven't really found totally satisfactory solutions here.  I used boost::ublas with boost::python 99% of the time, and only sometimes numpy.  There are a number of efforts to try to do something better.  One that interests me is pyublas.

Another interesting thing is cython, which is supposed to be getting support for numpy.  Personally I find cython a bit too strange and a bit too C-centric, but it is widely used (all of sage!).


From robince at gmail.com  Wed Jan  7 15:21:34 2009
From: robince at gmail.com (Robin)
Date: Wed, 7 Jan 2009 20:21:34 +0000
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
Message-ID: <b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>

On Wed, Jan 7, 2009 at 8:00 PM, Brian Granger <ellisonbg.net at gmail.com> wrote:
> Hi,
>
> I see that people are starting to use multiprocessing to parallelize
> numerical Python code.  I am wondering if we want to allow/recommend
> using multiprocessing in scipy.  Here are some of my concerns:
>
> * Currently multiprocessing doesn't play well with IPython.  Thus, if
> scipy starts to use multiprocessing, people will get very unpleasant
> surprises when using IPython.  I don't know exactly what the problems
> are, but my feeling is that it is unlikely that IPython will ever have
> *full* support for multiprocessing.  Some support might be possible,
> though.

I've used multiprocessing (or actually pyprocessing) a little bit with
IPython. The main problem is that you can't use interactively defined
functions - ie can't do something like
p = Pool(8)
p.map(lambda x: somefunc(x,2,3), range(1,100))

because pickling interactively defined stuff doesn't work in Ipython.
Other than that though - it's been working fine for me, (ie just make
sure anything you are using is defined in a module so pickle works):

from module import somefunc
p.map(somefunc, range(1,100)

although I haven't been trying to do anything too clever (haven't had
any plots open or anything like that)

I think it adds a very valuable feature - that for a beginner like me
is much easier to get to grips with than MPI or even the clustering
features of ipython, to easily allow use of multi-core machines.

It would be great if IPython could sort out the pickle business so you
could pickle interactively defined functions (they currently don't
show up in __main__ which is a FakeModule instance).

Robin


From karl.young at ucsf.edu  Wed Jan  7 15:17:26 2009
From: karl.young at ucsf.edu (Young, Karl)
Date: Wed, 7 Jan 2009 12:17:26 -0800
Subject: [SciPy-user] multidimensional wavelet packages
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
Message-ID: <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu>


I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable.    

Karl Young
Center for Imaging of Neurodegenerative Disease, UCSF
VA Medical Center, MRS Unit (114M)
Phone:  (415) 221-4810 x3114
FAX:    (415) 668-2864
Email:  karl young at ucsf edu


From ellisonbg.net at gmail.com  Wed Jan  7 15:35:25 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 7 Jan 2009 12:35:25 -0800
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>
Message-ID: <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com>

> I think it adds a very valuable feature - that for a beginner like me
> is much easier to get to grips with than MPI or even the clustering
> features of ipython, to easily allow use of multi-core machines.

I am not questioning that the features of multiprocessing are
valuable, I just want to understand what the limitations are to using
fork.

> It would be great if IPython could sort out the pickle business so you
> could pickle interactively defined functions (they currently don't
> show up in __main__ which is a FakeModule instance).

Isn't the problem pickle itself though?  It is my understanding that
interactive functions can't be pickled, even in regular python.  How
does multiprocessing get around this?  I am aware of tricks/hacks that
make this work, but I would be surprised if multiprocessing was using
these.

Do interactive functions work with multiprocessing in the standard
interactive python shell?

Cheers,

Brian


> Robin
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From robert.kern at gmail.com  Wed Jan  7 15:49:29 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Jan 2009 15:49:29 -0500
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>
	<6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com>
Message-ID: <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com>

On Wed, Jan 7, 2009 at 15:35, Brian Granger <ellisonbg.net at gmail.com> wrote:

>> It would be great if IPython could sort out the pickle business so you
>> could pickle interactively defined functions (they currently don't
>> show up in __main__ which is a FakeModule instance).
>
> Isn't the problem pickle itself though?  It is my understanding that
> interactive functions can't be pickled, even in regular python.  How
> does multiprocessing get around this?

In the regular interpreter, the functions are in the __main__ module,
which the subprocess inherits (on UNIX and if the function is defined
before forking).

The FakeModule business is really the culprit in IPython. Which is a
shame, because the comments for that class lead one to believe that it
exists to support pickling.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From ellisonbg.net at gmail.com  Wed Jan  7 16:22:39 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 7 Jan 2009 13:22:39 -0800
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>
	<6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com>
	<3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com>
Message-ID: <6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com>

> In the regular interpreter, the functions are in the __main__ module,
> which the subprocess inherits (on UNIX and if the function is defined
> before forking).
>
> The FakeModule business is really the culprit in IPython. Which is a
> shame, because the comments for that class lead one to believe that it
> exists to support pickling.

I am not familiar with this part of IPython, but I will ask Fernando
or Ville when I get a chance.  Hopefully this could be fixed.  But is
that the *only* issue that has needs to be addressed to use
IPython+multiprocessing?

Cheers,

Brian


From robert.kern at gmail.com  Wed Jan  7 16:29:30 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Jan 2009 16:29:30 -0500
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>
	<6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com>
	<3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com>
	<6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com>
Message-ID: <3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com>

On Wed, Jan 7, 2009 at 16:22, Brian Granger <ellisonbg.net at gmail.com> wrote:
>> In the regular interpreter, the functions are in the __main__ module,
>> which the subprocess inherits (on UNIX and if the function is defined
>> before forking).
>>
>> The FakeModule business is really the culprit in IPython. Which is a
>> shame, because the comments for that class lead one to believe that it
>> exists to support pickling.
>
> I am not familiar with this part of IPython, but I will ask Fernando
> or Ville when I get a chance.  Hopefully this could be fixed.  But is
> that the *only* issue that has needs to be addressed to use
> IPython+multiprocessing?

There are probably smaller details floating around, but that's the
most important one.

WRT multiprocessing and GUIs, we have a wxPython application that
starts up a Process (that does not use a GUI) just fine on UNIX and
Windows.

But why the sudden interest? And why on this list rather than
ipython-devel where we've discussed these issues before?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From stefan at sun.ac.za  Wed Jan  7 16:39:25 2009
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 7 Jan 2009 23:39:25 +0200
Subject: [SciPy-user] multidimensional wavelet packages
In-Reply-To: <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu>
Message-ID: <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com>

Hi Karl

The only Python wavelet library I've ever used is the one at

http://wavelets.scipy.org

It works pretty well, if only for 1D and 2D cases.  IIRC, some of the
orthogonal wavelet transforms are separable, so you may be able to
construct a 3D transform using the 1D functions already implemented.

Regards
St?fan

2009/1/7 Young, Karl <karl.young at ucsf.edu>:
>
> I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable.
>
> Karl Young
> Center for Imaging of Neurodegenerative Disease, UCSF
> VA Medical Center, MRS Unit (114M)
> Phone:  (415) 221-4810 x3114
> FAX:    (415) 668-2864
> Email:  karl young at ucsf edu
>
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From Karl.Young at ucsf.edu  Wed Jan  7 16:15:27 2009
From: Karl.Young at ucsf.edu (Karl Young)
Date: Wed, 07 Jan 2009 13:15:27 -0800
Subject: [SciPy-user] multidimensional wavelet packages
In-Reply-To: <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>	<9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu>
	<9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com>
Message-ID: <49651B6F.6080001@ucsf.edu>


Hi Stefan,

Thanks; I'd looked a little at PyWavelets and figured that what you 
suggest might be what I ended up hacking but thought maybe some 
enterprising neuroimager (or other person working with 3D, 4D data) 
might have already done so :-)

>Hi Karl
>
>The only Python wavelet library I've ever used is the one at
>
>http://wavelets.scipy.org
>
>It works pretty well, if only for 1D and 2D cases.  IIRC, some of the
>orthogonal wavelet transforms are separable, so you may be able to
>construct a 3D transform using the 1D functions already implemented.
>
>Regards
>St?fan
>
>2009/1/7 Young, Karl <karl.young at ucsf.edu>:
>  
>
>>I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable.
>>
>>Karl Young
>>Center for Imaging of Neurodegenerative Disease, UCSF
>>VA Medical Center, MRS Unit (114M)
>>Phone:  (415) 221-4810 x3114
>>FAX:    (415) 668-2864
>>Email:  karl young at ucsf edu
>>
>>
>>
>>
>>_______________________________________________
>>SciPy-user mailing list
>>SciPy-user at scipy.org
>>http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>>    
>>
>_______________________________________________
>SciPy-user mailing list
>SciPy-user at scipy.org
>http://projects.scipy.org/mailman/listinfo/scipy-user
>
>  
>


-- 

Karl Young
Center for Imaging of Neurodegenerative Diseases, UCSF          
VA Medical Center (114M)              Phone:  (415) 221-4810 x3114  lab        
4150 Clement Street                   FAX:    (415) 668-2864
San Francisco, CA 94121               Email:  karl young at ucsf edu


From gael.varoquaux at normalesup.org  Wed Jan  7 17:39:50 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Wed, 7 Jan 2009 23:39:50 +0100
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
Message-ID: <20090107223950.GA5186@phare.normalesup.org>

On Wed, Jan 07, 2009 at 12:00:03PM -0800, Brian Granger wrote:
> I see that people are starting to use multiprocessing to parallelize
> numerical Python code.  I am wondering if we want to allow/recommend
> using multiprocessing in scipy.

Too late! I use it in almost all code :). OK, none if this is in Scipy,
but multiprocessing is starting to creep in various places.

> * Currently multiprocessing doesn't play well with IPython.  Thus, if
> scipy starts to use multiprocessing, people will get very unpleasant
> surprises when using IPython.  I don't know exactly what the problems
> are, but my feeling is that it is unlikely that IPython will ever have
> *full* support for multiprocessing.  Some support might be possible,
> though.

As Robert points out, that's because of wizardry done by IPython. That's
really a pity, because in my experience, multiprocessing is fairly
robust. Nothing that's not fixable from IPython's side, though, I
believe.

> * Multiprocessing doesn't play well with other things as well, such as
> Twisted.  Again, if scipy uses multiprocessing, it would become
> usuable within Twisted based servers.

IMHO that's a bug of Twisted :). More seriously, multiprocessing is now
in the standard library. It may have some quirks, but I think everybody
should try and play well with it, and I wouldn't be surprised to see
things improving as people get familiar with it.

> What experience have others had with using multiprocessing in these
> contexts.  Success?  Failure?

I have tried every solution for parallel computing, and for
single-machine parallel computing, multiprocessing is my favorite option.
The reason being that its API for spawning and killing processes is
really light and quick (fork gives you speed). It does not eat much
resources, and it allows sharing of arrays or other types. It implements
a very light parallel computing which is very much what I need. Moreover,
the fork give automatic distribution of globals which I like a lot. On
the other hand, error-management is less than ideal.

I must admit I would really like to see IPython using multiprocessing as
a backend for single-computer parallel computing (I have 8 cores, so I do
a lot of that). I don't know if it is compatible with IPython's
architecture. Specifically, I would like to be able to use the same API
than IPython, with a fork-based mechanism. I would also like the easy
process management.

> Based on that, what to other people recommend and think about using
> multiprocessing in scipy or numpy?  I guess this also applies to any
> other project in this realm (sympy, pymc, ETS, matplotlib, etc., etc.).

I think they are several solutions for parallel computing with Python.
Right now they all have pros and cons. We need to strive to support as
many as possible. Multiprocessing is especially important since it comes
with the standard library.

My 2 cents,

Ga?l


From robert.kern at gmail.com  Wed Jan  7 18:07:03 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Jan 2009 18:07:03 -0500
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
Message-ID: <3d375d730901071507t7dcbb097h6e482f6628fe1324@mail.gmail.com>

On Wed, Jan 7, 2009 at 15:00, Brian Granger <ellisonbg.net at gmail.com> wrote:
> What experience have others had with using multiprocessing in these
> contexts.  Success?  Failure?  Based on that, what to other people
> recommend and think about using multiprocessing in scipy or numpy?

Well, no one should be doing any parallel stuff in scipy by default.
I.e. a serial version of an algorithm should always be available. I
have no problem with people putting in parallel algorithms in addition
with whatever libraries they think are needed. We shouldn't impose any
extra dependencies, so we should treat these like we treat the
optional plotting helper functions that we have in
scipy.stats.morestats.

Pretty much all of the parallelizing libraries impose potential
incompatibilities; multiprocessing isn't exceptional in this regard.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From stef.mientki at gmail.com  Wed Jan  7 18:41:34 2009
From: stef.mientki at gmail.com (Stef Mientki)
Date: Thu, 08 Jan 2009 00:41:34 +0100
Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : 2.1.9
In-Reply-To: <4964E33C.10109@pythonxy.com>
References: <mailman.3927.1231314446.2878.scipy-user@scipy.org>
	<4964E33C.10109@pythonxy.com>
Message-ID: <49653DAE.8030104@gmail.com>

James, Pierre,

thanks for the information.
Didn't realize that VPython-3 was using the old numeric library.

cheers,
Stef

Pierre Raybaut wrote:
>>>   
>>>     
>>>       
>> Isn't VPython-5 still a little buggy and missing features of VPyton-3 ?
>> And why only for Windows ?
>> I would suggest to add both  VPython-3 and VPython-5,
>> and use a programmatical switch between these two.
>>
>> cheers,
>> Stef
>>
>>     
>>   
>>     
>
> Stef,
>
> To the best of my knowledge (which is quite limited on this matter as 
> I'm not using personnaly this module), there never was a stable version 
> of VPython since v3 which relies indeed on Numeric instead of NumPy as 
> mentioned Jim.
> Moreover, v5.0 being a release candidate, I guess that it's intended to 
> be more stable than v4.0 which was a beta release.
>
> Thanks for your interest in Python(x,y),
> Cheers,
> Pierre
>
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups "python(x,y)" group.
> To post to this group, send email to pythonxy at googlegroups.com
> To unsubscribe from this group, send email to pythonxy+unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/pythonxy?hl=en
> -~----------~----~----~----~------~----~------~--~---
>
>   


From filipwasilewski at gmail.com  Wed Jan  7 19:14:00 2009
From: filipwasilewski at gmail.com (Filip Wasilewski)
Date: Thu, 8 Jan 2009 01:14:00 +0100
Subject: [SciPy-user] multidimensional wavelet packages
In-Reply-To: <49651B6F.6080001@ucsf.edu>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu>
	<9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com>
	<49651B6F.6080001@ucsf.edu>
Message-ID: <c3e57df40901071614y1fc39f41vc3a5cf869e0beadd@mail.gmail.com>

Hi Karl,

On Wed, Jan 7, 2009 at 22:15, Karl Young <Karl.Young at ucsf.edu> wrote:
>
> Hi Stefan,
>
> Thanks; I'd looked a little at PyWavelets and figured that what you
> suggest might be what I ended up hacking but thought maybe some
> enterprising neuroimager (or other person working with 3D, 4D data)
> might have already done so :-)

I haven't seen a 3D transform implementation in Python, but I can give
you some hints on extending PyWavelets.

First of all take a look at 2D DWT and IDWT implementation at [1]. It
follows a standard pattern of transforming rows and then columns using
1D transform and producing 2D arrays of coefficients.

As you may already know, to perform n-dimensional transform one will
have to apply 1D transform over each dimension, doubling the number of
n-dimensional coefficient arrays with every step (approximation and
details coefficients) -- see [2] for a 3D example.

Below is a naive implementation of this algorithm. As you can see it
is very short and even seems to work (I have verified results for 2d
case only), but unfortunately it has several major drawbacks in the
recursive approach (worst possible memory management and twice the
necessary computations because of PyWavelets missing true downcoef_a
and downcoef_d functions for use with apply_along_axis[3]).

I think it could be converted into something like the dwt2 from [1]
with freeing intermediate arrays, but I guess the resulting code may
become very complex, so the solution with apply_along_axis is still
very attractive (it only needs adding optimized downcoef_a and
downcoef_d functions to PyWavelets and converting recursion into
iteration to better handle memory usage).

Let me know if you come out with a more optimal solution, so if you
agree I could include it in PyWavelets.

Hope that will help you with n-dimensional implementation.

[1] http://projects.scipy.org/wavelets/browser/pywt/trunk/pywt/multidim.py
[2] http://taco.poly.edu/WaveletSoftware/standard3D.html
[3] http://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_along_axis.html

<code>
#!/usr/bin/env python
# Author: Filip Wasilewski
# Licence: Public Domain

import numpy
import pywt

# Helpers for numpy.apply_along_axis, which expects a 1D array
# as the function output
def downcoef_a(*args, **kwargs):
    """Returns DWT approximation coeffs."""
    return pywt.dwt(*args, **kwargs)[0]

def downcoef_d(*args, **kwargs):
    """Returns DWT details coeffs."""
    return pywt.dwt(*args, **kwargs)[1]


def dwt_n(data, wavelet, mode='sym', axis=0, subband=''):
    """N-dimensional Discrete Wavelet Transform

    Note: This is a proof of concept with worst possible memory usage
    characteristic.
    """
    dim = len(data.shape)
    if axis < dim:
        cA = numpy.apply_along_axis(downcoef_a, axis, data, wavelet, mode)
        cD = numpy.apply_along_axis(downcoef_d, axis, data, wavelet, mode)
        return (dwt_n(cA, wavelet, mode, axis+1, subband=subband+'L'),
                dwt_n(cD, wavelet, mode, axis+1, subband=subband+'H'))
    else:
        return (subband, data) # (subband name, coeffs)

if __name__ == '__main__':
    import pprint
    x = numpy.ones((4, 4, 4, 4)) # 4D array
    result = dwt_n(x, 'db1')
    pprint.pprint(result)

</code>

Filip Wasilewski
-- 
http://www.linkedin.com/in/filipwasilewski

>>Hi Karl
>>
>>The only Python wavelet library I've ever used is the one at
>>
>>http://wavelets.scipy.org
>>
>>It works pretty well, if only for 1D and 2D cases.  IIRC, some of the
>>orthogonal wavelet transforms are separable, so you may be able to
>>construct a 3D transform using the 1D functions already implemented.
>>
>>Regards
>>St?fan
>>
>>2009/1/7 Young, Karl <karl.young at ucsf.edu>:
>>
>>
>>>I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable.
>>>

From Karl.Young at ucsf.edu  Wed Jan  7 20:29:16 2009
From: Karl.Young at ucsf.edu (Karl Young)
Date: Wed, 07 Jan 2009 17:29:16 -0800
Subject: [SciPy-user] multidimensional wavelet packages
In-Reply-To: <c3e57df40901071614y1fc39f41vc3a5cf869e0beadd@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>	<9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu>	<9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com>	<49651B6F.6080001@ucsf.edu>
	<c3e57df40901071614y1fc39f41vc3a5cf869e0beadd@mail.gmail.com>
Message-ID: <496556EC.7020401@ucsf.edu>


Hi Filip,

Thanks much (and thanks for the original package); I will go through the 
code and let you know if I come up with anything that would be worth 
incorporating (or let you know that your suggested addition works fine 
and should be added as is).

>Hi Karl,
>
>On Wed, Jan 7, 2009 at 22:15, Karl Young <Karl.Young at ucsf.edu> wrote:
>  
>
>>Hi Stefan,
>>
>>Thanks; I'd looked a little at PyWavelets and figured that what you
>>suggest might be what I ended up hacking but thought maybe some
>>enterprising neuroimager (or other person working with 3D, 4D data)
>>might have already done so :-)
>>    
>>
>
>I haven't seen a 3D transform implementation in Python, but I can give
>you some hints on extending PyWavelets.
>
>First of all take a look at 2D DWT and IDWT implementation at [1]. It
>follows a standard pattern of transforming rows and then columns using
>1D transform and producing 2D arrays of coefficients.
>
>As you may already know, to perform n-dimensional transform one will
>have to apply 1D transform over each dimension, doubling the number of
>n-dimensional coefficient arrays with every step (approximation and
>details coefficients) -- see [2] for a 3D example.
>
>Below is a naive implementation of this algorithm. As you can see it
>is very short and even seems to work (I have verified results for 2d
>case only), but unfortunately it has several major drawbacks in the
>recursive approach (worst possible memory management and twice the
>necessary computations because of PyWavelets missing true downcoef_a
>and downcoef_d functions for use with apply_along_axis[3]).
>
>I think it could be converted into something like the dwt2 from [1]
>with freeing intermediate arrays, but I guess the resulting code may
>become very complex, so the solution with apply_along_axis is still
>very attractive (it only needs adding optimized downcoef_a and
>downcoef_d functions to PyWavelets and converting recursion into
>iteration to better handle memory usage).
>
>Let me know if you come out with a more optimal solution, so if you
>agree I could include it in PyWavelets.
>
>Hope that will help you with n-dimensional implementation.
>
>[1] http://projects.scipy.org/wavelets/browser/pywt/trunk/pywt/multidim.py
>[2] http://taco.poly.edu/WaveletSoftware/standard3D.html
>[3] http://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_along_axis.html
>
><code>
>#!/usr/bin/env python
># Author: Filip Wasilewski
># Licence: Public Domain
>
>import numpy
>import pywt
>
># Helpers for numpy.apply_along_axis, which expects a 1D array
># as the function output
>def downcoef_a(*args, **kwargs):
>    """Returns DWT approximation coeffs."""
>    return pywt.dwt(*args, **kwargs)[0]
>
>def downcoef_d(*args, **kwargs):
>    """Returns DWT details coeffs."""
>    return pywt.dwt(*args, **kwargs)[1]
>
>
>def dwt_n(data, wavelet, mode='sym', axis=0, subband=''):
>    """N-dimensional Discrete Wavelet Transform
>
>    Note: This is a proof of concept with worst possible memory usage
>    characteristic.
>    """
>    dim = len(data.shape)
>    if axis < dim:
>        cA = numpy.apply_along_axis(downcoef_a, axis, data, wavelet, mode)
>        cD = numpy.apply_along_axis(downcoef_d, axis, data, wavelet, mode)
>        return (dwt_n(cA, wavelet, mode, axis+1, subband=subband+'L'),
>                dwt_n(cD, wavelet, mode, axis+1, subband=subband+'H'))
>    else:
>        return (subband, data) # (subband name, coeffs)
>
>if __name__ == '__main__':
>    import pprint
>    x = numpy.ones((4, 4, 4, 4)) # 4D array
>    result = dwt_n(x, 'db1')
>    pprint.pprint(result)
>
></code>
>
>Filip Wasilewski
>  
>


-- 

Karl Young
Center for Imaging of Neurodegenerative Diseases, UCSF          
VA Medical Center (114M)              Phone:  (415) 221-4810 x3114  lab        
4150 Clement Street                   FAX:    (415) 668-2864
San Francisco, CA 94121               Email:  karl young at ucsf edu


From ellisonbg.net at gmail.com  Wed Jan  7 22:28:11 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 7 Jan 2009 19:28:11 -0800
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>
	<6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com>
	<3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com>
	<6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com>
	<3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com>
Message-ID: <6ce0ac130901071928u78b9bffbo5d1e52e392742fb2@mail.gmail.com>

>>> The FakeModule business is really the culprit in IPython. Which is a
>>> shame, because the comments for that class lead one to believe that it
>>> exists to support pickling.
>>
>> I am not familiar with this part of IPython, but I will ask Fernando
>> or Ville when I get a chance.  Hopefully this could be fixed.  But is
>> that the *only* issue that has needs to be addressed to use
>> IPython+multiprocessing?

> There are probably smaller details floating around, but that's the
> most important one.

OK, that is good to know.

> WRT multiprocessing and GUIs, we have a wxPython application that
> starts up a Process (that does not use a GUI) just fine on UNIX and
> Windows.

But do higher level things Pool.map work?

> But why the sudden interest? And why on this list rather than
> ipython-devel where we've discussed these issues before?

Mostly because of the recent thread on one of the scipy lists asking
about how to parallelize the kdtree code *in scipy*.  One option
mentioned was multiprocessing.  Agreed though, the IPython specific
stuff should be discussed on ipython-dev.

Cheers,

Brian


From robert.kern at gmail.com  Wed Jan  7 22:36:16 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Jan 2009 21:36:16 -0600
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <6ce0ac130901071928u78b9bffbo5d1e52e392742fb2@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<b14b627d0901071221m31e865c7i3d4a4df1c17fbcd5@mail.gmail.com>
	<6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com>
	<3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com>
	<6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com>
	<3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com>
	<6ce0ac130901071928u78b9bffbo5d1e52e392742fb2@mail.gmail.com>
Message-ID: <3d375d730901071936y68ddae00j86cca61588aa7fe2@mail.gmail.com>

On Wed, Jan 7, 2009 at 21:28, Brian Granger <ellisonbg.net at gmail.com> wrote:

>> WRT multiprocessing and GUIs, we have a wxPython application that
>> starts up a Process (that does not use a GUI) just fine on UNIX and
>> Windows.
>
> But do higher level things Pool.map work?

Queues certainly do. I don't know of any reason why Pool.map() wouldn't.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From ellisonbg.net at gmail.com  Wed Jan  7 23:00:25 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 7 Jan 2009 20:00:25 -0800
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <20090107223950.GA5186@phare.normalesup.org>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<20090107223950.GA5186@phare.normalesup.org>
Message-ID: <6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com>

> Too late! I use it in almost all code :).

I don't care if you use multiprocessing in your own code - I am
thinking only about numpy/scipy here.

> OK, none if this is in Scipy,
> but multiprocessing is starting to creep in various places.

> As Robert points out, that's because of wizardry done by IPython. That's
> really a pity, because in my experience, multiprocessing is fairly
> robust. Nothing that's not fixable from IPython's side, though, I
> believe.

Yes, a bug report should be added to IPython's launchpad site about this.

>> * Multiprocessing doesn't play well with other things as well, such as
>> Twisted.  Again, if scipy uses multiprocessing, it would become
>> usuable within Twisted based servers.
>
> IMHO that's a bug of Twisted :).

Then please file a bug report with Twisted :)

More seriously, Twisted has been around *a bit* longer than
multiprocessing and is much better tested in both the unittest sense
and in the real world sense.  The informal word from the Twisted
community is that there are fundamental incompatabilities between
Twisted and multiprocessing and that in no way are these
incompatabilities in the "Twisted bug category."  But, I do hope these
things are eventually worked out.

> More seriously, multiprocessing is now
> in the standard library. It may have some quirks, but I think everybody
> should try and play well with it, and I wouldn't be surprised to see
> things improving as people get familiar with it.

Yes, because it is in the standard library, we should all try to play
well with it.  And I do hope things improve.  However,
multiprocessing's implementation (as I understand it) carries some
strong constraints that exclude certain potential friends (like
Twisted).

> I must admit I would really like to see IPython using multiprocessing as
> a backend for single-computer parallel computing (I have 8 cores, so I do
> a lot of that). I don't know if it is compatible with IPython's
> architecture. Specifically, I would like to be able to use the same API
> than IPython, with a fork-based mechanism. I would also like the easy
> process management.

Because of multiprocessing's inability to play well with Twisted, this
exact thing probably won't happen - at least anytime soon.  However,
it is very possible that IPython might have a multiprocessing-like
API.


From bayer.justin at googlemail.com  Thu Jan  8 05:40:48 2009
From: bayer.justin at googlemail.com (Justin Bayer)
Date: Thu, 8 Jan 2009 11:40:48 +0100
Subject: [SciPy-user] Swig and Numpy arrays
Message-ID: <db793aab0901080240nc2ed9b2nd9ca862f1b23634e@mail.gmail.com>

Hi group,

I am currently trying to connect a C++ library of mine via SWIG to
Python/Scipy. I have several classes that have methods which expect a
double* as an argument of which the length is known by the object.

So what I want to do is to connect a method with the signature
(double* array) to a Numpy array. I had a look at numpy.i and its
typemaps, but it seems that only typemaps are supplied which also deal
with such bound checking behaviour in the signature. As I said, the
bounds are held in a field of the object.

What is the best way to get around this? I am fairly new to swig and
wanted to know if somebody else has already encountered this problem.


Regards,
-Justin


From matthieu.brucher at gmail.com  Thu Jan  8 05:47:52 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Thu, 8 Jan 2009 11:47:52 +0100
Subject: [SciPy-user] Swig and Numpy arrays
In-Reply-To: <db793aab0901080240nc2ed9b2nd9ca862f1b23634e@mail.gmail.com>
References: <db793aab0901080240nc2ed9b2nd9ca862f1b23634e@mail.gmail.com>
Message-ID: <e76aa17f0901080247wca43898ydab3320fbf542a49@mail.gmail.com>

Hi,

In fact numpy typemaps extract the size of the array, so if I
understand correctly, this is what you don't want. So you only have to
delete this part of the typemap.
Be aware that you will not have any size checks anymore, but you still
could extract the size, compare it with your memorized size.

Matthieu

2009/1/8 Justin Bayer <bayer.justin at googlemail.com>:
> Hi group,
>
> I am currently trying to connect a C++ library of mine via SWIG to
> Python/Scipy. I have several classes that have methods which expect a
> double* as an argument of which the length is known by the object.
>
> So what I want to do is to connect a method with the signature
> (double* array) to a Numpy array. I had a look at numpy.i and its
> typemaps, but it seems that only typemaps are supplied which also deal
> with such bound checking behaviour in the signature. As I said, the
> bounds are held in a field of the object.
>
> What is the best way to get around this? I am fairly new to swig and
> wanted to know if somebody else has already encountered this problem.
>
>
> Regards,
> -Justin
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From cimrman3 at ntc.zcu.cz  Thu Jan  8 06:29:03 2009
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Thu, 08 Jan 2009 12:29:03 +0100
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <20090107223950.GA5186@phare.normalesup.org>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<20090107223950.GA5186@phare.normalesup.org>
Message-ID: <4965E37F.8040202@ntc.zcu.cz>

Gael Varoquaux wrote:
> On Wed, Jan 07, 2009 at 12:00:03PM -0800, Brian Granger wrote:
>> I see that people are starting to use multiprocessing to
>> parallelize numerical Python code.  I am wondering if we want to
>> allow/recommend using multiprocessing in scipy.
> 
> Too late! I use it in almost all code :). OK, none if this is in
> Scipy, but multiprocessing is starting to creep in various places. 
 > ...
> I must admit I would really like to see IPython using
> multiprocessing as a backend for single-computer parallel computing
> (I have 8 cores, so I do a lot of that). I don't know if it is
> compatible with IPython's architecture. Specifically, I would like to
> be able to use the same API than IPython, with a fork-based
> mechanism. I would also like the easy process management.

+1.

With multiprocessing I have been finally able to resolve the problem of
showing and updating matplotlib plots when doing a long computation with
sfepy - the application feeds data to a Log class, that sends them via a
pipe to another process plotting as the data arrive.

To conclude, in my application it plays with a GUI (GTKAgg) well, and I
certainly would use it in relevant algorithms in scipy if someone is
willing to implement it.

r.


From bayer.justin at googlemail.com  Thu Jan  8 07:18:16 2009
From: bayer.justin at googlemail.com (Justin Bayer)
Date: Thu, 8 Jan 2009 13:18:16 +0100
Subject: [SciPy-user] Swig and Numpy arrays
In-Reply-To: <e76aa17f0901080247wca43898ydab3320fbf542a49@mail.gmail.com>
References: <db793aab0901080240nc2ed9b2nd9ca862f1b23634e@mail.gmail.com>
	<e76aa17f0901080247wca43898ydab3320fbf542a49@mail.gmail.com>
Message-ID: <db793aab0901080418t5ac0a579y7987d13983208ba3@mail.gmail.com>

> In fact numpy typemaps extract the size of the array, so if I
> understand correctly, this is what you don't want. So you only have to
> delete this part of the typemap.

Is there an elegant way to do this with reusing as much functionality
of numpy.i as possible?

I tried to just make my own typemap for this purpose and also a
typemaps, but moved it out of the "fragment". No some functions which
are defined in a numpy fragment are missing. %fragment seems to be a
fairly underdocumented feature of swig, and I don't know how to
elegantly get access to those functions except copypasting them
somewhere, which gives me the shivers.

> Be aware that you will not have any size checks anymore, but you still
> could extract the size, compare it with your memorized size.
>
> Matthieu
>
> 2009/1/8 Justin Bayer <bayer.justin at googlemail.com>:
>> Hi group,
>>
>> I am currently trying to connect a C++ library of mine via SWIG to
>> Python/Scipy. I have several classes that have methods which expect a
>> double* as an argument of which the length is known by the object.
>>
>> So what I want to do is to connect a method with the signature
>> (double* array) to a Numpy array. I had a look at numpy.i and its
>> typemaps, but it seems that only typemaps are supplied which also deal
>> with such bound checking behaviour in the signature. As I said, the
>> bounds are held in a field of the object.
>>
>> What is the best way to get around this? I am fairly new to swig and
>> wanted to know if somebody else has already encountered this problem.
>>
>>
>> Regards,
>> -Justin
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>
>
>
> --
> Information System Engineer, Ph.D.
> Website: http://matthieu-brucher.developpez.com/
> Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
P.S.: No Dogs!


From matthieu.brucher at gmail.com  Thu Jan  8 07:41:08 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Thu, 8 Jan 2009 13:41:08 +0100
Subject: [SciPy-user] Swig and Numpy arrays
In-Reply-To: <db793aab0901080418t5ac0a579y7987d13983208ba3@mail.gmail.com>
References: <db793aab0901080240nc2ed9b2nd9ca862f1b23634e@mail.gmail.com>
	<e76aa17f0901080247wca43898ydab3320fbf542a49@mail.gmail.com>
	<db793aab0901080418t5ac0a579y7987d13983208ba3@mail.gmail.com>
Message-ID: <e76aa17f0901080441g30cbc3b6lfcc87a9fbe98237c@mail.gmail.com>

2009/1/8 Justin Bayer <bayer.justin at googlemail.com>:
>> In fact numpy typemaps extract the size of the array, so if I
>> understand correctly, this is what you don't want. So you only have to
>> delete this part of the typemap.
>
> Is there an elegant way to do this with reusing as much functionality
> of numpy.i as possible?
>
> I tried to just make my own typemap for this purpose and also a
> typemaps, but moved it out of the "fragment". No some functions which
> are defined in a numpy fragment are missing. %fragment seems to be a
> fairly underdocumented feature of swig, and I don't know how to
> elegantly get access to those functions except copypasting them
> somewhere, which gives me the shivers.

You will have to copy and paste the typemaps.
The other solution is to create a new method with SWIG that will have
additional parameters. The drawback is that you will have an
additional routine level, but there are several advantages: you will
use numpy.i, you can add checks inside your custom method, ...

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From bayer.justin at googlemail.com  Thu Jan  8 07:53:50 2009
From: bayer.justin at googlemail.com (Justin Bayer)
Date: Thu, 8 Jan 2009 13:53:50 +0100
Subject: [SciPy-user] Swig and Numpy arrays
In-Reply-To: <e76aa17f0901080441g30cbc3b6lfcc87a9fbe98237c@mail.gmail.com>
References: <db793aab0901080240nc2ed9b2nd9ca862f1b23634e@mail.gmail.com>
	<e76aa17f0901080247wca43898ydab3320fbf542a49@mail.gmail.com>
	<db793aab0901080418t5ac0a579y7987d13983208ba3@mail.gmail.com>
	<e76aa17f0901080441g30cbc3b6lfcc87a9fbe98237c@mail.gmail.com>
Message-ID: <db793aab0901080453w81bc17bmeba91604e6a85352@mail.gmail.com>

> The other solution is to create a new method with SWIG that will have
> additional parameters. The drawback is that you will have an
> additional routine level, but there are several advantages: you will
> use numpy.i, you can add checks inside your custom method, ...

This sounds more interesting to me now. What are you referring to
exactly? I skimmed around in the docs and examples but did not really
find something like that.


From matthieu.brucher at gmail.com  Thu Jan  8 08:10:46 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Thu, 8 Jan 2009 14:10:46 +0100
Subject: [SciPy-user] Swig and Numpy arrays
In-Reply-To: <db793aab0901080453w81bc17bmeba91604e6a85352@mail.gmail.com>
References: <db793aab0901080240nc2ed9b2nd9ca862f1b23634e@mail.gmail.com>
	<e76aa17f0901080247wca43898ydab3320fbf542a49@mail.gmail.com>
	<db793aab0901080418t5ac0a579y7987d13983208ba3@mail.gmail.com>
	<e76aa17f0901080441g30cbc3b6lfcc87a9fbe98237c@mail.gmail.com>
	<db793aab0901080453w81bc17bmeba91604e6a85352@mail.gmail.com>
Message-ID: <e76aa17f0901080510p16fd09e8te25ad37b2620cb59@mail.gmail.com>

The easiest think to do would be to check the numpy ML (which is more
adequate for numpy arrays ;)) for the thread ;)

You might just have to create a new method through the %extend
feature, but you will have to check.

Matthieu

2009/1/8 Justin Bayer <bayer.justin at googlemail.com>:
>> The other solution is to create a new method with SWIG that will have
>> additional parameters. The drawback is that you will have an
>> additional routine level, but there are several advantages: you will
>> use numpy.i, you can add checks inside your custom method, ...
>
> This sounds more interesting to me now. What are you referring to
> exactly? I skimmed around in the docs and examples but did not really
> find something like that.
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From david at ar.media.kyoto-u.ac.jp  Thu Jan  8 10:19:05 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Fri, 09 Jan 2009 00:19:05 +0900
Subject: [SciPy-user] Scipy 0.7, weave, windows
Message-ID: <49661969.40905@ar.media.kyoto-u.ac.jp>

Hi,

    I just did a full build/install/test dance of scipy 0.7 on windows,
and things look good - except weave, which brings 205 errors when the
full test suite is run. Do people use weave on windows ? I would think
not many, because we discovered with Stefan some weave functions using
python code not available at least since python 2.4, but I would like to
make sure.

    Otherwise, I believe we will finally be able to release scipy 0.7,
almost one year and a half after 0.6 :)

    thanks,

    David


From daniel.wheeler2 at gmail.com  Thu Jan  8 10:49:29 2009
From: daniel.wheeler2 at gmail.com (Daniel Wheeler)
Date: Thu, 8 Jan 2009 10:49:29 -0500
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <49661969.40905@ar.media.kyoto-u.ac.jp>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
Message-ID: <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com>

On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau
<david at ar.media.kyoto-u.ac.jp> wrote:
> Hi,
>
>    I just did a full build/install/test dance of scipy 0.7 on windows,
> and things look good - except weave, which brings 205 errors when the
> full test suite is run. Do people use weave on windows ?

Yes. Our test suite for fipy currently passes all it's weave tests on
windows with python 2.5 and scipy version 0.6.0 and that includes a
lot of auto generated weave code.

Cheers

-- 
Daniel Wheeler


From josef.pktd at gmail.com  Thu Jan  8 10:59:20 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Jan 2009 10:59:20 -0500
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <49661969.40905@ar.media.kyoto-u.ac.jp>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
Message-ID: <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com>

On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau
<david at ar.media.kyoto-u.ac.jp> wrote:
> Hi,
>
>    I just did a full build/install/test dance of scipy 0.7 on windows,
> and things look good - except weave, which brings 205 errors when the
> full test suite is run. Do people use weave on windows ? I would think
> not many, because we discovered with Stefan some weave functions using
> python code not available at least since python 2.4, but I would like to
> make sure.
>
>    Otherwise, I believe we will finally be able to release scipy 0.7,
> almost one year and a half after 0.6 :)
>
>    thanks,
>
>    David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Thu Jan  8 11:13:24 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Jan 2009 11:13:24 -0500
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com>
Message-ID: <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com>

On Thu, Jan 8, 2009 at 10:59 AM,  <josef.pktd at gmail.com> wrote:
> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau
> <david at ar.media.kyoto-u.ac.jp> wrote:
>> Hi,
>>
>>    I just did a full build/install/test dance of scipy 0.7 on windows,
>> and things look good - except weave, which brings 205 errors when the
>> full test suite is run. Do people use weave on windows ? I would think
>> not many, because we discovered with Stefan some weave functions using
>> python code not available at least since python 2.4, but I would like to
>> make sure.
>>
>>    Otherwise, I believe we will finally be able to release scipy 0.7,
>> almost one year and a half after 0.6 :)
>>
>>    thanks,
>>
>>    David
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>

(hit wrong button)

WindowsXP, MingW,
SWIG Version 1.3.36 (Compiled with i586-mingw32msvc-g++ [i686-pc-linux-gnu])

When testing weave with 'full', weave usually looks pretty good. It
leaves a lot of temp files behind, but  I don't get any failures or
errors. (after test with cout crash is removed)

the skips are the wxpython tests, I don't know what the other two knownfail are

>>> import scipy.weave
>>> scipy.weave.test('full')
Running unit tests for scipy.weave
NumPy version 1.3.0.dev6139
NumPy is installed in C:\Programs\Python25\lib\site-packages\numpy
SciPy version 0.7.0.dev          # 5286
SciPy is installed in C:\Programs\Python25\lib\site-packages\scipy
Python version 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Int
el)]
nose version 0.10.4
-------------------------------------------------------------
Ran 449 tests in 827.922s

OK (KNOWNFAIL=3, SKIP=7)
<nose.result.TextTestResult run=449 errors=0 failures=0>

the log contains these error messages, but they don't cause a failure

..................................................error removing c:\docume~1\car
r\locals~1\temp\tmpjqj0tkcat_test: c:\docume~1\carr\locals~1\temp\tmpjqj
0tkcat_test: The directory is not empty
..building extensions here: c:\docume~1\carr\locals~1\temp\Carr\python25
_compiled\m61
......K<weave: compiling>
<weave: compiling>
c:\docume~1\carr\locals~1\temp\Carr\python25_compiled\m61\sc_2b01bfa9cce
5c43d4c49a1d7e13f43d21.cpp: In function `PyObject* compiled_func(PyObject*, PyOb
ject*)':
c:\docume~1\carr\locals~1\temp\Carr\python25_compiled\m61\sc_2b01bfa9cce
5c43d4c49a1d7e13f43d21.cpp:664: error: no match for 'operator<' in 'a < 2'
c:\docume~1\carr\locals~1\temp\Carr\python25_compiled\m61\sc_2b01bfa9cce
5c43d4c49a1d7e13f43d21.cpp:668: error: no match for 'operator+' in 'a + 1'

Josef


From cournape at gmail.com  Thu Jan  8 11:15:36 2009
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 9 Jan 2009 01:15:36 +0900
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com>
Message-ID: <5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com>

On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler
<daniel.wheeler2 at gmail.com> wrote:
> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau
> <david at ar.media.kyoto-u.ac.jp> wrote:
>> Hi,
>>
>>    I just did a full build/install/test dance of scipy 0.7 on windows,
>> and things look good - except weave, which brings 205 errors when the
>> full test suite is run. Do people use weave on windows ?
>
> Yes. Our test suite for fipy currently passes all it's weave tests on
> windows with python 2.5 and scipy version 0.6.0 and that includes a
> lot of auto generated weave code.

Thanks for the info. Would you mind testing it with scipy 0.7.x branch
? There are some test failures which showed some old code which could
not have worked (like using python code which was removed from python
svn 5 years ago), but as I am not a weave user myself, I can't really
assess what's significant and what's not.

I could make a binary installer if that makes it easier for you to test,

David


From cournape at gmail.com  Thu Jan  8 12:37:38 2009
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 9 Jan 2009 02:37:38 +0900
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com>
	<1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com>
Message-ID: <5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com>

On Fri, Jan 9, 2009 at 1:13 AM,  <josef.pktd at gmail.com> wrote:
>
> When testing weave with 'full', weave usually looks pretty good. It
> leaves a lot of temp files behind, but  I don't get any failures or
> errors. (after test with cout crash is removed)

Hm, strange. I tried on another machine, and I still get a lot of
those failures... Not the same though. Which compilers have you
installed on your computer ? Do you have any MS compilers installed ?

David


From timmichelsen at gmx-topmail.de  Thu Jan  8 12:40:58 2009
From: timmichelsen at gmx-topmail.de (Timmie)
Date: Thu, 8 Jan 2009 17:40:58 +0000 (UTC)
Subject: [SciPy-user] converting hourly series to annual unneccessaryly
	masks data
Message-ID: <loom.20090108T173552-850@post.gmane.org>

Hello,
I would like to build an average over a hourly timeseries stretching over more
than one year. I converted it to annual and now many data got masked.


In [83]: test = ts.time_series(np.arange(17520), start_date=ts.now('H'))

In [84]: test
Out[84]:
timeseries([    0     1     2 ..., 17517 17518 17519],
           dates = [08-Jan-2009 18:00 ... 08-Jan-2011 17:00],
           freq  = H)


In [85]: test = ts.time_series(np.arange(17520), start_date=ts.now('H'))

In [86]: atest = test.convert('A')

In [87]: test
Out[87]:
timeseries([    0     1     2 ..., 17517 17518 17519],
           dates = [08-Jan-2009 18:00 ... 08-Jan-2011 17:00],
           freq  = H)


In [88]: atest
Out[88]:
timeseries(
 [[-- -- -- ..., -- -- --]
 [8574 8575 8576 ..., -- -- --]
 [17334 17335 17336 ..., -- -- --]],
           dates =
 [2009 ... 2011],
           freq  = A-DEC)

I entered it at: 
http://scipy.org/scipy/scikits/ticket/84

I'd be glad to receive a comment on waht is happening here.

Thanks in advanve.
Timmie


From pgmdevlist at gmail.com  Thu Jan  8 13:03:05 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Thu, 8 Jan 2009 13:03:05 -0500
Subject: [SciPy-user] converting hourly series to annual unneccessaryly
	masks data
In-Reply-To: <loom.20090108T173552-850@post.gmane.org>
References: <loom.20090108T173552-850@post.gmane.org>
Message-ID: <B8CAFEC0-A1D3-437A-A548-0268D7FDF9C5@gmail.com>

Timmie,
The documentation is still a bit lacking, sorry. Still, in the  
docstring of convert, you can see that if you don't precise a func  
input parameter, the series is converted to 2D, as stated:
`
         If ``func`` is not given, the output series group the points  
of the
         initial series that share the same new date. For example, if  
the
         initial series has a daily frequency and is 1D, the output  
series is
         2D.
`

In your case, each line corresponds to a year, and each column to one  
given hour, starting at 01/01-01:00 (or 00:00, I can't remmbr right  
now).
Check the shape of your atest variable:

 >>> atest.shape
(3, 8784)

Note that 8784 = 366*24: we actually use years of 366 days in that  
case, to take leap years into account.
The missing data you observe comes from the facts that:
1. You're not starting at 01/01:00-00, but 8 days later
2. We are using this 366d year: as there are no leap year in your  
range of years, the last 24 data of each line will be masked.
3. You don't finish at 12/31-23:00, but (365-8) days earlier.

So all is well and works as expected (developer-wise), no need for a  
ticket (good reflex, though).

Now, of course, you need to tell us what you were expecting, and what  
kind of average you wanted to calculate.


>  test = ts.time_series(np.arange(17520), start_date=ts.now('H'))
> atest = test.convert('A')
>
> In [87]: test
> Out[87]:
> timeseries([    0     1     2 ..., 17517 17518 17519],
>           dates = [08-Jan-2009 18:00 ... 08-Jan-2011 17:00],
>           freq  = H)
>
>
> In [88]: atest
> Out[88]:
> timeseries(
> [[-- -- -- ..., -- -- --]
> [8574 8575 8576 ..., -- -- --]
> [17334 17335 17336 ..., -- -- --]],
>           dates =
> [2009 ... 2011],
>           freq  = A-DEC)
>
> I entered it at:
> http://scipy.org/scipy/scikits/ticket/84
>
> I'd be glad to receive a comment on waht is happening here.
>
> Thanks in advanve.
> Timmie
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From josef.pktd at gmail.com  Thu Jan  8 13:12:31 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Jan 2009 13:12:31 -0500
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com>
	<1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com>
	<5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com>
Message-ID: <1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com>

On Thu, Jan 8, 2009 at 12:37 PM, David Cournapeau <cournape at gmail.com> wrote:
> On Fri, Jan 9, 2009 at 1:13 AM,  <josef.pktd at gmail.com> wrote:
>>
>> When testing weave with 'full', weave usually looks pretty good. It
>> leaves a lot of temp files behind, but  I don't get any failures or
>> errors. (after test with cout crash is removed)
>
> Hm, strange. I tried on another machine, and I still get a lot of
> those failures... Not the same though. Which compilers have you
> installed on your computer ? Do you have any MS compilers installed ?
>

Essentially only the official MingW 3.4.5, I also have an older
dev-cpp with separate MingW which is on the Windows path behind the
official MingW.

I also installed some time ago  Microsoft Visual 2005 Express Edition.
But I never use it, since it's not compatible with python. (I don't
have 2003 Edition)

In general, setuptools and MingW work very well, so I never needed to
dig more into the compilation details. (The only exception is that I
don't have Boost for MingW, since there is no premade installer.) My
compiler knowledge is almost only cut and paste and `setup.py bdist`.

Josef


From cournape at gmail.com  Thu Jan  8 13:22:55 2009
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 9 Jan 2009 03:22:55 +0900
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com>
	<1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com>
	<5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com>
	<1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com>
Message-ID: <5b8d13220901081022od1c11d0o5cb4a3403027b148@mail.gmail.com>

On Fri, Jan 9, 2009 at 3:12 AM,  <josef.pktd at gmail.com> wrote:
> On Thu, Jan 8, 2009 at 12:37 PM, David Cournapeau <cournape at gmail.com> wrote:
>> On Fri, Jan 9, 2009 at 1:13 AM,  <josef.pktd at gmail.com> wrote:
>>>
>>> When testing weave with 'full', weave usually looks pretty good. It
>>> leaves a lot of temp files behind, but  I don't get any failures or
>>> errors. (after test with cout crash is removed)
>>
>> Hm, strange. I tried on another machine, and I still get a lot of
>> those failures... Not the same though. Which compilers have you
>> installed on your computer ? Do you have any MS compilers installed ?
>>
>
> Essentially only the official MingW 3.4.5, I also have an older
> dev-cpp with separate MingW which is on the Windows path behind the
> official MingW.
>

Ah, that's why. The problems could only be seen when MS compiler were
installed - I think I solve the problem - no errors anymore, now.

David


From josef.pktd at gmail.com  Thu Jan  8 13:39:01 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Jan 2009 13:39:01 -0500
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <5b8d13220901081022od1c11d0o5cb4a3403027b148@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com>
	<1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com>
	<5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com>
	<1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com>
	<5b8d13220901081022od1c11d0o5cb4a3403027b148@mail.gmail.com>
Message-ID: <1cd32cbb0901081039x475c77c3r37eea7ca8db54a67@mail.gmail.com>

On Thu, Jan 8, 2009 at 1:22 PM, David Cournapeau <cournape at gmail.com> wrote:
> On Fri, Jan 9, 2009 at 3:12 AM,  <josef.pktd at gmail.com> wrote:
>> On Thu, Jan 8, 2009 at 12:37 PM, David Cournapeau <cournape at gmail.com> wrote:
>>> On Fri, Jan 9, 2009 at 1:13 AM,  <josef.pktd at gmail.com> wrote:
>>>>
>>>> When testing weave with 'full', weave usually looks pretty good. It
>>>> leaves a lot of temp files behind, but  I don't get any failures or
>>>> errors. (after test with cout crash is removed)
>>>
>>> Hm, strange. I tried on another machine, and I still get a lot of
>>> those failures... Not the same though. Which compilers have you
>>> installed on your computer ? Do you have any MS compilers installed ?
>>>
>>
>> Essentially only the official MingW 3.4.5, I also have an older
>> dev-cpp with separate MingW which is on the Windows path behind the
>> official MingW.
>>
>
> Ah, that's why. The problems could only be seen when MS compiler were
> installed - I think I solve the problem - no errors anymore, now.
>

It would be useful to some users to have this information (e.g. in the
docs) if they run into similar problems.

I usually try to keep my Windows path clean, and it's also the reason
that I'm quite wary of automatic installers or programs that mess with
the registry.

Josef


From timmichelsen at gmx-topmail.de  Thu Jan  8 13:49:15 2009
From: timmichelsen at gmx-topmail.de (Timmie)
Date: Thu, 8 Jan 2009 18:49:15 +0000 (UTC)
Subject: [SciPy-user]
	=?utf-8?q?converting_hourly_series_to_annual_unnecce?=
	=?utf-8?q?ssaryly=09masks_data?=
References: <loom.20090108T173552-850@post.gmane.org>
	<B8CAFEC0-A1D3-437A-A548-0268D7FDF9C5@gmail.com>
Message-ID: <loom.20090108T183917-88@post.gmane.org>

Hello Pierre,

> The documentation is still a bit lacking, sorry. Still, in the  
> docstring of convert, you can see that if you don't precise a func  
> input parameter, the series is converted to 2D, as stated:
> `
>          If ``func`` is not given, the output series group the points  
> of the
>          initial series that share the same new date. For example, if  
> the
>          initial series has a daily frequency and is 1D, the output  
> series is
>          2D.
No problem here.
We discussedit already here:
aggregation of long-term time series
http://article.gmane.org/gmane.comp.python.scientific.user/15584

 
> 1. You're not starting at 01/01:00-00, but 8 days later
Yes, I am aware of it.

> 2. We are using this 366d year: as there are no leap year in your  
> range of years, the last 24 data of each line will be masked.
This explains what I was looking for.
Because it affects how I handle the data later.

I need averages for all hours over the years:

atest.mean(0)
=> this the data array for the new one-year hourly time series (8760 h).
And since the data is masked at the and, I am lacking a day when I build the
timeseries.
Is there a way to handle this generically?

I mean if my long-term years contain a leap year I neeed the masked points but
normally not.

How would you suggest to build the one-year hourly average time series in a
flexible way?

A example case what I am aiming at:
Averge hourly temperatures over 20 years of data.

> 3. You don't finish at 12/31-23:00, but (365-8) days earlier.
I also know this here.

 
> So all is well and works as expected (developer-wise), no need for a  
> ticket (good reflex, though).
Sorry, too fast.
 
> Now, of course, you need to tell us what you were expecting, and what  
> kind of average you wanted to calculate.
See above.


From pgmdevlist at gmail.com  Thu Jan  8 14:08:00 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Thu, 8 Jan 2009 14:08:00 -0500
Subject: [SciPy-user] converting hourly series to annual
	unneccessaryly	masks data
In-Reply-To: <loom.20090108T183917-88@post.gmane.org>
References: <loom.20090108T173552-850@post.gmane.org>
	<B8CAFEC0-A1D3-437A-A548-0268D7FDF9C5@gmail.com>
	<loom.20090108T183917-88@post.gmane.org>
Message-ID: <E3924FEA-AA13-4CA8-BC6C-3A8342F3AA9B@gmail.com>

Timmie

>
> I need averages for all hours over the years:

Irrespectively of the day ? That is, you need just 24 values ? Why  
wouldn't you convert to daily instead, and take the mean over axis=0 ?
If you need hourly averages per year, try looping over the years,  
selecting the data falling into each year, converting to daily and  
averaging.

Note that if you don't specify `func` in `convert`, you're currently  
limited to 1D data in input.


> atest.mean(0)
> => this the data array for the new one-year hourly time series (8760  
> h).
> And since the data is masked at the and, I am lacking a day when I  
> build the
> timeseries.

??? You have one extra day, with data completely masked. That  
shouldn't change your results.

> I mean if my long-term years contain a leap year I neeed the masked  
> points but
> normally not.

Sorry, that won't be possible. If you really wanna stick to 365d  
years, just drop the last 24 points of each line
 >>> atest[:,:-24]


From daniel.wheeler2 at gmail.com  Thu Jan  8 15:00:12 2009
From: daniel.wheeler2 at gmail.com (Daniel Wheeler)
Date: Thu, 8 Jan 2009 15:00:12 -0500
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com>
	<5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com>
Message-ID: <80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com>

On Thu, Jan 8, 2009 at 11:15 AM, David Cournapeau <cournape at gmail.com> wrote:
> On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler
> <daniel.wheeler2 at gmail.com> wrote:
>> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau
>> <david at ar.media.kyoto-u.ac.jp> wrote:
>>> Hi,
>>>
>>>    I just did a full build/install/test dance of scipy 0.7 on windows,
>>> and things look good - except weave, which brings 205 errors when the
>>> full test suite is run. Do people use weave on windows ?
>>
>> Yes. Our test suite for fipy currently passes all it's weave tests on
>> windows with python 2.5 and scipy version 0.6.0 and that includes a
>> lot of auto generated weave code.
>
> Thanks for the info. Would you mind testing it with scipy 0.7.x branch
> ? There are some test failures which showed some old code which could
> not have worked (like using python code which was removed from python
> svn 5 years ago), but as I am not a weave user myself, I can't really
> assess what's significant and what's not.
>
> I could make a binary installer if that makes it easier for you to test,

That would be great if you have it set to build quickly and easily.
Don't fancy figuring out how to build scipy on windows. Cheers.

-- 
Daniel Wheeler


From timmichelsen at gmx-topmail.de  Thu Jan  8 16:34:40 2009
From: timmichelsen at gmx-topmail.de (Timmie)
Date: Thu, 8 Jan 2009 21:34:40 +0000 (UTC)
Subject: [SciPy-user]
	=?utf-8?q?converting_hourly_series_to_annual=09unnec?=
	=?utf-8?q?cessaryly=09masks_data?=
References: <loom.20090108T173552-850@post.gmane.org>
	<B8CAFEC0-A1D3-437A-A548-0268D7FDF9C5@gmail.com>
	<loom.20090108T183917-88@post.gmane.org>
	<E3924FEA-AA13-4CA8-BC6C-3A8342F3AA9B@gmail.com>
Message-ID: <loom.20090108T213307-271@post.gmane.org>

Thanks for the fast response.

> > I need averages for all hours over the years:
I think I have to give you a better example.

I'll post that tomorrow.

Regards,
Timmie


From pgmdevlist at gmail.com  Thu Jan  8 16:39:54 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Thu, 8 Jan 2009 16:39:54 -0500
Subject: [SciPy-user] converting hourly series to
	annual	unneccessaryly	masks data
In-Reply-To: <loom.20090108T213307-271@post.gmane.org>
References: <loom.20090108T173552-850@post.gmane.org>
	<B8CAFEC0-A1D3-437A-A548-0268D7FDF9C5@gmail.com>
	<loom.20090108T183917-88@post.gmane.org>
	<E3924FEA-AA13-4CA8-BC6C-3A8342F3AA9B@gmail.com>
	<loom.20090108T213307-271@post.gmane.org>
Message-ID: <2F5D9FEF-4017-442D-8A49-B6017901541F@gmail.com>


On Jan 8, 2009, at 4:34 PM, Timmie wrote:
>
>>> I need averages for all hours over the years:
> I think I have to give you a better example.

Indeed. Precise the shape of the output you expect (24, 24*365...).


From gael.varoquaux at normalesup.org  Thu Jan  8 17:42:44 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Thu, 8 Jan 2009 23:42:44 +0100
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<20090107223950.GA5186@phare.normalesup.org>
	<6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com>
Message-ID: <20090108224244.GD9026@phare.normalesup.org>

On Wed, Jan 07, 2009 at 08:00:25PM -0800, Brian Granger wrote:
> > As Robert points out, that's because of wizardry done by IPython. That's
> > really a pity, because in my experience, multiprocessing is fairly
> > robust. Nothing that's not fixable from IPython's side, though, I
> > believe.

> Yes, a bug report should be added to IPython's launchpad site about this.

Good point. I just did so, with a test case.

Cheers,

Ga?l


From wizzard028wise at gmail.com  Thu Jan  8 18:06:02 2009
From: wizzard028wise at gmail.com (Dorian)
Date: Fri, 9 Jan 2009 00:06:02 +0100
Subject: [SciPy-user] Iterative proportional fitting
Message-ID: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>

Hi all,
I have some marginal functions densities and I'm looking to the good way to
find their join density function.
I would want to know if there is any  package or script in Scipy  for
iterative proportional fitting (IPF) .
Or any web link to help  me start.

Thanks in advance
Dorian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090109/a8cf418e/attachment.html>

From robert.kern at gmail.com  Thu Jan  8 18:17:28 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 8 Jan 2009 17:17:28 -0600
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
Message-ID: <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>

On Thu, Jan 8, 2009 at 17:06, Dorian <wizzard028wise at gmail.com> wrote:
> Hi all,
> I have some marginal functions densities and I'm looking to the good way to
> find their join density function.

There are potentially an infinite number of such joint density
functions that have the same marginal densities. Adding some
constraints, like a correlation between two variables, helps, but it's
still an ill-defined problem.

> I would want to know if there is any  package or script in Scipy  for
> iterative proportional fitting (IPF) .
> Or any web link to help  me start.

No, there is nothing in scipy for this. I think IPF applies more to
data than to distributions, per se. Estimating a joint distribution
from marginal distribution is usually called a copula, in my
experience.

http://en.wikipedia.org/wiki/Copula_(statistics)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From fperez.net at gmail.com  Thu Jan  8 18:50:56 2009
From: fperez.net at gmail.com (Fernando Perez)
Date: Thu, 8 Jan 2009 15:50:56 -0800
Subject: [SciPy-user] Multiprocessing, GUIs and IPython
In-Reply-To: <20090108224244.GD9026@phare.normalesup.org>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<20090107223950.GA5186@phare.normalesup.org>
	<6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com>
	<20090108224244.GD9026@phare.normalesup.org>
Message-ID: <db6b5ecc0901081550g41a77f1dgec3b7473c73cf30@mail.gmail.com>

On Thu, Jan 8, 2009 at 2:42 PM, Gael Varoquaux
<gael.varoquaux at normalesup.org> wrote:
> On Wed, Jan 07, 2009 at 08:00:25PM -0800, Brian Granger wrote:
>> > As Robert points out, that's because of wizardry done by IPython. That's
>> > really a pity, because in my experience, multiprocessing is fairly
>> > robust. Nothing that's not fixable from IPython's side, though, I
>> > believe.
>
>> Yes, a bug report should be added to IPython's launchpad site about this.
>
> Good point. I just did so, with a test case.

Thanks, I just saw it.  The culprit here, FakeModule, is *very old*
code that indeed was added to support pickling at the very birth of
ipython.  Unfortunately at the time I had no testing, so I never
encoded anywhere exactly what the cases for needing such  a hack were.

I'll try to rip it out and see if I can find pickle-related failures,
and we can then look for a better solution.

Further discussion of this will obviously happen on ipython-dev, I
just wanted to say here that we'll definitely do our best to play
nicely with multiprocessing from our side.

Cheers,

f


From wizzard028wise at gmail.com  Thu Jan  8 19:07:38 2009
From: wizzard028wise at gmail.com (Dorian)
Date: Fri, 9 Jan 2009 01:07:38 +0100
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
Message-ID: <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>

Thanks for your quick response. You are right , I've tried that, but copula
are limited only
to the case that the marginal distributions are uniform over  the interval
zero to one.

As I read  from literature  IPF method is more general and can be applied
also with marginal
distributions, not limited to the interval zero to one .

Thanks again,

Dorian


2009/1/9 Robert Kern <robert.kern at gmail.com>

> On Thu, Jan 8, 2009 at 17:06, Dorian <wizzard028wise at gmail.com> wrote:
> > Hi all,
> > I have some marginal functions densities and I'm looking to the good way
> to
> > find their join density function.
>
> There are potentially an infinite number of such joint density
> functions that have the same marginal densities. Adding some
> constraints, like a correlation between two variables, helps, but it's
> still an ill-defined problem.
>
> > I would want to know if there is any  package or script in Scipy  for
> > iterative proportional fitting (IPF) .
> > Or any web link to help  me start.
>
> No, there is nothing in scipy for this. I think IPF applies more to
> data than to distributions, per se. Estimating a joint distribution
> from marginal distribution is usually called a copula, in my
> experience.
>
> http://en.wikipedia.org/wiki/Copula_(statistics)<http://en.wikipedia.org/wiki/Copula_%28statistics%29>
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090109/204e0155/attachment.html>

From robert.kern at gmail.com  Thu Jan  8 19:17:42 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 8 Jan 2009 18:17:42 -0600
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
	<674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
Message-ID: <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com>

On Thu, Jan 8, 2009 at 18:07, Dorian <wizzard028wise at gmail.com> wrote:
> Thanks for your quick response. You are right , I've tried that, but copula
> are limited only
> to the case that the marginal distributions are uniform over  the interval
> zero to one.

No, you transform your marginal distributions to uniform and also
transform the constraints appropriately, too. You find the uniform
copula and then apply the inverse transformations to get the original
joint density.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From afraser at lanl.gov  Thu Jan  8 19:32:26 2009
From: afraser at lanl.gov (Andy Fraser)
Date: Thu, 08 Jan 2009 17:32:26 -0700
Subject: [SciPy-user] C extension to manipulate sparse lil matrix
Message-ID: <87wsd5qpyd.fsf@lanl.gov>

I want to move some time critical bits of code for hidden Markov
models from python to C.  I've written code that works and uses sparse
matrices.  Next, I want to implement the "backward" algorithm in C.
As an intermediate step, I've coded/prototyped the manipulations that
I want to do on the internals of the sparse matrices using python.  I'll
append that code at the end here.

Now, I am trying to figure out how to manipulate lil sparse matrices.
In particular calling such a matrix "SM", and supposing that "t" is
the index for a row, I want to assign new arrays to "SM.rows[t]" and
"SM.data[t]".

I would be grateful if someone posted C code that interchanged two
rows of a lil sparse matrix.  I think I could glean what I need from
that example.  Since I'm new to C extensions, I'd like to see type
checking and reference counting done right too.

The basic recursion for the backward algorithm is

beta[t-1] = beta[t] {op1} Py[t] {op2} gamma[t] {op3} ScS

where beta[t-1], beta[t], and Py[t] are vectors, gamma[t] is a scalar,
and ScS is a matrix, and {op1} is element-wise multiplication of two
vectors, {op2} is division of a vector by a scalar, and {op3} is a
vector matrix product.

Here is my python code for the backward algorithm with sparse matrices:

======================================================================

def backsteps(N, T, gamma, Py_data, Py_rows, ScS_data, ScS_indices, ScS_indptr,
              beta_data,beta_rows):
    """ To imitate and check C.backsteps for debugging."""
    last_rows = numpy.array(range(N),numpy.int32)
    last_data = numpy.ones(N,numpy.float64)
    for t in xrange(T-1,-1,-1):
        beta_data[t] = last_data
        beta_rows[t] = last_rows
        gamma_t = gamma[t]
        Pyt_rows = Py_rows[t]
        Pyt_data = Py_data[t]
        mul_rows = []
        mul_data = []
        j0 = 0
        for i in xrange(len(Pyt_rows)):
            I = Pyt_rows[i]
            for j in xrange(j0,len(last_rows)):
                J = last_rows[j]
                if J == I:
                    mul_rows.append(I)
                    mul_data.append(Pyt_data[i]*last_data[j]/gamma_t)
                    j0 = j+1
                    break
                if J>I:
                    if j>j0:
                        j0 = j-1
                    break
        prod = numpy.zeros(N)
        for i in xrange(len(mul_rows)):
            I = mul_rows[i]
            for j in xrange(ScS_indptr[I],ScS_indptr[I+1]):
                J = ScS_indices[j]
                prod[J] += ScS_data[j]*mul_data[i]
        M = 0
        for i in xrange(N):
            if prod[i] != 0:
                M += 1
        last_rows = numpy.empty(M,numpy.int32)
        last_data = numpy.empty(M,numpy.float64)
        j = 0
        for i in xrange(N):
            if prod[i] != 0:
                last_rows[j] = i
                last_data[j] = prod[i]
                j += 1
    return


From coughlan at ski.org  Thu Jan  8 19:14:05 2009
From: coughlan at ski.org (James Coughlan)
Date: Thu, 08 Jan 2009 16:14:05 -0800
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
	<674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
Message-ID: <496696CD.6090408@ski.org>

You can use the maximum entropy to estimate a joint distribution given 
marginals (or arbitrary functions of marginals), e.g. see pdf tutorial  
on "Maximum Entropy Distributions and Their Relationship to Maximum 
Likelihood 
<http://www.ski.org/Rehab/Coughlan_lab/General/Tutorials/MaxEnt.pdf> "at:

http://www.ski.org/Rehab/Coughlan_lab/General/Tutorials.html

Assuming your marginals are defined numerically (e.g. histograms or 
means/variances/moments) this should work. Once you've set up the 
problem this way, you can solve it numerically using gradient descent.

Best,

James


Dorian wrote:
> Thanks for your quick response. You are right , I've tried that, but 
> copula are limited only
> to the case that the marginal distributions are uniform over  the 
> interval zero to one.
>
> As I read  from literature  IPF method is more general and can be 
> applied also with marginal
> distributions, not limited to the interval zero to one .
>
> Thanks again,
>
> Dorian
>
>
>
>
>
>
>
> 2009/1/9 Robert Kern <robert.kern at gmail.com 
> <mailto:robert.kern at gmail.com>>
>
>     On Thu, Jan 8, 2009 at 17:06, Dorian <wizzard028wise at gmail.com
>     <mailto:wizzard028wise at gmail.com>> wrote:
>     > Hi all,
>     > I have some marginal functions densities and I'm looking to the
>     good way to
>     > find their join density function.
>
>     There are potentially an infinite number of such joint density
>     functions that have the same marginal densities. Adding some
>     constraints, like a correlation between two variables, helps, but it's
>     still an ill-defined problem.
>
>     > I would want to know if there is any  package or script in Scipy
>      for
>     > iterative proportional fitting (IPF) .
>     > Or any web link to help  me start.
>
>     No, there is nothing in scipy for this. I think IPF applies more to
>     data than to distributions, per se. Estimating a joint distribution
>     from marginal distribution is usually called a copula, in my
>     experience.
>
>     http://en.wikipedia.org/wiki/Copula_(statistics)
>     <http://en.wikipedia.org/wiki/Copula_%28statistics%29>
>
>     --
>     Robert Kern
>
>     "I have come to believe that the whole world is an enigma, a harmless
>     enigma that is made terrible by our own mad attempt to interpret it as
>     though it had an underlying truth."
>      -- Umberto Eco
>     _______________________________________________
>     SciPy-user mailing list
>     SciPy-user at scipy.org <mailto:SciPy-user at scipy.org>
>     http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   


-- 
-------------------------------------------------------
James Coughlan, Ph.D., Scientist                     

The Smith-Kettlewell Eye Research Institute

Email: coughlan at ski.org
URL: http://www.ski.org/Rehab/Coughlan_lab/
Phone: 415-345-2146 
Fax: 415-345-8455
-------------------------------------------------------


From wizzard028wise at gmail.com  Thu Jan  8 20:15:20 2009
From: wizzard028wise at gmail.com (Dorian)
Date: Fri, 9 Jan 2009 02:15:20 +0100
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
	<674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
	<3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com>
Message-ID: <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com>

Could you give me one appropriate  example on the way of adding the
constraints?

As a example In the case of given two marginal Gaussian distributions.
I have written the corresponding bivariate Gaussian copula  density , after
inverse transformation (using Sklar's theorem)  to get the joint density
function  their is no correlation coefficient to infer on it.
Because the joint density is not necessary a Gaussian density  and I stuck
there .

I'll try also what James suggested about maximum entropy.


Thanks for your kind help

Dorian


2009/1/9 Robert Kern <robert.kern at gmail.com>

> On Thu, Jan 8, 2009 at 18:07, Dorian <wizzard028wise at gmail.com> wrote:
> > Thanks for your quick response. You are right , I've tried that, but
> copula
> > are limited only
> > to the case that the marginal distributions are uniform over  the
> interval
> > zero to one.
>
> No, you transform your marginal distributions to uniform and also
> transform the constraints appropriately, too. You find the uniform
> copula and then apply the inverse transformations to get the original
> joint density.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090109/cde89c76/attachment.html>

From robert.kern at gmail.com  Thu Jan  8 20:33:01 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 8 Jan 2009 19:33:01 -0600
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
	<674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
	<3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com>
	<674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com>
Message-ID: <3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com>

On Thu, Jan 8, 2009 at 19:15, Dorian <wizzard028wise at gmail.com> wrote:
> Could you give me one appropriate  example on the way of adding the
> constraints?
>
> As a example In the case of given two marginal Gaussian distributions.
> I have written the corresponding bivariate Gaussian copula  density , after
> inverse transformation (using Sklar's theorem)  to get the joint density
> function  their is no correlation coefficient to infer on it.
> Because the joint density is not necessary a Gaussian density  and I stuck
> there .

Hmm, I could be talking out of my butt, here. The last time I looked
at something like this was years ago, and my problem was just
generating random numbers, not trying to derive density functions. I
was looking at the NORTA (NORmal To Anything) method. It might be
possible to derive a method for estimating a joint density using a
similar approach.

What information do you have? Just the marginal densities? Can you
describe your problem at a higher level?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From chuck.l.norris at gmail.com  Thu Jan  8 20:29:27 2009
From: chuck.l.norris at gmail.com (Kevin Webster)
Date: Fri, 9 Jan 2009 01:29:27 +0000 (UTC)
Subject: [SciPy-user] Selecting Array Indicies with an array of values?!?
Message-ID: <loom.20090109T011611-74@post.gmane.org>

Hello,

  I am rather new to numpy and scipy so some of this may come from my ignorance,
but I am having an issue with using numpy to edit a large array of values. I
want to selectively edit items in an array with a list of those items. Some code
might explain better:

arrSV[usr_mov_ids[a[i]:a[i+1]]] =
((abs(SV1)-abs(arrSV[usr_mov_ids[a[i]:a[i+1]]])).clip(min=-1,max=1)

here I am using numpy's great expressions to specify the range that I want to
work with. Inside usr_mov_ids I have an array of index values in specific order
that I want to place inside the array arrSV. Because of memory restrictions, I
had to chunk up the array operations, so I use the array a[] to hold the chunked
up index values. This compiles correctly and runs, but instead of using the
values coming from usr_mov_ids it just fills every item in the array with the
same values.

I thought I could sidestep the problem if I used weave and just inline some C to
flip through the array quickly. Here is the code that I wrote:

for (int x=a(i); x<a(i+1); ++x) {
    int mId = usr_mov_ids(x);
    for (int y=0; y<12; ++y) {
        arrSV(mId, y) = abs(SVD)-abs(arrSV(mId, y));
    }
}

This ends up giving me the exact same result. I can even print the values that
should be used with mId and y. But it keeps filling the whole array with values.

Does anyone know what I am doing wrong?

Thanks,
kw


From robert.kern at gmail.com  Thu Jan  8 20:49:26 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 8 Jan 2009 19:49:26 -0600
Subject: [SciPy-user] Selecting Array Indicies with an array of values?!?
In-Reply-To: <loom.20090109T011611-74@post.gmane.org>
References: <loom.20090109T011611-74@post.gmane.org>
Message-ID: <3d375d730901081749ta785482i7cecce1efd815561@mail.gmail.com>

On Thu, Jan 8, 2009 at 19:29, Kevin Webster <chuck.l.norris at gmail.com> wrote:
> Hello,
>
>  I am rather new to numpy and scipy so some of this may come from my ignorance,
> but I am having an issue with using numpy to edit a large array of values. I
> want to selectively edit items in an array with a list of those items. Some code
> might explain better:
>
> arrSV[usr_mov_ids[a[i]:a[i+1]]] =
> ((abs(SV1)-abs(arrSV[usr_mov_ids[a[i]:a[i+1]]])).clip(min=-1,max=1)

Can you give us a small, self-contained, complete example that
demonstrates the problem?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From wizzard028wise at gmail.com  Thu Jan  8 22:04:00 2009
From: wizzard028wise at gmail.com (Dorian)
Date: Fri, 9 Jan 2009 04:04:00 +0100
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
	<674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
	<3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com>
	<674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com>
	<3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com>
Message-ID: <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com>

Hi  Kern ,  James

I look at closely the "Maximum entropy method " and "NORTA method" , they
correspond exactly
to what I'm looking for to start thinking deeply about the problem of
approaching likely the density
function which will  correspond to a given marginal densities functions.

Thanks  a lot ,

Dorian


P.S  to Kern :  As English isn't my language , I speak French ,  I'm still
learning English.  I didn't understand at the beginning the meaning of
"butt" , I was really confused by the definition given by google. Then I
google the all sentence " talk out of my butt" and I understood what you
meant [?]

2009/1/9 Robert Kern <robert.kern at gmail.com>

> On Thu, Jan 8, 2009 at 19:15, Dorian <wizzard028wise at gmail.com> wrote:
> > Could you give me one appropriate  example on the way of adding the
> > constraints?
> >
> > As a example In the case of given two marginal Gaussian distributions.
> > I have written the corresponding bivariate Gaussian copula  density ,
> after
> > inverse transformation (using Sklar's theorem)  to get the joint density
> > function  their is no correlation coefficient to infer on it.
> > Because the joint density is not necessary a Gaussian density  and I
> stuck
> > there .
>
> Hmm, I could be talking out of my butt, here. The last time I looked
> at something like this was years ago, and my problem was just
> generating random numbers, not trying to derive density functions. I
> was looking at the NORTA (NORmal To Anything) method. It might be
> possible to derive a method for estimating a joint density using a
> similar approach.
>
> What information do you have? Just the marginal densities? Can you
> describe your problem at a higher level?
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090109/2502a58b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 338.gif
Type: image/gif
Size: 541 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090109/2502a58b/attachment.gif>

From robert.kern at gmail.com  Thu Jan  8 22:32:37 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 8 Jan 2009 21:32:37 -0600
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
	<674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
	<3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com>
	<674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com>
	<3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com>
	<674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com>
Message-ID: <3d375d730901081932x5dcc7e4au961956a5194ed4c9@mail.gmail.com>

On Thu, Jan 8, 2009 at 21:04, Dorian <wizzard028wise at gmail.com> wrote:
>
> Hi  Kern ,  James
>
> I look at closely the "Maximum entropy method " and "NORTA method" , they correspond exactly
> to what I'm looking for to start thinking deeply about the problem of approaching likely the density
> function which will  correspond to a given marginal densities functions.

I think NORTA may be adapted to your problem. NORTA is a method for
generating N-D random variates from a distribution characterized by N
marginal distributions and a correlation matrix. You sample from an
N-D normal distribution using a correlation matrix derived from the
target correlation matrix, then apply the inverse CDFs of the marginal
distributions. The magic is all in finding the right transformation of
the correlation matrix.

Instead of transforming randomly sampled points, you could instead
transform a grid. On that grid, you can find the values of the N-D CDF
of the corresponding NORTA normal distribution. Transforming the grid
locations back to your original space, the warped grid should now
correspond to the N-D CDF of the target joint distribution. Apply your
favorite interpolation scheme to evaluate the N-D CDF numerically on a
regular grid in the original space, and you should be able to evaluate
the PDF from that.

This will probably work okay for 2 dimensions, but it would be quite
challenging to do this for many more.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco


From josef.pktd at gmail.com  Thu Jan  8 23:02:58 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 8 Jan 2009 23:02:58 -0500
Subject: [SciPy-user] Iterative proportional fitting
In-Reply-To: <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com>
References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com>
	<3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com>
	<674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com>
	<3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com>
	<674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com>
	<3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com>
	<674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com>
Message-ID: <1cd32cbb0901082002i13e008a6pead603840b30deef@mail.gmail.com>

On Thu, Jan 8, 2009 at 10:04 PM, Dorian <wizzard028wise at gmail.com> wrote:

> Hi  Kern ,  James
>
> I look at closely the "Maximum entropy method " and "NORTA method" , they
> correspond exactly
> to what I'm looking for to start thinking deeply about the problem of
> approaching likely the density
> function which will  correspond to a given marginal densities functions.
>
> I was reading a bit during this thread, since besides copula, I haven't
heard of the other methods.

Dorian, you haven't mentioned what kind of data you have. From some quick
reading, it seems that iterative proportional fitting is often used for
contingency tables, copulas are used in finance, where the underlying
distribution is continuous and usually many observations are available.

The first few google searches for NORTA consider it as a normal copula with
discrete marginals. There is a maximum entropy estimation package in scipy
that I don't know much about, applications show up mostly for
ontologies/language (see scipy\maxentropy\examples)

So, I guess, the popularity of the approach depends on the field and data
set.

In my search on copulas, I found a good description at
http://www.vosesoftware.com/ModelRiskHelp/index.htm#Modeling_correlation/Copulas.htm
where they use Kendals tau to estimate the correlation parameter for the
normal copula (and also in other copulas). The Wikipedia article is
unfortunately silent on estimation.

Since the problem of generating multivariate distribution is pretty
widespread, it would be useful to add some recipes to the cookbook, or to
this thread. So, if your search produces some examples that you are willing
to share, I and, I guess, the next user with a similar question would
appreciate it.

Josef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090108/6c86869f/attachment.html>

From hoytak at cs.ubc.ca  Fri Jan  9 01:33:23 2009
From: hoytak at cs.ubc.ca (Hoyt Koepke)
Date: Thu, 8 Jan 2009 22:33:23 -0800
Subject: [SciPy-user] C extension to manipulate sparse lil matrix
In-Reply-To: <87wsd5qpyd.fsf@lanl.gov>
References: <87wsd5qpyd.fsf@lanl.gov>
Message-ID: <4db580fd0901082233r14a2035bx8c32b9476fbeb576@mail.gmail.com>

Hello Andy,

I don't know if I can be of much help answering your questions, but
here's a few thoughts:

> I want to move some time critical bits of code for hidden Markov
> models from python to C.  I've written code that works and uses sparse
> matrices.  Next, I want to implement the "backward" algorithm in C.
<snip>
> I think I could glean what I need from
> that example.  Since I'm new to C extensions, I'd like to see type
> checking and reference counting done right too.

Have you tried using cython ((http://www.cython.org)?  It makes
writing C code extensions almost as painless as typing your variables,
works well with numpy arrays, and handles all the messy stuff for you.
 If your goal is to learn the ins and outs of how python works with
extensions, then stick with c.  But if you just want to optimize your
code, you can't beat cython.  In particular, see
http://wiki.cython.org/tutorials/numpy for how to work with numpy.

> I would be grateful if someone posted C code that interchanged two
> rows of a lil sparse matrix.

I'm not sure it's what you're looking for, I might recommend using an
intermediate index mapping array, and make all your accesses to the
sparse matrix go through it.  in other words, have m be a bijective
map on the indices, and use something SM.row[ m[i] ] to access stuff.
mapping indices are easy to swap in C or cython. Then at the end, do
the whole transformation either in python or on the index arrays of a
csr or csc matrix all at once.

Some other experts on the list might have a better way though,

--Hoyt

++++++++++++++++++++++++++++++++++++++++++++++++
+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoytak at gmail.com
++++++++++++++++++++++++++++++++++++++++++


From grh at mur.at  Fri Jan  9 03:23:33 2009
From: grh at mur.at (Georg Holzmann)
Date: Fri, 09 Jan 2009 09:23:33 +0100
Subject: [SciPy-user] C extension to manipulate sparse lil matrix
In-Reply-To: <4db580fd0901082233r14a2035bx8c32b9476fbeb576@mail.gmail.com>
References: <87wsd5qpyd.fsf@lanl.gov>
	<4db580fd0901082233r14a2035bx8c32b9476fbeb576@mail.gmail.com>
Message-ID: <49670985.20505@mur.at>

Hallo!

>> I want to move some time critical bits of code for hidden Markov
>> models from python to C.  I've written code that works and uses sparse
>> matrices.  Next, I want to implement the "backward" algorithm in C.
> <snip>
>> I think I could glean what I need from
>> that example.  Since I'm new to C extensions, I'd like to see type
>> checking and reference counting done right too.
> 
> Have you tried using cython ((http://www.cython.org)?  It makes
> writing C code extensions almost as painless as typing your variables,
> works well with numpy arrays, and handles all the messy stuff for you.
>  If your goal is to learn the ins and outs of how python works with
> extensions, then stick with c.  But if you just want to optimize your
> code, you can't beat cython.  In particular, see
> http://wiki.cython.org/tutorials/numpy for how to work with numpy.

You can also use weave.inline: http://www.scipy.org/PerformancePython
There you just embed the critical C code directly in the python file and 
everything gets compiled automatically ...

LG
Georg


From cournape at gmail.com  Fri Jan  9 07:31:20 2009
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 9 Jan 2009 21:31:20 +0900
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com>
	<5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com>
	<80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com>
Message-ID: <5b8d13220901090431u7b9b0de4n4ce26d187fc5e20d@mail.gmail.com>

On Fri, Jan 9, 2009 at 5:00 AM, Daniel Wheeler
<daniel.wheeler2 at gmail.com> wrote:
> On Thu, Jan 8, 2009 at 11:15 AM, David Cournapeau <cournape at gmail.com> wrote:
>> On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler
>> <daniel.wheeler2 at gmail.com> wrote:
>>> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau
>>> <david at ar.media.kyoto-u.ac.jp> wrote:
>>>> Hi,
>>>>
>>>>    I just did a full build/install/test dance of scipy 0.7 on windows,
>>>> and things look good - except weave, which brings 205 errors when the
>>>> full test suite is run. Do people use weave on windows ?
>>>
>>> Yes. Our test suite for fipy currently passes all it's weave tests on
>>> windows with python 2.5 and scipy version 0.6.0 and that includes a
>>> lot of auto generated weave code.
>>
>> Thanks for the info. Would you mind testing it with scipy 0.7.x branch
>> ? There are some test failures which showed some old code which could
>> not have worked (like using python code which was removed from python
>> svn 5 years ago), but as I am not a weave user myself, I can't really
>> assess what's significant and what's not.
>>
>> I could make a binary installer if that makes it easier for you to test,
>
> That would be great if you have it set to build quickly and easily.
> Don't fancy figuring out how to build scipy on windows. Cheers.

No need to worry, I am the one who coded the tools for the windows
binary installer, so hopefully I am still familiar with it :)

Here we are:

http://www.ar.media.kyoto-u.ac.jp/members/david/archives/scipy/scipy-0.7.0.dev5410-win32-superpack-python2.5.exe

David


From migita at gmail.com  Fri Jan  9 12:47:44 2009
From: migita at gmail.com (zzzz)
Date: Fri, 9 Jan 2009 09:47:44 -0800 (PST)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
Message-ID: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>

Hi!

I've made a direct comparison of the time numpy and MATLAB need to
calculate inverse matrix. Since (as far as I know) both call standard
packages such as LAPACK internally, I thought that for large matrices
inversion time should be approximately the same. Contrary to my
expectations, the difference between Python's numpy.linalg.inv and
MATLAB actually diverge (with Python being approximately 6 times
slower than MATLAB for matrices of size 1000).

I use the following "naive" code to estimate inversion time (and a
similar code for MATLAB):

import numpy as np
import time
import csv

def get_rand_mtx(n):
   X = np.random.rand(n, n) + 10*np.sqrt(n)*np.eye(n)
#    print 'cond = ',  np.linalg.cond(X)
   return X

def inverse_time(X):
   t0 = time.clock()
   Xinv = np.linalg.inv(X)
   return time.clock()-t0

if __name__ == "__main__":
   n_list = range(200, 1000, 10)
   times = {}
   for n in n_list:
       times[n] = inverse_time(get_rand_mtx(n))

Did I miss something?

Thanks.


From david at ar.media.kyoto-u.ac.jp  Fri Jan  9 12:52:09 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 10 Jan 2009 02:52:09 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
Message-ID: <49678EC9.3030303@ar.media.kyoto-u.ac.jp>

zzzz wrote:
> Hi!
>
> I've made a direct comparison of the time numpy and MATLAB need to
> calculate inverse matrix. Since (as far as I know) both call standard
> packages such as LAPACK internally, I thought that for large matrices
> inversion time should be approximately the same. Contrary to my
> expectations, the difference between Python's numpy.linalg.inv and
> MATLAB actually diverge (with Python being approximately 6 times
> slower than MATLAB for matrices of size 1000).
>   

For such big matrices, you are testing your lapack implementation; at
this point, this has nothing to do with numpy or scipy, unless matlab
and scipy have the same lapack implementation - which is highly
unlikely. One order of magnitude of difference can easily be seen
between LAPACK implementations, specially for matrices/matrices
operations (BLAS level 3).

Which lapack are you using for numpy ?

David


From sturla at molden.no  Fri Jan  9 14:57:17 2009
From: sturla at molden.no (Sturla Molden)
Date: Fri, 9 Jan 2009 20:57:17 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <49678EC9.3030303@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
Message-ID: <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>

> zzzz wrote:

> Which lapack are you using for numpy ?

Or which BLAS?

Matlab ships with Intel MKL, at least on Windows. NumPy comes with ATLAS
(I think), which may not be optimized properly for the hardware. Bottom
line: Build libraries like NumPy from source. If you have an Intel
processor, consider buying an MKL license.


From bkomaki at yahoo.com  Fri Jan  9 15:17:38 2009
From: bkomaki at yahoo.com (Ch B Komaki)
Date: Fri, 9 Jan 2009 12:17:38 -0800 (PST)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
Message-ID: <15171.72904.qm@web30402.mail.mud.yahoo.com>


Hallo,
Despite the comparing of th t wo software is possible, but the fact is Python is designed for Array but Matlab is oruginally for matrix.


--- On Fri, 1/9/09, zzzz <migita at gmail.com> wrote:
From: zzzz <migita at gmail.com>
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
To: scipy-user at scipy.org
Date: Friday, January 9, 2009, 9:17 PM

Hi!

I've made a direct comparison of the time numpy and MATLAB need to
calculate inverse matrix. Since (as far as I know) both call standard
packages such as LAPACK internally, I thought that for large matrices
inversion time should be approximately the same. Contrary to my
expectations, the difference between Python's numpy.linalg.inv and
MATLAB actually diverge (with Python being approximately 6 times
slower than MATLAB for matrices of size 1000).

I use the following "naive" code to estimate inversion time (and a
similar code for MATLAB):

import numpy as np
import time
import csv

def get_rand_mtx(n):
   X = np.random.rand(n, n) + 10*np.sqrt(n)*np.eye(n)
#    print 'cond = ',  np.linalg.cond(X)
   return X

def inverse_time(X):
   t0 = time.clock()
   Xinv = np.linalg.inv(X)
   return time.clock()-t0

if __name__ == "__main__":
   n_list = range(200, 1000, 10)
   times = {}
   for n in n_list:
       times[n] = inverse_time(get_rand_mtx(n))

Did I miss something?

Thanks.
_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090109/aefac42f/attachment.html>

From sturla at molden.no  Fri Jan  9 16:20:06 2009
From: sturla at molden.no (Sturla Molden)
Date: Fri, 9 Jan 2009 22:20:06 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <15171.72904.qm@web30402.mail.mud.yahoo.com>
References: <15171.72904.qm@web30402.mail.mud.yahoo.com>
Message-ID: <3e6dd31532bad9ec2d39fdbd38cc8775.squirrel@webmail.uio.no>


> Despite the comparing of th t wo software is possible, but the fact is
> Python is designed for Array but Matlab is oruginally for matrix.

It doesn't matter. The internal representation is the same. The external
C/Fortran code in LAPACK can't tell the difference. That is, LAPACK is
based on Fortran and probably a bit more efficient when working on Fortran
ordered arrays (Matlab disallows C order, but NumPy allows both with 'C'
being default). That aside, NumPy has a Matrix class and Matlab har
element-wise (array) operators.

The difference that matters most performance wise is the BLAS version.
LAPACK makes calls int BLAS. Matlab ships with the best BLAS library for
Intel laptop and desktop computers by default (that is, Intel MKL). NumPy
does not.

1. Buy an MKL license from Intel
2. Compile LAPACK with MKL as BLAS
3. Build NumPy against LAPACK and MKL

If you do this, Matlab and NumPy should invert matrices equally fast. If
you cannot use MKL, build a version of ATLAS customized to your hardware.

There is also another difference: Matlab is 'smart'. Matlab's \ operator
and inv function call a LAPACK wrapper of ~80,000 lines of code that tries
to solve the linalg problem in the best possible way. With NumPy you must
know your linear algebra better, and select between Gauss-Newton, LU, QR,
SVD, Cholesky etc. manually. Just asking NumPy to invert a matrix
(numpy.linalg.inv) will work, but it will use a safe but not necessarily
efficient method (I think it defaults to backsubstitution). When it comes
to solving linear algebra on a personal computer, it is nearly impossible
to beat the performance of Matlab. It uses the best available libraries by
default and selects the methods intelligently. If that is what you want,
buy a Matlab license.


From cournape at gmail.com  Fri Jan  9 23:44:24 2009
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 10 Jan 2009 13:44:24 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
Message-ID: <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>

On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:
>> zzzz wrote:
>
>> Which lapack are you using for numpy ?
>
> Or which BLAS?

I use LAPACK generically :) BLAS/LAPACK is the exact term, I guess.

>
> Matlab ships with Intel MKL, at least on Windows. NumPy comes with ATLAS
> (I think)

Numpy does not come with ATLAS: it uses whatever blas/lapack you have
available. If you don't have any, numpy has an internal copy of a
light lapack, which is not the fastest.

>, which may not be optimized properly for the hardware. Bottom
> line: Build libraries like NumPy from source. If you have an Intel
> processor, consider buying an MKL license.

Matrix inversion speed is not a good benchmark if you want to compare
matlab/numpy - it may well be the worse benchmark, actually. I
sometimes have the feeling that people who care about speed do only
matrix inversion/product :)

David


From sturla at molden.no  Sat Jan 10 07:23:41 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Jan 2009 13:23:41 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
Message-ID: <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>

> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:

> Numpy does not come with ATLAS: it uses whatever blas/lapack you have
> available. If you don't have any, numpy has an internal copy of a
> light lapack, which is not the fastest.

Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can
I just replace the DLL?


From akshaysrinivasan at gmail.com  Sat Jan 10 07:47:59 2009
From: akshaysrinivasan at gmail.com (Akshay Srinivasan)
Date: Sat, 10 Jan 2009 18:17:59 +0530
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
Message-ID: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>

2009/1/10 Sturla Molden <sturla at molden.no>:
>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:
>
>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have
>> available. If you don't have any, numpy has an internal copy of a
>> light lapack, which is not the fastest.
>
> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can
> I just replace the DLL?
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>

I don't think you need to rebuild Numpy or Scipy, if the dynamic
libraries behave the same way - which I'm guessing is true.


From sturla at molden.no  Sat Jan 10 07:52:10 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Jan 2009 13:52:10 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
Message-ID: <e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>

> 2009/1/10 Sturla Molden <sturla at molden.no>:

> I don't think you need to rebuild Numpy or Scipy, if the dynamic
> libraries behave the same way - which I'm guessing is true.

http://scipy.org/Installing_SciPy/Windows#head-711101b83618cd49bcd3283dc5eea28ceb734116

This claims NumPy/SciPy only uses static libraries. Is this still valid?


From michael.abshoff at googlemail.com  Sat Jan 10 07:58:37 2009
From: michael.abshoff at googlemail.com (Michael Abshoff)
Date: Sat, 10 Jan 2009 04:58:37 -0800
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
Message-ID: <49689B7D.9040901@gmail.com>

Akshay Srinivasan wrote:
> 2009/1/10 Sturla Molden <sturla at molden.no>:
>>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:
>>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have
>>> available. If you don't have any, numpy has an internal copy of a
>>> light lapack, which is not the fastest.
>> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can
>> I just replace the DLL?
>>
>>
>>
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
> 
> I don't think you need to rebuild Numpy or Scipy, if the dynamic
> libraries behave the same way - which I'm guessing is true.

Nope, the names are different and you cannot just switch them out. This 
also assumes you link dynamically which in many cases is not true.

Cheers,

Michael

> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
> 


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 07:53:03 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 10 Jan 2009 21:53:03 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
Message-ID: <49689A2F.3020102@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:
>>     
>
>   
>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have
>> available. If you don't have any, numpy has an internal copy of a
>> light lapack, which is not the fastest.
>>     
>
> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can
> I just replace the DLL?
>   

You have to rebuild. Blas/Lapack libraries are actually a very messy
business, not two of them are compatible (library name, conventions,
link options, etc...).

MKL is not always well supported on all platforms (they keep changing
the names and conventions for each version, in particular, which makes
it very awkward to use reliably).

David


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 07:55:19 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 10 Jan 2009 21:55:19 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
Message-ID: <49689AB7.7030102@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
>> 2009/1/10 Sturla Molden <sturla at molden.no>:
>>     
>
>   
>> I don't think you need to rebuild Numpy or Scipy, if the dynamic
>> libraries behave the same way - which I'm guessing is true.
>>     
>
> http://scipy.org/Installing_SciPy/Windows#head-711101b83618cd49bcd3283dc5eea28ceb734116
>
> This claims NumPy/SciPy only uses static libraries. Is this still valid?
>   

Yes. Windows has no reliable way that I know of to link several binaries
against one library (if you have foo/bar.dll and foo/bar/fubar.dll which
link against libbla.dll, libbla.dll must be in both foo and foo/bar
directories, or in a system directory).

cheers,

David


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 07:57:17 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 10 Jan 2009 21:57:17 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
Message-ID: <49689B2D.4040005@ar.media.kyoto-u.ac.jp>

Akshay Srinivasan wrote:
> 2009/1/10 Sturla Molden <sturla at molden.no>:
>   
>>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:
>>>       
>>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have
>>> available. If you don't have any, numpy has an internal copy of a
>>> light lapack, which is not the fastest.
>>>       
>> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can
>> I just replace the DLL?
>>
>>
>>
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>>     
>
> I don't think you need to rebuild Numpy or Scipy, if the dynamic
> libraries behave the same way - which I'm guessing is true.
>   

You guess wrong, there are many issues :) Names is only one problem, but
there is also mixed ABI conventions (passing float by value or by
reference, for example, which fortran runtime, etc...), which means it
is very difficult to reliably support dynamic linking of those libraries.

David


From josef.pktd at gmail.com  Sat Jan 10 08:20:55 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Jan 2009 08:20:55 -0500
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <49689B2D.4040005@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<49689B2D.4040005@ar.media.kyoto-u.ac.jp>
Message-ID: <1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com>

On Sat, Jan 10, 2009 at 7:57 AM, David Cournapeau
<david at ar.media.kyoto-u.ac.jp> wrote:
> Akshay Srinivasan wrote:
>> 2009/1/10 Sturla Molden <sturla at molden.no>:
>>
>>>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:
>>>>
>>>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have
>>>> available. If you don't have any, numpy has an internal copy of a
>>>> light lapack, which is not the fastest.
>>>>
>>> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can
>>> I just replace the DLL?
>>>
>>>
>>>
>>> _______________________________________________
>>> SciPy-user mailing list
>>> SciPy-user at scipy.org
>>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>>
>>>
>>
>> I don't think you need to rebuild Numpy or Scipy, if the dynamic
>> libraries behave the same way - which I'm guessing is true.
>>
>
> You guess wrong, there are many issues :) Names is only one problem, but
> there is also mixed ABI conventions (passing float by value or by
> reference, for example, which fortran runtime, etc...), which means it
> is very difficult to reliably support dynamic linking of those libraries.
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>

How do I find out whether and which ATLAS and blas and lapack my
installed versions of numpy/scipy are using? I misplaced the
information and cannot find it anymore.

Josef


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 08:12:58 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 10 Jan 2009 22:12:58 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>	<49689B2D.4040005@ar.media.kyoto-u.ac.jp>
	<1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com>
Message-ID: <49689EDA.6030106@ar.media.kyoto-u.ac.jp>

josef.pktd at gmail.com wrote:
> On Sat, Jan 10, 2009 at 7:57 AM, David Cournapeau
> <david at ar.media.kyoto-u.ac.jp> wrote:
>   
>> Akshay Srinivasan wrote:
>>     
>>> 2009/1/10 Sturla Molden <sturla at molden.no>:
>>>
>>>       
>>>>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden <sturla at molden.no> wrote:
>>>>>
>>>>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have
>>>>> available. If you don't have any, numpy has an internal copy of a
>>>>> light lapack, which is not the fastest.
>>>>>
>>>>>           
>>>> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can
>>>> I just replace the DLL?
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> SciPy-user mailing list
>>>> SciPy-user at scipy.org
>>>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>>>
>>>>
>>>>         
>>> I don't think you need to rebuild Numpy or Scipy, if the dynamic
>>> libraries behave the same way - which I'm guessing is true.
>>>
>>>       
>> You guess wrong, there are many issues :) Names is only one problem, but
>> there is also mixed ABI conventions (passing float by value or by
>> reference, for example, which fortran runtime, etc...), which means it
>> is very difficult to reliably support dynamic linking of those libraries.
>>
>> David
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>>     
>
> How do I find out whether and which ATLAS and blas and lapack my
> installed versions of numpy/scipy are using? I misplaced the
> information and cannot find it anymore.
>
>   

python -c "import numpy; print numpy.show_config()"

Same for scipy. A reliable way to know which dll are actually linked to
a python extension is depends.exe, on windows:

http://www.dependencywalker.com/

It does not always work - in particular with the whole SxS mess on XP
and Vista, it does not always know where to find dll which are there.
Guess it is one of this amazing ability of MS platform to consistently
surprise me for its lack of reliable tools for the most basic things,

fed up with wasting my time with windows-l'y,

David


From matthieu.brucher at gmail.com  Sat Jan 10 09:05:37 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 10 Jan 2009 15:05:37 +0100
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
Message-ID: <e76aa17f0901100605t4e1d5fa3j1a6fc8e8834372e7@mail.gmail.com>

2009/1/10 Sturla Molden <sturla at molden.no>:
>> 2009/1/10 Sturla Molden <sturla at molden.no>:
>
>> I don't think you need to rebuild Numpy or Scipy, if the dynamic
>> libraries behave the same way - which I'm guessing is true.
>
> http://scipy.org/Installing_SciPy/Windows#head-711101b83618cd49bcd3283dc5eea28ceb734116
>
> This claims NumPy/SciPy only uses static libraries. Is this still valid?

And for the MKL, you always have to use the static libraries, even with Linux.

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From matthieu.brucher at gmail.com  Sat Jan 10 09:08:06 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 10 Jan 2009 15:08:06 +0100
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <49689AB7.7030102@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
Message-ID: <e76aa17f0901100608k29e44976i54b52b5fe1045348@mail.gmail.com>

> Yes. Windows has no reliable way that I know of to link several binaries
> against one library (if you have foo/bar.dll and foo/bar/fubar.dll which
> link against libbla.dll, libbla.dll must be in both foo and foo/bar
> directories, or in a system directory).

As for Linux, safe if you define the exact folder where it will be
(doable with manifest files), and then it won't be portable ;)

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From matthieu.brucher at gmail.com  Sat Jan 10 09:11:21 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 10 Jan 2009 15:11:21 +0100
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <49689EDA.6030106@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<49689B2D.4040005@ar.media.kyoto-u.ac.jp>
	<1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com>
	<49689EDA.6030106@ar.media.kyoto-u.ac.jp>
Message-ID: <e76aa17f0901100611idd4258fg654b6f29ef2faa0@mail.gmail.com>

> It does not always work - in particular with the whole SxS mess on XP
> and Vista, it does not always know where to find dll which are there.
> Guess it is one of this amazing ability of MS platform to consistently
> surprise me for its lack of reliable tools for the most basic things,
>
> fed up with wasting my time with windows-l'y,

At least, they tried to fix the dll-hell that is also present with
Linux. Perhaps not in the best way, but it works ;)

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 09:08:21 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 10 Jan 2009 23:08:21 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <e76aa17f0901100608k29e44976i54b52b5fe1045348@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<e76aa17f0901100608k29e44976i54b52b5fe1045348@mail.gmail.com>
Message-ID: <4968ABD5.8040306@ar.media.kyoto-u.ac.jp>

Matthieu Brucher wrote:
>> Yes. Windows has no reliable way that I know of to link several binaries
>> against one library (if you have foo/bar.dll and foo/bar/fubar.dll which
>> link against libbla.dll, libbla.dll must be in both foo and foo/bar
>> directories, or in a system directory).
>>     
>
> As for Linux, safe if you define the exact folder where it will be
> (doable with manifest files), and then it won't be portable ;)
>   

It is not reliable: python itself has problems on some windows because
of that exact problem (for linking hte python dll against the msvcrt;
that's why installing python for one user does not work on Vista), and
they removed manifest from extensions. If python's community is not able
to solve the problem, with people as experienced as Martin Lowis, I
think it is safe to say we won't be very successful trying to do so in
numpy.

I even had discussion with people developing a well known compiler on
windows who were pulling their hair of their skulls because of this
manifest business.

So no, I won't use them unless strictly necessary.

David


From dg.gmane at thesamovar.net  Sat Jan 10 10:03:36 2009
From: dg.gmane at thesamovar.net (Dan Goodman)
Date: Sat, 10 Jan 2009 15:03:36 +0000 (UTC)
Subject: [SciPy-user] Scipy 0.7, weave, windows
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
Message-ID: <loom.20090110T150059-471@post.gmane.org>

I use weave.inline extensively in my code and libraries. I'm currently using
scipy version 0.7.0.dev5180 and it works fine on Windows XP with cygwin and gcc
version 3.4.4. I haven't tried the build you posted on this thread yet, but I
can if it would be helpful.

Dan Goodman


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 09:58:33 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 10 Jan 2009 23:58:33 +0900
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <loom.20090110T150059-471@post.gmane.org>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<loom.20090110T150059-471@post.gmane.org>
Message-ID: <4968B799.2070009@ar.media.kyoto-u.ac.jp>

Dan Goodman wrote:
> I use weave.inline extensively in my code and libraries. I'm currently using
> scipy version 0.7.0.dev5180 and it works fine on Windows XP with cygwin and gcc
> version 3.4.4. I haven't tried the build you posted on this thread yet, but I
> can if it would be helpful.
>   

Yes, please, it would be very helpful. If the binary posted works fine
on windows, we can finally make a RC,

David


From sturla at molden.no  Sat Jan 10 11:41:25 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Jan 2009 17:41:25 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <49689AB7.7030102@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
Message-ID: <ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>

> Sturla Molden wrote:

> Yes. Windows has no reliable way that I know of to link several binaries
> against one library (if you have foo/bar.dll and foo/bar/fubar.dll which
> link against libbla.dll, libbla.dll must be in both foo and foo/bar
> directories, or in a system directory).

Instead of using a static link library to connect with the DLL, you can
use LoadLibrary and GetProcAddress in windows.h to load the exported DLL
functions. You just need to specify the DLL name and method names as a
text strings. Another option is to use a COM object.

Sturla Molden


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 11:32:38 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sun, 11 Jan 2009 01:32:38 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
Message-ID: <4968CDA6.1030705@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
>> Sturla Molden wrote:
>>     
>
>   
>> Yes. Windows has no reliable way that I know of to link several binaries
>> against one library (if you have foo/bar.dll and foo/bar/fubar.dll which
>> link against libbla.dll, libbla.dll must be in both foo and foo/bar
>> directories, or in a system directory).
>>     
>
> Instead of using a static link library to connect with the DLL, you can
> use LoadLibrary and GetProcAddress in windows.h to load the exported DLL
> functions. You just need to specify the DLL name and method names as a
> text strings. Another option is to use a COM object.
>   

But LoadLibrary has the same semantics as the windows dynamic loader, so
I am not sure what this would change - except that we would have to
first rewrite all our code which uses external libraries to load them
explicitly (which would be useful on its own, though). And this does not
solve the problem of manifests, and security restrictions in Vista,
which I don't claim to understand, but know from the Python-dev ML that
they cause big headache for people who know more than me about windows
(which granted is not that difficult).

I would be happy to get patches to make the procedure usable, workable
with all the used MS compilers and mingw, on both windows XP and VISTA,
if it is that easy, though :)

cheers,

David


From sturla at molden.no  Sat Jan 10 11:54:23 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Jan 2009 17:54:23 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <49689AB7.7030102@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
Message-ID: <e2de27a02e05484c2be2acb182eeeac4.squirrel@webmail.uio.no>

> Sturla Molden wrote:

> Yes. Windows has no reliable way that I know of to link several binaries
> against one library

It does. It is called COM (aka ActiveX and OLE). You specify the name of
the COM object, and Windows loads the correct DLL by looking it up in the
registry.


S. M.


From sturla at molden.no  Sat Jan 10 11:59:08 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Jan 2009 17:59:08 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
Message-ID: <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>

> But LoadLibrary has the same semantics as the windows dynamic loader, so
> I am not sure what this would change

Sorry I was thinking the other way around: loading multiple DLLs from one
binary.

The cure for DLL hell on Windows is either COM or putting the DLL in a
system folder. COM is the preferred solution.

S.M.


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 11:47:21 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sun, 11 Jan 2009 01:47:21 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <e2de27a02e05484c2be2acb182eeeac4.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<e2de27a02e05484c2be2acb182eeeac4.squirrel@webmail.uio.no>
Message-ID: <4968D119.5080705@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
>> Sturla Molden wrote:
>>     
>
>   
>> Yes. Windows has no reliable way that I know of to link several binaries
>> against one library
>>     
>
> It does. It is called COM (aka ActiveX and OLE). You specify the name of
> the COM object, and Windows loads the correct DLL by looking it up in the
> registry.
>   

How is the DLL registered in the registry ? Where should be the dll (can
it be anywhere in the FS) ?  What does "loads the correct dll" means ?

David


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 12:24:49 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sun, 11 Jan 2009 02:24:49 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
Message-ID: <4968D9E1.5040302@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
>> But LoadLibrary has the same semantics as the windows dynamic loader, so
>> I am not sure what this would change
>>     
>
> Sorry I was thinking the other way around: loading multiple DLLs from one
> binary.
>
> The cure for DLL hell on Windows is either COM or putting the DLL in a
> system folder.

I thought putting the DLL in system folder was the cause for DLL hell :)

>  COM is the preferred solution.
>   

I thought COM was deprecated since .net ?

Another problem with dll is zip modules; I don't know if that's a
problem for numpy/scipy (can numpy/scipy eggs be in zip ? I am not
familiar with zipped eggs); windows cannot load DLL from zip files

http://mail.python.org/pipermail/python-list/2009-January/523570.html

David
> S.M.
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
>   


From matthieu.brucher at gmail.com  Sat Jan 10 13:10:12 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 10 Jan 2009 19:10:12 +0100
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
Message-ID: <e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>

2009/1/10 Sturla Molden <sturla at molden.no>:
>> But LoadLibrary has the same semantics as the windows dynamic loader, so
>> I am not sure what this would change
>
> Sorry I was thinking the other way around: loading multiple DLLs from one
> binary.
>
> The cure for DLL hell on Windows is either COM or putting the DLL in a
> system folder. COM is the preferred solution.

Microsoft real answer to DLL hell is manifest files (address and
version of a DLL), but it cannot be applied everywhere :(
Does COM handle DLL versions ?

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From david at ar.media.kyoto-u.ac.jp  Sat Jan 10 13:10:40 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sun, 11 Jan 2009 03:10:40 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
Message-ID: <4968E4A0.9030909@ar.media.kyoto-u.ac.jp>

Matthieu Brucher wrote:
> 2009/1/10 Sturla Molden <sturla at molden.no>:
>   
>>> But LoadLibrary has the same semantics as the windows dynamic loader, so
>>> I am not sure what this would change
>>>       
>> Sorry I was thinking the other way around: loading multiple DLLs from one
>> binary.
>>
>> The cure for DLL hell on Windows is either COM or putting the DLL in a
>> system folder. COM is the preferred solution.
>>     
>
> Microsoft real answer to DLL hell is manifest files (address and
> version of a DLL), but it cannot be applied everywhere

But this is quite complicated to apply in our case. Indeed, in my
understanding, we would have to embed manifest referring to the linked
dll (say atlas). This can only be done reliably (working as non admin on
Vista) if atlas is installed in the SxS cache - using private assemblies
requires the dll to be on the same dir as the .pyd (which means copying
it everywhere we need it....); see this for related problem (with the
msvcrt; the problem is the same, since python 2.6 does not assume you
have th msvcrt9):

http://bugs.python.org/issue4120

As I see it, one solution would be to have a 'private SxS' inside of
numpy - I don't know if it is possible at all. Now, all this is so
hopelessly undocumented that I see little chance to be able to support
this with both mingw and MS compilers (without even talking about Intel
compilers) in finite time. Also, if I can see how we could do it in
theory for atlas, how can we do that for a library we can't distribute
and control ourselves like MKL ?

MS could have used something like rpath + $ORIGIN which exists for like,
ten years at least, but no, they had to use all those crappy XML files
embedded in binaries with obscure semantics documented nowhere... Why
should I waste my time for such a crappy platform ? If people are really
interested in that feature for windows, they should do it themselves - I
won't do it.

cheers,

David


From dg.gmane at thesamovar.net  Sat Jan 10 13:31:39 2009
From: dg.gmane at thesamovar.net (Dan Goodman)
Date: Sat, 10 Jan 2009 18:31:39 +0000 (UTC)
Subject: [SciPy-user] Scipy 0.7, weave, windows
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<loom.20090110T150059-471@post.gmane.org>
	<4968B799.2070009@ar.media.kyoto-u.ac.jp>
Message-ID: <loom.20090110T182948-550@post.gmane.org>

David,

Works fine here! Good stuff.

Dan


From matthieu.brucher at gmail.com  Sat Jan 10 13:42:46 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 10 Jan 2009 19:42:46 +0100
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <4968E4A0.9030909@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
	<4968E4A0.9030909@ar.media.kyoto-u.ac.jp>
Message-ID: <e76aa17f0901101042k62d8eb17vde9dfc2c69042dfc@mail.gmail.com>

>> Microsoft real answer to DLL hell is manifest files (address and
>> version of a DLL), but it cannot be applied everywhere
>
> But this is quite complicated to apply in our case.

I agree ;)

-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From sturla at molden.no  Sat Jan 10 14:59:10 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Jan 2009 20:59:10 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
Message-ID: <c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>


> Microsoft real answer to DLL hell is manifest files (address and
> version of a DLL), but it cannot be applied everywhere :(
> Does COM handle DLL versions ?

The idea is that a DLL is not just identified by a name, but by a number
(the CLSID). The CLSID is stored in the registry and is looked up to get
the fully qualified path of the DLL. The rules of COM specified that the
CLSID should change whenever the version of the DLL changed.
Unfortunately, developers broke this rule, so DLL hell persisted. But if
the developer i honest and complies with this, COM does not not produce
DLL Hell.

S.M.


From sturla at molden.no  Sat Jan 10 15:01:31 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Jan 2009 21:01:31 +0100 (CET)
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <4968D9E1.5040302@ar.media.kyoto-u.ac.jp>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49678EC9.3030303@ar.media.kyoto-u.ac.jp>
	<4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no>
	<5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<4968D9E1.5040302@ar.media.kyoto-u.ac.jp>
Message-ID: <399854e6faeccbbdac45d388890ca238.squirrel@webmail.uio.no>


> I thought COM was deprecated since .net ?

ActiveX (internet-enabled COM objects) are certainly deprecated.

S.M.


From josef.pktd at gmail.com  Sat Jan 10 15:28:21 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Jan 2009 15:28:21 -0500
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <399854e6faeccbbdac45d388890ca238.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<4968D9E1.5040302@ar.media.kyoto-u.ac.jp>
	<399854e6faeccbbdac45d388890ca238.squirrel@webmail.uio.no>
Message-ID: <1cd32cbb0901101228k71d1563cxde1ae4aa888ceb68@mail.gmail.com>

On Sat, Jan 10, 2009 at 3:01 PM, Sturla Molden <sturla at molden.no> wrote:
>
>> I thought COM was deprecated since .net ?
>
> ActiveX (internet-enabled COM objects) are certainly deprecated.
>
> S.M.

The advantage of numpy/scipy installation compared to python is that
python is already available.
Here is a quick script to find the dll version and path of mkl on my windowsxp

I had installed the trial version of mkl, and it put several variables
in my environment, e.g. the lib path.

The attachment uses a recipe from
http://timgolden.me.uk/python/win32_how_do_i/get_dll_version.html
which uses the win32api

But overall, I think the superpack is very good and, thanks to David,
lowered the entry cost for windows users to install scipy (with
correct sse) very much.

Josef
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mkldllversion.py
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090110/395232ce/attachment.ksh>

From cournape at gmail.com  Sat Jan 10 15:58:41 2009
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 11 Jan 2009 05:58:41 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
	<c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>
Message-ID: <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>

On Sun, Jan 11, 2009 at 4:59 AM, Sturla Molden <sturla at molden.no> wrote:
>
>> Microsoft real answer to DLL hell is manifest files (address and
>> version of a DLL), but it cannot be applied everywhere :(
>> Does COM handle DLL versions ?
>
> The idea is that a DLL is not just identified by a name, but by a number
> (the CLSID). The CLSID is stored in the registry and is looked up to get
> the fully qualified path of the DLL.

But what happens when you can't write to the registry ? Or is this per user ?

> The rules of COM specified that the
> CLSID should change whenever the version of the DLL changed.
> Unfortunately, developers broke this rule, so DLL hell persisted. But if
> the developer i honest and complies with this, COM does not not produce
> DLL Hell.


http://msdn.microsoft.com/en-us/library/ms973843.aspx#dplywithnet_sharing

It does not sound like COM is a good idea - according to MSDN's own
word, COM is responsible for DLL Hell. I actually quite like the GAC
principle - I think it would be nice for python itself to have
something similar. But AFAIK, it is not possible to use this for
unmanaged code, without any CLR involvement.

cheers,

David


From josef.pktd at gmail.com  Sat Jan 10 17:41:07 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Jan 2009 17:41:07 -0500
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
	<c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>
	<5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>
Message-ID: <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com>

On Sat, Jan 10, 2009 at 3:58 PM, David Cournapeau <cournape at gmail.com> wrote:
> On Sun, Jan 11, 2009 at 4:59 AM, Sturla Molden <sturla at molden.no> wrote:
>>
>>> Microsoft real answer to DLL hell is manifest files (address and
>>> version of a DLL), but it cannot be applied everywhere :(
>>> Does COM handle DLL versions ?
>>
>> The idea is that a DLL is not just identified by a name, but by a number
>> (the CLSID). The CLSID is stored in the registry and is looked up to get
>> the fully qualified path of the DLL.
>
> But what happens when you can't write to the registry ? Or is this per user ?
>
>> The rules of COM specified that the
>> CLSID should change whenever the version of the DLL changed.
>> Unfortunately, developers broke this rule, so DLL hell persisted. But if
>> the developer i honest and complies with this, COM does not not produce
>> DLL Hell.
>
>
> http://msdn.microsoft.com/en-us/library/ms973843.aspx#dplywithnet_sharing
>
> It does not sound like COM is a good idea - according to MSDN's own
> word, COM is responsible for DLL Hell. I actually quite like the GAC
> principle - I think it would be nice for python itself to have
> something similar. But AFAIK, it is not possible to use this for
> unmanaged code, without any CLR involvement.
>

on my computer, the trial version of mkl is not registered as a shared
dll, so I didnt see a CLSID. The only place in the registry where I
have mkl is the installer and uninstaller. Matlab has it's own local
copy of mkl .

The mkl lib, include, .. directories are added to the windows
environment, but not the bin/dll directory. I think the idea is the
same as with virtualenv, make a local copy, then other programs cannot
overwrite the required version. And there is no need to worry about
shared libraries and no dll hell.

I don't think program libraries for applications need to be installed
system or user wide, for example when I installed Python25 and GTK for
it, GTK installed the new version in Program Files and broke my GTK
install for Python24, or maybe it was GIMP that overwrote the system
wide GTK install. wxpython is installed completely in site-packages,
and I never had any problems with it.

So I think that, if you want to link against mkl, then the best would
be to make a local copy of the dlls. This seems to be the common
policy,  for example, I have more than 10 programs (mostly open
source, including numpy/scipy) that all have their own lapack in the
local directories.

Josef


From matthieu.brucher at gmail.com  Sat Jan 10 18:08:15 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sun, 11 Jan 2009 00:08:15 +0100
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<e1434a4571702bc2fdb723806c9abafc.squirrel@webmail.uio.no>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
	<c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>
	<5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>
	<1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com>
Message-ID: <e76aa17f0901101508h2648fc7n3f4b7ddf066a1eba@mail.gmail.com>

> So I think that, if you want to link against mkl, then the best would
> be to make a local copy of the dlls. This seems to be the common
> policy,  for example, I have more than 10 programs (mostly open
> source, including numpy/scipy) that all have their own lapack in the
> local directories.

Contrary to usual programs, Python is scattered in several folders,
and in the absolute, the dll should be put in the main Python folder
to be seen by Python at runtime. But then what would you do if other
modules provide their own MKL dlls?
The only solution is to load on the fly the dlls, a kind of plugin
system. This means that someone has to propose a patch to support
this, without slowdowns compared to the current position (which is
very good thanks to David efforts to provide fully functional
Numpy/Scipy installers, even if he does not have much time, PhD
power). The situation is not optimal, but it is not bad, it is very
good. If someone wants to do better, he can always propose something
;)

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From josef.pktd at gmail.com  Sat Jan 10 18:42:19 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Jan 2009 18:42:19 -0500
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <e76aa17f0901101508h2648fc7n3f4b7ddf066a1eba@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<49689AB7.7030102@ar.media.kyoto-u.ac.jp>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
	<c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>
	<5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>
	<1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com>
	<e76aa17f0901101508h2648fc7n3f4b7ddf066a1eba@mail.gmail.com>
Message-ID: <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com>

On Sat, Jan 10, 2009 at 6:08 PM, Matthieu Brucher
<matthieu.brucher at gmail.com> wrote:
>> So I think that, if you want to link against mkl, then the best would
>> be to make a local copy of the dlls. This seems to be the common
>> policy,  for example, I have more than 10 programs (mostly open
>> source, including numpy/scipy) that all have their own lapack in the
>> local directories.
>
> Contrary to usual programs, Python is scattered in several folders,
> and in the absolute, the dll should be put in the main Python folder
> to be seen by Python at runtime. But then what would you do if other
> modules provide their own MKL dlls?

Currently scipy has duplicates of lapack, blas in different folders,
so having several copies
of another set of dlls wouldn't make much difference. If they are put
in the main python folder, they could be renamed (if that is possible,
since it would be a "private" copy) to lapack_scipy.dll.

But I have no idea what happens when other extensions, e.g. scikits
want to compile against the same libraries and how that can be
supported. But whatever the mechanism, I think numpy/scipy should have
its own copies of the dlls and not rely on a system wide install.

However, for these things, I'm a pure user, who only suffers every
once in a while if programs don't stick to their own territory.

Josef


> The only solution is to load on the fly the dlls, a kind of plugin
> system. This means that someone has to propose a patch to support
> this, without slowdowns compared to the current position (which is
> very good thanks to David efforts to provide fully functional
> Numpy/Scipy installers, even if he does not have much time, PhD
> power). The situation is not optimal, but it is not bad, it is very
> good. If someone wants to do better, he can always propose something
> ;)
>
> Matthieu
> --
> Information System Engineer, Ph.D.
> Website: http://matthieu-brucher.developpez.com/
> Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From matthieu.brucher at gmail.com  Sat Jan 10 18:55:22 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sun, 11 Jan 2009 00:55:22 +0100
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
	<c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>
	<5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>
	<1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com>
	<e76aa17f0901101508h2648fc7n3f4b7ddf066a1eba@mail.gmail.com>
	<1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com>
Message-ID: <e76aa17f0901101555h5635a8f8tb45ce1babc6f159c@mail.gmail.com>

>> Contrary to usual programs, Python is scattered in several folders,
>> and in the absolute, the dll should be put in the main Python folder
>> to be seen by Python at runtime. But then what would you do if other
>> modules provide their own MKL dlls?
>
> Currently scipy has duplicates of lapack, blas in different folders,
> so having several copies
> of another set of dlls wouldn't make much difference. If they are put
> in the main python folder, they could be renamed (if that is possible,
> since it would be a "private" copy) to lapack_scipy.dll.

But you can't access them as standard dll, they are accessed by Python
by a dymanic loader, with a specific interface. So you have to provide
a plugin mecanism.

> But I have no idea what happens when other extensions, e.g. scikits
> want to compile against the same libraries and how that can be
> supported. But whatever the mechanism, I think numpy/scipy should have
> its own copies of the dlls and not rely on a system wide install.

First, with Windows, you have to compile against the lib files, which
you cannot distribute in the case of the MKL. So you use the plugin
mecanism to access the library, thus a numpy or scipy interface.
Problem closed.

> However, for these things, I'm a pure user, who only suffers every
> once in a while if programs don't stick to their own territory.

And we need someone to try to get this idea working ;)

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From josef.pktd at gmail.com  Sat Jan 10 21:29:06 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Jan 2009 21:29:06 -0500
Subject: [SciPy-user] cdf and integration for multivariate normal
	distribution in stats.kde
Message-ID: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com>

I found the fortran code for rectangular integration of the
multivariate normal distribution in stats kde, which can be used to
calculate the cdf.

I didn't see this function exposed anywhere in scipy. Did I miss it?

I wrote a quick wrapper, with two functions

mvstdnormcdf(lower,upper,corrcoef, **kwds)   direct convenience
wrapper with just some reparameterization for convenience, for
standard normal
mvnormcdf(lower, upper, mu, cov, **kwds)    allows non-standard
multivariate normal distribution, normalizes distribution and calls
mvstdnormcdf

Both calculate the integral only for a single are, no vectorization
yet. Also, this is not yet a clean version.

I wanted to ask first if I missed it, and it's already used somewhere in scipy.

Josef
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mvncdf.py
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090110/bef63fec/attachment.ksh>

From robert.kern at gmail.com  Sat Jan 10 21:35:26 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 10 Jan 2009 20:35:26 -0600
Subject: [SciPy-user] cdf and integration for multivariate normal
	distribution in stats.kde
In-Reply-To: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com>
References: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com>
Message-ID: <3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com>

On Sat, Jan 10, 2009 at 20:29,  <josef.pktd at gmail.com> wrote:
> I found the fortran code for rectangular integration of the
> multivariate normal distribution in stats kde, which can be used to
> calculate the cdf.
>
> I didn't see this function exposed anywhere in scipy. Did I miss it?

No, I didn't expose it. Not for any particular reason; it's just that
the only use case I had was the KDE stuff.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From josef.pktd at gmail.com  Sat Jan 10 21:53:29 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 10 Jan 2009 21:53:29 -0500
Subject: [SciPy-user] cdf and integration for multivariate normal
	distribution in stats.kde
In-Reply-To: <3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com>
References: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com>
	<3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com>
Message-ID: <1cd32cbb0901101853q4696aeb4l7e21c10aee1c92ee@mail.gmail.com>

On Sat, Jan 10, 2009 at 9:35 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Sat, Jan 10, 2009 at 20:29,  <josef.pktd at gmail.com> wrote:
>> I found the fortran code for rectangular integration of the
>> multivariate normal distribution in stats kde, which can be used to
>> calculate the cdf.
>>
>> I didn't see this function exposed anywhere in scipy. Did I miss it?
>
> No, I didn't expose it. Not for any particular reason; it's just that
> the only use case I had was the KDE stuff.
>

I will add it to stats.distributions when I find time to clean it up
and add tests.

mvn cdf will be useful to construct normal copulas.

Josef


From cournape at gmail.com  Sun Jan 11 02:57:04 2009
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 11 Jan 2009 16:57:04 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>
	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>
	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>
	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>
	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>
	<c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>
	<5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>
	<1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com>
	<e76aa17f0901101508h2648fc7n3f4b7ddf066a1eba@mail.gmail.com>
	<1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com>
Message-ID: <5b8d13220901102357m5107c70cs5a2ccf2826e6035a@mail.gmail.com>

On Sun, Jan 11, 2009 at 8:42 AM,  <josef.pktd at gmail.com> wrote:
> On Sat, Jan 10, 2009 at 6:08 PM, Matthieu Brucher
> <matthieu.brucher at gmail.com> wrote:
>>> So I think that, if you want to link against mkl, then the best would
>>> be to make a local copy of the dlls. This seems to be the common
>>> policy,  for example, I have more than 10 programs (mostly open
>>> source, including numpy/scipy) that all have their own lapack in the
>>> local directories.
>>
>> Contrary to usual programs, Python is scattered in several folders,
>> and in the absolute, the dll should be put in the main Python folder
>> to be seen by Python at runtime. But then what would you do if other
>> modules provide their own MKL dlls?
>
> Currently scipy has duplicates of lapack, blas in different folders,

Not really: we have copy of blas/lapack in scipy/lib, but that has
nothing to do with build issues: the code itself is duplicated.

> so having several copies
> of another set of dlls wouldn't make much difference.

Sure, but then you have to wonder: what's the point of dynamic linking
:) Statically linking everything is more or less the same as copying
the dll everywhere, with the benefit it is more reliable. The drawback
is memory waste (since the code cannot be shared), but well, it is not
like a few MB will make a difference on windows.

> If they are put
> in the main python folder, they could be renamed (if that is possible,
> since it would be a "private" copy) to lapack_scipy.dll.

I don't think we should install anything in the main python folder. I
personally would be pissed if other softwares did that.

> I think numpy/scipy should have
> its own copies of the dlls and not rely on a system wide install.

Yes, it would be good if it that was possible. But as you see, it is
difficult. It is difficult on any platform, but windows makes it
particularly difficult. It is so difficult that almost noone does it:
either they put all their dll in one directory (as matlab, as you
pointed out), but we can't do that, or they install globally, or in
the SxS. Manifest could in theory solve this (they are a MS mechanism
to refer to dll inside other dll) - but the system was obviously not
designed to be used by other tools, it can be safely considered as an
implementation scheme specific to MS compilers. As almost anything MS
related, you have to use MS tools only if you want to use this feature
at this point.

David


From david at ar.media.kyoto-u.ac.jp  Sun Jan 11 04:42:13 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sun, 11 Jan 2009 18:42:13 +0900
Subject: [SciPy-user] matrix inversion time (Python vs MATLAB)
In-Reply-To: <5b8d13220901102357m5107c70cs5a2ccf2826e6035a@mail.gmail.com>
References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com>	<ca5d3753b49826fff365b5095e83c3f1.squirrel@webmail.uio.no>	<4968CDA6.1030705@ar.media.kyoto-u.ac.jp>	<62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no>	<e76aa17f0901101010v1de9d678k34ec7bffe8a496fd@mail.gmail.com>	<c7e5fef4bf5fd97d3f32dc15039b9302.squirrel@webmail.uio.no>	<5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com>	<1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com>	<e76aa17f0901101508h2648fc7n3f4b7ddf066a1eba@mail.gmail.com>	<1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com>
	<5b8d13220901102357m5107c70cs5a2ccf2826e6035a@mail.gmail.com>
Message-ID: <4969BEF5.9000801@ar.media.kyoto-u.ac.jp>

David Cournapeau wrote:
> On Sun, Jan 11, 2009 at 8:42 AM,  <josef.pktd at gmail.com> wrote:
>   
>> On Sat, Jan 10, 2009 at 6:08 PM, Matthieu Brucher
>> <matthieu.brucher at gmail.com> wrote:
>>     
>>>> So I think that, if you want to link against mkl, then the best would
>>>> be to make a local copy of the dlls. This seems to be the common
>>>> policy,  for example, I have more than 10 programs (mostly open
>>>> source, including numpy/scipy) that all have their own lapack in the
>>>> local directories.
>>>>         
>>> Contrary to usual programs, Python is scattered in several folders,
>>> and in the absolute, the dll should be put in the main Python folder
>>> to be seen by Python at runtime. But then what would you do if other
>>> modules provide their own MKL dlls?
>>>       
>> Currently scipy has duplicates of lapack, blas in different folders,
>>     
>
> Not really: we have copy of blas/lapack in scipy/lib, but that has
> nothing to do with build issues: the code itself is duplicated.
>
>   
>> so having several copies
>> of another set of dlls wouldn't make much difference.
>>     
>
> Sure, but then you have to wonder: what's the point of dynamic linking
> :) Statically linking everything is more or less the same as copying
> the dll everywhere, with the benefit it is more reliable. The drawback
> is memory waste (since the code cannot be shared), but well, it is not
> like a few MB will make a difference on windows.
>   

Now that I think about it, I am not even sure that the dll can be shared
anyway if we copy it. After all, they are copied because the dynamic
loader needs them there, and I am not sure the loader will check that
they are the same (Even if the files are exactly the same, maybe two
copies were provided to avoid some sharing - but this is getting far
beyond my knowledge on how this stuff work).

David


From contact at pythonxy.com  Sun Jan 11 07:13:33 2009
From: contact at pythonxy.com (Pierre Raybaut)
Date: Sun, 11 Jan 2009 13:13:33 +0100
Subject: [SciPy-user] PyQtShell
Message-ID: <4969E26D.8050906@pythonxy.com>

Hi all,

I would like to share with you this little open-source project of mine, 
PyQtShell:
http://pypi.python.org/pypi/PyQtShell/
http://code.google.com/p/pyqtshell/

I've just started it a few days ago and I worked on it only a couple of 
hours at home this week and saturday morning... so do not expect a 
revolution here.
But I thought that some of you might be interested in contributing or 
simply testing it.

Here is an extract from the Google Code website:

PyQtShell is intended to be an extension to PyQt4 (module PyQt4.QtShell) 
providing a console application (see screenshots below) based on 
independent widgets which interact with each other:
    - QShell, a Python shell with useful options (like a '-os' switch 
for importing os and os.path as osp, a '-pylab' switch for importing 
matplotlib in interactive mode, ...) and advanced features like code 
completion (requires QScintilla, i.e. module PyQt4.Qsci)
    - CurrentDirChanger: shows the current directory and allows to change it
Not implemented :
    - GlobalsExplorer: shows globals() list with some properties for 
each global (e.g. value for int or float, min and max values for arrays, 
...) and allows to open an appropriate GUI editor
    - and other widgets: FileExplorer, CodeEditor, ...

Cheers,
Pierre


From dineshbvadhia at hotmail.com  Sun Jan 11 07:15:03 2009
From: dineshbvadhia at hotmail.com (Dinesh B Vadhia)
Date: Sun, 11 Jan 2009 04:15:03 -0800
Subject: [SciPy-user] dimension mismatch error
Message-ID: <COL103-DS242827519454B4EE8CBAD1A3DB0@phx.gbl>

I want to do a vector-matrix multiplication as follows:

z = y * A

... where y is a (1 x J) vector, A is a (I x J) Scipy (csr) Sparse matrix, and the resulting z a (1 x J) vector.

The calculation results in this dimension mismatch error:

Traceback (most recent call last):
  File " ... .py", line 260, in <module>
    ...
  File "C:\Python25\Lib\site-packages\scipy\sparse\base.py", line 350, in __rmul__
    return (self.transpose() * tr).transpose()
  File "C:\Python25\Lib\site-packages\scipy\sparse\base.py", line 299, in __mul__
    raise ValueError('dimension mismatch')
ValueError: dimension mismatch

Any ideas?

Dinesh

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090111/a71ef15a/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Sun Jan 11 07:26:27 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sun, 11 Jan 2009 21:26:27 +0900
Subject: [SciPy-user] dimension mismatch error
In-Reply-To: <COL103-DS242827519454B4EE8CBAD1A3DB0@phx.gbl>
References: <COL103-DS242827519454B4EE8CBAD1A3DB0@phx.gbl>
Message-ID: <4969E573.3040206@ar.media.kyoto-u.ac.jp>

Dinesh B Vadhia wrote:
> I want to do a vector-matrix multiplication as follows:
>  
> z = y * A
>  
> ... where y is a (1 x J) vector, A is a (I x J) Scipy (csr) Sparse
> matrix, and the resulting z a (1 x J) vector.

Is y dimension (1xJ) a type or the actual dimension ? In that later
case, a ValueError is expected for matrix product :)

cheers,

David


From josef.pktd at gmail.com  Sun Jan 11 07:55:01 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 11 Jan 2009 07:55:01 -0500
Subject: [SciPy-user] dimension mismatch error
In-Reply-To: <4969E573.3040206@ar.media.kyoto-u.ac.jp>
References: <COL103-DS242827519454B4EE8CBAD1A3DB0@phx.gbl>
	<4969E573.3040206@ar.media.kyoto-u.ac.jp>
Message-ID: <1cd32cbb0901110455v41d9132excfe66f08b572838b@mail.gmail.com>

On Sun, Jan 11, 2009 at 7:26 AM, David Cournapeau
<david at ar.media.kyoto-u.ac.jp> wrote:
> Dinesh B Vadhia wrote:
>> I want to do a vector-matrix multiplication as follows:
>>
>> z = y * A
>>
>> ... where y is a (1 x J) vector, A is a (I x J) Scipy (csr) Sparse
>> matrix, and the resulting z a (1 x J) vector.
>
> Is y dimension (1xJ) a type or the actual dimension ? In that later
> case, a ValueError is expected for matrix product :)
>
> cheers,
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>

I guess this is the problem
It looks like, which multiplication is used, is defined by the order,
__rmult__, __mult__

>>> A = sparse.lil_matrix([[0.0,0,5,3],]).tocsr()
>>> A
<1x4 sparse matrix of type '<type 'numpy.float64'>'
	with 2 stored elements in Compressed Sparse Row format>
>>> np.ones(4)*A
Traceback (most recent call last):
  File "<pyshell#126>", line 1, in <module>
    np.ones(4)*A
  File "\Programs\Python25\Lib\site-packages\scipy\sparse\base.py",
line 350, in __rmul__
    return (self.transpose() * tr).transpose()
  File "\Programs\Python25\Lib\site-packages\scipy\sparse\base.py",
line 299, in __mul__
    raise ValueError('dimension mismatch')
ValueError: dimension mismatch
>>> A*np.ones(4)
array([ 8.])

Josef


From H.Zahiri at curtin.edu.au  Sun Jan 11 07:48:55 2009
From: H.Zahiri at curtin.edu.au (Hani Zahiri)
Date: Sun, 11 Jan 2009 21:48:55 +0900
Subject: [SciPy-user] Problem with reading binary file (diffrener result
	between MATLAB and Python)
Message-ID: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au>

Hi folks,

 
I am trying to translate one of my MATLAB scripts to Python and I am
experiencing a strange problem (at least to me!) and I am desperetly
looking for help. The binary file is a raw binary containing header
information (first 720 bytes) following by radar data. For better
illustration and using python basic functions, first 800 bytes of the
file is look like this:

 
>>> fid = open("file_name","rb")

>>> fid.read(800)

'\x00\x00\x00\x012\xc0\x12\x12\x00\x00\x02\xd0A   CEOS-SAR-CCT A A 1.00
1AL1 PSRBIMOP    FSEQ       1   4FTYP       5   4FLGT       9   4
18432 88220                          32   2   8       1   18432   0
10976   0   0   0BSQ  1 1 412   87808   0      13 4PB  49 2PB  45 4PB
21 4PB  29 4PB                                  97 4PB
COMPLEX*8                   C*8    0   0
\x00\x00\x00\x022\n\x12\x14\x00\x01X\x9c\x00\x00\x00\x01\x00\x00\x00\x01
\x00\x00\x00\x00\x00\x00*\xe0\x00\x00\x00\x00\x00\x00\x00\x00\ ...
x00\x00\x07\xd6\x00\x00\x00\x8d\x02\xdd\xfe\x95\x00\x01\x00\x00\x00\x00\
x00\x00\x00\x1c\xae\x93\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ix ...

\x00\x00\x00\x00I}.\xd0'

 
And now the problem is: 

If I read the file in MATLAB, let say to find out length of header part,
I will get the correct answer:

 
EDU>> fid=fopen('file_name','r','b');

EDU>> fseek(fid,8,'bof');

EDU>> fread(fid,1,'uint32')

 
ans =

 
   720 

 
However if I read this in python I am keep getting this wrong:

 
>>> fid.seek(8)

>>> scipy.fromfile(fid,'uint32',1)

array([3489792000], dtype=uint32)

 
I have almost tried every Scipy and Numpy classes with no result. I need
a quick answer to this and I appreciate if anybody can help me with this
problem.

 
Cheers,

 
Hani 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090111/19ac0762/attachment.html>

From robert.kern at gmail.com  Sun Jan 11 08:12:59 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 11 Jan 2009 07:12:59 -0600
Subject: [SciPy-user] Problem with reading binary file (diffrener result
	between MATLAB and Python)
In-Reply-To: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au>
References: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au>
Message-ID: <3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com>

On Sun, Jan 11, 2009 at 06:48, Hani Zahiri <H.Zahiri at curtin.edu.au> wrote:
> Hi folks,
>
>
>
> I am trying to translate one of my MATLAB scripts to Python and I am
> experiencing a strange problem (at least to me!) and I am desperetly looking
> for help. The binary file is a raw binary containing header information
> (first 720 bytes) following by radar data. For better illustration and using
> python basic functions, first 800 bytes of the file is look like this:
>
>
>
>>>> fid = open("file_name","rb")
>
>>>> fid.read(800)
>
> '\x00\x00\x00\x012\xc0\x12\x12\x00\x00\x02\xd0A   CEOS-SAR-CCT A A
> 1.00          1AL1 PSRBIMOP    FSEQ       1   4FTYP       5   4FLGT
> 9   4
> 18432 88220                          32   2   8       1   18432   0
> 10976   0   0   0BSQ  1 1 412   87808   0      13 4PB  49 2PB  45 4PB  21
> 4PB  29 4PB                                  97 4PB
> COMPLEX*8                   C*8    0
> 0
>                                                                                                                                                                                                                                             \x00\x00\x00\x022\n\x12\x14\x00\x01X\x9c\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00*\xe0\x00\x00\x00\x00\x00\x00\x00\x00\
> ...
> x00\x00\x07\xd6\x00\x00\x00\x8d\x02\xdd\xfe\x95\x00\x01\x00\x00\x00\x00\x00\x00\x00\x1c\xae\x93\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ix
> ...
>
> \x00\x00\x00\x00I}.\xd0'
>
>
>
> And now the problem is:
>
> If I read the file in MATLAB, let say to find out length of header part, I
> will get the correct answer:
>
>
>
> EDU>> fid=fopen('file_name','r','b');
>
> EDU>> fseek(fid,8,'bof');
>
> EDU>> fread(fid,1,'uint32')
>
>
>
> ans =
>
>
>
>    720
>
>
>
> However if I read this in python I am keep getting this wrong:
>
>>>> fid.seek(8)
>
>>>> scipy.fromfile(fid,'uint32',1)
>
> array([3489792000], dtype=uint32)

I suspect that you are running your Matlab script on a bigendian
machine and your Python script on a littleendian machine. The length
marker in your file is bigendian. Use dtype='>i4' to read it. You will
probably also need to use bigendian dtypes for the rest of the data,
too.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From H.Zahiri at curtin.edu.au  Sun Jan 11 09:44:05 2009
From: H.Zahiri at curtin.edu.au (Hani Zahiri)
Date: Sun, 11 Jan 2009 23:44:05 +0900
Subject: [SciPy-user] Problem with reading binary file (diffrener
	resultbetween MATLAB and Python)
In-Reply-To: <3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com>
References: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au>
	<3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com>
Message-ID: <82200558F6DE2C479D381D3000D1551C04DC9015@EXMSK1.staff.ad.curtin.edu.au>

Hi Robert,

Many Thanks, you were right and it's work. Since, I run both MATLAB and
Python on windows, I didn't suspect byte order issue.
Probably, original file was generated using different byte order. Is it
the case that MATLAB can recognise the original byte order (because it
is platform-independent) and Python does not?!

Anyway, many thanks. You made my day!

Cheers,

Hani  


-----Original Message-----
From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org]
On Behalf Of Robert Kern
Sent: Sunday, 11 January 2009 10:13 PM
To: SciPy Users List
Subject: Re: [SciPy-user] Problem with reading binary file (diffrener
resultbetween MATLAB and Python)

On Sun, Jan 11, 2009 at 06:48, Hani Zahiri <H.Zahiri at curtin.edu.au>
wrote:
> Hi folks,
>
>
>
> I am trying to translate one of my MATLAB scripts to Python and I am
> experiencing a strange problem (at least to me!) and I am desperetly
looking
> for help. The binary file is a raw binary containing header
information
> (first 720 bytes) following by radar data. For better illustration and
using
> python basic functions, first 800 bytes of the file is look like this:
>
>
>
>>>> fid = open("file_name","rb")
>
>>>> fid.read(800)
>
> '\x00\x00\x00\x012\xc0\x12\x12\x00\x00\x02\xd0A   CEOS-SAR-CCT A A
> 1.00          1AL1 PSRBIMOP    FSEQ       1   4FTYP       5   4FLGT
> 9   4
> 18432 88220                          32   2   8       1   18432   0
> 10976   0   0   0BSQ  1 1 412   87808   0      13 4PB  49 2PB  45 4PB
21
> 4PB  29 4PB                                  97 4PB
> COMPLEX*8                   C*8    0
> 0
>
\x00\x00\x00\x022\n\x12\x14\x00\x01X\x9c\x00\x00\x00\x01\x00\x00\x00\x01
\x00\x00\x00\x00\x00\x00*\xe0\x00\x00\x00\x00\x00\x00\x00\x00\
> ...
>
x00\x00\x07\xd6\x00\x00\x00\x8d\x02\xdd\xfe\x95\x00\x01\x00\x00\x00\x00\
x00\x00\x00\x1c\xae\x93\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ix
> ...
>
> \x00\x00\x00\x00I}.\xd0'
>
>
>
> And now the problem is:
>
> If I read the file in MATLAB, let say to find out length of header
part, I
> will get the correct answer:
>
>
>
> EDU>> fid=fopen('file_name','r','b');
>
> EDU>> fseek(fid,8,'bof');
>
> EDU>> fread(fid,1,'uint32')
>
>
>
> ans =
>
>
>
>    720
>
>
>
> However if I read this in python I am keep getting this wrong:
>
>>>> fid.seek(8)
>
>>>> scipy.fromfile(fid,'uint32',1)
>
> array([3489792000], dtype=uint32)

I suspect that you are running your Matlab script on a bigendian
machine and your Python script on a littleendian machine. The length
marker in your file is bigendian. Use dtype='>i4' to read it. You will
probably also need to use bigendian dtypes for the rest of the data,
too.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user


From wizzard028wise at gmail.com  Sun Jan 11 11:02:07 2009
From: wizzard028wise at gmail.com (Dorian)
Date: Sun, 11 Jan 2009 17:02:07 +0100
Subject: [SciPy-user] cdf and integration for multivariate normal
	distribution in stats.kde
In-Reply-To: <1cd32cbb0901101853q4696aeb4l7e21c10aee1c92ee@mail.gmail.com>
References: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com>
	<3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com>
	<1cd32cbb0901101853q4696aeb4l7e21c10aee1c92ee@mail.gmail.com>
Message-ID: <674a602a0901110802r68b79992w80f4ecf4f5195b3f@mail.gmail.com>

I've used the R package "copula"  which is easy to handle.
http://cran.r-project.org/web/packages/copula/index.html

And I've written already many pieces of code to play with copula based on
the previous R package.
I'm not a programmer, but  mathematician so I really don't  know,  how to
put them together and made them available for others  [?]


2009/1/11 <josef.pktd at gmail.com>

> On Sat, Jan 10, 2009 at 9:35 PM, Robert Kern <robert.kern at gmail.com>
> wrote:
> > On Sat, Jan 10, 2009 at 20:29,  <josef.pktd at gmail.com> wrote:
> >> I found the fortran code for rectangular integration of the
> >> multivariate normal distribution in stats kde, which can be used to
> >> calculate the cdf.
> >>
> >> I didn't see this function exposed anywhere in scipy. Did I miss it?
> >
> > No, I didn't expose it. Not for any particular reason; it's just that
> > the only use case I had was the KDE stuff.
> >
>
> I will add it to stats.distributions when I find time to clean it up
> and add tests.
>
> mvn cdf will be useful to construct normal copulas.
>
> Josef
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090111/b850361b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 361.gif
Type: image/gif
Size: 226 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090111/b850361b/attachment.gif>

From robert.kern at gmail.com  Sun Jan 11 19:55:32 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 11 Jan 2009 18:55:32 -0600
Subject: [SciPy-user] Problem with reading binary file (diffrener
	resultbetween MATLAB and Python)
In-Reply-To: <82200558F6DE2C479D381D3000D1551C04DC9015@EXMSK1.staff.ad.curtin.edu.au>
References: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au>
	<3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com>
	<82200558F6DE2C479D381D3000D1551C04DC9015@EXMSK1.staff.ad.curtin.edu.au>
Message-ID: <3d375d730901111655l1ba7a674xde26c4404e7c280a@mail.gmail.com>

On Sun, Jan 11, 2009 at 08:44, Hani Zahiri <H.Zahiri at curtin.edu.au> wrote:
> Hi Robert,
>
> Many Thanks, you were right and it's work. Since, I run both MATLAB and
> Python on windows, I didn't suspect byte order issue.
> Probably, original file was generated using different byte order. Is it
> the case that MATLAB can recognise the original byte order (because it
> is platform-independent) and Python does not?!

No, you are explicitly telling MATLAB that the file is big-endian when
you use 'b' (short for 'ieee-be') as the third argument to fopen(). In
Python, the file objects neither know nor care about integer formats;
they just give bytes. You have to use that knowledge when you convert
to a numpy object by picking the right dtype.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From fragon25 at yahoo.com  Sun Jan 11 22:58:53 2009
From: fragon25 at yahoo.com (Tan Tran)
Date: Sun, 11 Jan 2009 19:58:53 -0800 (PST)
Subject: [SciPy-user] problem ValueError: array is not broadcastable to
	correct shape - convert shape (x, ) to (x, 1)
References: <mailman.4082.1231690974.2878.numpy-discussion@scipy.org>
Message-ID: <426807.42274.qm@web39206.mail.mud.yahoo.com>

Hello,

I'm trying to convert a command like this in matlab to python numpy
mysel((col3 ==11 & (col4 ==16)) = y((col3 ==11 & (col4 ==16)), col0);

from numpy import *

y = array([ [ 0, 1,11,15, 4], \
            [10,11,12,16,14], \
            [20,21,11,17,24], \
            [30,31,12,15,34], \
            [40,41,11,16,44], \
            [50,51,12,17,54], \
            [60,61,11,15,64], \
            [70,71,11,16,74], \
            [80,81,11,17,84], \
            [90,91,12,15,94]])

col3 = 2
col4 = 3
col0 = 0
mysel = zeros((10, 1),int)

mypick = y[:,col0][(y[:,col3] == 11) & (y[:,col4] == 16)]
print mypick 
print mypick.shape

aa = mysel[(y[:,col3] == 11) & (y[:,col4] == 16)]
print aa
print aa.shape

mysel[(y[:,col3] == 11) & (y[:,col4] == 16)]  = mypick  <-- error here ValueError: array is not broadcastable to correct shape


I check the shape of two sides and see they are not adequate. The shape of mypick is (2,) and the shape of mysel[(y[:,col3] == 11) & (y[:,col4] == 16)] is (2,1). 
I have a problem selecting elements in numpy. It's always return something with shape (x,). How can I reformat it to the shape that I want like (x,1)?
Is there other way to do the task?

Thanks,


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090111/9abf5b48/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Sun Jan 11 23:03:03 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 12 Jan 2009 13:03:03 +0900
Subject: [SciPy-user] problem ValueError: array is not broadcastable to
 correct shape - convert shape (x, ) to (x, 1)
In-Reply-To: <426807.42274.qm@web39206.mail.mud.yahoo.com>
References: <mailman.4082.1231690974.2878.numpy-discussion@scipy.org>
	<426807.42274.qm@web39206.mail.mud.yahoo.com>
Message-ID: <496AC0F7.3050109@ar.media.kyoto-u.ac.jp>

Tan Tran wrote:
> Hello,
>
> I'm trying to convert a command like this in matlab to python numpy
> mysel((col3 ==11 & (col4 ==16)) = y((col3 ==11 & (col4 ==16)), col0);
>
> from numpy import *
>
> y = array([ [ 0, 1,11,15, 4], \
>             [10,11,12,16,14], \
>             [20,21,11,17,24], \
>             [30,31,12,15,34], \
>             [40,41,11,16,44], \
>             [50,51,12,17,54], \
>             [60,61,11,15,64], \
>             [70,71,11,16,74], \
>             [80,81,11,17,84], \
>             [90,91,12,15,94]])
>
> col3 = 2
> col4 = 3
> col0 = 0
> mysel = zeros((10, 1),int)
>
> mypick = y[:,col0][(y[:,col3] == 11) & (y[:,col4] == 16)]
> print mypick
> print mypick.shape
>
> aa = mysel[(y[:,col3] == 11) & (y[:,col4] == 16)]
> print aa
> print aa.shape
>
> mysel[(y[:,col3] == 11) & (y[:,col4] == 16)]  = mypick  <-- error here
> ValueError: array is not broadcastable to correct shape
>
>
> I check the shape of two sides and see they are not adequate. The
> shape of mypick is (2,) and the shape of mysel[(y[:,col3] == 11) &
> (y[:,col4] == 16)] is (2,1).
> I have a problem selecting elements in numpy. It's always return
> something with shape (x,). How can I reformat it to the shape that I
> want like (x,1)?

This should do it:

import numpy as np

a = np.random.randn(10, 2) # 10 rows, 2column array
a1 = a[:, 0] # first column, (10,) array
a1 = a[:, 0:1] # column 0 to 1, 1 not included -> (10, 1) array

David


From rmay31 at gmail.com  Mon Jan 12 13:24:16 2009
From: rmay31 at gmail.com (Ryan May)
Date: Mon, 12 Jan 2009 12:24:16 -0600
Subject: [SciPy-user] pupynere/scipy.io.netcdf
Message-ID: <496B8AD0.2000600@gmail.com>

Hi,

Anyone know if pupynere (a version of which is in scipy.io.netcdf) supports
writing files with 64-bit offsets?  This allows writing files larger than 2GB.

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma


From daniel.wheeler2 at gmail.com  Mon Jan 12 14:59:52 2009
From: daniel.wheeler2 at gmail.com (Daniel Wheeler)
Date: Mon, 12 Jan 2009 14:59:52 -0500
Subject: [SciPy-user] Scipy 0.7, weave, windows
In-Reply-To: <5b8d13220901090431u7b9b0de4n4ce26d187fc5e20d@mail.gmail.com>
References: <49661969.40905@ar.media.kyoto-u.ac.jp>
	<80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com>
	<5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com>
	<80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com>
	<5b8d13220901090431u7b9b0de4n4ce26d187fc5e20d@mail.gmail.com>
Message-ID: <80b160a0901121159m4f1695e9o1c4b8cca7f95033a@mail.gmail.com>

Hi David,

I ran all the weave tests in FiPy (on windows) with the binary posted
and I get absolutley no errors. I was a little confused by this
because we generally get a mountain of doctest errors the first time
we run the weave tests. These are associated with the "weave
compiling" output noise. I deleted the ".python25_compiled" directory
and I only get one doctest error associated with the creation of that
directory. Anyway, the long and the short of it is that everything
seems cool and as an addition weave is no longer puking that annoying
"weave compiling" noise that breaks all the doctests.

Cheers

On Fri, Jan 9, 2009 at 7:31 AM, David Cournapeau <cournape at gmail.com> wrote:
> On Fri, Jan 9, 2009 at 5:00 AM, Daniel Wheeler
> <daniel.wheeler2 at gmail.com> wrote:
>> On Thu, Jan 8, 2009 at 11:15 AM, David Cournapeau <cournape at gmail.com> wrote:
>>> On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler
>>> <daniel.wheeler2 at gmail.com> wrote:
>>>> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau
>>>> <david at ar.media.kyoto-u.ac.jp> wrote:
>>>>> Hi,
>>>>>
>>>>>    I just did a full build/install/test dance of scipy 0.7 on windows,
>>>>> and things look good - except weave, which brings 205 errors when the
>>>>> full test suite is run. Do people use weave on windows ?
>>>>
>>>> Yes. Our test suite for fipy currently passes all it's weave tests on
>>>> windows with python 2.5 and scipy version 0.6.0 and that includes a
>>>> lot of auto generated weave code.
>>>
>>> Thanks for the info. Would you mind testing it with scipy 0.7.x branch
>>> ? There are some test failures which showed some old code which could
>>> not have worked (like using python code which was removed from python
>>> svn 5 years ago), but as I am not a weave user myself, I can't really
>>> assess what's significant and what's not.
>>>
>>> I could make a binary installer if that makes it easier for you to test,
>>
>> That would be great if you have it set to build quickly and easily.
>> Don't fancy figuring out how to build scipy on windows. Cheers.
>
> No need to worry, I am the one who coded the tools for the windows
> binary installer, so hopefully I am still familiar with it :)
>
> Here we are:
>
> http://www.ar.media.kyoto-u.ac.jp/members/david/archives/scipy/scipy-0.7.0.dev5410-win32-superpack-python2.5.exe
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


-- 
Daniel Wheeler


From afraser at lanl.gov  Mon Jan 12 15:36:43 2009
From: afraser at lanl.gov (Andy Fraser)
Date: Mon, 12 Jan 2009 13:36:43 -0700
Subject: [SciPy-user] C extension to manipulate sparse lil matrix
In-Reply-To: <87wsd5qpyd.fsf@lanl.gov> (Andy Fraser's message of "Thu\,
	08 Jan 2009 17\:32\:26 -0700")
References: <87wsd5qpyd.fsf@lanl.gov>
Message-ID: <8763kkqn1g.fsf@lanl.gov>

>>>>> "A" == Andy Fraser <afraser at lanl.gov> writes:

    A> I want to move some time critical bits of code for hidden
    A> Markov models from python to C.  [...]

    A> Now, I am trying to figure out how to manipulate lil sparse
    A> matrices.  In particular calling such a matrix "SM", and
    A> supposing that "t" is the index for a row, I want to assign new
    A> arrays to "SM.rows[t]" and "SM.data[t]".

    A> I would be grateful if someone posted C code that interchanged
    A> two rows of a lil sparse matrix.  [...]

I've answered my own question.  I append the code below.  I found that
lil matrices consist of numpy arrays of python lists.  I also found
that I could replace the python lists with numpy arrays.  Here are
some resources that helped:

   http://www.tramy.us/ Oliphant's "Guide to Numpy"
   http://docs.python.org/extending/
   http://docs.python.org/c-api/memory.html
   http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays
   http://docs.python.org/c-api/arg.html#arg-parsing

Here is code that swaps "rows":

static PyObject *
hmm_test1(PyObject *self, PyObject *args)
/* Python equivalent of:
def test1(A_O # An array of objects like the "rows" of a sparse lil matrix
          ):
    temp = A_O[0]
    A_O[0] = A_O[1]
    A_O[1] = temp
    return
*/
{
  PyObject **object0, **object1, *temp;
  PyArrayObject *A_O;
  if (!PyArg_ParseTuple(args, "O&",
                        PyArray_Converter, &A_O
                        ))
    return NULL;
  object0 = PyArray_GETPTR1(A_O, 0);
  object1 = PyArray_GETPTR1(A_O, 1);
  temp = *object0;
  *object0 = *object1;
  *object1 = temp;
  return Py_BuildValue(""); /* return None */
}


From Matt.Fago at itt.com  Mon Jan 12 16:26:10 2009
From: Matt.Fago at itt.com (Fago, Matt - AES)
Date: Mon, 12 Jan 2009 16:26:10 -0500
Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack
Message-ID: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com>

I'm testing SciPy 0.7.0 b1 for Fedora 9 and have come across an issue:

>>from scipy.interpolate.fitpack import splev

gives the warning:

/usr/lib64/python2.5/site-packages/scipy/sparse/linalg/dsolve/linsolve.py:20:
DeprecationWarning: scipy.sparse.linalg.dsolve.umfpack will be removed, install
scikits.umfpack instead
  ' install scikits.umfpack instead', DeprecationWarning )

I've searched on this topic and found a similar discussion involving linsolve:

http://article.gmane.org/gmane.comp.python.scientific.devel/9359

that was fixed in svn trunk revision 5214, but evidently I'm seeing a separate issue?


Thanks,
Matt

This e-mail and any files transmitted with it may be proprietary and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the sender.
Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of ITT Corporation. The recipient should check this e-mail and any attachments for the presence of viruses. ITT accepts no liability for any damage caused by any virus transmitted by this e-mail.


From nwagner at iam.uni-stuttgart.de  Mon Jan 12 16:54:52 2009
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Mon, 12 Jan 2009 22:54:52 +0100
Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack
In-Reply-To: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com>
References: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com>
Message-ID: <web-115840269@uni-stuttgart.de>

On Mon, 12 Jan 2009 16:26:10 -0500
  "Fago, Matt - AES" <Matt.Fago at itt.com> wrote:
> I'm testing SciPy 0.7.0 b1 for Fedora 9 and have come 
>across an issue:
> 
>>>from scipy.interpolate.fitpack import splev
> 
> gives the warning:
> 
> /usr/lib64/python2.5/site-packages/scipy/sparse/linalg/dsolve/linsolve.py:20:
> DeprecationWarning: scipy.sparse.linalg.dsolve.umfpack 
>will be removed, install
> scikits.umfpack instead
>  ' install scikits.umfpack instead', DeprecationWarning 
>)
> 
> I've searched on this topic and found a similar 
>discussion involving linsolve:
> 
> http://article.gmane.org/gmane.comp.python.scientific.devel/9359
> 
> that was fixed in svn trunk revision 5214, but evidently 
>I'm seeing a separate issue?
> 
> 
> Thanks,
> Matt
> 
> This e-mail and any files transmitted with it may be 
>proprietary and are intended solely for the use of the 
>individual or entity to whom they are addressed. If you 
>have received this e-mail in error please notify the 
>sender.
> Please note that any views or opinions presented in this 
>e-mail are solely those of the author and do not 
>necessarily represent those of ITT Corporation. The 
>recipient should check this e-mail and any attachments 
>for the presence of viruses. ITT accepts no liability for 
>any damage caused by any virus transmitted by this 
>e-mail.
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
  
Python 2.6 (r26:66714, Dec  3 2008, 10:55:18)
[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2
Type "help", "copyright", "credits" or "license" for more 
information.
>>> from scipy.interpolate.fitpack import splev
>>> import scipy
>>> scipy.__version__
'0.8.0.dev5446'


From pav at iki.fi  Mon Jan 12 16:54:28 2009
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 12 Jan 2009 21:54:28 +0000 (UTC)
Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack
References: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com>
Message-ID: <gkge6k$sl0$2@ger.gmane.org>

Mon, 12 Jan 2009 16:26:10 -0500, Fago, Matt - AES wrote:
> I'm testing SciPy 0.7.0 b1 for Fedora 9 and have come across an issue:
>>>from scipy.interpolate.fitpack import splev
> 
> gives the warning:
> 
> /usr/lib64/python2.5/site-packages/scipy/sparse/linalg/dsolve/
linsolve.py:20:
> DeprecationWarning: scipy.sparse.linalg.dsolve.umfpack will be removed,
> install scikits.umfpack instead
>   ' install scikits.umfpack instead', DeprecationWarning )
[clip]
> that was fixed in svn trunk revision 5214, but evidently I'm seeing a
> separate issue?

It's the same issue. The fix didn't make it into the beta (but will be in 
rc/final).

-- 
Pauli Virtanen


From scott.sinclair.za at gmail.com  Tue Jan 13 00:31:01 2009
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Tue, 13 Jan 2009 07:31:01 +0200
Subject: [SciPy-user] pupynere/scipy.io.netcdf
In-Reply-To: <496B8AD0.2000600@gmail.com>
References: <496B8AD0.2000600@gmail.com>
Message-ID: <6a17e9ee0901122131l15bf6756x7025c2f93d5c70ba@mail.gmail.com>

> 2009/1/12 Ryan May <rmay31 at gmail.com>:
> Anyone know if pupynere (a version of which is in scipy.io.netcdf) supports
> writing files with 64-bit offsets?  This allows writing files larger than 2GB.

You might try asking on the Matplotlib mailing list. Jeff Whitaker
includes pupynere as part of the Basemap toolkit, he may have an
answer..

http://matplotlib.sourceforge.net/basemap/doc/html/api/basemap_api.html#mpl_toolkits.basemap.NetCDFFile

Alternatively take a look at the Python NetCDF4 interface, which
should support this functionality

http://code.google.com/p/netcdf4-python/

Cheers,
Scott


From Matt.Fago at itt.com  Tue Jan 13 11:59:15 2009
From: Matt.Fago at itt.com (Fago, Matt - AES)
Date: Tue, 13 Jan 2009 11:59:15 -0500
Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack
Message-ID: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF4972@01AESMX09-1.aes.de.ittind.com>

Pauli Virtanen wrote:
> It's the same issue. The fix didn't make it into the beta (but will be in
> rc/final).

Great, thanks. I've let the Fedora SciPy packager know.

 - Matt

This e-mail and any files transmitted with it may be proprietary and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the sender.
Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of ITT Corporation. The recipient should check this e-mail and any attachments for the presence of viruses. ITT accepts no liability for any damage caused by any virus transmitted by this e-mail.


From timmichelsen at gmx-topmail.de  Tue Jan 13 15:19:16 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Tue, 13 Jan 2009 21:19:16 +0100
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
Message-ID: <gkit04$u4b$1@ger.gmane.org>

Hello Pierre & Matt,
I have seen that you are expanding timeseries.

How robust is the function tsfromtext?
Can I use it or do you (better we all) need more testing?

Is this a rewrite of np.loadtxt?

You recently said, that the extras.py functions are not yet for common use.
May you indicate for those functions that are still under development 
the status in the docstring?

Thanks and kind regards,
Timmie

The links:
http://scipy.org/scipy/scikits/browser/trunk/timeseries/scikits/timeseries/extras.py

http://scipy.org/scipy/scikits/browser/trunk/timeseries/scikits/timeseries/_preview.py


From josh.k.lawrence at gmail.com  Tue Jan 13 15:23:24 2009
From: josh.k.lawrence at gmail.com (Josh Lawrence)
Date: Tue, 13 Jan 2009 15:23:24 -0500
Subject: [SciPy-user] f2py, fortran modules, and dynamic arrays
Message-ID: <1A47D3A6-6C6F-4A2A-BF59-4BE7710BC3AF@gmail.com>

Hello,

I am wanting to write some code in fortran to perform looping  
calculations on numpy arrays. I would like to accomplish using modules  
in my fortran code. I want to approach the problem in this manner  
because I have numerous arrays that would need to be passed and I  
would prefer not to have a fortran function with 10, 20, or more  
arguments. The following is some code that illustrates what I am  
trying to accomplish.

module foo
   integer :: alen(2), xlen, blen
   complex*16, pointer :: a(:,:), x(:), b(:)
! or complex*16, allocatable :: a(:)
end module foo

subroutine bar
   use foo
   integer :: i, j
   if (associated(a) .and. xlen > 0 .and. alen[0] .eq. alen[1]) then
     blen = xlen
     do i = 0, alen
       b(i) = 0.0
       do j = 0, alen
         b(i) = b(i) + a(i,j) * x(i)
       end do
     end do
   end if
end

Then, if I compile this as FortranMod, I would like to be able to do  
the following in python:

import FortranMod as formod

formod.foo.alen[0] = 2
formod.foo.alen[1] = 2
formod.foo.a = np.array([[1, 2], [3, 4]])
formod.foo.blen = 2
formod.foo.x = np.array([5, 6])

formod.far()

print formod.foo.blen
print formod.foo.b

Now, I do not care if I need to use pointers or allocatable or  
whatever, but I am curious if this is possible to do this type of  
functionality using fortran modules and f2py.

Thanks,

Josh


From timmichelsen at gmx-topmail.de  Tue Jan 13 15:33:00 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Tue, 13 Jan 2009 21:33:00 +0100
Subject: [SciPy-user] predicting values based on (linear) models
Message-ID: <gkitpt$1ko$1@ger.gmane.org>

Hello,
I had to do several statistical computations lately. I therefore looked 
at the statistical language R since it seems to contain already many 
models and functionality.

Is there some function like "predict" [1] in Python?

Example:

      x <- rnorm(15)
      y <- x + rnorm(15)
      t = predict(lm(y ~ x))

      t => the predicted data determined by the linear model (comapare 
to scipy.stats.linregress)


How is this done by pure Python?


Are there many using Rpy (rpy2) to acces the statistical functionalities 
provided by R?
What are your experiences with this?

Programming in python seems to be more convenient than in R but lacking 
the vast statistics.


Thanks in advance,
Timmie


[1] predict is a generic function for predictions from the results of 
various model fitting functions. The function invokes particular methods 
which depend on the class of the first argument.

Most prediction methods which similar to fitting linear models have an 
argument newdata specifying the first place to look for explanatory 
variables to be used for prediction. Some considerable attempts are made 
to match up the columns in newdata to those used for fitting, for 
example that they are of comparable types and that any factors have the 
same level set in the same order (or can be transformed to be so).
  Time series prediction methods in package stats have an argument 
n.ahead specifying how many time steps ahead to predict.

Eample:

      x <- rnorm(15)
      y <- x + rnorm(15)
      t = predict(lm(y ~ x))

      t => the predicted data determined by the linear model (comapare 
to scipy.stats.linregress)


From pgmdevlist at gmail.com  Tue Jan 13 15:40:18 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 13 Jan 2009 15:40:18 -0500
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
In-Reply-To: <gkit04$u4b$1@ger.gmane.org>
References: <gkit04$u4b$1@ger.gmane.org>
Message-ID: <A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>


On Jan 13, 2009, at 3:19 PM, Tim Michelsen wrote:
>
> How robust is the function tsfromtext?
> Can I use it or do you (better we all) need more testing?

It's fairly robust, but testing is always needed of course. I'm pretty  
sure we can run into some nasty corner cases.


> Is this a rewrite of np.loadtxt?

It's actually an adaptation of numpy.genfromtxt, the rewrite of  
np.loadtxt/mlab.csv2rec that I have implemented last month and that  
still don't know where to put in the numpy distribution. As I copied  
the necessary code in scikits.timeseries, you won't need to install  
anything else.


> You recently said, that the extras.py functions are not yet for  
> common use.

Did I ? I was probable exaggerating a bit. The functions (should)  
work. tsfromtxt does most certainly and replaces trecords.fromtextfile  
that was badly broken.

>
> May you indicate for those functions that are still under development
> the status in the docstring?

Will do. But once again, feel fere to try tsfromtxt and to send some  
feedback.


From timmichelsen at gmx-topmail.de  Tue Jan 13 18:29:15 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Wed, 14 Jan 2009 00:29:15 +0100
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
In-Reply-To: <A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>
References: <gkit04$u4b$1@ger.gmane.org>
	<A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>
Message-ID: <gkj84f$81b$1@ger.gmane.org>

Hi,

> Will do. But once again, feel fere to try tsfromtxt and to send some  
> feedback.
I guess I need some help on dateconverter.

I used:

data = ts.tsfromtxt('test.csv', datecols=(0,1), skiprows=1)


Then got the error:
TypeError: <lambda>() takes exactly 1 argument (2 given)


A sample column of my data:

2009-01-14 12:00;	23;	46

How would I read the such data in?

Kind regdards,
Timmie


From pgmdevlist at gmail.com  Tue Jan 13 19:30:17 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 13 Jan 2009 19:30:17 -0500
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
In-Reply-To: <gkj84f$81b$1@ger.gmane.org>
References: <gkit04$u4b$1@ger.gmane.org>
	<A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>
	<gkj84f$81b$1@ger.gmane.org>
Message-ID: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com>


On Jan 13, 2009, at 6:29 PM, Tim Michelsen wrote:
>
> I used:
>
> data = ts.tsfromtxt('test.csv', datecols=(0,1), skiprows=1)
>
>
> Then got the error:
> TypeError: <lambda>() takes exactly 1 argument (2 given)
>
>
> A sample column of my data:
>
> 2009-01-14 12:00;	23;	46


I will assume that you mean row...
* First, your separator isn't a comma, but a semicolon. Use  
delimiter=";"
* Second, your date is actually only in the first column, so you  
should use datecols=0;
* Last, you don't need to define a converter for the dates in that  
case, as it should be recognized by the date parser. However, you  
should provide a freq argument, such freq="H"


From josef.pktd at gmail.com  Tue Jan 13 20:24:21 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 13 Jan 2009 20:24:21 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <gkitpt$1ko$1@ger.gmane.org>
References: <gkitpt$1ko$1@ger.gmane.org>
Message-ID: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>

On Tue, Jan 13, 2009 at 3:33 PM, Tim Michelsen
<timmichelsen at gmx-topmail.de> wrote:
> Hello,
> I had to do several statistical computations lately. I therefore looked
> at the statistical language R since it seems to contain already many
> models and functionality.
>
> Is there some function like "predict" [1] in Python?
>
> Example:
>
>      x <- rnorm(15)
>      y <- x + rnorm(15)
>      t = predict(lm(y ~ x))
>
>      t => the predicted data determined by the linear model (comapare
> to scipy.stats.linregress)
>
>
> How is this done by pure Python?
>
>
> Are there many using Rpy (rpy2) to acces the statistical functionalities
> provided by R?
> What are your experiences with this?
>
> Programming in python seems to be more convenient than in R but lacking
> the vast statistics.
>
>
> Thanks in advance,
> Timmie
>
>
> [1] predict is a generic function for predictions from the results of
> various model fitting functions. The function invokes particular methods
> which depend on the class of the first argument.
>
> Most prediction methods which similar to fitting linear models have an
> argument newdata specifying the first place to look for explanatory
> variables to be used for prediction. Some considerable attempts are made
> to match up the columns in newdata to those used for fitting, for
> example that they are of comparable types and that any factors have the
> same level set in the same order (or can be transformed to be so).
>  Time series prediction methods in package stats have an argument
> n.ahead specifying how many time steps ahead to predict.
>
> Eample:
>
>      x <- rnorm(15)
>      y <- x + rnorm(15)
>      t = predict(lm(y ~ x))
>
>      t => the predicted data determined by the linear model (comapare
> to scipy.stats.linregress)
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>

This is on the todo list
scipy.stats.linregress treats only the case with a single explanatory variable

doing it by explicitly
----------------------------
assumes x is data vector without constant, y is endogenous variable
for estimation
xnew are observation of explanatory variables for prediction
see scipy tutorial, the only thing to watch out for are the
matrix/array dimensions

>>> from scipy import linalg
>>> b,resid,rank,sigma = linalg.lstsq(np.c_[np.ones((x.shape[0],1)),x],y)
>>> b
array([[ 5.47073574],
       [ 0.6575267 ],
       [ 2.09241884]])
>>> xnewwc=np.c_[np.ones((xnew.shape[0],1)),xnew]
>>> ypred = np.dot(xnewwc,b)   # prediction with ols estimate of parameters b
>>> print np.c_[ynew, ypred, ynew - ypred]
[[ 8.23128832  8.69250962 -0.46122129]
 [ 9.14636291  9.66243911 -0.51607621]
 [-0.10198498 -0.27382934  0.17184436]]


or using the ols example from the cookbook to which I added a predict method
-----------------------------------------------------------------------------------------------------------------

#------------------------ try_olsexample.py

import numpy as np
from olsexample import ols

def generate_data(nobs):
    x = np.random.randn(nobs,2)
    btrue = np.array([[5,1,2]]).T
    y = np.dot(x,btrue[1:,:]) + btrue[0,:] + 0.5 * np.random.randn(nobs,1)
    return y,x

y,x = generate_data(15)

est = ols(y,x)  # initialize and estimate with ols, constant added by default
print 'ols estimate'
print est.b
print np.array([[5,1,2]])  # true coefficients

ynew,xnew = generate_data(3)
ypred = est.predict(xnew)

print '    ytrue        ypred        error'
print np.c_[ynew, ypred, ynew - ypred]
#------------------------------- EOF

output:

ols estimate
[[ 5.47073574]
 [ 0.6575267 ]
 [ 2.09241884]]
[[5 1 2]]
    ytrue        ypred        error
[[ 8.23128832  8.69250962 -0.46122129]
 [ 9.14636291  9.66243911 -0.51607621]
 [-0.10198498 -0.27382934  0.17184436]]


olsexample.py is in attachment is from the cookbook and I'm slowly reworking it.
fancier models will be in scipy.stats.models when they are ready for inclusion.

I'm using rpy (version 1) to check scipy.stats function, and for sure
the available methods are very extensive in R, while coverage of
statistics and econometrics in python packages including scipy is
spotty, some good spots and many missing pieces.

Josef
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: olsexample.py
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090113/a3a6c942/attachment.ksh>

From fredmfp at gmail.com  Wed Jan 14 09:32:57 2009
From: fredmfp at gmail.com (fred)
Date: Wed, 14 Jan 2009 15:32:57 +0100
Subject: [SciPy-user] IBM float point format...
Message-ID: <496DF799.2010406@gmail.com>

Hi all,

Is there something ready-to-use in numpy/scipy to read data stored in
IBM floating point format from file?

I usually use scipy.io.numpyio.{fread, fwrite} to read my files, but for
this peculiar case, I don't know how I can handle it...

Any clue ?


TIA.

Cheers,

-- 
Fred


From Scott.Askey at afit.edu  Wed Jan 14 12:34:14 2009
From: Scott.Askey at afit.edu (Askey Scott A Capt AFIT/ENY)
Date: Wed, 14 Jan 2009 12:34:14 -0500
Subject: [SciPy-user] optimize.fsolve starting guess
Message-ID: <792700546363C941B876B9D41AF44759057188C8@MS-AFIT-03.afit.edu>


-----Original Message-----
From: Askey Scott A Capt AFIT/ENY 
Sent: Wednesday, January 14, 2009 12:25 PM
To: 'scipy-user at scipy.org'
Subject: optimize.fsolve starting guess


-----Original Message-----
From: Askey Scott A Capt AFIT/ENY 
Sent: Wednesday, January 14, 2009 9:46 AM
To: scipy-user at scipy.org
Subject: optimize.fsolve starting guess

I am using fsolve to solve a systems of nonlinear equations to solve a
dynamics problem that is marching forward in time.

 
X(i+1)=fsolve(F,x(i),args=(x(i))

 
F is a vector.  My problem is Fsolve fail to converge as written above.
F(x[i],x[i]) contains many zeros.

 
It does converge if 

 
X(i+1)=fsolve(F,.99999*x(i),args=(x(i)) is used.

 
Is there a clever way to avoid the .99999 peturb the intial guess in a
computationally efficient manner?

Array(x, dtype= float32), x.round(6) ?

 
Thanks 

 
Scott


From josef.pktd at gmail.com  Wed Jan 14 13:33:53 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 14 Jan 2009 13:33:53 -0500
Subject: [SciPy-user] optimize.fsolve starting guess
In-Reply-To: <792700546363C941B876B9D41AF44759057188C8@MS-AFIT-03.afit.edu>
References: <792700546363C941B876B9D41AF44759057188C8@MS-AFIT-03.afit.edu>
Message-ID: <1cd32cbb0901141033m43fdd4d3wdf2e69fe22f2c36a@mail.gmail.com>

On Wed, Jan 14, 2009 at 12:34 PM, Askey Scott A Capt AFIT/ENY
<Scott.Askey at afit.edu> wrote:
>
>
> -----Original Message-----
> From: Askey Scott A Capt AFIT/ENY
> Sent: Wednesday, January 14, 2009 12:25 PM
> To: 'scipy-user at scipy.org'
> Subject: optimize.fsolve starting guess
>
>
>
> -----Original Message-----
> From: Askey Scott A Capt AFIT/ENY
> Sent: Wednesday, January 14, 2009 9:46 AM
> To: scipy-user at scipy.org
> Subject: optimize.fsolve starting guess
>
> I am using fsolve to solve a systems of nonlinear equations to solve a
> dynamics problem that is marching forward in time.
>
>
>
> X(i+1)=fsolve(F,x(i),args=(x(i))
>
>
>
> F is a vector.  My problem is Fsolve fail to converge as written above.
> F(x[i],x[i]) contains many zeros.
>
>
>
> It does converge if
>
>
>
> X(i+1)=fsolve(F,.99999*x(i),args=(x(i)) is used.
>
>

Does it converge if you try:

X(i+1)=fsolve(F, 1.0*x(i),args=(x(i))

then 1.0*x[i] makes a temporary copy, x[i].copy()

I don't know about your specific problem, but one problem with python
is to watch out for mutable arguments. If fsolve doesn't make a copy
of the arguments, then the args values in your F function
might change during the solution search and it would be evaluating F
at the fixed point.


Josef


From robert.kern at gmail.com  Wed Jan 14 16:30:41 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 14 Jan 2009 15:30:41 -0600
Subject: [SciPy-user] IBM float point format...
In-Reply-To: <496DF799.2010406@gmail.com>
References: <496DF799.2010406@gmail.com>
Message-ID: <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com>

On Wed, Jan 14, 2009 at 08:32, fred <fredmfp at gmail.com> wrote:
> Hi all,
>
> Is there something ready-to-use in numpy/scipy to read data stored in
> IBM floating point format from file?
>
> I usually use scipy.io.numpyio.{fread, fwrite} to read my files, but for
> this peculiar case, I don't know how I can handle it...

Not in numpy or scipy. I do have a ufunc in my own code for this. It's
also fairly straightforward to implement in pure numpy if you can bear
the memory cost of the temporaries (and of course, you can chunk the
input to reduce that cost).

Here is a Python implementation:

    def ibm2ieee(ibm):
        """ Converts an IBM floating point number into IEEE format. """

        sign = ibm >> 31 & 0x01

        exponent = ibm >> 24 & 0x7f

        mantissa = ibm & 0x00ffffff
        mantissa = (mantissa * 1.0) / pow(2, 24)

        ieee = (1 - 2 * sign) * mantissa * pow(16.0, exponent - 64)

        return ieee


Here is the ufunc:

#define IBM_MANTISSA_UNIT (16777216.0)

static void ibm2ieee_loop(char **args, npy_intp *dimensions, npy_intp *steps,
    void *data)
{
    char *ibmbuf = args[0];
    char *output = args[1];

    npy_intp i, n=dimensions[0];
    npy_int32 ibm_val, exponent, mantissa;
    npy_float32 ieee_val;
    short sign;

    for (i=0; i < n; i++, ibmbuf+=steps[0], output+=steps[1]) {
        ibm_val = *(npy_int32*)ibmbuf;
        sign = (ibm_val >> 31) & 0x01;
        exponent = ((ibm_val >> 24) & 0x7F) - 64;
        mantissa = ibm_val & 0x00FFFFFF;
        ieee_val = (1-2*sign) * (mantissa / IBM_MANTISSA_UNIT) *
powf(16.0, exponent);
        *(npy_float32*)output = ieee_val;
    }
}

static char ibm2ieee_sigs[] = {
    NPY_INT32, NPY_FLOAT32
};
static char ibm2ieee_doc[] = "Convert an IBM floating point number
(viewed as a 32-bit integer) to an IEEE-754 float32.";
static PyUFuncGenericFunction ibm2ieee_functions[] = {ibm2ieee_loop};
static void* ibm2ieee_data[] = {NULL};


-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From fredmfp at gmail.com  Wed Jan 14 17:16:52 2009
From: fredmfp at gmail.com (fred)
Date: Wed, 14 Jan 2009 23:16:52 +0100
Subject: [SciPy-user] IBM float point format...
In-Reply-To: <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com>
References: <496DF799.2010406@gmail.com>
	<3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com>
Message-ID: <496E6454.1050201@gmail.com>

Robert Kern a ?crit :

Hi Robert,

First, the python implementation.

python complains that operand >> is not supported on numpy.float32 and
int (which I understand quite well for float32):

      2     """ Converts an IBM floating point number into IEEE format. """
      3
----> 4     sign = ibm >> 31 & 0x01
      5
      6     exponent = ibm >> 24 & 0x7f

TypeError: unsupported operand type(s) for >>: 'numpy.float32' and 'int'

What am I doing wrong ?


Cheers,

-- 
Fred


From zachary.pincus at yale.edu  Wed Jan 14 17:24:42 2009
From: zachary.pincus at yale.edu (Zachary Pincus)
Date: Wed, 14 Jan 2009 17:24:42 -0500
Subject: [SciPy-user] IBM float point format...
In-Reply-To: <496E6454.1050201@gmail.com>
References: <496DF799.2010406@gmail.com>
	<3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com>
	<496E6454.1050201@gmail.com>
Message-ID: <0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu>

> python complains that operand >> is not supported on numpy.float32 and
> int (which I understand quite well for float32):
>
>      2     """ Converts an IBM floating point number into IEEE  
> format. """
>      3
> ----> 4     sign = ibm >> 31 & 0x01
>      5
>      6     exponent = ibm >> 24 & 0x7f
>
> TypeError: unsupported operand type(s) for >>: 'numpy.float32' and  
> 'int'
>
> What am I doing wrong ?

I presume you are to read in the data from disk as an int32, which  
then gets processed to a float by Robert's code.

The ufunc operates in the same way -- look at its signature.

Zach


From fredmfp at gmail.com  Wed Jan 14 17:44:17 2009
From: fredmfp at gmail.com (fred)
Date: Wed, 14 Jan 2009 23:44:17 +0100
Subject: [SciPy-user] IBM float point format...
In-Reply-To: <0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu>
References: <496DF799.2010406@gmail.com>	<3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com>	<496E6454.1050201@gmail.com>
	<0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu>
Message-ID: <496E6AC1.7040801@gmail.com>

Zachary Pincus a ?crit :

> I presume you are to read in the data from disk as an int32, which  
> then gets processed to a float by Robert's code.
Sorry, I'm afraid I don't understand you here.

Do you mean that I have to read my data as int32 from my file which
contains float32 ?

> The ufunc operates in the same way -- look at its signature.
Yes, but I did not looked at it as I know nothing about ufuncs for now :-(


Cheers,

-- 
Fred


From david at ar.media.kyoto-u.ac.jp  Wed Jan 14 17:32:49 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Thu, 15 Jan 2009 07:32:49 +0900
Subject: [SciPy-user] IBM float point format...
In-Reply-To: <496E6AC1.7040801@gmail.com>
References: <496DF799.2010406@gmail.com>	<3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com>	<496E6454.1050201@gmail.com>	<0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu>
	<496E6AC1.7040801@gmail.com>
Message-ID: <496E6811.7090000@ar.media.kyoto-u.ac.jp>

fred wrote:
> Zachary Pincus a ?crit :
>
>   
>> I presume you are to read in the data from disk as an int32, which  
>> then gets processed to a float by Robert's code.
>>     
> Sorry, I'm afraid I don't understand you here.
>
> Do you mean that I have to read my data as int32 from my file which
> contains float32 ?
>   

Yes. You can't read them in floating point support, since your machine
representation and IBM's representation are fundamentally different. You
want to import a type which is not supported by your CPU, so you have to
bypass completely the type system. Reading them as int32 means you
consider your bytes as a raw set of bits, which is what Robert's code is
doing,

David


From fredmfp at gmail.com  Wed Jan 14 17:55:23 2009
From: fredmfp at gmail.com (fred)
Date: Wed, 14 Jan 2009 23:55:23 +0100
Subject: [SciPy-user] IBM float point format...
In-Reply-To: <496E6811.7090000@ar.media.kyoto-u.ac.jp>
References: <496DF799.2010406@gmail.com>	<3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com>	<496E6454.1050201@gmail.com>	<0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu>	<496E6AC1.7040801@gmail.com>
	<496E6811.7090000@ar.media.kyoto-u.ac.jp>
Message-ID: <496E6D5B.3090509@gmail.com>

David Cournapeau a ?crit :
>
> Yes. You can't read them in floating point support, since your machine
> representation and IBM's representation are fundamentally different. You
> want to import a type which is not supported by your CPU, so you have to
> bypass completely the type system. Reading them as int32 means you
> consider your bytes as a raw set of bits, which is what Robert's code is
> doing,
Thanks a lot to you all, I get it.

I have to convert my data from big to little endian too
to get the right result.


Cheers,

-- 
Fred


From timmichelsen at gmx-topmail.de  Wed Jan 14 19:24:03 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Thu, 15 Jan 2009 01:24:03 +0100
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
In-Reply-To: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com>
References: <gkit04$u4b$1@ger.gmane.org>	<A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>	<gkj84f$81b$1@ger.gmane.org>
	<9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com>
Message-ID: <gklvn3$1ia$1@ger.gmane.org>

> I will assume that you mean row...
> * First, your separator isn't a comma, but a semicolon. Use  
> delimiter=";"
> * Second, your date is actually only in the first column, so you  
> should use datecols=0;
> * Last, you don't need to define a converter for the dates in that  
> case, as it should be recognized by the date parser. However, you  
> should provide a freq argument, such freq="H"

I tried on a small random data set (see below).
Here the ipython script and output:

In [2]: import scikits.timeseries as ts

In [3]: series = ts.tsfromtxt('test_ts.csv', delimiter=';', freq='H', 
datecols=0, skiprows=1)
/usr/lib/python2.5/site-packages/numpy/ma/core.py:1383: UserWarning: 
MaskedArray.__setitem__ on fields: The mask is NOT affected!
   warnings.warn("MaskedArray.__setitem__ on fields: "\

In [4]: series
Out[4]:
timeseries([(10,) (1,) (13,) (7,) (17,) (1,) (4,) (15,) (11,) (15,) 
(15,) (6,) (1,)
  (16,) (3,) (19,) (11,) (16,) (12,) (8,) (11,) (19,) (15,) (10,) (6,) (0,)
  (14,) (6,) (12,) (1,) (13,) (12,) (2,) (12,) (16,) (18,) (9,) (5,) (19,)
  (5,) (14,) (14,) (18,) (1,) (14,) (20,) (13,) (11,)],
    dtype = [('f1', '<i4')],
    dates = [15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 
15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00
  15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00 15-Jan-2009 00:00],
    freq  = H)

Here the data:

"datetime";"temp"
01.01.07 00:00;10
01.01.07 01:00;0
01.01.07 02:00;14
01.01.07 03:00;6
01.01.07 04:00;12
01.01.07 05:00;1
01.01.07 06:00;13
01.01.07 07:00;12
01.01.07 08:00;2
01.01.07 09:00;12
01.01.07 10:00;16
01.01.07 11:00;18
01.01.07 12:00;9
01.01.07 13:00;5
01.01.07 14:00;19
01.01.07 15:00;5
01.01.07 16:00;14
01.01.07 17:00;14
01.01.07 18:00;18
01.01.07 19:00;1
01.01.07 20:00;14
01.01.07 21:00;20
01.01.07 22:00;6
01.01.07 23:00;10
02.01.07 00:00;15
02.01.07 01:00;15
02.01.07 02:00;1
02.01.07 03:00;13
02.01.07 04:00;7
02.01.07 05:00;17
02.01.07 06:00;1
02.01.07 07:00;4
02.01.07 08:00;15
02.01.07 09:00;11
02.01.07 10:00;15
02.01.07 11:00;6
02.01.07 12:00;19
02.01.07 13:00;1
02.01.07 14:00;16
02.01.07 15:00;3
02.01.07 16:00;19
02.01.07 17:00;11
02.01.07 18:00;16
02.01.07 19:00;12
02.01.07 20:00;8
02.01.07 21:00;11
02.01.07 22:00;13
02.01.07 23:00;11

How do I read the data from above into a time series?

Thanks in advance,
Timmie


From timmichelsen at gmx-topmail.de  Wed Jan 14 19:37:01 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Thu, 15 Jan 2009 01:37:01 +0100
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>
	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
Message-ID: <gkm0fe$3u1$1@ger.gmane.org>

Hello Josef,
thank for your extensive answer.
I really appreciate it and will see how I could use it.


> olsexample.py is in attachment is from the cookbook and I'm slowly reworking it.
> fancier models will be in scipy.stats.models when they are ready for inclusion.
Do you have the scipy.stats.models in a SVN repository somewhere?

> I'm using rpy (version 1) to check scipy.stats function, and for sure
> the available methods are very extensive in R, while coverage of
> statistics and econometrics in python packages including scipy is
> spotty, some good spots and many missing pieces.
As you are checking against R with rpy, do you think that the R 
functions are more accurate?
Do you see benefit from re-programming the stats functions in scipy?

Thanks and regards,
Timmie


From filipwasilewski at gmail.com  Wed Jan 14 19:50:44 2009
From: filipwasilewski at gmail.com (Filip Wasilewski)
Date: Thu, 15 Jan 2009 01:50:44 +0100
Subject: [SciPy-user] multidimensional wavelet packages
In-Reply-To: <496556EC.7020401@ucsf.edu>
References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com>
	<9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu>
	<9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com>
	<49651B6F.6080001@ucsf.edu>
	<c3e57df40901071614y1fc39f41vc3a5cf869e0beadd@mail.gmail.com>
	<496556EC.7020401@ucsf.edu>
Message-ID: <c3e57df40901141650x76db0175n7d8a4744b3d0fd55@mail.gmail.com>

Hi Karl,

On Thu, Jan 8, 2009 at 02:29, Karl Young <Karl.Young at ucsf.edu> wrote:
>
> Hi Filip,
>
> Thanks much (and thanks for the original package); I will go through the
> code and let you know if I come up with anything that would be worth
> incorporating (or let you know that your suggested addition works fine
> and should be added as is).
>
>>Hi Karl,
>>
>>On Wed, Jan 7, 2009 at 22:15, Karl Young <Karl.Young at ucsf.edu> wrote:
>>
>>
>>>Hi Stefan,
>>>
>>>Thanks; I'd looked a little at PyWavelets and figured that what you
>>>suggest might be what I ended up hacking but thought maybe some
>>>enterprising neuroimager (or other person working with 3D, 4D data)
>>>might have already done so :-)


Guess what. I have forgot that some time ago I already implemented the
proper `downcoef` routine in PyWavelets svn version. Below is an
updated recipe for n-dimensional 1-level dwt:

<code>
#!/usr/bin/env python
# Author: Filip Wasilewski
# Licence: Public Domain

import numpy
import pywt

def downcoef(data, wavelet, mode, type):
    """Adapts pywt.downcoef call for numpy.apply_along_axis"""
    return pywt.downcoef(type, data, wavelet, mode, level=1)

def dwt_n(data, wavelet, mode='sym'):
    """N-dimensional Discrete Wavelet Transform"""
    data = numpy.asarray(data)
    dim = len(data.shape)
    coeffs = [('', data)]
    for axis in range(dim):
        new_coeffs = []
        for subband, x in coeffs:
            new_coeffs.extend([
                (subband+'L', numpy.apply_along_axis(downcoef, axis, x,
                                                     wavelet, mode, 'a')),
                (subband+'H', numpy.apply_along_axis(downcoef, axis, x,
                                                     wavelet, mode, 'd'))
            ])
        coeffs = new_coeffs
    return dict(coeffs)

if __name__ == '__main__':
    import pprint
    data = numpy.ones((4, 4, 4, 4)) # 4D array
    result = dwt_n(data , 'db1')
    pprint.pprint(result)

</code>

Filip Wasilewski
-- 
http://www.linkedin.com/in/filipwasilewski


From pgmdevlist at gmail.com  Wed Jan 14 20:18:54 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Wed, 14 Jan 2009 20:18:54 -0500
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
In-Reply-To: <gklvn3$1ia$1@ger.gmane.org>
References: <gkit04$u4b$1@ger.gmane.org>	<A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>	<gkj84f$81b$1@ger.gmane.org>
	<9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com>
	<gklvn3$1ia$1@ger.gmane.org>
Message-ID: <B99491C9-F53A-4A36-94E5-853A8B3271D9@gmail.com>

Tim,
It works on my machine: you do end up with a series with a structured  
dtype [('f1',int)], and the date is correctly processed...

 >>> import StringIO
 >>> data="""datetime;test
... 01.01.07 00:00; 10
... 01.01.07 01:00; 15
"""
 >>> ts.tsfromtxt(StringIO.StringIO(data), delimiter=';', skiprows=1,  
datecols=0, freq='H')
timeseries([(10,) (15,)],
    dtype = [('f1', '<i4')],
    dates = [01-Jan-2007 00:00 01-Jan-2007 01:00],
    freq  = H)

You can then take a view of the result as int to get a series without  
named fields.

So:
It looks like you're using an old version of numpy (older than mine  
anyway...): which one is it ?
FYI, I only tested the module on 1.3.x (the SVN version). 1.3.0 will  
be a requirement as soon as it is released anyway.
Explanation: there's been a significant effort to improve support of  
structured/flexible dtype arrays in numpy.ma between 1.2 and 1.3. The  
code underlying tsfromtxt is based on these modifications. I can't/ 
won't backport the modifications from 1.3.0dev to 1.2.2. I could try  
to put some of the modifications to numpy.ma in the _preview.py file  
of the package, but that'd be just a hack. It's far easier to require  
1.3 only.


From josef.pktd at gmail.com  Wed Jan 14 22:15:10 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 14 Jan 2009 22:15:10 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <gkm0fe$3u1$1@ger.gmane.org>
References: <gkitpt$1ko$1@ger.gmane.org>
	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
	<gkm0fe$3u1$1@ger.gmane.org>
Message-ID: <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>

On Wed, Jan 14, 2009 at 7:37 PM, Tim Michelsen
<timmichelsen at gmx-topmail.de> wrote:
> Hello Josef,
> thank for your extensive answer.
> I really appreciate it and will see how I could use it.
>
>
>> olsexample.py is in attachment is from the cookbook and I'm slowly reworking it.
>> fancier models will be in scipy.stats.models when they are ready for inclusion.
> Do you have the scipy.stats.models in a SVN repository somewhere?

The main current location is at
http://bazaar.launchpad.net/~nipy-developers/nipy/trunk/files/head%3A/neuroimaging/fixes/scipy/stats/

I made a few changes to stats.models, so that all existing tests pass at
http://bazaar.launchpad.net/~josef-pktd/%2Bjunk/nipy_stats_models2/files/head%3A/neuroimaging/fixes/scipy/stats/models/

>
>> I'm using rpy (version 1) to check scipy.stats function, and for sure
>> the available methods are very extensive in R, while coverage of
>> statistics and econometrics in python packages including scipy is
>> spotty, some good spots and many missing pieces.
> As you are checking against R with rpy, do you think that the R
> functions are more accurate?

The function in stats, that I tested or rewrote, are usually identical
to around 1e-15, but in some cases R has a more accurate test
distribution for small samples (option "exact" in R), while in
scipy.stats we only have the asymptotic distribution. Also, not all
existing functions in scipy.stats are tested (yet).

> Do you see benefit from re-programming the stats functions in scipy?
>

(Since R and its packages are GPL we cannot copy from it directly, but
I was looking at R and matlab for the interface/signature of
statistical functions.)

I would like to see many of the basic statistics functions included in
scipy (or in an addon, or initially as cookbook recipes). Much of the
basic supporting tools for statistics like optimize, linalg,
distributions, special and signal, are available but it is a pain to
figure out each time how to use it; for example,  how to get the error
and covariance estimates for linear or non-linear regression. There
are many good specialized packages for python available, for example
for machine learning or MCMC, but no complete collection of basic
statistical functionality.

But, my impression is that, since scipy is mostly developer driven
(?), what finally ends up in scipy, depends on the needs of the
developers, and their willingness to share the code and to incorporate
user feedback.

Josef


From pgmdevlist at gmail.com  Wed Jan 14 23:24:36 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Wed, 14 Jan 2009 23:24:36 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>
	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
	<gkm0fe$3u1$1@ger.gmane.org>
	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>
Message-ID: <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>


On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote:
> The function in stats, that I tested or rewrote, are usually identical
> to around 1e-15, but in some cases R has a more accurate test
> distribution for small samples (option "exact" in R), while in
> scipy.stats we only have the asymptotic distribution.

We could try to reimplement part of it in C,. In any   case, it might  
be worth to output a warning (or at least be very explicit in the doc)  
that the results may not hold for samples smaller than 10-20.

> Also, not all
> existing functions in scipy.stats are tested (yet).

We should also try to make sure missing data are properly supported  
(not always possible) and that the results are consistent between the  
masked and non-masked versions.


>> Do you see benefit from re-programming the stats functions in scipy?
>>
>
> (Since R and its packages are GPL we cannot copy from it directly, but
> I was looking at R and matlab for the interface/signature of
> statistical functions.)

There's one obvious advantage (on top of the pedagogical exercise):  
that's one dependency less.


> I would like to see many of the basic statistics functions included in
> scipy (or in an addon, or initially as cookbook recipes). Much of the
> basic supporting tools for statistics like optimize, linalg,
> distributions, special and signal, are available but it is a pain to
> figure out each time how to use it; for example,  how to get the error
> and covariance estimates for linear or non-linear regression.

Very true, but it's also what attracted me in numpy/scipy in the first  
place: the functions I needed were at the time non-existent, and I was  
reluctant to rely on other softwares which, albeit more powerful,  
hided how values were actually calculated (what assumptions were made,  
what were the validity domains...). It was nice to have some time at  
hand.
>
> But, my impression is that, since scipy is mostly developer driven
> (?), what finally ends up in scipy, depends on the needs of the
> developers, and their willingness to share the code and to incorporate
> user feedback.

IMHO, the readiness to incorporate user feedback is here. The feedback  
is not, or at least not as much as we'd like.


From josef.pktd at gmail.com  Thu Jan 15 00:50:56 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 15 Jan 2009 00:50:56 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>
	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
	<gkm0fe$3u1$1@ger.gmane.org>
	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>
	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
Message-ID: <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>

On Wed, Jan 14, 2009 at 11:24 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>
> On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote:
>> The function in stats, that I tested or rewrote, are usually identical
>> to around 1e-15, but in some cases R has a more accurate test
>> distribution for small samples (option "exact" in R), while in
>> scipy.stats we only have the asymptotic distribution.
>
> We could try to reimplement part of it in C,. In any   case, it might
> be worth to output a warning (or at least be very explicit in the doc)
> that the results may not hold for samples smaller than 10-20.

I am not a "C" person and I never went much beyond HelloWorld in C.
I just checked some of the doc strings, and I am usually mention that
we use the asymptotic distribution, but there are still pretty vague
statements in some of the doc strings, such as

"The p-values are not entirely reliable but are probably reasonable for
datasets larger than 500 or so."


>
>> Also, not all
>> existing functions in scipy.stats are tested (yet).
>
> We should also try to make sure missing data are properly supported
> (not always possible) and that the results are consistent between the
> masked and non-masked versions.
>

I added a ticket so we don't forget to check this.


> IMHO, the readiness to incorporate user feedback is here. The feedback
> is not, or at least not as much as we'd like.

That depends on the subpackage, some problems in stats have been
reported and known for quite some time and the expected lifetime of a
ticket can be pretty long. I was looking at different python packages
that use statistics, and many of them are reluctant to use scipy while
numpy looks very well established. But, I suppose this will improve
with time and the user base will increase, especially with the recent
improvements in the build/distribution and the documentation.

Josef


From ndbecker2 at gmail.com  Thu Jan 15 07:42:48 2009
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 15 Jan 2009 07:42:48 -0500
Subject: [SciPy-user] interpolation/extrapolation
Message-ID: <gknb08$pnq$1@ger.gmane.org>

I am interested in using interpolate.interp1d(x,y, kind='linear'), but 
instead of throwing an exception (or using a fill value) for out-of-bounds, I 
would like extrapolation.  Anything in scipy useful here?


From christopher.paul.taylor at gmail.com  Thu Jan 15 09:02:54 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Thu, 15 Jan 2009 09:02:54 -0500
Subject: [SciPy-user] import of scipy.sparse.linalg is breaking
Message-ID: <f44caf830901150602n795f3065oe10e7c3383d14f9b@mail.gmail.com>

Hello!

I have built myself a copy of scipy 0.7.0 and have tried to import the
sparse.linalg module. I continue to get this error:

Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08)
[GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> import scipy.sparse.linalg
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<path_to_python_site_packages>/scipy/sparse/linalg/__init__.py",
line 5, in <module>
    from isolve import *
  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/__init__.py",
line 4, in <module>
    from iterative import *
  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/iterative.py",
line 5, in <module>
    import _iterative
ImportError: <path_to_python_site_packages>/scipy/sparse/linalg/isolve/_iterative.so:
undefined symbol: slamch_
>>>

Any tips on how to correct this issue resolving the slamch_ symbol?

ct


From cournape at gmail.com  Thu Jan 15 09:34:09 2009
From: cournape at gmail.com (David Cournapeau)
Date: Thu, 15 Jan 2009 23:34:09 +0900
Subject: [SciPy-user] import of scipy.sparse.linalg is breaking
In-Reply-To: <f44caf830901150602n795f3065oe10e7c3383d14f9b@mail.gmail.com>
References: <f44caf830901150602n795f3065oe10e7c3383d14f9b@mail.gmail.com>
Message-ID: <5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com>

On Thu, Jan 15, 2009 at 11:02 PM, christopher taylor
<christopher.paul.taylor at gmail.com> wrote:
> Hello!
>
> I have built myself a copy of scipy 0.7.0 and have tried to import the
> sparse.linalg module. I continue to get this error:
>
> Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08)
> [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import scipy
>>>> import scipy.sparse.linalg
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "<path_to_python_site_packages>/scipy/sparse/linalg/__init__.py",
> line 5, in <module>
>    from isolve import *
>  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/__init__.py",
> line 4, in <module>
>    from iterative import *
>  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/iterative.py",
> line 5, in <module>
>    import _iterative
> ImportError: <path_to_python_site_packages>/scipy/sparse/linalg/isolve/_iterative.so:
> undefined symbol: slamch_
>>>>
>

Which fortran compiler did you use to build blas/lapack, which one did
you use for numpy and scipy ? Could you give us the output of ldd
_iterative.so ?

David


From bsouthey at gmail.com  Thu Jan 15 10:09:39 2009
From: bsouthey at gmail.com (Bruce Southey)
Date: Thu, 15 Jan 2009 09:09:39 -0600
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>	<gkm0fe$3u1$1@ger.gmane.org>	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
	<1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>
Message-ID: <496F51B3.9060400@gmail.com>

josef.pktd at gmail.com wrote:
> On Wed, Jan 14, 2009 at 11:24 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>   
>> On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote:
>>     
>>> The function in stats, that I tested or rewrote, are usually identical
>>> to around 1e-15, but in some cases R has a more accurate test
>>> distribution for small samples (option "exact" in R), while in
>>> scipy.stats we only have the asymptotic distribution.
>>>       
>> We could try to reimplement part of it in C,. In any   case, it might
>> be worth to output a warning (or at least be very explicit in the doc)
>> that the results may not hold for samples smaller than 10-20.
>>     
>
> I am not a "C" person and I never went much beyond HelloWorld in C.
> I just checked some of the doc strings, and I am usually mention that
> we use the asymptotic distribution, but there are still pretty vague
> statements in some of the doc strings, such as
>
> "The p-values are not entirely reliable but are probably reasonable for
> datasets larger than 500 or so."
>
>
>   
The 'exact' test are usually Fisher's exact tests 
(http://en.wikipedia.org/wiki/Fisher%27s_exact_test) which are very 
different from the asymptotic testing and can get very demanding. Also I 
do not think that such statements should be part of the doc strings.

>>> Also, not all
>>> existing functions in scipy.stats are tested (yet).
>>>       
>> We should also try to make sure missing data are properly supported
>> (not always possible) and that the results are consistent between the
>> masked and non-masked versions.
>>
>>     
>
> I added a ticket so we don't forget to check this.
>
>
>
>   
>> IMHO, the readiness to incorporate user feedback is here. The feedback
>> is not, or at least not as much as we'd like.
>>     
>
> That depends on the subpackage, some problems in stats have been
> reported and known for quite some time and the expected lifetime of a
> ticket can be pretty long. I was looking at different python packages
> that use statistics, and many of them are reluctant to use scipy while
> numpy looks very well established. But, I suppose this will improve
> with time and the user base will increase, especially with the recent
> improvements in the build/distribution and the documentation.
>
> Josef
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   
There are different reasons for a lack of user base. One of the reasons 
for R is that many, many statistics classes use it.

Some of the reasons that I do not use scipy for stats (and have not 
looked at this in some time) included:
1) The difficulty of installation which is considerably better now.
2) Lack of support for missing values as virtually everything that I 
have worked with involves missing values at some stage.
3) Lack of an suitable statistical modeling interface where you can 
specify the model to be fit without having to create each individual 
array. The approach must work for a range of scenarios.

Bruce


From josef.pktd at gmail.com  Thu Jan 15 10:11:49 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 15 Jan 2009 10:11:49 -0500
Subject: [SciPy-user] interpolation/extrapolation
In-Reply-To: <gknb08$pnq$1@ger.gmane.org>
References: <gknb08$pnq$1@ger.gmane.org>
Message-ID: <1cd32cbb0901150711m40cbfe3bsd598db36d81dd318@mail.gmail.com>

On Thu, Jan 15, 2009 at 7:42 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
> I am interested in using interpolate.interp1d(x,y, kind='linear'), but
> instead of throwing an exception (or using a fill value) for out-of-bounds, I
> would like extrapolation.  Anything in scipy useful here?
>

As far as I understand interpolate.interp1d needs two points to
interpolate in between, so you would need to tell it where you want it
to go outside of the range of existing points, e.g. you could create
artificial points outside of the range.

But in general this kind of extrapolation is a typical case for
regression, in the 1D case stats.linregress should do it, if x is
multivariate, then using e.g. OLS would be necessary. If your data
doesn't look linear overall, you could just use a few points close to
the boundary to estimate some local linear fit and extrapolate from
there.

If you want a connected line, then using a predicted value from the
regression as artificial point for interp1d.

Since there are many possible ways of extrapolating, it depends on the
purpose and the shape of the x,y data.

Josef


From christopher.paul.taylor at gmail.com  Thu Jan 15 10:23:49 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Thu, 15 Jan 2009 10:23:49 -0500
Subject: [SciPy-user] import of scipy.sparse.linalg is breaking
In-Reply-To: <5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com>
References: <f44caf830901150602n795f3065oe10e7c3383d14f9b@mail.gmail.com>
	<5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com>
Message-ID: <f44caf830901150723u4cad2839pb42df3362148a2ab@mail.gmail.com>

Here is the output from ldd:

ldd _iterative.so
        libg2c.so.0 => /usr/lib64/libg2c.so.0 (0x00002b8b6f3b5000)
        libm.so.6 => /lib64/libm.so.6 (0x00002b8b6f5d6000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b8b6f859000)
        libc.so.6 => /lib64/libc.so.6 (0x00002b8b6fa68000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003414800000)

I'm seeing this when I run the scipy build script (setup.py)

customize GnuFCompiler
Found executable /usr/bin/g77
gnu: no Fortran 90 compiler found

When scipy and numpy setup.py recognizes ATLAS it prints something
like this out:

ATLAS version 3.8.2

 INSTFLG  : -1 0 -a 1
   ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_UNKNOWNx86 -DATL_CPUMHZ=3192
-DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664
   F2CDEFS  : -DAdd_ -DF77_INTEGER=int -DStringSunStyle
   CACHEEDGE: 163840
   F77      : gfortran, version GNU Fortran (GCC) 4.1.2 20071124 (Red
Hat 4.1.2-42)
   F77FLAGS : -O -fPIC -m64
   SMC      : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
   SMCFLAGS : -O -fomit-frame-pointer -fPIC -m64
   SKC      : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
   SKCFLAGS : -O -fomit-frame-pointer -fPIC -m64

It also finds these libraries:

FOUND:
    libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']

Thanks so much for your help!

ct

On Thu, Jan 15, 2009 at 9:34 AM, David Cournapeau <cournape at gmail.com> wrote:
> On Thu, Jan 15, 2009 at 11:02 PM, christopher taylor
> <christopher.paul.taylor at gmail.com> wrote:
>> Hello!
>>
>> I have built myself a copy of scipy 0.7.0 and have tried to import the
>> sparse.linalg module. I continue to get this error:
>>
>> Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08)
>> [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import scipy
>>>>> import scipy.sparse.linalg
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>>  File "<path_to_python_site_packages>/scipy/sparse/linalg/__init__.py",
>> line 5, in <module>
>>    from isolve import *
>>  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/__init__.py",
>> line 4, in <module>
>>    from iterative import *
>>  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/iterative.py",
>> line 5, in <module>
>>    import _iterative
>> ImportError: <path_to_python_site_packages>/scipy/sparse/linalg/isolve/_iterative.so:
>> undefined symbol: slamch_
>>>>>
>>
>
> Which fortran compiler did you use to build blas/lapack, which one did
> you use for numpy and scipy ? Could you give us the output of ldd
> _iterative.so ?
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From christopher.paul.taylor at gmail.com  Thu Jan 15 10:29:38 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Thu, 15 Jan 2009 10:29:38 -0500
Subject: [SciPy-user] import of scipy.sparse.linalg is breaking
In-Reply-To: <f44caf830901150723u4cad2839pb42df3362148a2ab@mail.gmail.com>
References: <f44caf830901150602n795f3065oe10e7c3383d14f9b@mail.gmail.com>
	<5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com>
	<f44caf830901150723u4cad2839pb42df3362148a2ab@mail.gmail.com>
Message-ID: <f44caf830901150729q4d14ec49le5d0e70bc068ec62@mail.gmail.com>

I think this is the problem. lapack wants to use gfortan which
--version tells me is:

GNU Fortran (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)

where as, g77 --version identifies itself as:

GNU Fortran (GCC) 3.4.6 20060404 (Red Hat 3.4.6-4)

So there you have it ladies and gentlemen, I think that's a great
starting point, though I would hope GNU would allow for compatibility
between GNU Fortran versions... :-(

ct

On Thu, Jan 15, 2009 at 10:23 AM, christopher taylor
<christopher.paul.taylor at gmail.com> wrote:
> Here is the output from ldd:
>
> ldd _iterative.so
>        libg2c.so.0 => /usr/lib64/libg2c.so.0 (0x00002b8b6f3b5000)
>        libm.so.6 => /lib64/libm.so.6 (0x00002b8b6f5d6000)
>        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b8b6f859000)
>        libc.so.6 => /lib64/libc.so.6 (0x00002b8b6fa68000)
>        /lib64/ld-linux-x86-64.so.2 (0x0000003414800000)
>
> I'm seeing this when I run the scipy build script (setup.py)
>
> customize GnuFCompiler
> Found executable /usr/bin/g77
> gnu: no Fortran 90 compiler found
>
> When scipy and numpy setup.py recognizes ATLAS it prints something
> like this out:
>
> ATLAS version 3.8.2
>
>  INSTFLG  : -1 0 -a 1
>   ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_UNKNOWNx86 -DATL_CPUMHZ=3192
> -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664
>   F2CDEFS  : -DAdd_ -DF77_INTEGER=int -DStringSunStyle
>   CACHEEDGE: 163840
>   F77      : gfortran, version GNU Fortran (GCC) 4.1.2 20071124 (Red
> Hat 4.1.2-42)
>   F77FLAGS : -O -fPIC -m64
>   SMC      : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
>   SMCFLAGS : -O -fomit-frame-pointer -fPIC -m64
>   SKC      : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
>   SKCFLAGS : -O -fomit-frame-pointer -fPIC -m64
>
> It also finds these libraries:
>
> FOUND:
>    libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
>
> Thanks so much for your help!
>
> ct
>
> On Thu, Jan 15, 2009 at 9:34 AM, David Cournapeau <cournape at gmail.com> wrote:
>> On Thu, Jan 15, 2009 at 11:02 PM, christopher taylor
>> <christopher.paul.taylor at gmail.com> wrote:
>>> Hello!
>>>
>>> I have built myself a copy of scipy 0.7.0 and have tried to import the
>>> sparse.linalg module. I continue to get this error:
>>>
>>> Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08)
>>> [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> import scipy
>>>>>> import scipy.sparse.linalg
>>> Traceback (most recent call last):
>>>  File "<stdin>", line 1, in <module>
>>>  File "<path_to_python_site_packages>/scipy/sparse/linalg/__init__.py",
>>> line 5, in <module>
>>>    from isolve import *
>>>  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/__init__.py",
>>> line 4, in <module>
>>>    from iterative import *
>>>  File "<path_to_python_site_packages>/scipy/sparse/linalg/isolve/iterative.py",
>>> line 5, in <module>
>>>    import _iterative
>>> ImportError: <path_to_python_site_packages>/scipy/sparse/linalg/isolve/_iterative.so:
>>> undefined symbol: slamch_
>>>>>>
>>>
>>
>> Which fortran compiler did you use to build blas/lapack, which one did
>> you use for numpy and scipy ? Could you give us the output of ldd
>> _iterative.so ?
>>
>> David
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>


From ndbecker2 at gmail.com  Thu Jan 15 10:39:50 2009
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 15 Jan 2009 10:39:50 -0500
Subject: [SciPy-user] recursion limit in plot
Message-ID: <gknlc6$2i0$1@ger.gmane.org>

What's wrong here?
This code snippet:

from pylab import plot, show
print Id
print pout

plot (Id, pout)
show()

produces:
['50', '100', '150', '200', '250', '300', '350', '400', '450', '500', '550', 
'600', '650', '700', '750', '800', '850', '900', '950', '1000', '1050']
['0', '7.4', '11.4', '14.2', '16.3', '18.1', '19.3', '20.6', '21.6', '22.6', 
'23.4', '24.1', '24.9', '25.4', '26.1', '26.5', '26.9', '27.1', '27.3', 
'27.4', '27.4']
Traceback (most recent call last):
  File "./read_current_drive.py", line 26, in <module>
    plot (Id, pout)
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/pyplot.py", line 2096, in plot
    ret =  gca().plot(*args, **kwargs)
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/axes.py", line 3277, in plot
    for line in self._get_lines(*args, **kwargs):
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/axes.py", line 394, in _grab_next_args
    for seg in self._plot_2_args(remaining, **kwargs):
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/axes.py", line 298, in _plot_2_args
    x, y, multicol = self._xy_from_xy(x, y)
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/axes.py", line 214, in _xy_from_xy
    bx = self.axes.xaxis.update_units(x)
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/axis.py", line 939, in update_units
    converter = munits.registry.get_converter(data)
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/units.py", line 137, in get_converter
    converter = self.get_converter( thisx )
  File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux-
x86_64.egg/matplotlib/units.py", line 137, in get_converter
[...]
recursion limit reached


From jdh2358 at gmail.com  Thu Jan 15 10:48:33 2009
From: jdh2358 at gmail.com (John Hunter)
Date: Thu, 15 Jan 2009 09:48:33 -0600
Subject: [SciPy-user] recursion limit in plot
In-Reply-To: <gknlc6$2i0$1@ger.gmane.org>
References: <gknlc6$2i0$1@ger.gmane.org>
Message-ID: <88e473830901150748h301b91bdnccbe399fea17d81b@mail.gmail.com>

On Thu, Jan 15, 2009 at 9:39 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
> What's wrong here?
> This code snippet:
>
> from pylab import plot, show
> print Id
> print pout
>
> plot (Id, pout)
> show()
>
> produces:
> ['50', '100', '150', '200', '250', '300', '350', '400', '450', '500', '550',
> '600', '650', '700', '750', '800', '850', '900', '950', '1000', '1050']
> ['0', '7.4', '11.4', '14.2', '16.3', '18.1', '19.3', '20.6', '21.6', '22.6',
> '23.4', '24.1', '24.9', '25.4', '26.1', '26.5', '26.9', '27.1', '27.3',
> '27.4', '27.4']

You are passing a list of strings in -- convert them to floats first, eg

  ld = np.array(ld, np.float)
  pout = np.array(pout, np.float)

plotting questions using matplotlib should be directed to matplotlib-users

 http://lists.sourceforge.net/mailman/listinfo/matplotlib-users

JDH


From rmay31 at gmail.com  Thu Jan 15 10:51:47 2009
From: rmay31 at gmail.com (Ryan May)
Date: Thu, 15 Jan 2009 09:51:47 -0600
Subject: [SciPy-user] recursion limit in plot
In-Reply-To: <gknlc6$2i0$1@ger.gmane.org>
References: <gknlc6$2i0$1@ger.gmane.org>
Message-ID: <496F5B93.3030302@gmail.com>

Neal Becker wrote:
> What's wrong here?
> This code snippet:
> 
> from pylab import plot, show
> print Id
> print pout
> 
> plot (Id, pout)
> show()
> 
> produces:
> ['50', '100', '150', '200', '250', '300', '350', '400', '450', '500', '550', 
> '600', '650', '700', '750', '800', '850', '900', '950', '1000', '1050']
> ['0', '7.4', '11.4', '14.2', '16.3', '18.1', '19.3', '20.6', '21.6', '22.6', 
> '23.4', '24.1', '24.9', '25.4', '26.1', '26.5', '26.9', '27.1', '27.3', 
> '27.4', '27.4']

The problem here is that you're trying to plot lists of strings instead of lists
of numbers.  You need to convert all of these values to numbers.  However,
matplotlib could behave a bit more nicely in this case rather than simply
recursing until it hits the limit.

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma


From pgmdevlist at gmail.com  Thu Jan 15 11:35:16 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Thu, 15 Jan 2009 11:35:16 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <496F51B3.9060400@gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>	<gkm0fe$3u1$1@ger.gmane.org>	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
	<1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>
	<496F51B3.9060400@gmail.com>
Message-ID: <4467307B-1603-4AB4-9C29-F24467B6D6CB@gmail.com>


On Jan 15, 2009, at 10:09 AM, Bruce Southey wrote:
>>
> Some of the reasons that I do not use scipy for stats (and have not
> looked at this in some time) included:
> 1) The difficulty of installation which is considerably better now.
> 2) Lack of support for missing values as virtually everything that I
> have worked with involves missing values at some stage.

Can you proceed and give us examples of your needs ? That way we could  
improve scipy.stats.mstats


>
> 3) Lack of an suitable statistical modeling interface where you can
> specify the model to be fit without having to create each individual
> array. The approach must work for a range of scenarios.

Here again, a short example would help.

Thx a lot in advance,
P.


From josef.pktd at gmail.com  Thu Jan 15 11:56:07 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 15 Jan 2009 11:56:07 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <496F51B3.9060400@gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>
	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
	<gkm0fe$3u1$1@ger.gmane.org>
	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>
	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
	<1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>
	<496F51B3.9060400@gmail.com>
Message-ID: <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com>

On Thu, Jan 15, 2009 at 10:09 AM, Bruce Southey <bsouthey at gmail.com> wrote:
> josef.pktd at gmail.com wrote:
>> On Wed, Jan 14, 2009 at 11:24 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>>
>>> On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote:
>>>
>>>> The function in stats, that I tested or rewrote, are usually identical
>>>> to around 1e-15, but in some cases R has a more accurate test
>>>> distribution for small samples (option "exact" in R), while in
>>>> scipy.stats we only have the asymptotic distribution.
>>>>
>>> We could try to reimplement part of it in C,. In any   case, it might
>>> be worth to output a warning (or at least be very explicit in the doc)
>>> that the results may not hold for samples smaller than 10-20.
>>>
>>
>> I am not a "C" person and I never went much beyond HelloWorld in C.
>> I just checked some of the doc strings, and I am usually mention that
>> we use the asymptotic distribution, but there are still pretty vague
>> statements in some of the doc strings, such as
>>
>> "The p-values are not entirely reliable but are probably reasonable for
>> datasets larger than 500 or so."
>>
>>
>>
> The 'exact' test are usually Fisher's exact tests
> (http://en.wikipedia.org/wiki/Fisher%27s_exact_test) which are very
> different from the asymptotic testing and can get very demanding. Also I
> do not think that such statements should be part of the doc strings.

According to the wikipedia reference this is for contingency tables, the
two cases I worked on, were the exact two-sided Kolmogorov-Smirnov distribution,
were I found a good approximation, and the exact distribution for the
Spearman correlation coefficient for the Null of no correlation.

>
>>>> Also, not all
>>>> existing functions in scipy.stats are tested (yet).
>>>>
>>> We should also try to make sure missing data are properly supported
>>> (not always possible) and that the results are consistent between the
>>> masked and non-masked versions.
>>>
>>>
>>
>> I added a ticket so we don't forget to check this.
>>
>>
>>
>>
>>> IMHO, the readiness to incorporate user feedback is here. The feedback
>>> is not, or at least not as much as we'd like.
>>>
>>
>> That depends on the subpackage, some problems in stats have been
>> reported and known for quite some time and the expected lifetime of a
>> ticket can be pretty long. I was looking at different python packages
>> that use statistics, and many of them are reluctant to use scipy while
>> numpy looks very well established. But, I suppose this will improve
>> with time and the user base will increase, especially with the recent
>> improvements in the build/distribution and the documentation.
>>
>> Josef
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
> There are different reasons for a lack of user base. One of the reasons
> for R is that many, many statistics classes use it.
>
> Some of the reasons that I do not use scipy for stats (and have not
> looked at this in some time) included:
> 1) The difficulty of installation which is considerably better now.
> 2) Lack of support for missing values as virtually everything that I
> have worked with involves missing values at some stage.
> 3) Lack of an suitable statistical modeling interface where you can
> specify the model to be fit without having to create each individual
> array. The approach must work for a range of scenarios.
>

With 2 and 3 I have little experience
Missing observations, I usually remove or clean in the initial data
preparation. mstats provides functions for masked arrays, but stats
mostly assumes no missing values. What would be the generic treatment
for missing observations, just dropping all observations that have
NaNs or converting them to masked arrays and expand the function that
can handle those?

Jonathan Taylor included a formula framework in stats.models similar
to R, but I haven't looked very closely at it. I haven't learned much
of R's syntax and I usually prefer to build by own arrays (with some
exceptions such as polynomials) than hide them behind a mini model
language.
For both stats.models and for the interface for general stats
functions, feedback would be very appreciated.

Josef


From pgmdevlist at gmail.com  Thu Jan 15 12:19:16 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Thu, 15 Jan 2009 12:19:16 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>
	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
	<gkm0fe$3u1$1@ger.gmane.org>
	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>
	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
	<1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>
	<496F51B3.9060400@gmail.com>
	<1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com>
Message-ID: <0E60E206-26E7-4CA8-8C9C-EBD6490549EA@gmail.com>

>
> With 2 and 3 I have little experience
> Missing observations, I usually remove or clean in the initial data
> preparation. mstats provides functions for masked arrays, but stats
> mostly assumes no missing values. What would be the generic treatment
> for missing observations, just dropping all observations that have
> NaNs or converting them to masked arrays and expand the function that
> can handle those?
>

That depends on the situation. For linear fitting, missing values  
could be dropped (using the MaskedArray.compressed method if the data  
is 1D, or by using something like a[~np.isnan(a)]). In other cases,  
the missing values have to be taken into account.


From bsouthey at gmail.com  Thu Jan 15 12:36:43 2009
From: bsouthey at gmail.com (Bruce Southey)
Date: Thu, 15 Jan 2009 11:36:43 -0600
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>	<gkm0fe$3u1$1@ger.gmane.org>	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>	<1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>	<496F51B3.9060400@gmail.com>
	<1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com>
Message-ID: <496F742B.7040202@gmail.com>

josef.pktd at gmail.com wrote:
>> There are different reasons for a lack of user base. One of the reasons
>> for R is that many, many statistics classes use it.
>>
>> Some of the reasons that I do not use scipy for stats (and have not
>> looked at this in some time) included:
>> 1) The difficulty of installation which is considerably better now.
>> 2) Lack of support for missing values as virtually everything that I
>> have worked with involves missing values at some stage.
>> 3) Lack of an suitable statistical modeling interface where you can
>> specify the model to be fit without having to create each individual
>> array. The approach must work for a range of scenarios.
>>
>>     
>
> With 2 and 3 I have little experience
> Missing observations, I usually remove or clean in the initial data
> preparation. mstats provides functions for masked arrays, but stats
> mostly assumes no missing values. What would be the generic treatment
> for missing observations, just dropping all observations that have
> NaNs or converting them to masked arrays and expand the function that
> can handle those?
>   
No! We have had considerable discussion on this aspect in the past on 
the numpy/scipy lists. Basically a missing observation should not be 
treated as an NaNs (and there are different types of NaNs) because they 
are not the same. In some cases, missing values disappear in the 
calculations such as creating the X'X matrix etc but you probably do not 
want that if you have real NaNs in your data (say after taking square 
root of an array that includes negative numbers).

> Jonathan Taylor included a formula framework in stats.models similar
> to R, but I haven't looked very closely at it. I haven't learned much
> of R's syntax and I usually prefer to build by own arrays (with some
> exceptions such as polynomials) than hide them behind a mini model
> language.
> For both stats.models and for the interface for general stats
> functions, feedback would be very appreciated.
>
> Josef
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   
If you look at R's lm function you can see that you can fit a model 
using a formula. Without a similar framework, you can not do useful 
stats. Also you must have a 'mini model language' because the inputs 
must be created correctly and it gets very repetitive very quickly.

For example, in R (and all major stats languages like SAS) you can just 
fit regression models like lm(Y~ x2) and  lm( Y~ x3 + x1), where Y, x1, 
x2, and x3 are with the appropriate dataframe (not necessarily in that 
order).

If I understand mstats.linregress correctly, I have to create two arrays 
just to fit one of these two models. In the second case, I have to 
create yet another array. If I have my original data in one array, now I 
have unnecessarily duplicated 3 columns of that array not to mention had 
to do all this extra work, hopefully error free, just to do 2 lines of R 
code.

Jonathan's formula is along the right approach but, based on the doc 
string, rather cumbersome and does not use array inputs. It probably 
would be more effective with a record masked array.

Bruce

PS Way back when I did give feedback to the direction of stats stuff.


From timmichelsen at gmx-topmail.de  Thu Jan 15 12:59:20 2009
From: timmichelsen at gmx-topmail.de (Timmie)
Date: Thu, 15 Jan 2009 17:59:20 +0000 (UTC)
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
References: <gkit04$u4b$1@ger.gmane.org>	<A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>	<gkj84f$81b$1@ger.gmane.org>
	<9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com>
	<gklvn3$1ia$1@ger.gmane.org>
	<B99491C9-F53A-4A36-94E5-853A8B3271D9@gmail.com>
Message-ID: <loom.20090115T175834-424@post.gmane.org>

> It looks like you're using an old version of numpy (older than mine  
> anyway...): which one is it ?
I am using 1.2.1.1 from PythonXY.

I will test again and report back.

Thanks so far.
Timmie


From pgmdevlist at gmail.com  Thu Jan 15 13:05:04 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Thu, 15 Jan 2009 13:05:04 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <496F742B.7040202@gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>	<gkm0fe$3u1$1@ger.gmane.org>	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>	<1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>	<496F51B3.9060400@gmail.com>
	<1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com>
	<496F742B.7040202@gmail.com>
Message-ID: <56796EA1-00FD-4426-8227-65F487FCAD93@gmail.com>


On Jan 15, 2009, at 12:36 PM, Bruce Southey wrote:
> No! We have had considerable discussion on this aspect in the past on
> the numpy/scipy lists. Basically a missing observation should not be
> treated as an NaNs (and there are different types of NaNs) because  
> they
> are not the same. In some cases, missing values disappear in the
> calculations such as creating the X'X matrix etc but you probably do  
> not
> want that if you have real NaNs in your data (say after taking square
> root of an array that includes negative numbers).

numpy.ma implements equivalents of ufuncs that return a masked array,  
where invalid outputs are masked (the output is invalid if the input  
is masked or if it falls outside the validity domain of the function),  
so we're set. There are functions that mask full rows or columns of a  
2D array, or even get rid of the columns/rows that contain one or  
several missing values which can be used in some cases.

>>
> If you look at R's lm function you can see that you can fit a model
> using a formula. Without a similar framework, you can not do useful
> stats. Also you must have a 'mini model language' because the inputs
> must be created correctly and it gets very repetitive very quickly.

> For example, in R (and all major stats languages like SAS) you can  
> just
> fit regression models like lm(Y~ x2) and  lm( Y~ x3 + x1), where Y,  
> x1,
> x2, and x3 are with the appropriate dataframe (not necessarily in that
> order).

Well, we could adapt the functions to accept a structured array as  
input and define your x1, x2... from the fields of this array. I tried  
to significantly improve the support of structured arrays in numpy.ma  
1.3., so it shouldn't be that difficult to use masked arrays by default.


> If I understand mstats.linregress correctly, I have to create two  
> arrays
> just to fit one of these two models. In the second case, I have to
> create yet another array. If I have my original data in one array,  
> now I
> have unnecessarily duplicated 3 columns of that array not to mention  
> had
> to do all this extra work, hopefully error free, just to do 2 lines  
> of R
> code.
>

For the first case (Y~x2), you don't need 2 arrays, you can use a 2D  
array with either 2 rows or 2 columns and that would work.  
mstats.linregress use the same approach as stats.linregress.
The second case is a tad more complex, but could probably be adapted  
relatively easily.


> Jonathan's formula is along the right approach but, based on the doc
> string, rather cumbersome and does not use array inputs. It probably
> would be more effective with a record masked array.

OK, more on my todo list...


From josef.pktd at gmail.com  Thu Jan 15 13:25:45 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 15 Jan 2009 13:25:45 -0500
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <496F742B.7040202@gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>
	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>
	<gkm0fe$3u1$1@ger.gmane.org>
	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>
	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
	<1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com>
	<496F51B3.9060400@gmail.com>
	<1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com>
	<496F742B.7040202@gmail.com>
Message-ID: <1cd32cbb0901151025l55d60598la7ccb1f4882e2a86@mail.gmail.com>

On Thu, Jan 15, 2009 at 12:36 PM, Bruce Southey <bsouthey at gmail.com> wrote:
> josef.pktd at gmail.com wrote:
>>> There are different reasons for a lack of user base. One of the reasons
>>> for R is that many, many statistics classes use it.
>>>
>>> Some of the reasons that I do not use scipy for stats (and have not
>>> looked at this in some time) included:
>>> 1) The difficulty of installation which is considerably better now.
>>> 2) Lack of support for missing values as virtually everything that I
>>> have worked with involves missing values at some stage.
>>> 3) Lack of an suitable statistical modeling interface where you can
>>> specify the model to be fit without having to create each individual
>>> array. The approach must work for a range of scenarios.
>>>
>>>
>>
>> With 2 and 3 I have little experience
>> Missing observations, I usually remove or clean in the initial data
>> preparation. mstats provides functions for masked arrays, but stats
>> mostly assumes no missing values. What would be the generic treatment
>> for missing observations, just dropping all observations that have
>> NaNs or converting them to masked arrays and expand the function that
>> can handle those?
>>
> No! We have had considerable discussion on this aspect in the past on
> the numpy/scipy lists. Basically a missing observation should not be
> treated as an NaNs (and there are different types of NaNs) because they
> are not the same. In some cases, missing values disappear in the
> calculations such as creating the X'X matrix etc but you probably do not
> want that if you have real NaNs in your data (say after taking square
> root of an array that includes negative numbers).
>
>> Jonathan Taylor included a formula framework in stats.models similar
>> to R, but I haven't looked very closely at it. I haven't learned much
>> of R's syntax and I usually prefer to build by own arrays (with some
>> exceptions such as polynomials) than hide them behind a mini model
>> language.
>> For both stats.models and for the interface for general stats
>> functions, feedback would be very appreciated.
>>
>> Josef
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
> If you look at R's lm function you can see that you can fit a model
> using a formula. Without a similar framework, you can not do useful
> stats. Also you must have a 'mini model language' because the inputs
> must be created correctly and it gets very repetitive very quickly.
>
> For example, in R (and all major stats languages like SAS) you can just
> fit regression models like lm(Y~ x2) and  lm( Y~ x3 + x1), where Y, x1,
> x2, and x3 are with the appropriate dataframe (not necessarily in that
> order).

For the simple case, it could be done with accepting a sequence of
args, and building
the design matrix inside the function, e.g.

ols( Y, x3, x1, x2**2 )

To build design matrices, I wrote,  for myself, functions like
simplex(x,n) where x is a 2D column matrix and it builds the
interaction terms matrix, x[:,1], x[:,2],  x[:,1]*x[:,2], ...
x[:,1]**n, which if I read the R stats help correctly would correspond
to (x[:,1] + x[:,2])^n

My ols call would then be
       ols(Y, simplex(x3,x1,2) ),

This uses explicit functions and avoids the mini-language, but it
requires some design building functions.

Being able to access some meta-information to data arrays would be
nice, but I haven't used these features much, except for building my
own classes in python or structs in matlab.

Josef


From timmichelsen at gmx-topmail.de  Thu Jan 15 18:10:39 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Fri, 16 Jan 2009 00:10:39 +0100
Subject: [SciPy-user] scikits.timeseries: tsfromtxt
In-Reply-To: <loom.20090115T175834-424@post.gmane.org>
References: <gkit04$u4b$1@ger.gmane.org>	<A5B3F62C-DAB2-469A-8275-F2D1C8C8030A@gmail.com>	<gkj84f$81b$1@ger.gmane.org>	<9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com>	<gklvn3$1ia$1@ger.gmane.org>	<B99491C9-F53A-4A36-94E5-853A8B3271D9@gmail.com>
	<loom.20090115T175834-424@post.gmane.org>
Message-ID: <gkofpf$aug$2@ger.gmane.org>

Timmie schrieb:
>> It looks like you're using an old version of numpy (older than mine  
>> anyway...): which one is it ?
> I am using 1.2.1.1 from PythonXY.
> 
> I will test again and report back.
Yes, it doesn't work with version 1.2.1.1.
Too bad that I have to wait now.

Do you know when Numpy 1.3.x will be released?
I didn't find a roadmap site.

Kind regards,
Timmie


From timmichelsen at gmx-topmail.de  Thu Jan 15 18:22:00 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Fri, 16 Jan 2009 00:22:00 +0100
Subject: [SciPy-user] predicting values based on (linear) models
In-Reply-To: <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
References: <gkitpt$1ko$1@ger.gmane.org>	<1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com>	<gkm0fe$3u1$1@ger.gmane.org>	<1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com>
	<4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com>
Message-ID: <gkogeq$e5i$1@ger.gmane.org>

Hello,
thanks very much for continuing this discussion. It is very helpful for 
me and perhaps others who need to chose the right tools for statistical 
processing and analysis.

> IMHO, the readiness to incorporate user feedback is here. The feedback  
> is not, or at least not as much as we'd like.
I am often very much ocupied writing my own special routines building on 
top of scipy/numpy etc. And therefore, I find it difficult to contribute 
code.
But I can do testings and bug reporting. Since I want the libaries I 
rely on to be good working I have a vital interest in this.

Please just indicate where testing is needed. If it matches with my 
knowledge and application I am happy to contribute.

Kind regards,
Timmie


From fragon25 at yahoo.com  Fri Jan 16 03:23:22 2009
From: fragon25 at yahoo.com (Tan Tran)
Date: Fri, 16 Jan 2009 00:23:22 -0800 (PST)
Subject: [SciPy-user] Handle large array
Message-ID: <13220.42827.qm@web39206.mail.mud.yahoo.com>

Hello,

I'm trying to do some & like this 

xx = (d[:,0:1] == 0) & (d[:,2:3] == 2) & (d[:, 1:2]==1) & (d[:, 1:2]==2)

If d is small, 19 columns and about 5000 rows, the code runs fine. But if I have large data like d has about 40k rows, I got error message: MemoryError

I tried to make separate variable but still have problem when trying to & them
aa = d[:,0:1] == 0
bb =  d[:,2:3] == 2
cc = d[:, 1:2]==1
dd = d[:, 1:2]==2

xx = aa & bb & cc & dd <-- MemoryError's here

Have anybody seen this problem before? How to play with large data?

Thanks,


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090116/aaa34823/attachment.html>

From robert.kern at gmail.com  Fri Jan 16 03:44:00 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 16 Jan 2009 02:44:00 -0600
Subject: [SciPy-user] Handle large array
In-Reply-To: <13220.42827.qm@web39206.mail.mud.yahoo.com>
References: <13220.42827.qm@web39206.mail.mud.yahoo.com>
Message-ID: <3d375d730901160044s2f6a946ey499e84d1d569a84d@mail.gmail.com>

On Fri, Jan 16, 2009 at 02:23, Tan Tran <fragon25 at yahoo.com> wrote:
> Hello,
>
> I'm trying to do some & like this
>
> xx = (d[:,0:1] == 0) & (d[:,2:3] == 2) & (d[:, 1:2]==1) & (d[:, 1:2]==2)
>
> If d is small, 19 columns and about 5000 rows, the code runs fine. But if I
> have large data like d has about 40k rows, I got error message: MemoryError
>
> I tried to make separate variable but still have problem when trying to &
> them
> aa = d[:,0:1] == 0
> bb =  d[:,2:3] == 2
> cc = d[:, 1:2]==1
> dd = d[:, 1:2]==2
>
> xx = aa & bb & cc & dd <-- MemoryError's here
>
> Have anybody seen this problem before? How to play with large data?

I usually chunk things up using iterators. For example:


def chunked_slices(ntotal, chunksize):
    nchunks, nlast = divmod(ntotal, chunksize)
    for i in range(nchunks):
        yield slice(i*chunksize, (i+1)*chunksize)
    if nlast > 0:
        penultimate = (i+1)*chunksize
        yield slice(penultimate, penultimate+nlast)

xx = np.empty([len(d)], dtype=bool)

for slc in chunked_slices(len(d), 1000):
    xx[slc] = (d[slc,0] == 0) & (d[slc,2] == 2) & (d[slc,1]==1) & (d[slc,1]==2)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From faltet at pytables.org  Fri Jan 16 04:34:50 2009
From: faltet at pytables.org (Francesc Alted)
Date: Fri, 16 Jan 2009 10:34:50 +0100
Subject: [SciPy-user] Handle large array
In-Reply-To: <3d375d730901160044s2f6a946ey499e84d1d569a84d@mail.gmail.com>
References: <13220.42827.qm@web39206.mail.mud.yahoo.com>
	<3d375d730901160044s2f6a946ey499e84d1d569a84d@mail.gmail.com>
Message-ID: <200901161034.50758.faltet@pytables.org>

A Friday 16 January 2009, Robert Kern escrigu?:
> On Fri, Jan 16, 2009 at 02:23, Tan Tran <fragon25 at yahoo.com> wrote:
> > Hello,
> >
> > I'm trying to do some & like this
> >
> > xx = (d[:,0:1] == 0) & (d[:,2:3] == 2) & (d[:, 1:2]==1) & (d[:,
> > 1:2]==2)
> >
> > If d is small, 19 columns and about 5000 rows, the code runs fine.
> > But if I have large data like d has about 40k rows, I got error
> > message: MemoryError
> >
> > I tried to make separate variable but still have problem when
> > trying to & them
> > aa = d[:,0:1] == 0
> > bb =  d[:,2:3] == 2
> > cc = d[:, 1:2]==1
> > dd = d[:, 1:2]==2
> >
> > xx = aa & bb & cc & dd <-- MemoryError's here
> >
> > Have anybody seen this problem before? How to play with large data?
>
> I usually chunk things up using iterators. For example:
>
>
> def chunked_slices(ntotal, chunksize):
>     nchunks, nlast = divmod(ntotal, chunksize)
>     for i in range(nchunks):
>         yield slice(i*chunksize, (i+1)*chunksize)
>     if nlast > 0:
>         penultimate = (i+1)*chunksize
>         yield slice(penultimate, penultimate+nlast)
>
> xx = np.empty([len(d)], dtype=bool)
>
> for slc in chunked_slices(len(d), 1000):
>     xx[slc] = (d[slc,0] == 0) & (d[slc,2] == 2) & (d[slc,1]==1) &
> (d[slc,1]==2)

Another option could be using numexpr [1], that avoid the use of 
temporaries during the expression evaluation:

xx = numexpr.evaluate("aa & bb & cc & dd")

However, I think that your problem here is that your initial array, d, 
is too large and takes almost all of your available memory.  You may 
want to save it into a file a read columns from it when you need them.  
There are several ways to achieve this, like memmapped arrays or using 
HDF5/NetCDF4 for saving them.  Here it is a quick example following the 
HDF5 path (through PyTables [2]):

In [1]: import numpy as np

In [2]: import tables as tb

In [3]: import tables.numexpr as ne

In [4]: f = tb.openFile('mydata.h5', 'w')

In [5]: d = f.createCArray(f.root, 'mydata', tb.Int32Atom(), (19,40000))

In [6]: for ncol in range(19):  # Write data column by column
   ...:     d[ncol] = np.arange(40000)*ncol
   ...:

In [7]: a, b, c = d[0,:], d[1,:], d[2,:]

In [8]: xx = ne.evaluate('(a == 0) & (c == 2) & (b == 1) & (b == 2)')

In [9]: xx
Out[9]: array([False, False, False, ..., False, False, False], 
dtype=bool)

In [10]: f.close()


With this, you will only have 4 columns (a,b,c and xx) of your data as 
maximum in memory while the d array is completely on disk.  Note that 
I've transposed your original d array for read efficiency reasons.  
Also, numexpr is already integrated in PyTables, so you don't need to 
install it separately (although you can if you want).

[1] http://code.google.com/p/numexpr/
[2] http://www.pytables.org

Hope that helps,

-- 
Francesc Alted


From fredmfp at gmail.com  Fri Jan 16 05:58:00 2009
From: fredmfp at gmail.com (fred)
Date: Fri, 16 Jan 2009 11:58:00 +0100
Subject: [SciPy-user] ndimage convolve vs. RAM issue...
Message-ID: <49706838.50808@gmail.com>

Hi all,

On a bi-xeon quad core (debian 64 bits) with 8 GB of RAM, if I want to
convolve a 102*122*143 float array (~7 MB) with a kernel of 77*77*41
cells (~1 MB), I get a MemoryError in correlate:

File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line
331, in convolve
    origin, True)
  File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line
312, in _correlate_or_convolve
    _nd_image.correlate(input, weights, output, mode, cval, origins)
MemoryError

Why ?

Is there a workaround to compute such convolution ?

TIA.


Cheers,

-- 
Fred


From josef.pktd at gmail.com  Fri Jan 16 11:52:58 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 16 Jan 2009 11:52:58 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
Message-ID: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>

I tried to see how masked arrays and nans, and a sequence of input
arguments, can be handled in a regression (linear model)

The attached file works,  but array dimension handling and
concatenation looks pretty messy, if we want column orientation
instead of rows. I left some of the unsuccessful attempts as comments,
if someone can propose a better way. Also this is the first time, that
I use masked arrays, and I'm not sure I found the best way. At the end
of the file are my test cases.

I treat masked arrays or nans by removing all observations with masked
or nan values for the internal calculation, but keep the mask around,
for the case when it is needed for some output. I looked at
np.ma.polyfit, which uses a dummy fill value (0) before calling least
squares, but in general it will be difficult to find "neutral" fill
values..

Does this look like a reasonable way to write functions or classes
that can handle plain np.arrays and masked arrays at the same time
with minimal overhead? I haven't looked at extending it to record,
structured arrays.

Josef

class Regression(object):
    def __init__(self,y,*args,**kwds):
        '''
        Parameters
        ----------

        y: array, 1D or 2D
            variable of regression. if 2D,  it needs to be one column
        args: sequence of 1D or 2D arrays
            one or several arrays, if 2D than interpretation is that each
            column represents one variable and rows are observations
        kwds: addconst (default: True)
            if True then a constant is added to the regressor

        return
        ------
        class instance with regression results

        Notes
        -----

        Observation (rows) that contain masked values in a masked array or nans
        in any of the regressors or in y will be dropped for the calculation.
        Arrays that correspond to observation, e.g. estimated y (yhat)
        or residuals, are returned as masked arrays if masked values or nans are
        in the input data.

        Usage
        -----

        estm = Ols(y, x1, x2)
        estm.b  # estimated parameter vector

        example polynomial
        estv = Ols(y, np.vander(x[:,0],3), x[:,1], addconst=False)
        estv.b  # estimated parameter vector
        estv.yhat # fitted values
-------------- next part --------------
import numpy as np
import numpy.ma as ma
from scipy import linalg
#from numpy.testing import assert_almost_equal
from numpy.ma.testutils import assert_almost_equal


def compressmiss(y,design,axis=0):
    '''compress rows with any masked value'''
    #print 'ma.mask_rows(design).mask', ma.mask_rows(design).mask.shape
    
    mask = ma.mask_or(ma.getmask(y), ma.mask_rows(design).mask[:,:1]).ravel()
    ok = (mask == False)
    #print ok, mask
    #print 'mask.shape', mask.shape, ok.shape
    #print 'y.shape', y.shape
    #print 'design.shape',design.shape
    #return y[ok,:], design[ok,:], mask   #note this does not convert to nd.array
    
    y, design = y[ok,0], design[ok,:]  # y[ok,0] use 0 for ma.compressed
    #print 'y.shape', y.shape, 'design.shape', design.shape
    #y = ma.masked_array(y, mask)
    #design = ma.masked_array(design, mask)
    
    #ma.compressed doesn't preserve shape (orientation) 
    return ma.compressed(y)[:,np.newaxis], ma.compress_rows(design), mask
    #return ma.compressed(y), ma.compress_rows(design), mask

    
class Regression(object):
    def __init__(self,y,*args,**kwds):
        '''
        Parameters
        ----------

        y: array, 1D or 2D
            variable of regression. If 2D, then it needs to be one column
        args: sequence of 1D or 2D arrays
            one or several arrays of regressors. If 2D than interpretation
            is that each column represents one variable and rows are
            observations
        kwds: addconst (default: True)
            if True then a constant is added to the regressor

        return
        ------
        class instance with regression results

        Notes
        -----

        Observation (rows) that contain masked values in a masked array or nans
        in any of the regressors or in y will be dropped for the calculation.
        Arrays that correspond to observation, e.g. estimated y (yhat)
        or residuals, are returned as masked arrays if masked values or nans are
        in the input data.

        Usage
        -----
        estm = Ols(y, x1, x2)
        estm.b  # estimated parameter vector
        
        example polynomial
        estv = Ols(y, np.vander(x[:,0],3), x[:,1], addconst=False)
        estv.b  # estimated parameter vector
        estv.yhat # fitted values
        
        '''
        #print 'init of class Regression'   
        self.addconst = kwds.pop('addconst', True)
        #print y.shape
##        for v in args:
##            print v.shape   
        if self.addconst:
            #design = np.concatenate( tuple([np.ones(y.shape)] + list(args)),axis=1)
            #design = np.concatenate( [np.ones(y.shape).T] + [it.T for it in args],axis=0).T
            #design = np.c_[ [np.ones(y.shape)] + list(args)]
            
            #simple to read but maybe inefficient intermediate matrices
            design = np.c_[args] # note: this encodes mask as nan
            design = np.c_[np.ones(y.shape), design]
            #print design.shape
        else:
            design = np.c_[args]
        if isinstance(y,ma.MaskedArray) or isinstance(design,ma.MaskedArray) \
            or np.any(np.isnan(y)) or np.any(np.isnan(design)):
            design = ma.masked_array(design, np.isnan(design))
            
            self.ymasked = y #keep around for simplicity
            self.y, self.design, self.mask = compressmiss(y,design)
            self.masked = True #not necessary
            self.ymasked = ma.masked_array(y, self.mask) # update mask ?
            #print self.design      
        else:
            self.y = y
            self.design = design
            self.mask = None

        #print 'y x shapes before estimate', self.y.shape, self.design.shape
        self.estimate()
        #print 'y x b shapes before yhat', self.y.shape, self.design.shape, self.b.shape
        yhat = np.dot(self.design,self.b)
        if not self.mask == None:
            self.yhat = self.ymasked.copy() # just to remember shape
            self.yhat[self.mask == False] = yhat
        else:
            self.yhat = yhat

    def estimate(self):
        pass

    def get_yhat(self):
        pass
        

    def summary(self):
        pass

    def predict(self, x):
        pass


class Ols(Regression):
    def __init__(self,y,*args,**kwds):
        super(self.__class__, self).__init__(y,*args,**kwds)

    def estimate(self):
        y,x = self.y, self.design
        #print 'y.shape, x.shape'
        #print y.shape, x.shape
        self.b,self.resid,rank,self.sigma = linalg.lstsq(x,y)
                
        #print rank

    def predict(self, x):
        '''redurn prediction
        todo: add prediction error, confidence intervall'''
        if self.addconst:
            x  = np.c_[np.ones(x.shape[0]),x]
        #print x.shape, self.b.shape
        return np.dot(x,self.b)


if __name__ == '__main__':

    import numpy as np
    #from olsexample import ols

    def generate_data(nobs):
       x = np.random.randn(nobs,2)
       btrue = np.array([[5,1,2]]).T
       y = np.dot(x, btrue[1:,:]) + btrue[0,:] + 0.5 * np.random.randn(nobs,1)
       return y,x

    y,x = generate_data(15)

    #benchmark no masked arrays, and one 2D array for x
    est = Ols(y[1:,:],x[1:,:])  # initialize and estimate with ols, constant added by default
    print 'ols estimate'
    est.estimate()
    print est.b.T
    #print np.array([[5,1,2]])  # true coefficients

    ynew,xnew = generate_data(3)
    ypred = est.predict(xnew)

    print '    ytrue        ypred        error'
    print np.c_[ynew, ypred, ynew - ypred]


    #case masked array y
    ym = y.copy()
    ym[0,:] = np.nan
    ym = ma.masked_array(ym, np.isnan(ym))
    estm1 = Ols(ym,x)
    print estm1.b.T
    print estm1.yhat.shape
    print 'yhat'
    print estm1.yhat[:10,:]
    assert_almost_equal(estm1.yhat[1:,:], est.yhat)

    #masked y and several x args, addconst=False
    estm2 = Ols(ym,np.ones(ym.shape),x[:,0],x[:,1],addconst=False)
    print estm2.b.T
    assert_almost_equal(estm2.b, estm1.b)
    assert_almost_equal(estm2.yhat, estm1.yhat)

    #masked y and several x args, 
    estm3 = Ols(ym,x[:,0],x[:,1])
    print estm2.b.T
    assert_almost_equal(estm3.b, estm1.b)
    assert_almost_equal(estm3.yhat, estm1.yhat)

    #masked array in y and one x variable
    x_0 = x[:,0].copy()  # is copy necessary?
    x_0[0] = np.nan
    x_0 = ma.masked_array(x_0, np.isnan(x_0))
    estm4 = Ols(ym,x_0,x[:,1])
    print estm4.b.T
    assert_almost_equal(estm4.b, estm1.b)
    assert_almost_equal(estm4.yhat, estm1.yhat)

    #masked array in one x variable, but not in y
    x_0 = x[:,0].copy()  # is copy necessary?
    x_0[0] = np.nan
    x_0 = ma.masked_array(x_0, np.isnan(x_0))
    estm5 = Ols(y,x_0,x[:,1])   #, addconst=False)
    print estm5.b.T
    assert_almost_equal(estm5.b, estm1.b)
    assert_almost_equal(estm5.yhat, estm1.yhat) 
    #assert np.all(estm5.yhat == estm1.yhat)

    #example polynomial
    print 'example with one polynomial x added'
    estv = Ols(y,np.vander(x[:,0],3), x[:,1], addconst=False)
    print estv.b.T
    print estv.yhat

From pgmdevlist at gmail.com  Fri Jan 16 17:13:21 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Fri, 16 Jan 2009 17:13:21 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
Message-ID: <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>

Josef,
I'm rewriting your module, expect some update in the next few hours  
(days...).
Still, I have some generic comments:

* If you need a function that supports both ndarrays and masked  
arrays, force the inputs to be masked arrays, that'll be easier.

* Use ma.fix_invalid to transform a ndarray w/ or w/o NaNs into a  
masked array: the NaNs will automatically be masked, and the  
underlying data fixed (a copy is made, no worry).

* if you need to mask an element, just mask it directly: you don't  
have to set it to NaN and then use np.isnan for the mask. So, instead  
of:
x_0 = x[:,0].copy()
x_0[0] = np.nan
x_0 = ma.masked_array(x_0, np.isnan(x_0))

just do:
x_0 = ma.array(x[:,0])
x_0[0] = ma.masked

* When mask is a boolean ndarray, just use x[~mask] instead of  
x[mask==False].

* To get rid of the missing data in x, use x.compressed() or emulate  
it with x.data[~ma.getmaskarray(x)]. ma.getmaskarray(x) always returns  
a ndarray with the same length as x, whereas ma.getmask(x) can return  
nomask.


* when manipulating masked arrays, if performance is an issue,  
decompose the process in manipulating the data and the mask  
separately. The easiest is to use .filled to get a pure ndarray for  
the data. The choice of the fill_value depends on the application. In  
ma.polyfit, we fill y with 0, which doesn't really matter as the  
corresponding coefficients of x will be 0 (through vander).

> Also this is the first time, that
> I use masked arrays, and I'm not sure I found the best way.

Don't worry, practice makes perfect.

>
> I treat masked arrays or nans by removing all observations with masked
> or nan values for the internal calculation, but keep the mask around,
> for the case when it is needed for some output.

You keep the *common* mask, which sounds OK. Removing the missing  
observations seems the way to go

> I looked at
> np.ma.polyfit, which uses a dummy fill value (0) before calling least
> squares, but in general it will be difficult to find "neutral" fill
> values..

cf explanation above.
>


From josef.pktd at gmail.com  Fri Jan 16 19:53:48 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 16 Jan 2009 19:53:48 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>
Message-ID: <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>

Thanks for the explanations, it was quite a bit of trial and error to
find out, especially how the dimension handling and casting works.

my basic idea was:

* get a fast path through the function for (no nans, unmasked)
np.arrays, that's why I didn't convert inputs automatically to masked
arrays.

* program basic statistical function for np.arrays without nans. I
would like to limit the handling of different types of arrays to the
input and output stages, so that the statistical core part does not
need to be special cased.

* use compressed not filled to convert masked data, because, in
general, there is no neutral fill value for regressions. It's also
easier to use existing functions, for example my version can use the
standard np.vander.

This works for calculating summary statistics, parameter estimates,
covariance matrices, test statistics and so on, but for transformation
of the input variables, error vector, predicted values, keeping masked
arrays might be necessary or more convenient.

I'm not yet very familiar with numpy details, for example when a view
and when a copy or when intermediate arrays are created and what the
performance overhead of casting back and forth is.

If we get a general setting for handling different type of arrays,
then this could be used to wrap standard statistical methods and
functions without too much extra work.

On Fri, Jan 16, 2009 at 5:13 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
> Josef,
> I'm rewriting your module, expect some update in the next few hours
> (days...).
> Still, I have some generic comments:
>
> * If you need a function that supports both ndarrays and masked
> arrays, force the inputs to be masked arrays, that'll be easier.
>
> * Use ma.fix_invalid to transform a ndarray w/ or w/o NaNs into a
> masked array: the NaNs will automatically be masked, and the
> underlying data fixed (a copy is made, no worry).
>
> * if you need to mask an element, just mask it directly: you don't
> have to set it to NaN and then use np.isnan for the mask. So, instead
> of:
> x_0 = x[:,0].copy()
> x_0[0] = np.nan
> x_0 = ma.masked_array(x_0, np.isnan(x_0))
>
> just do:
> x_0 = ma.array(x[:,0])
> x_0[0] = ma.masked

I followed the docs examples. In your way x_0.data still has the
original value (?), so I wouldn't have run into the problem with
numpy.testing asserts? Would this hide some test cases?

>
> * When mask is a boolean ndarray, just use x[~mask] instead of
> x[mask==False].

I didn't remember  `~`

>
> * To get rid of the missing data in x, use x.compressed() or emulate
> it with x.data[~ma.getmaskarray(x)]. ma.getmaskarray(x) always returns
> a ndarray with the same length as x, whereas ma.getmask(x) can return
> nomask.

this makes shape manipulation and shape preserving compression easier
it tried this
     x_0[~ma.getmaskarray(x)]
and got a masked array back, when I wanted this
     x_0.data[~ma.getmaskarray(x)]

>
>
> * when manipulating masked arrays, if performance is an issue,
> decompose the process in manipulating the data and the mask
> separately. The easiest is to use .filled to get a pure ndarray for
> the data. The choice of the fill_value depends on the application. In
> ma.polyfit, we fill y with 0, which doesn't really matter as the
> corresponding coefficients of x will be 0 (through vander).

compressed might be necessary, see above

>
>> Also this is the first time, that
>> I use masked arrays, and I'm not sure I found the best way.
>
> Don't worry, practice makes perfect.
>
>>
>> I treat masked arrays or nans by removing all observations with masked
>> or nan values for the internal calculation, but keep the mask around,
>> for the case when it is needed for some output.
>
> You keep the *common* mask, which sounds OK. Removing the missing
> observations seems the way to go.

Actually, after the discussion for 3D picture filling, that it would
be possible to replace some of the missing values by their predicted
value or their conditional expectation in a second stage. I think this
would be the method specific "neutral" fill value.

>
>> I looked at
>> np.ma.polyfit, which uses a dummy fill value (0) before calling least
>> squares, but in general it will be difficult to find "neutral" fill
>> values..
>
> cf explanation above.
>>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>

Josef


From pgmdevlist at gmail.com  Fri Jan 16 20:38:55 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Fri, 16 Jan 2009 20:38:55 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>
	<1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>
Message-ID: <BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>

  Josef,
>
> * get a fast path through the function for (no nans, unmasked)
> np.arrays, that's why I didn't convert inputs automatically to masked
> arrays.
>
> * program basic statistical function for np.arrays without nans. I
> would like to limit the handling of different types of arrays to the
> input and output stages, so that the statistical core part does not
> need to be special cased.
>

Well, you can very well convert your inputs to MaskedArrays only (for  
example through ma.fix_invalid), get rid of the missing values to work  
only w/ standard ndarrays. I'm


> * use compressed not filled to convert masked data, because, in
> general, there is no neutral fill value for regressions. It's also
> easier to use existing functions, for example my version can use the
> standard np.vander.

Indeed.


> I'm not yet very familiar with numpy details, for example when a view
> and when a copy or when intermediate arrays are created and what the
> performance overhead of casting back and forth is.

With a view, you don't create a new array, which is nice if you don't  
intend modifying ti. Creating a masked array version doesn't copy the  
data either, an extra array is sometimes created (the mask), but it  
can be modified relatively safely, modifications shouldn't be  
propagated.


> If we get a general setting for handling different type of arrays,
> then this could be used to wrap standard statistical methods and
> functions without too much extra work.

That depends on the situation again. For regressions, your approach  
works. In other cases, the masked values have to be taken into account  
(because they should be counted as ties, for example). Using masked  
arrays should make it easier to adapt the code to other objects  
(TimeSeries, for example)


>> * if you need to mask an element, just mask it directly: you don't
>> have to set it to NaN and then use np.isnan for the mask. So, instead
>> of:
>> x_0 = x[:,0].copy()
>> x_0[0] = np.nan
>> x_0 = ma.masked_array(x_0, np.isnan(x_0))
>>
>> just do:
>> x_0 = ma.array(x[:,0])
>> x_0[0] = ma.masked
>
> I followed the docs examples. In your way x_0.data still has the
> original value (?), so I wouldn't have run into the problem with
> numpy.testing asserts? Would this hide some test cases?

I've never been happy with what was presented in the docs so far. Now  
that a draft doc for numpy.ma is available, that should change.
In this example, yes, x_0.data[0]  has the same value before and after  
masking, but that's not a problem as the mask will hide it (and that  
you'll drop it anyway later on). However, you want to use the  
numpy.ma.testutils for testing.

>
>>
>> * To get rid of the missing data in x, use x.compressed() or emulate
>> it with x.data[~ma.getmaskarray(x)]. ma.getmaskarray(x) always  
>> returns
>> a ndarray with the same length as x, whereas ma.getmask(x) can return
>> nomask.
>
> this makes shape manipulation and shape preserving compression easier
> it tried this
>     x_0[~ma.getmaskarray(x)]
> and got a masked array back, when I wanted this
>     x_0.data[~ma.getmaskarray(x)]

I saw that. .compressed flattens the data, which is an issue in your  
case. Just selecting elements of .data is more convenient.

>> Actually, after the discussion for 3D picture filling, that it would
> be possible to replace some of the missing values by their predicted
> value or their conditional expectation in a second stage. I think this
> would be the method specific "neutral" fill value.

Except that it won't work, as .filled takes only one element (all the  
masked data are filled w/ the same value). What you wanna do is to use  
putmask on your standard ndarray.


From josef.pktd at gmail.com  Fri Jan 16 21:13:47 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 16 Jan 2009 21:13:47 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>
	<1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>
	<BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>
Message-ID: <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>

>>> Actually, after the discussion for 3D picture filling, that it would
>> be possible to replace some of the missing values by their predicted
>> value or their conditional expectation in a second stage. I think this
>> would be the method specific "neutral" fill value.
>
> Except that it won't work, as .filled takes only one element (all the
> masked data are filled w/ the same value). What you wanna do is to use
> putmask on your standard ndarray.
>

What's the best way of unmasking a single masked element in a masked array?

y.data[i] = 5
y.mask[i] = False

Is there an ma.unmask(y[i],5)  ?

It's becoming clearer how this can work.

Josef


From pgmdevlist at gmail.com  Fri Jan 16 21:27:58 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Fri, 16 Jan 2009 21:27:58 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>
	<1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>
	<BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>
	<1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>
Message-ID: <4D56EA74-07F9-4F33-BB9C-9779321E27AA@gmail.com>

>>
>
> What's the best way of unmasking a single masked element in a masked  
> array?
>
> y.data[i] = 5
> y.mask[i] = False
>
> Is there an ma.unmask(y[i],5)  ?

Nope, but that's an idea.

Meanwhile, the easiest (and recommended way) is to do:
y[i] = 5
That way, you change the data and the mask at the same time. That  
works as long as the mask is soft (which it is, by default. To harden  
a mask, viz, to prevent masked data to be unmasked, you need to really  
want to).

If you just want to unmask without changing the value, you need to  
check whether you have a mask which is not no.mask, and change it by  
y.mask[i] = False.


Check the docs on the svn site, you'll find the draft documentation  
for numpy.ma under "maskedarray.html". You may have to build the doc  
with Sphinx, but that shouldn't be a problem.


> It's becoming clearer how this can work.

It's quite straightforward, don't worry.


From pgmdevlist at gmail.com  Fri Jan 16 22:19:36 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Fri, 16 Jan 2009 22:19:36 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>
	<1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>
	<BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>
	<1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>
Message-ID: <FEE0C842-3556-4640-844C-668C781186F1@gmail.com>

Josef,
Here's what I came up with. Note that it's just a draft.
I'm not keen on having .estimate() computes the predicted values for y  
given the original xs, but I left it nevertheless. Using predict  
instead, with a default of None that reverts to self.design would work  
better.
Another element to consider is the name of some of the attributes.  
self.b or self.yhat make sense for you, but far less for me (at least  
at first).
Let me know how it goes.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: regress.py
Type: text/x-python-script
Size: 6907 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090116/a2ab4e59/attachment.bin>
-------------- next part --------------


From hetland at tamu.edu  Sat Jan 17 10:19:34 2009
From: hetland at tamu.edu (Rob Hetland)
Date: Sat, 17 Jan 2009 09:19:34 -0600
Subject: [SciPy-user] pupynere/scipy.io.netcdf
In-Reply-To: <496B8AD0.2000600@gmail.com>
References: <496B8AD0.2000600@gmail.com>
Message-ID: <0E885718-C487-4B6E-A93B-75CB2C686DA2@tamu.edu>


On Jan 12, 2009, at 12:24 PM, Ryan May wrote:
> Anyone know if pupynere (a version of which is in scipy.io.netcdf)  
> supports
> writing files with 64-bit offsets?  This allows writing files larger  
> than 2GB.

As far as I know, pupyrnere (Pure Python NetCDF Reader) is a reader  
only, and has not write capabilities, unless something has changed  
recently.  I'm not sure a out the large file support, but I would  
suspect not.

-Rob

----
Rob Hetland, Associate Professor
Dept. of Oceanography, Texas A&M University
http://pong.tamu.edu/~rob
phone: 979-458-0096, fax: 979-845-6331


From patrickmarshwx at gmail.com  Sat Jan 17 10:45:10 2009
From: patrickmarshwx at gmail.com (Patrick Marsh)
Date: Sat, 17 Jan 2009 09:45:10 -0600
Subject: [SciPy-user] pupynere/scipy.io.netcdf
In-Reply-To: <0E885718-C487-4B6E-A93B-75CB2C686DA2@tamu.edu>
References: <496B8AD0.2000600@gmail.com>
	<0E885718-C487-4B6E-A93B-75CB2C686DA2@tamu.edu>
Message-ID: <e9962b830901170745q3f03ec5o7f439234b07adb0@mail.gmail.com>

pupynere does have write capabilities.  I use it almost daily.
However, I don't write out that large of files, so I can't answer
Ryan's question.


-Patrick


On Sat, Jan 17, 2009 at 9:19 AM, Rob Hetland <hetland at tamu.edu> wrote:
>
> On Jan 12, 2009, at 12:24 PM, Ryan May wrote:
>> Anyone know if pupynere (a version of which is in scipy.io.netcdf)
>> supports
>> writing files with 64-bit offsets?  This allows writing files larger
>> than 2GB.
>
> As far as I know, pupyrnere (Pure Python NetCDF Reader) is a reader
> only, and has not write capabilities, unless something has changed
> recently.  I'm not sure a out the large file support, but I would
> suspect not.
>
> -Rob
>
> ----
> Rob Hetland, Associate Professor
> Dept. of Oceanography, Texas A&M University
> http://pong.tamu.edu/~rob
> phone: 979-458-0096, fax: 979-845-6331
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From bkomaki at yahoo.com  Sat Jan 17 14:49:22 2009
From: bkomaki at yahoo.com (Ch B Komaki)
Date: Sat, 17 Jan 2009 11:49:22 -0800 (PST)
Subject: [SciPy-user] pupynere/scipy.io.netcdf
In-Reply-To: <e9962b830901170745q3f03ec5o7f439234b07adb0@mail.gmail.com>
Message-ID: <207511.33802.qm@web30405.mail.mud.yahoo.com>

but others ; can 
https://wiki.fysik.dtu.dk/ase/epydoc/ase.io.pupynere-pysrc.html
http://xenocoder.wordpress.com/2008/07/21/trying-sage-mathematical-software-part-ii-running-trials-1/


--- On Sat, 1/17/09, Patrick Marsh <patrickmarshwx at gmail.com> wrote:
From: Patrick Marsh <patrickmarshwx at gmail.com>
Subject: Re: [SciPy-user] pupynere/scipy.io.netcdf
To: "SciPy Users List" <scipy-user at scipy.org>
Date: Saturday, January 17, 2009, 7:15 PM

pupynere does have write capabilities.  I use it almost daily.
However, I don't write out that large of files, so I can't answer
Ryan's question.


-Patrick


On Sat, Jan 17, 2009 at 9:19 AM, Rob Hetland <hetland at tamu.edu> wrote:
>
> On Jan 12, 2009, at 12:24 PM, Ryan May wrote:
>> Anyone know if pupynere (a version of which is in scipy.io.netcdf)
>> supports
>> writing files with 64-bit offsets?  This allows writing files larger
>> than 2GB.
>
> As far as I know, pupyrnere (Pure Python NetCDF Reader) is a reader
> only, and has not write capabilities, unless something has changed
> recently.  I'm not sure a out the large file support, but I would
> suspect not.
>
> -Rob
>
> ----
> Rob Hetland, Associate Professor
> Dept. of Oceanography, Texas A&M University
> http://pong.tamu.edu/~rob
> phone: 979-458-0096, fax: 979-845-6331
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090117/ea59188b/attachment.html>

From kpdere at verizon.net  Sun Jan 18 11:17:41 2009
From: kpdere at verizon.net (Ken Dere)
Date: Sun, 18 Jan 2009 11:17:41 -0500
Subject: [SciPy-user] PyQtShell
References: <4969E26D.8050906@pythonxy.com>
Message-ID: <gkvkkv$4a0$1@ger.gmane.org>

Pierre Raybaut wrote:

> Hi all,
> 
> I would like to share with you this little open-source project of mine,
> PyQtShell:
> http://pypi.python.org/pypi/PyQtShell/
> http://code.google.com/p/pyqtshell/
> 
> I've just started it a few days ago and I worked on it only a couple of
> hours at home this week and saturday morning... so do not expect a
> revolution here.
> But I thought that some of you might be interested in contributing or
> simply testing it.
> 
> Here is an extract from the Google Code website:
> 
> PyQtShell is intended to be an extension to PyQt4 (module PyQt4.QtShell)
> providing a console application (see screenshots below) based on
> independent widgets which interact with each other:
>     - QShell, a Python shell with useful options (like a '-os' switch
> for importing os and os.path as osp, a '-pylab' switch for importing
> matplotlib in interactive mode, ...) and advanced features like code
> completion (requires QScintilla, i.e. module PyQt4.Qsci)
>     - CurrentDirChanger: shows the current directory and allows to change
>     it
> Not implemented :
>     - GlobalsExplorer: shows globals() list with some properties for
> each global (e.g. value for int or float, min and max values for arrays,
> ...) and allows to open an appropriate GUI editor
>     - and other widgets: FileExplorer, CodeEditor, ...
> 
> Cheers,
> Pierre

Looks interesting.  Can you run ipython inside?
-- 
K. Dere


From contact at pythonxy.com  Sun Jan 18 13:29:10 2009
From: contact at pythonxy.com (Pierre Raybaut)
Date: Sun, 18 Jan 2009 19:29:10 +0100
Subject: [SciPy-user] PyQtShell
In-Reply-To: <mailman.7.1232301605.19714.scipy-user@scipy.org>
References: <mailman.7.1232301605.19714.scipy-user@scipy.org>
Message-ID: <497374F6.5010700@pythonxy.com>

Date: Sun, 18 Jan 2009 11:17:41 -0500
> From: Ken Dere <kpdere at verizon.net>
> Subject: Re: [SciPy-user] PyQtShell
> To: scipy-user at scipy.org
> Cc: pyqt at riverbankcomputing.com
> Message-ID: <gkvkkv$4a0$1 at ger.gmane.org>
> Content-Type: text/plain; charset=us-ascii
>
> Pierre Raybaut wrote:
>
>   
>> Hi all,
>>
>> I would like to share with you this little open-source project of mine,
>> PyQtShell:
>> http://pypi.python.org/pypi/PyQtShell/
>> http://code.google.com/p/pyqtshell/
>>
>> I've just started it a few days ago and I worked on it only a couple of
>> hours at home this week and saturday morning... so do not expect a
>> revolution here.
>> But I thought that some of you might be interested in contributing or
>> simply testing it.
>>
>> Here is an extract from the Google Code website:
>>
>> PyQtShell is intended to be an extension to PyQt4 (module PyQt4.QtShell)
>> providing a console application (see screenshots below) based on
>> independent widgets which interact with each other:
>>     - QShell, a Python shell with useful options (like a '-os' switch
>> for importing os and os.path as osp, a '-pylab' switch for importing
>> matplotlib in interactive mode, ...) and advanced features like code
>> completion (requires QScintilla, i.e. module PyQt4.Qsci)
>>     - CurrentDirChanger: shows the current directory and allows to change
>>     it
>> Not implemented :
>>     - GlobalsExplorer: shows globals() list with some properties for
>> each global (e.g. value for int or float, min and max values for arrays,
>> ...) and allows to open an appropriate GUI editor
>>     - and other widgets: FileExplorer, CodeEditor, ...
>>
>> Cheers,
>> Pierre
>>     
>
> Looks interesting.  Can you run ipython inside?
>   
The current release does not support IPython -- the main reason being 
that I'm not using IPython so much.
I think that it should be quite easy to add this feature to PyQtShell. 
But I won't have enough time to spend on this project to add features 
that I'm not directly interested about. That's why I mentioned the 
project here, to find contributors eventually.

I've just released another version with a lot of new features -- e.g. a 
MATLAB-like workspace (with array editor and list/dict editor), an 
history log, a multiline editor, ...

See for example:
http://www.pythonxy.com/screenshot2.PNG
(http://code.google.com/p/pyqtshell/)

Pierre


From marko.loparic at gmail.com  Mon Jan 19 03:01:02 2009
From: marko.loparic at gmail.com (Marko Loparic)
Date: Mon, 19 Jan 2009 09:01:02 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific projects
Message-ID: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>

Hello,

Could you suggest links justifying the use of python instead of java
for a scientific project?

I work for a R&D departemnt of a large company. We develop
mathematical models, some of them in python.

For a new project one manager suggested me to use java instead of
python. He says that python has performance problems. (It is true that
we had performance problems in one of our python project done by a
colleague but the code was written without paying attention to that
issue).

So I am collecting links supporting the choice of python.
http://python-advocacy.wikidot.com/

It is open for anonymous edit, so please add what you want. But if you
prefer, answer me by email and I edit the site myself.

What I would like to add:

1) I am already satisfied with what I collected for the python x java
comparison. It shouldn't be too long, otherwise managers don't read.
But if you have a link with an impressive new argument/case study
please add.

2) Comparison with java specifically for scientific computing. I would
like to have links of two kinds.
- giving reasons:
- giving evidence, i.e. advanced projects which use python (pytables, pyro)
Also I would like to have link confirming/supporting impressions that I have:
  a) for a new scientific project there is no reason to prefer java
  b) if someone opts for java it is because they don't know python or
need a very specific library which does not exist in python (but there
is no such library for scientific computing)
  c) the quantity of new advanced java projects in scientific
computing is very small (how can we show that?)
  d) the important scientific computing projects in java started more
than 5 years ago
I will also make a list of scientific resources: scipy, numpy, etc.

3) On the issue of performance
- why exactly can we say that speed limiation of python is *not* a
problem to use it for high performance computing?
- argument to be developed/supported:
  - pure java loop can often beat a pure python loop, but a pure C
loop also beats a pure java loop. So if we need high performance loops
we need to use C/C++ routines (or numpy, or pyrex) anyway and it is
much better/easier to do the interface in python than in java.

If there is a better place where the community prefers to host this
kind of information, please tell me.

Thanks!
Marko


From simon.palmer at gmail.com  Mon Jan 19 03:38:19 2009
From: simon.palmer at gmail.com (Simon Palmer)
Date: Mon, 19 Jan 2009 08:38:19 +0000
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
Message-ID: <fe40ec880901190038h4bb04152xaafdd9929da6dec6@mail.gmail.com>

You may already have looked there, but there has been a bit of discussion
related to this on stackoverflow

http://stackoverflow.com/questions/371966/are-there-any-good-reasons-why-i-should-not-use-python
http://stackoverflow.com/questions/202337/how-do-i-make-the-business-case-for-python

Simon

On Mon, Jan 19, 2009 at 8:01 AM, Marko Loparic <marko.loparic at gmail.com>wrote:

> Hello,
>
> Could you suggest links justifying the use of python instead of java
> for a scientific project?
>
> I work for a R&D departemnt of a large company. We develop
> mathematical models, some of them in python.
>
> For a new project one manager suggested me to use java instead of
> python. He says that python has performance problems. (It is true that
> we had performance problems in one of our python project done by a
> colleague but the code was written without paying attention to that
> issue).
>
> So I am collecting links supporting the choice of python.
> http://python-advocacy.wikidot.com/
>
> It is open for anonymous edit, so please add what you want. But if you
> prefer, answer me by email and I edit the site myself.
>
> What I would like to add:
>
> 1) I am already satisfied with what I collected for the python x java
> comparison. It shouldn't be too long, otherwise managers don't read.
> But if you have a link with an impressive new argument/case study
> please add.
>
> 2) Comparison with java specifically for scientific computing. I would
> like to have links of two kinds.
> - giving reasons:
> - giving evidence, i.e. advanced projects which use python (pytables, pyro)
> Also I would like to have link confirming/supporting impressions that I
> have:
>  a) for a new scientific project there is no reason to prefer java
>  b) if someone opts for java it is because they don't know python or
> need a very specific library which does not exist in python (but there
> is no such library for scientific computing)
>  c) the quantity of new advanced java projects in scientific
> computing is very small (how can we show that?)
>  d) the important scientific computing projects in java started more
> than 5 years ago
> I will also make a list of scientific resources: scipy, numpy, etc.
>
> 3) On the issue of performance
> - why exactly can we say that speed limiation of python is *not* a
> problem to use it for high performance computing?
> - argument to be developed/supported:
>  - pure java loop can often beat a pure python loop, but a pure C
> loop also beats a pure java loop. So if we need high performance loops
> we need to use C/C++ routines (or numpy, or pyrex) anyway and it is
> much better/easier to do the interface in python than in java.
>
> If there is a better place where the community prefers to host this
> kind of information, please tell me.
>
> Thanks!
> Marko
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090119/c7de0f75/attachment.html>

From aisaac at american.edu  Mon Jan 19 08:35:31 2009
From: aisaac at american.edu (Alan G Isaac)
Date: Mon, 19 Jan 2009 08:35:31 -0500
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
Message-ID: <497481A3.3050307@american.edu>

http://www.amazon.com/Python-Scripting-Computational-Science-Engineering/dp/3540739157/ref=sr_1_1?ie=UTF8&s=books&qid=1232371939&sr=1-1

The introduction has some relevant discussion.

Alan Isaac


From sturla at molden.no  Mon Jan 19 11:14:28 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 19 Jan 2009 17:14:28 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
Message-ID: <4974A6E4.70000@molden.no>

On 1/19/2009 9:01 AM, Marko Loparic wrote:

> For a new project one manager suggested me to use java instead of
> python. He says that python has performance problems.

Some managers prefer Java because it is hyped; they tend to be 
ill-informed. Python does not have performance problems. But a program 
written in Python might. Usually it is not due to Python. Most likeliy, 
programs written in Java will experience the same performance problems. 
An O(N**2) algorithm in Python will still be O(N**2) in Java. One needs 
to be a bit more clever than just swap language. Python is used to run 
YouTube.com and a web spider called Googlebot. It is used to analyze 
NASA's images from the Hubble space telescope. Do you have performance 
issues that exceeds that?

Java is not commonly used for scientific computing. Scientists generally 
prefer languages like IDL, R, Matlab, S-PLUS, Mathematica, Perl and 
Python. Java requires to much 'boiler plate' code. You don't get to 
focus on the important work. Matlab and Python programs tends to be much 
shorter than Java (1/10 to 1/5 lines of code). As for performance, I 
tend to find that Python scrips (with NumPy) run faster than similar 
scripts written in Matlab. Matlab is perhaps the most commonly used 
language for numerical computing today.

The advantage of Java over Python for scientific computing is faster for 
loops. Anything else is is favour of Python. One particularly important 
issue is memory use. Python's strategy of reference counting keeps the 
memory use down at all times. Java is much more greedy on the memory, 
and only collects garbage now and then. Using to much RAM can cause the 
OS to begin swapping out pages to disk. If you are worried about speed, 
you really don't want this to happen.

Sometimes speed does matter. If a calculation takes a day in pure 
Fortran, it may take half a year in pure Python. Remember C.A.R. Hoare's 
famous statement (sometimes erroneously attributed to D. Knuth) that 
'premature optimization is the root of all evil in computer 
programming.' There is a reason we don't write everything in Fortran 77 
or assembly, even if generates the fastest code (faster than C). 
Focusing optimization on anything but the worst offending bottlenecks is 
a waste of effort. And that is why scientists don't use Java or Fortran 
all the time: Java and Fortran may be faster than Python or Matlab on 
average, but the computationally important parts will be focused in less 
than 5% of your code. There is nothing that says a program written in 
Python must be 'pure Python'. If you migrate that offending 5% to 
Fortran or C, you would beat Java in terms of speed, and still retain 
all the advantages of Python. That is why we don't have performance 
problems when using Python for HPC. We don't use Python all the time; we 
use Python where it is convenient.

Here is a 10 point strategy for writing correct and fast programs with 
Python:

1. Write everything in Python with NumPy (and possibly SciPy, 
Matplotlib, wxPython, psyco, twisted, etc.) Get a verified, working 
program. Correctness is far more important than speed in scientific 
computing. Scientists must be pedantic about correctness.

2. If your program is fast enough, quit and be happy with it. You don't 
need to fix something that works. 9 out of 10 times, the development 
cycle ends here.

3. Identify the worst bottlenecks using a profiler. Your guess and gut 
feeling will likely be incorrect.

4. If the bottlenck is I/O (disk, network, SQL server) or calls to 
libraries like SciPy, there is very little that can be done about it. 
Faster hardware may help, Java will certainly not. Java or C does not 
read data from disk etc. faster than Python.

5. Hardware is expensive but much cheaper than labour. If you can solve 
the problems by buying more hardware or better hardware, then do that.

6. If bottlenecks are most easily solved by numerical libraries, e.q. 
LAPACK, FFTW, MKL, ATLAS, GSL, etc., then use these. People have spent 
years optimizing them. There is likely nothing you can hand-code - in 
any language - that will be faster. Remember that NumPy and SciPy will 
use some of these libraries as well.

7. Did you remember to use vectorized array syntax? Neither Python (with 
NumPy) nor Matlab is meant to be used like Java. For-loops are plain 
evil. Most of Peter J. Acklam's vectorization guide to Matlab applies to 
NumPy as well:

http://home.online.no/~pjacklam/matlab/doc/mtt/doc/mtt.pdf

8. Check your algorithm. O(N) or O(N log N) is better than O(N**2) if N 
is large. This is where you can get really big speed improvements, 
regardless of language.

9. If the bottleneck cannot be solved by libraries or changing 
algorithm, re-write these parts in Fortran 95. Compile with f2py to get 
a Python callable extension module. Real scientists do not use C++ (if 
we need OOP, we have Python.)

10. If you need to use parallel processors (e.g. multicore CPUs), begin 
by inserting OpenMP directives into your Fortran code. If this is not 
enough, use the standard lib packages 'multiprocessing' or 'threading' 
for courser grained parallelism. Ensure that GIL is released if you 
choose 'threading'; f2py can release the GIL around thread-safe Fortran 
routines.


Regards,
Sturla Molden


From scotta_2002 at yahoo.com  Mon Jan 19 11:28:26 2009
From: scotta_2002 at yahoo.com (Scott Askey)
Date: Mon, 19 Jan 2009 08:28:26 -0800 (PST)
Subject: [SciPy-user] integrate.ode 2 dimensional simple harmonic oscillator
	code
Message-ID: <545293.8031.qm@web36502.mail.mud.yahoo.com>

Do ode and odeint work in multiple dimensions?

I could not any examples with more than one degree of freedom.  And from the doc string it how to solve simultaneous  ode's was not obvious.  The code for modelling a 2d simple harmonic oscillator or spherical pendulum would give me the insight I need. 

I found and understand the following 1 D harmonic oscillator model from the scipy cookbook.

V/R

Scott


From bsouthey at gmail.com  Mon Jan 19 11:53:11 2009
From: bsouthey at gmail.com (Bruce Southey)
Date: Mon, 19 Jan 2009 10:53:11 -0600
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <FEE0C842-3556-4640-844C-668C781186F1@gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>	<1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>	<BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>	<1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>
	<FEE0C842-3556-4640-844C-668C781186F1@gmail.com>
Message-ID: <4974AFF7.7050902@gmail.com>

Hi,
I do not want to sound overly critical as I would like to assist with this.

I am not sure of what your design goal is and what should the Regression 
class actually contain and do.  Do you want something like an R lm object?

But, I do not like that your _init_ function does so much work that 
probably does not belong there. I would have thought that it would just 
initialize certain important variables including the solutions (b). One 
reasons is perhaps you just want to update the input arrays but this 
design forces you to create a new object.

At what stage should you check for valid inputs and the correct 
dimensions of x?

I would also prefer the object having the standard errors of the 
solutions and eventually other 'useful' statistics like sum of squares, 
R-squared. However, I do not know how to the standard errors using 
linalg.lstsq (I vaguely recall there is another way that would work). 
The others are easy to get probably on demand like R's summary function.

Bruce
 

From bastian.weber at gmx-topmail.de  Mon Jan 19 12:20:46 2009
From: bastian.weber at gmx-topmail.de (Bastian Weber)
Date: Mon, 19 Jan 2009 18:20:46 +0100
Subject: [SciPy-user] integrate.ode 2 dimensional simple harmonic
 oscillator code
In-Reply-To: <545293.8031.qm@web36502.mail.mud.yahoo.com>
References: <545293.8031.qm@web36502.mail.mud.yahoo.com>
Message-ID: <4974B66E.9060707@gmx-topmail.de>

Hello Scott,

> Do ode and odeint work in multiple dimensions?

of course they do. Here is an example of a 2-dim problem (the 1-D
harmonic oscillator, which is a _second_ order system), I wrote some
time ago.

#!/usr/bin/python
# -*- coding: utf8 -*-

from scipy import *
from scipy.integrate import odeint
import pylab

#parameters
delta=0.1   # damping
omega_2=100 # means omega**2


t=r_[0:100:.01]# time vector
x0=[0,10]   # initial state


def rhs(x,t):
    """
    right hand side of the statespace equation

    in this cas two-dimensional
    """
    x1_dot=x[1]
    x2_dot=-(2*delta*x[1]+omega_2*x[0])

    return [x1_dot, x2_dot]

x=odeint(rhs, x0, t,)
print shape(x)

x1=x[:,0]
x2=x[:,1]
pylab.plot(t, x1, "k-")
pylab.show()


Scott Askey schrieb:

> 
> I could not any examples with more than one degree of freedom. And
> from the doc string it how to solve simultaneous ode's was not obvious.
> The code for modelling a 2d simple harmonic oscillator or spherical
> pendulum would give me the insight I need.
> 
> I found and understand the following 1 D harmonic oscillator model
from the scipy cookbook.
> 
> V/R
> 
> Scott
> 

Reading your message again now, I get the impression, the code above
might not fit your needs..

However, what you have to do is to get your system of interest in a
*state space* representation, i. e. writing down a system of first order
differential equations. In the case of 2-D oscillator your x array would
consist of 4 elements and rhs would return the time derivate of these 4
magnitudes, i. e. also an array of length 4.


Regards,
Bastian.


From lorenzo.isella at gmail.com  Mon Jan 19 13:41:34 2009
From: lorenzo.isella at gmail.com (Lorenzo Isella)
Date: Mon, 19 Jan 2009 19:41:34 +0100
Subject: [SciPy-user] SciPy and Cython
Message-ID: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>

Dear All,
I am used to resorting to f2py for the numerical intensive bottlenecks
of my Python codes.
However, I have recently come across Cython. From the examples on:
http://docs.cython.org/docs/tutorial.html#the-basics-of-cython
it looks like I can directly write my functions in Python and then
easily build a Cython extension.
This sounds sweet music to me, but the fact is that more often than
now my functions would need a scipy array as an input.
I read somewhere that Cython is better integrated with numpy rather
than scipy; is this really the case?
Can anyone tell me if there is any caveat I should be aware of when
writing Cython extensions which operate on, let's say, numpy arrays?
Kind Regards

Lorenzo


From robert.kern at gmail.com  Mon Jan 19 14:09:33 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 19 Jan 2009 13:09:33 -0600
Subject: [SciPy-user] SciPy and Cython
In-Reply-To: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>
References: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>
Message-ID: <3d375d730901191109k124c157bh2ed3ed36788ca392@mail.gmail.com>

On Mon, Jan 19, 2009 at 12:41, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
> Dear All,
> I am used to resorting to f2py for the numerical intensive bottlenecks
> of my Python codes.
> However, I have recently come across Cython. From the examples on:
> http://docs.cython.org/docs/tutorial.html#the-basics-of-cython
> it looks like I can directly write my functions in Python and then
> easily build a Cython extension.
> This sounds sweet music to me, but the fact is that more often than
> now my functions would need a scipy array as an input.
> I read somewhere that Cython is better integrated with numpy rather
> than scipy; is this really the case?

There is no such thing as a scipy array. scipy uses numpy.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From peridot.faceted at gmail.com  Mon Jan 19 14:24:52 2009
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Mon, 19 Jan 2009 14:24:52 -0500
Subject: [SciPy-user] SciPy and Cython
In-Reply-To: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>
References: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>
Message-ID: <ce557a360901191124u50f18765he35dec104b3adc4d@mail.gmail.com>

2009/1/19 Lorenzo Isella <lorenzo.isella at gmail.com>:

> I am used to resorting to f2py for the numerical intensive bottlenecks
> of my Python codes.
> However, I have recently come across Cython. From the examples on:
> http://docs.cython.org/docs/tutorial.html#the-basics-of-cython
> it looks like I can directly write my functions in Python and then
> easily build a Cython extension.
> This sounds sweet music to me, but the fact is that more often than
> now my functions would need a scipy array as an input.
> I read somewhere that Cython is better integrated with numpy rather
> than scipy; is this really the case?
> Can anyone tell me if there is any caveat I should be aware of when
> writing Cython extensions which operate on, let's say, numpy arrays?

Yes, Cython is very well suited to just the sort of use you describe.
If you'd like to see what it looks like, the current development
versions of scipy include the module scipy.spatial, which contains a
pure python and a derived cython implementation of kd-trees.

Anne


From sturla at molden.no  Mon Jan 19 14:42:21 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 19 Jan 2009 20:42:21 +0100
Subject: [SciPy-user] SciPy and Cython
In-Reply-To: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>
References: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>
Message-ID: <4974D79D.4030301@molden.no>

On 1/19/2009 7:41 PM, Lorenzo Isella wrote:

> Can anyone tell me if there is any caveat I should be aware of when
> writing Cython extensions which operate on, let's say, numpy arrays?

Fortran is a mature language, Cython is not. On the other hand, Cython 
looks more like Python, hand has all of Pythons types available. Cython 
also integrates easier with C (though this will change with Fortran 2003).

Has anyone tried to compare Fortran 95 with Cython for numerical computing?


Sturla Molden


From matthew.brett at gmail.com  Mon Jan 19 14:40:36 2009
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 19 Jan 2009 14:40:36 -0500
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4974A6E4.70000@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
Message-ID: <1e2af89e0901191140m10b5557dt76be32f4164ae30b@mail.gmail.com>

Hi,

> Here is a 10 point strategy for writing correct and fast programs with
> Python:

Thank you for this excellent summary.

Best,

Matthew


From sturla at molden.no  Mon Jan 19 14:46:38 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 19 Jan 2009 20:46:38 +0100
Subject: [SciPy-user] SciPy and Cython
In-Reply-To: <ce557a360901191124u50f18765he35dec104b3adc4d@mail.gmail.com>
References: <a2b3004b0901191041i5326584bnb6cd645f43a78294@mail.gmail.com>
	<ce557a360901191124u50f18765he35dec104b3adc4d@mail.gmail.com>
Message-ID: <4974D89E.5070303@molden.no>

On 1/19/2009 8:24 PM, Anne Archibald wrote:

> Yes, Cython is very well suited to just the sort of use you describe.
> If you'd like to see what it looks like, the current development
> versions of scipy include the module scipy.spatial, which contains a
> pure python and a derived cython implementation of kd-trees.

http://svn.scipy.org/svn/scipy/trunk/scipy/spatial/ckdtree.pyx


S.M.


From matthew.brett at gmail.com  Mon Jan 19 14:40:36 2009
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 19 Jan 2009 14:40:36 -0500
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4974A6E4.70000@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
Message-ID: <1e2af89e0901191140m10b5557dt76be32f4164ae30b@mail.gmail.com>

Hi,

> Here is a 10 point strategy for writing correct and fast programs with
> Python:

Thank you for this excellent summary.

Best,

Matthew


From lists_ravi at lavabit.com  Mon Jan 19 14:58:19 2009
From: lists_ravi at lavabit.com (Ravi)
Date: Mon, 19 Jan 2009 14:58:19 -0500
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4974A6E4.70000@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
Message-ID: <200901191458.19218.lists_ravi@lavabit.com>

Hi,
  The advice from Mr. Molden is well-argued, but he does gloss over a few of 
the difficulties. These serious problems are also present in Matlab & Java for 
the most part.

1. The python packaging system is junk. Matlab & Java get around this problem 
by not really having a packing system (leading to even worse confusion). PyPi 
& setuptools are painful (search enthought-dev & ipython-dev lists for the 
last year, especially for posts from Fernando Perez & Gael Varoquaux, for more 
information).

2. Installation/compilation of C/C++ extensions/wrappers: Matlab's cohesive 
toolbox shines here; their method is clearly documented and works reasonably 
well across all the platforms they support (at least on Solaris, HPUX, Linux & 
Windows, the plaforms I work with). Java extensions are, IMHO, reasonably 
straightforward to maintain, but python distutils takes everything to a whole 
new level of nightmare. For distutils difficulties, simply search the archives 
for this mailing list (especially those from David Cournepeau).

3. The lack of a real JIT compiler is a serious issue if the use cases involve 
more than linear algebra and differential equation solvers. In many such 
cases, for-loops and/or while-loops are the only reasonable solutions, both of 
which, very often, execute much faster under Matlab or Java. Some operations 
are simply not vectorizable if you wish to have maintainable code, e.g., large 
groups of interacting state machines.

4. Both Java & Matlab have very well-thought out IDEs. As I don't use IDEs 
myself, I cannot comment on their ease of use, but my colleagues who do work 
with them find them extremely useful. Neither eclipse-pydev nor eric3 is 
anywhere close to the Matlab IDE workspace. Java has several very nice IDEs 
but none of them are as useful as the Matlab IDE. A related issue is the lack 
of a decent debugger; pydb+ipython is the best one I have come across for 
python, but they are nowhere near Matlab/Java offerings.

In spite of the issues highlighted above, Python is still the best choice, 
beacuse of the large library and because of the well-designed language 
specification. (Cpython's shortcomings are well-known and will eventually be 
addressed by PyPy and the like; in some computation-intensive cases, even 
IronPython beats out cpython, go figure.)

Mr. Molden has provided a very good summary of the Python workflow but there 
is one issue that keeps rearing its ugly head on the numpy/scipy lists over & 
over again:

On Monday 19 January 2009 11:14:28 Sturla Molden wrote:
> 9. If the bottleneck cannot be solved by libraries or changing
> algorithm, re-write these parts in Fortran 95. Compile with f2py to get
> a Python callable extension module. Real scientists do not use C++ (if
> we need OOP, we have Python.)

I completely agree with the first part of the point above. (Use Fortran95 or 
many of the other languages which have very good numerical performance to 
speed up bottlenecks). However, the last part is merely ugly prejudice. Like 
python, Fortran, and other languages, C++ does have its place in scientific 
computing. Here's one example which, in my experience, is completely 
impossible to do in python, Matlab, Java or even C:
  The bottleneck in one our simulations is a fixed point FFT computation 
followed by a modified gradient search. Try implementing serious fixed-point 
computation with, say, 13-bit numbers, some of which are optimally expressed 
in log-normal form and the others in the standard form, on 
python/Matlab/Java/C. You will end up with either unmaintainable code or 
unusably slow code. C++ templates & a little bit of metaprogramming make 
prototyping the algorithm easy (because you can use doubles to verify data 
flow) while simultaneously making it easy to enhance the prototype quickly 
into fixed point code (simply by replacing types and running some automated 
tests to find appropriate bit-widths). In our case, we needed to optimize the 
radix of the underlying FFTs as well because of some high throughput 
considerations.

Admittedly, the problem considered above is pretty difficult & pretty 
specialized, but the beauty of C++ or even of PL/1 is that it makes certain 
difficult problems tractable: problems which are practically impossible to 
solve with python/Java/Matlab/C. Leave your programming language prejudices at 
home when you consider afresh the optimal solutions to your problem.

Regards,
Ravi


From josef.pktd at gmail.com  Mon Jan 19 15:30:57 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 19 Jan 2009 15:30:57 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <4974AFF7.7050902@gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>
	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>
	<1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>
	<BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>
	<1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>
	<FEE0C842-3556-4640-844C-668C781186F1@gmail.com>
	<4974AFF7.7050902@gmail.com>
Message-ID: <1cd32cbb0901191230o2db1b78bpd6482c2c34749b6c@mail.gmail.com>

On Mon, Jan 19, 2009 at 11:53 AM, Bruce Southey <bsouthey at gmail.com> wrote:
> Hi,
> I do not want to sound overly critical as I would like to assist with this.
>
> I am not sure of what your design goal is and what should the Regression
> class actually contain and do.  Do you want something like an R lm object?
>

After our previous discussion, I was thinking how a more useful
interface for regression analysis could look like (without using a
full formula framework).
The ols estimation part was just a quick example for a regression. The
main purpose of this is to find a good interface.

My basic idea was to create a general regression class, that does the
initialization and common calculation and the specific estimation is
implemented in the subclasses, e.g. OLS, GLS, Ridge,...(?). For the
basic statistics, which are calculated on demand, I'm planning
something like http://www.scipy.org/Cookbook/OLS. I would like also to
get a similar wrapper  for non-linear least squares, optimize.leastsq,
because that doesn't produce the (normalized) parameter standard
errors, or for non-linear maximum likelihood estimator. Also, I don't
want to duplicate work in stats.models.

> But, I do not like that your _init_ function does so much work that
> probably does not belong there. I would have thought that it would just
> initialize certain important variables including the solutions (b). One
> reasons is perhaps you just want to update the input arrays but this
> design forces you to create a new object.
>
> At what stage should you check for valid inputs and the correct
> dimensions of x?

I think this is the main point of the __init__ function, to transform
the data from a convenient specification for the user to the form that
is used by the estimation procedures. If the design matrix is not
already a "clean" array, then it needs to copied at least once to be
handed to linalf (as far as I understand this)

>
> I would also prefer the object having the standard errors of the
> solutions and eventually other 'useful' statistics like sum of squares,
> R-squared. However, I do not know how to the standard errors using
> linalg.lstsq (I vaguely recall there is another way that would work).
> The others are easy to get probably on demand like R's summary function.

My current thinking is that some basic statistics, such as estimated
parameters, standard errors, pvalues and R^2 are returned immediately.
More in depth analysis of the residuals are returned on demand in a
special result structure (class) (?).

Just to see how it works, I copied the cookbook ols recipe into my ols
class, I had to adjust the dimension (1D to 2D columns). It looks ok,
but I didn't test more than making sure it runs and looks reasonable.
Several of the test functions can go into the regression class, since
they apply also for other regression models not just OLS.

The current version takes one or several arrays as inputs. Inputs can
be 1D or 2D, and can be either numpy arrays or masked arrays. All
observations with at least one masked or nan value are removed from
the regression. Other array subclasses are not yet included.
Predicted values, called yhat, are masked arrays if the input was
masked or contained nans. If input are plain  numpy arrays with finite
values, then the predicted value array, yhat, is also a np array. (I
haven't checked the copied part yet, eg. the error vector is on the
compressed values)

I attached the updated version. lm in R looks a bit like an umbrella
function, but I have not used it enough to tell what useful features
should be copied by a regression class. Generalized linear models will
be in stats.models.

Josef
-------------- next part --------------
import numpy as np
import numpy.ma as ma
import numpy.random as mtrand #import randn, seed
import matplotlib.pylab as plt
import time
from scipy import linalg, stats
#from numpy.testing import assert_almost_equal
from numpy.ma.testutils import assert_almost_equal

import regressionanalysis_ma2 as _ini

class Regression(object):
    def __init__(self,y,*args,**kwds):
        '''
        Parameters
        ----------

        y: array, 1D or 2D
            variable of regression. If 2D, then it needs to be one column
        args: sequence of 1D or 2D arrays
            one or several arrays of regressors. If 2D than interpretation
            is that each column represents one variable and rows are
            observations
        kwds: addconst (default: True)
            if True then a constant is added to the regressor

        return
        ------
        class instance with regression results

        Notes
        -----

        Observation (rows) that contain masked values in a masked array or nans
        in any of the regressors or in y will be dropped for the calculation.
        Arrays that correspond to observation, e.g. estimated y (yhat)
        or residuals, are returned as masked arrays if masked values or nans are
        in the input data.

        Usage
        -----
        estm = Ols(y, x1, x2)
        estm.b  # estimated parameter vector
        
        example polynomial
        estv = Ols(y, np.vander(x[:,0],3), x[:,1], addconst=False)
        estv.b  # estimated parameter vector
        estv.yhat # fitted values
        
        '''
        self.addconst = kwds.pop('addconst', True)
        self.y_varnm = kwds.pop('y_varnm','y')
        self.x_varnm = kwds.pop('x_varnm',[])
        if not isinstance(self.x_varnm,list):
            self.x_varnm = list(self.x_varnm)
        # Make sure that any NaN in y and args are masked, and force masked arrays
        y = ma.fix_invalid(y)
        if y.ndim == 1:
            y.shape = (-1, 1)
        design = list(args)
        #design = [ma.fix_invalid(x) for x in args]
        # We need a constant: add a column of 1
        if self.addconst:
            design.insert(0, np.ones(y.shape))
            self.x_varnm.insert(0, 'const')
            

        # stack first than fix; saves one copy, maybe not
        design = ma.column_stack(design)  # convert types and make copy
        #does it make copy if args = x single 2D array
        design = ma.fix_invalid(design, copy=False)

        
        # Find the masked elements
        (ymask, designmask) = (ma.getmaskarray(y), ma.getmaskarray(design))
        mask = ma.mask_or(ymask, designmask, shrink=False).any(axis=-1)
        valid = ~mask
        # And now, drop them
        self.y = y.data[valid,:]
        self.design = design.data[valid]
        self.x = self.design    # copy to use cookbook ols
        self.mask = mask
        self.ymasked = ma.array(y, mask=mask, keep_mask=True)

        for ii in range(len(self.x_varnm),self.design.shape[1]+1):
            self.x_varnm.append('var%002d'%ii)
        #
        #self.yhat = None
        self.estimate()


    def estimate(self):
        pass

    def get_yhat(self):
        pass
        

    def summary(self):
        pass

    def predict(self, x):
        pass


class Ols(Regression):
    def __init__(self,y,*args,**kwds):
        Regression.__init__(self, y, *args, **kwds)

    def estimate(self):
        (y, x) = self.y, self.design
        #print 'y.shape, x.shape'
        #print y.shape, x.shape
        (self.b, self.resid, rank, self.sigma) = linalg.lstsq(x, y)
        #
##        yhat = ma.empty_like(self.ymasked)
##        mask = self.mask
##        if mask is not ma.nomask:
##            yhat[~mask, :] = np.dot(self.design, self.b)
##        else:
##            yhat[:,:] = np.dot(self.design, self.b)
        #self.yhat = yhat
        self.yhat = self.predict()
        #print rank

        # estimating coefficients, and basic stats
        self.inv_xx = linalg.pinv(np.dot(self.x.T,self.x)) # use Moore-Penrose pseudoinverse
        xy = np.dot(self.x.T,self.y)                   # estimate coefficients
        #print self.b.shape   # column vector
        self.nobs = self.y.shape[0]                     # number of observations
        self.ncoef = self.x.shape[1]                    # number of coef.
        self.df_e = self.nobs - self.ncoef              # degrees of freedom, error
        self.df_r = self.ncoef - 1                      # degrees of freedom, regression

        self.e = self.y - np.dot(self.x,self.b)            # residuals
        self.sse = np.dot(self.e.T,self.e)/self.df_e         # SSE
        self.se = np.sqrt(np.diagonal(self.sse*self.inv_xx))[:,np.newaxis]  # coef. standard errors
        self.t = self.b / self.se                       # coef. t-statistics
        self.p = (1-stats.t.cdf(np.abs(self.t), self.df_e)) * 2    # coef. p-values

        self.R2 = 1 - self.e.var()/self.y.var()         # model R-squared
        self.R2adj = 1-(1-self.R2)*((self.nobs-1)/(self.nobs-self.ncoef))   # adjusted R-square

        self.F = (self.R2/self.df_r) / ((1-self.R2)/self.df_e)  # model F-statistic
        self.Fpv = 1-stats.f.cdf(self.F, self.df_r, self.df_e)  # F-statistic p-value

    def dw(self):
        """
        Calculates the Durbin-Waston statistic
        """
        de = np.diff(self.e,1,axis=0)
        dw = np.dot(de.T,de) / np.dot(self.e.T,self.e);

        return dw

    def omni(self):
        """
        Omnibus test for normality
        """
        return stats.normaltest(self.e)

    def JB(self):
        """
        Calculate residual skewness, kurtosis, and do the JB test for normality
        """

        # Calculate residual skewness and kurtosis
        skew = stats.skew(self.e)
        kurtosis = 3 + stats.kurtosis(self.e)

        # Calculate the Jarque-Bera test for normality
        JB = (self.nobs/6) * (np.square(skew) + (1/4)*np.square(kurtosis-3))
        JBpv = 1-stats.chi2.cdf(JB,2);

        return JB, JBpv, skew, kurtosis

    def ll(self):
        """
        Calculate model log-likelihood and two information criteria
        """

        # Model log-likelihood, AIC, and BIC criterion values
        ll = -(self.nobs*1/2)*(1+np.log(2*np.pi)) - \
             (self.nobs/2)*np.log(np.dot(self.e.T,self.e)/self.nobs)
        aic = -2*ll/self.nobs + (2*self.ncoef/self.nobs)
        bic = -2*ll/self.nobs + (self.ncoef*np.log(self.nobs))/self.nobs

        return ll, aic, bic

    def summary(self):
        """
        Printing model output to screen
        """

        # local time & date
        t = time.localtime()

        # extra stats
        ll, aic, bic = self.ll()
        JB, JBpv, skew, kurtosis = self.JB()
        omni, omnipv = self.omni()

        # printing output to screen
        print '\n=============================================================================='
        print "Dependent Variable: " + self.y_varnm
        print "Method: Ordinary Least Squares"
        print "Date: ", time.strftime("%a, %d %b %Y",t)
        print "Time: ", time.strftime("%H:%M:%S",t)
        print '# obs:           %5.0f' % self.nobs
        print '# variables:     %5.0f' % self.ncoef
        print '=============================================================================='
        print 'variable     coefficient     std. Error    t-statistic       prob.'
        print '=============================================================================='
        for i in range(self.ncoef):
            #print self.x_varnm[i],self.b[i],self.se[i],self.t[i],self.p[i]
            print '''% -5s          % 9.4f     % 9.4f     % 9.4f     % 9.4f''' % tuple(
                [self.x_varnm[i],self.b.ravel()[i],self.se.ravel()[i],self.t.ravel()[i],self.p.ravel()[i]])
        print '=============================================================================='
        print 'Models stats                           Residual stats'
        print '=============================================================================='
        print 'R-squared            % 9.4f         Durbin-Watson stat  % 9.4f' % tuple([self.R2, self.dw()])
        print 'Adjusted R-squared   % 9.4f         Omnibus stat        % 9.4f' % tuple([self.R2adj, omni])
        print 'F-statistic          % 9.4f         Prob(Omnibus stat)  % 9.4f' % tuple([self.F, omnipv])
        print 'Prob (F-statistic)   % 9.4f         JB stat             % 9.4f' % tuple([self.Fpv, JB])
        print 'Log likelihood       % 9.4f         Prob(JB)            % 9.4f' % tuple([ll, JBpv])
        print 'AIC criterion        % 9.4f         Skew                % 9.4f' % tuple([aic, skew])
        print 'BIC criterion        % 9.4f         Kurtosis            % 9.4f' % tuple([bic, kurtosis])
        print '=============================================================================='


    def predict(self, x=None):
        '''redurn prediction
        todo: add prediction error, confidence intervall'''
        if x is None:
            x = self.design
            yest = np.dot(x, self.b)
            if (not self.mask is None) and self.mask.any():
                # (not self.mask is None) for shorthand, not used currently
                # (mask is not ma.nomask) is not necessary b/c no shrink
                output = ma.masked_all_like(self.ymasked)
                output[~self.mask, :] = yest
                # does not remove mask for unmasked
                return output
            else:
                return yest
            
        else:
            x = ma.fix_invalid(x)
            if self.addconst:
                x = ma.column_stack((ma.ones(x.shape[0]), x))
            output = ma.dot(x, self.b)
            #maybe detend 
            mask = output.mask
            if (mask is not ma.nomask) and (not mask.any()):
                #difficult to read 2 negatives
                output = output.data
        return output


def cookbook_example():
    xxsingular = False#True
    x = np.linspace(0, 15, 40)
    a,b,c = 3.1, 42, -304.2
    y_true = a*x**2 + b*x + c
    y_meas = y_true + 100.01*np.random.standard_normal( y_true.shape )
    if xxsingular:
        xx = np.c_[x**2,x,2*x,np.ones(x.shape[0])]
    else:
        xx = np.c_[x**2,x,np.ones(x.shape[0])]
    
    x_varnm = ['const', 'x1','x2','x3','x4']
    
    k = xx.shape[1]
    #m = Ols(y_meas,xx,y_varnm = 'y',x_varnm = x_varnm[:k-1],addconst = False)
    m = Ols(y_meas,xx,y_varnm = 'y',x_varnm = x_varnm[:k],addconst = False)
    m.summary()
    
    #matplotlib ploting
    
    plt.title('Linear Regression Example')
    plt.plot(x,y_true,'g.--')
    plt.plot(x,y_meas,'k.')
    plt.plot(x,y_meas-m.e.flatten(),'r.-')
    plt.legend(['original','plus noise', 'regression'], loc='lower right')
    np.testing.assert_almost_equal(y_meas[:,np.newaxis]-m.e,m.yhat)
    return m


if __name__ == '__main__':

    import numpy as np
    #from olsexample import ols

    def generate_data(nobs):
       x = np.random.randn(nobs,2)
       btrue = np.array([[5,1,2]]).T
       y = np.dot(x, btrue[1:,:]) + btrue[0,:] + 0.5 * np.random.randn(nobs,1)
       return y,x

    y,x = generate_data(15)

    #benchmark no masked arrays, and one 2D array for x
    est = Ols(y[1:,:],x[1:,:])  # initialize and estimate with ols, constant added by default
    _est = _ini.Ols(y[1:,:],x[1:,:])  # initialize and estimate with ols, constant added by default
    print 'ols estimate'
    est.estimate()
    print est.b.T
    print _est.b.T
    #print np.array([[5,1,2]])  # true coefficients

    ynew,xnew = generate_data(3)
    ypred = est.predict(xnew)

    print '    ytrue        ypred        error'
    print np.c_[ynew, ypred, ynew - ypred]
    
    ypred = _est.predict(xnew)
    print np.c_[ynew, ypred, ynew - ypred]
    assert not isinstance(est.yhat, ma.MaskedArray)
    assert not isinstance(ypred, ma.MaskedArray)


    #case masked array y
    ym = y.copy()
    ym[0,:] = np.nan
    ym = ma.masked_array(ym, np.isnan(ym))
    estm1 = Ols(ym,x)
    _estm1 = _ini.Ols(ym,x)
    print estm1.b.T
    print estm1.yhat.shape
    print _estm1.b.T
    print _estm1.yhat.shape
    print 'yhat'
    print estm1.yhat[:10,:]
    print _estm1.yhat[:10,:]
    assert_almost_equal(estm1.yhat[1:,:], est.yhat)
    assert isinstance(estm1.yhat, ma.MaskedArray)
    assert not isinstance(estm1.predict(x[:3,]), ma.MaskedArray)
    # not sure about this one

    #masked y and several x args, addconst=False
    estm2 = Ols(ym,np.ones(ym.shape),x[:,0],x[:,1],addconst=False)
    _estm2 = _ini.Ols(ym,np.ones(ym.shape),x[:,0],x[:,1],addconst=False)
    print estm2.b.T
    print _estm2.b.T
    assert_almost_equal(estm2.b, estm1.b)
    assert_almost_equal(estm2.yhat, estm1.yhat)
    assert isinstance(estm2.yhat, ma.MaskedArray)

    #masked y and several x args, 
    estm3 = Ols(ym,x[:,0],x[:,1]) 
    _estm3 = _ini.Ols(ym,x[:,0],x[:,1])
    print estm3.b.T
    print _estm3.b.T
    assert_almost_equal(estm3.b, estm1.b)
    assert_almost_equal(estm3.yhat, estm2.yhat)
    assert isinstance(estm3.yhat, ma.MaskedArray)

    #masked array in y and one x variable
    x_0 = x[:,0].copy()  # is copy necessary?
    x_0[0] = np.nan
    x_0 = ma.masked_array(x_0, np.isnan(x_0))
    estm4 = Ols(ym,x_0,x[:,1])
    _estm4 = Ols(ym,x_0,x[:,1])
    print estm4.b.T
    print _estm4.b.T
    assert_almost_equal(estm4.b, estm1.b)
    assert_almost_equal(estm4.yhat, estm1.yhat)
    assert isinstance(estm4.yhat, ma.MaskedArray)

    #masked array in one x variable, but not in y
    x_0 = x[:,0].copy()  # is copy necessary?
    x_0[0] = np.nan
    x_0 = ma.masked_array(x_0, np.isnan(x_0))
    estm5 = Ols(y,x_0,x[:,1])   #, addconst=False)
    _estm5 = _ini.Ols(y,x_0,x[:,1])   #, addconst=False)
    print estm5.b.T
    print _estm5.b.T
    assert_almost_equal(estm5.b, estm1.b)
    assert_almost_equal(estm5.yhat, estm2.yhat)
    assert isinstance(estm5.yhat, ma.MaskedArray)
    #assert np.all(estm5.yhat == estm1.yhat)

    #example polynomial
    print 'example with one polynomial x added'
    estv = Ols(y,np.vander(x[:,0],3), x[:,1], addconst=False)
    _estv = _ini.Ols(y,np.vander(x[:,0],3), x[:,1], addconst=False)
    print estv.b.T
    print _estv.b.T
    print estv.yhat
    print _estv.yhat
    assert not isinstance(estv.yhat, ma.MaskedArray)

    m = cookbook_example()

From marko.loparic at gmail.com  Mon Jan 19 15:33:43 2009
From: marko.loparic at gmail.com (Marko Loparic)
Date: Mon, 19 Jan 2009 21:33:43 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4974A6E4.70000@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
Message-ID: <e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>

Thanks Sturla and all the others for this very interesting discussion!

> There is nothing that says a program written in
> Python must be 'pure Python'. If you migrate that offending 5% to
> Fortran or C, you would beat Java in terms of speed, and still retain
> all the advantages of Python. That is why we don't have performance
> problems when using Python for HPC. We don't use Python all the time; we
> use Python where it is convenient.

Specifically on this point, playing the devil's advocate, one could
argue that using java we could also migrate the offending routine to
C. Is there an element to argue that the Python/C mix is simpler or
more powerful than the same for java/C?

I saw also somewhere the argument that python tools containing C/C++
routines like numpy and wxpython are more naturally or easily made for
python than for java. Is that really true?


From aisaac at american.edu  Mon Jan 19 15:39:06 2009
From: aisaac at american.edu (Alan G Isaac)
Date: Mon, 19 Jan 2009 15:39:06 -0500
Subject: [SciPy-user] Ols for np.arrays and masked arrays
In-Reply-To: <1cd32cbb0901191230o2db1b78bpd6482c2c34749b6c@mail.gmail.com>
References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com>	<47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com>	<1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com>	<BB51D16B-0C84-482A-8717-EDF06B85375F@gmail.com>	<1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com>	<FEE0C842-3556-4640-844C-668C781186F1@gmail.com>	<4974AFF7.7050902@gmail.com>
	<1cd32cbb0901191230o2db1b78bpd6482c2c34749b6c@mail.gmail.com>
Message-ID: <4974E4EA.2060001@american.edu>

On 1/19/2009 3:30 PM josef.pktd at gmail.com apparently wrote:
> The ols estimation part was just a quick example for a regression. The
> main purpose of this is to find a good interface.
> 
> My basic idea was to create a general regression class, that does the
> initialization and common calculation and the specific estimation is
> implemented in the subclasses, e.g. OLS, GLS, Ridge,...(?). 

I started thinking about this for a single equation
awhile back:
http://code.google.com/p/econpy/source/browse/trunk/pytrix/ls.py
(See the OLS class.)  I will return to this one day, probably
within the next couple months.

Alan Isaac


From robert.kern at gmail.com  Mon Jan 19 15:41:46 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 19 Jan 2009 14:41:46 -0600
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
Message-ID: <3d375d730901191241r67176133k5edb7460d9f5caea@mail.gmail.com>

On Mon, Jan 19, 2009 at 14:33, Marko Loparic <marko.loparic at gmail.com> wrote:
> Thanks Sturla and all the others for this very interesting discussion!
>
>> There is nothing that says a program written in
>> Python must be 'pure Python'. If you migrate that offending 5% to
>> Fortran or C, you would beat Java in terms of speed, and still retain
>> all the advantages of Python. That is why we don't have performance
>> problems when using Python for HPC. We don't use Python all the time; we
>> use Python where it is convenient.
>
> Specifically on this point, playing the devil's advocate, one could
> argue that using java we could also migrate the offending routine to
> C. Is there an element to argue that the Python/C mix is simpler or
> more powerful than the same for java/C?
>
> I saw also somewhere the argument that python tools containing C/C++
> routines like numpy and wxpython are more naturally or easily made for
> python than for java. Is that really true?

The JNI is notoriously difficult. The Python C API is relatively
straightforward, and there are tools (SWIG, Cython, f2py) that make it
even easier. Or you can avoid it entirely with ctypes.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From contact at pythonxy.com  Mon Jan 19 16:06:56 2009
From: contact at pythonxy.com (Pierre Raybaut)
Date: Mon, 19 Jan 2009 22:06:56 +0100
Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.10
Message-ID: <4974EB70.1090406@pythonxy.com>

Hi all,

Release 2.1.10 is now available on http://www.pythonxy.com:
   - All-in-One Installer ("Full Edition"),
   - Plugin Installer -- to be downloaded with xyweb,
   - Update

Changes history
Version 2.1.10 (01-17-2009)

     * Updated:
           o Python 2.5.4
           o Pydev 1.4.2
           o SciTE 1.77.1 - Code completion is now available (see 
http://www.pythonxy.com/_tools/img.php?img=/_images/SciTE.png) thanks to 
the added file 'python.api' which was built using Python(x,y) with 
recommended installation settings (you may update this file to take into 
account your own installation settings thanks to a start menu shortcut)
           o WinMerge 2.10.4
           o xy 1.0.19
           o PyQt4 4.4.3.6 (minor update: pyuic4.bat)
           o pydicom 0.9.2
           o Default Python path is now C:\Python25 -- if you want to 
change Python path, you must of course reinstall Python(x,y)
     * Corrected:
           o Issue 60: xyhome does not start on XP/Vista 64 bits
           o Issue 61: scipy.weave does not work well with Python 
default installation folder
           o Issue 62: after closing xyhome, pythonw.exe process is 
still alive
           o Issue 64: error message 'Array variable subscript badly 
formatted' at the end of Python(x,y) installation
           o Issue 68: PyQt4: pyuic4.bat is not modified according to 
Python install location


Regards,
Pierre Raybaut


From sturla at molden.no  Mon Jan 19 16:16:57 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 19 Jan 2009 22:16:57 +0100
Subject: [SciPy-user] python (against java) advocacy for
	scientific	projects
In-Reply-To: <e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
Message-ID: <4974EDC9.6030603@molden.no>

On 1/19/2009 9:33 PM, Marko Loparic wrote:

> Specifically on this point, playing the devil's advocate, one could
> argue that using java we could also migrate the offending routine to
> C. Is there an element to argue that the Python/C mix is simpler or
> more powerful than the same for java/C?

You can use JNI with Java, and obtain the same effect. But as a 'glue 
language' Java is inferior to Python (i.e. Java is more verbose, and 
statically typed). And Java has shortcomings for scientific computing, 
such as no operator overloading and no complex number primitive. This is 
why projects like NumPy would not be possible with Java.


> I saw also somewhere the argument that python tools containing C/C++
> routines like numpy and wxpython are more naturally or easily made for
> python than for java. Is that really true?

NumPy: Python has operator overloading.

wxPython: No. Swig emits JNI code as well. And there is a wxWidgets 
wrapper for Java.


S.M.


From wbaxter at gmail.com  Mon Jan 19 16:31:48 2009
From: wbaxter at gmail.com (Bill Baxter)
Date: Tue, 20 Jan 2009 06:31:48 +0900
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4974EDC9.6030603@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
	<4974EDC9.6030603@molden.no>
Message-ID: <e86a5fd00901191331n579dc618s85ca127f67e0430f@mail.gmail.com>

On Tue, Jan 20, 2009 at 6:16 AM, Sturla Molden <sturla at molden.no> wrote:
> On 1/19/2009 9:33 PM, Marko Loparic wrote:
>
>> Specifically on this point, playing the devil's advocate, one could
>> argue that using java we could also migrate the offending routine to
>> C. Is there an element to argue that the Python/C mix is simpler or
>> more powerful than the same for java/C?
>
> You can use JNI with Java, and obtain the same effect. But as a 'glue
> language' Java is inferior to Python (i.e. Java is more verbose, and
> statically typed).

Well, some would consider static typing a blessing as it lets you
catch a lot of dumb errors in advance instead of having to discover
them by running the code.  With a statically typed language you can
get by without unit tests (even though really you *should* have them
anyway).  But with a dynamic language like Python they become much
more critical.  If you don't have  100% test coverage of all pathways
in your code, it's very easy to have a simple typo or other silly
error lurking, waiting to bite you.

> And Java has shortcomings for scientific computing,
> such as no operator overloading and no complex number primitive.

No operator overloading is definitely a big minus for Java.

--bb


From matthieu.brucher at gmail.com  Mon Jan 19 16:31:26 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Mon, 19 Jan 2009 22:31:26 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4974EDC9.6030603@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
	<4974EDC9.6030603@molden.no>
Message-ID: <e76aa17f0901191331l4bf5e407w2070ad24510ff057@mail.gmail.com>

2009/1/19 Sturla Molden <sturla at molden.no>:
> On 1/19/2009 9:33 PM, Marko Loparic wrote:
>
>> Specifically on this point, playing the devil's advocate, one could
>> argue that using java we could also migrate the offending routine to
>> C. Is there an element to argue that the Python/C mix is simpler or
>> more powerful than the same for java/C?
>
> You can use JNI with Java, and obtain the same effect. But as a 'glue
> language' Java is inferior to Python (i.e. Java is more verbose, and
> statically typed). And Java has shortcomings for scientific computing,
> such as no operator overloading and no complex number primitive. This is
> why projects like NumPy would not be possible with Java.

And data must be copied between the JVM and the C code.

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From sturla at molden.no  Mon Jan 19 17:21:52 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 19 Jan 2009 23:21:52 +0100 (CET)
Subject: [SciPy-user] python (against java) advocacy for scientific
 projects
In-Reply-To: <e76aa17f0901191331l4bf5e407w2070ad24510ff057@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
	<4974EDC9.6030603@molden.no>
	<e76aa17f0901191331l4bf5e407w2070ad24510ff057@mail.gmail.com>
Message-ID: <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no>

> 2009/1/19 Sturla Molden <sturla at molden.no>:

> And data must be copied between the JVM and the C code.

No, you can get a pointer to the raw data:

JNIEXPORT void JNICALL Java_ArrayExample_manipulateArray
(JNIEnv *env, jdoubleArray array)
{
   jdouble *data = (*env)->GetDoubleArrayElements(env, array, 0);
   jlen len = (*env)->GetArrayLength(env, array);
   foobar(data, &len); /* call Fortran */
   (*env)->ReleaseDoubleArrayElements(env, array, Data, 0);
}

But if you simulate a 2D array with an array of arrays, it will not be a
contiguous region and you possibly have to copy the data (or fake it
similary in C with a pointer of an array of pointers, cf. Numerical
Receipes).

Sturla Molden


From simpson at math.toronto.edu  Mon Jan 19 20:42:19 2009
From: simpson at math.toronto.edu (Gideon Simpson)
Date: Mon, 19 Jan 2009 20:42:19 -0500
Subject: [SciPy-user] profiling
Message-ID: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu>

Are there any simple tools, built into SciPy or elsewhere, for  
profiling scripts?  I'd like to be able to identify bottlenecks.

-gideon


From argriffi at ncsu.edu  Mon Jan 19 20:44:52 2009
From: argriffi at ncsu.edu (Alex Griffing)
Date: Mon, 19 Jan 2009 20:44:52 -0500
Subject: [SciPy-user] profiling
In-Reply-To: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu>
References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu>
Message-ID: <49752C94.5020308@ncsu.edu>

Gideon Simpson wrote:
> Are there any simple tools, built into SciPy or elsewhere, for  
> profiling scripts?  I'd like to be able to identify bottlenecks.
>
> -gideon
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   

Here are some that are not specific to scipy:
http://docs.python.org/library/profile.html


From strawman at astraw.com  Mon Jan 19 23:08:03 2009
From: strawman at astraw.com (Andrew Straw)
Date: Mon, 19 Jan 2009 20:08:03 -0800
Subject: [SciPy-user] profiling
In-Reply-To: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu>
References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu>
Message-ID: <49754E23.4050705@astraw.com>

It's a pity I can't find this written up any better, but use
lsprofcalltree.py to convert the results from cProfile to kcachegrind.
Here's a howto for another project which is somewhat relevant,
especially the patch to the 'if __name__ == "__main__":' section to show
you how to use it:

http://lists.baseurl.org/pipermail/yum-devel/2007-January/003045.html

You'll have to download lsprofcalltree from somewhere, but I highly
recommend this approach.

-Andrew

Gideon Simpson wrote:
> Are there any simple tools, built into SciPy or elsewhere, for  
> profiling scripts?  I'd like to be able to identify bottlenecks.
> 
> -gideon
> 
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From robert.kern at gmail.com  Mon Jan 19 23:12:41 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 19 Jan 2009 22:12:41 -0600
Subject: [SciPy-user] profiling
In-Reply-To: <49754E23.4050705@astraw.com>
References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu>
	<49754E23.4050705@astraw.com>
Message-ID: <3d375d730901192012v2f205e8ja18b8976c04ad73c@mail.gmail.com>

On Mon, Jan 19, 2009 at 22:08, Andrew Straw <strawman at astraw.com> wrote:
> It's a pity I can't find this written up any better, but use
> lsprofcalltree.py to convert the results from cProfile to kcachegrind.
> Here's a howto for another project which is somewhat relevant,
> especially the patch to the 'if __name__ == "__main__":' section to show
> you how to use it:
>
> http://lists.baseurl.org/pipermail/yum-devel/2007-January/003045.html
>
> You'll have to download lsprofcalltree from somewhere, but I highly
> recommend this approach.

It's been packaged up officially here:

  http://pypi.python.org/pypi/pyprof2calltree/1.1.0

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Mon Jan 19 23:13:17 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 19 Jan 2009 22:13:17 -0600
Subject: [SciPy-user] profiling
In-Reply-To: <3d375d730901192012v2f205e8ja18b8976c04ad73c@mail.gmail.com>
References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu>
	<49754E23.4050705@astraw.com>
	<3d375d730901192012v2f205e8ja18b8976c04ad73c@mail.gmail.com>
Message-ID: <3d375d730901192013s4cb0bdfdl611cb0900ad7974c@mail.gmail.com>

On Mon, Jan 19, 2009 at 22:12, Robert Kern <robert.kern at gmail.com> wrote:
> On Mon, Jan 19, 2009 at 22:08, Andrew Straw <strawman at astraw.com> wrote:
>> It's a pity I can't find this written up any better, but use
>> lsprofcalltree.py to convert the results from cProfile to kcachegrind.
>> Here's a howto for another project which is somewhat relevant,
>> especially the patch to the 'if __name__ == "__main__":' section to show
>> you how to use it:
>>
>> http://lists.baseurl.org/pipermail/yum-devel/2007-January/003045.html
>>
>> You'll have to download lsprofcalltree from somewhere, but I highly
>> recommend this approach.
>
> It's been packaged up officially here:
>
>  http://pypi.python.org/pypi/pyprof2calltree/1.1.0

Notably, you don't have to modify your script at all anymore.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From forrest.bao at gmail.com  Mon Jan 19 23:18:56 2009
From: forrest.bao at gmail.com (Forrest Sheng Bao)
Date: Mon, 19 Jan 2009 22:18:56 -0600
Subject: [SciPy-user] FFT indexes with zero-padding
Message-ID: <889df5f00901192018n630b28ekc16b3d5a56b302c4@mail.gmail.com>

Hi,

I am thinking about a question regarding the indexes of FFT result with
zero-padding.

Suppose there is no zero-padding that the length of signal is a power of 2,
like 4096. Then the index corresponding to frequency f should be f/fs*N,
where fs is the sampling rate and N is the number of points.

But, what if the length of signal is not a power of 2? Like 5000? How does
Scipy.signal module handle this?

For example, I have 5000 samples and am doing 5000-point FFT. The sampling
rate is 200Hz. Is the index for 2 Hz still 2 / 200* 5000 = 50?

Cheers,
Forrest

-- 
Forrest Sheng Bao, B.S. EE
Ph.D. student/Teaching Assistant, Dept. of Computer Science
M.Sc. student/Research Assistant, Dept. of Electrical & Computer Engineering
Rm 115, Experimental Sciences Building
Texas Tech University, Lubbock, Texas, USA
http://narnia.cs.ttu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090119/49b0609f/attachment.html>

From robert.kern at gmail.com  Mon Jan 19 23:22:52 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 19 Jan 2009 22:22:52 -0600
Subject: [SciPy-user] FFT indexes with zero-padding
In-Reply-To: <889df5f00901192018n630b28ekc16b3d5a56b302c4@mail.gmail.com>
References: <889df5f00901192018n630b28ekc16b3d5a56b302c4@mail.gmail.com>
Message-ID: <3d375d730901192022t66f20185sbaa5effee308075c@mail.gmail.com>

On Mon, Jan 19, 2009 at 22:18, Forrest Sheng Bao <forrest.bao at gmail.com> wrote:
> Hi,
>
> I am thinking about a question regarding the indexes of FFT result with
> zero-padding.
>
> Suppose there is no zero-padding that the length of signal is a power of 2,
> like 4096. Then the index corresponding to frequency f should be f/fs*N,
> where fs is the sampling rate and N is the number of points.
>
> But, what if the length of signal is not a power of 2? Like 5000? How does
> Scipy.signal module handle this?
>
> For example, I have 5000 samples and am doing 5000-point FFT. The sampling
> rate is 200Hz. Is the index for 2 Hz still 2 / 200* 5000 = 50?

In [16]: numpy.fft.fftfreq?
Type:           function
Base Class:     <type 'function'>
String Form:    <function fftfreq at 0x18a29fb0>
Namespace:      Interactive
File:
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.2.0rc2-py2.5-macosx-10.3-fat.egg/numpy/fft/helper.py
Definition:     numpy.fft.fftfreq(n, d=1.0)
Docstring:
    fftfreq(n, d=1.0) -> f

    DFT sample frequencies

    The returned float array contains the frequency bins in
    cycles/unit (with zero at the start) given a window length n and a
    sample spacing d:

      f = [0,1,...,n/2-1,-n/2,...,-1]/(d*n)         if n is even
      f = [0,1,...,(n-1)/2,-(n-1)/2,...,-1]/(d*n)   if n is odd


-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From almar.klein at gmail.com  Tue Jan 20 04:05:27 2009
From: almar.klein at gmail.com (Almar Klein)
Date: Tue, 20 Jan 2009 10:05:27 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
	<4974EDC9.6030603@molden.no>
	<e76aa17f0901191331l4bf5e407w2070ad24510ff057@mail.gmail.com>
	<0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no>
Message-ID: <cc38d75f0901200105y500d54fap3a2fa4d040b1f56b@mail.gmail.com>

For what its worth,

I've once tried to do scientific programming in C#. I know, it's not Java,
but I guess its similar to some extend when compared to Python.

In scientific projects, there is usually a lot of prototyping and quick
testing
scripts. That makes it that an interpreted language is much more usefull
than a compiled language. That's one of the reasons why Matlab is so
suitable for scientific programming, or better yet: Python!

Cheers,
  Almar


2009/1/19 Sturla Molden <sturla at molden.no>

> > 2009/1/19 Sturla Molden <sturla at molden.no>:
>
> > And data must be copied between the JVM and the C code.
>
> No, you can get a pointer to the raw data:
>
> JNIEXPORT void JNICALL Java_ArrayExample_manipulateArray
> (JNIEnv *env, jdoubleArray array)
> {
>   jdouble *data = (*env)->GetDoubleArrayElements(env, array, 0);
>   jlen len = (*env)->GetArrayLength(env, array);
>   foobar(data, &len); /* call Fortran */
>   (*env)->ReleaseDoubleArrayElements(env, array, Data, 0);
> }
>
> But if you simulate a 2D array with an array of arrays, it will not be a
> contiguous region and you possibly have to copy the data (or fake it
> similary in C with a pointer of an array of pointers, cf. Numerical
> Receipes).
>
> Sturla Molden
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090120/3289c266/attachment.html>

From robert.kern at gmail.com  Tue Jan 20 04:09:02 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 20 Jan 2009 03:09:02 -0600
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <cc38d75f0901200105y500d54fap3a2fa4d040b1f56b@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
	<4974EDC9.6030603@molden.no>
	<e76aa17f0901191331l4bf5e407w2070ad24510ff057@mail.gmail.com>
	<0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no>
	<cc38d75f0901200105y500d54fap3a2fa4d040b1f56b@mail.gmail.com>
Message-ID: <3d375d730901200109o6fa8c2d5tdcf48017d3d56183@mail.gmail.com>

On Tue, Jan 20, 2009 at 03:05, Almar Klein <almar.klein at gmail.com> wrote:
> For what its worth,
>
> I've once tried to do scientific programming in C#. I know, it's not Java,
> but I guess its similar to some extend when compared to Python.

What was your experience with it?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From cournape at gmail.com  Tue Jan 20 04:38:18 2009
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 20 Jan 2009 18:38:18 +0900
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
Message-ID: <5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com>

On Mon, Jan 19, 2009 at 5:01 PM, Marko Loparic <marko.loparic at gmail.com> wrote:
> Hello,
>
> Could you suggest links justifying the use of python instead of java
> for a scientific project?
>
> I work for a R&D departemnt of a large company. We develop
> mathematical models, some of them in python.

I think it depends on the context - lame answer, I know :)

Of course, for prototyping, there is little doubt that python is
essentially much better equipped than java, because of fundamental
language features such as dynamicity, concisness, etc... You say that
your team already use python, so I assume that knowing python is not a
problem.

I love python for scientific programming, I think it is a huge step
compared to similar things, like matlab and co. But there are many
researchers I will never recommend python, yet: it is not well
integrated (and never will be as well as matlab, if only because each
package in the python stack are developed by different people with
overlapping but different goals), it is more difficult, and it is
different. Those aspects may be important or not. They don't matter to
me.

Concerning the speed issue, I think it is very misleading to say there
is no speed problem in python. There are still too many cases where I
need to code into cython and/or C for acceptable speed of some
algorithms. As good as those tools are (they are certainly better than
the equivalent in matlab, for example), they are fundamentally a
failure of python in my mind. People in Lisp or OCAML communities
almost never code in another language, at least not as often as we do
in python - things like the Stalin compiler for scheme can generate
code as good as optimized C, we have nothing remotely comparable in
python. Java has gained a lot of speed the last few years, and is now
relatively competitive with C. I know, those comparisons are always
flawed, but then such is a comparison saying python is as fast as
java. Some fundamental aspects of python -like function calls - are
much slower in python than in java.

There is really a fundamental tradeof between power, expressiveness
and availability of tools/community to the task. When I started my PhD
and looked for something different from matlab, I took some time
considering both Ocaml and python. I thought Ocaml was a better
language - and still think so, although I did not realize at that
point of powerful dynamic typing is. But python is much more readable
- and scientific code, at least in academia, is as much a
communication tool as an implementation tool IMHO. And python is more
known, has bigger community, is simpler - not all researchers are
computer scientists. Java is at the other end of the spectrum compared
to Ocaml, in some way - depending on the situation, I can imagine that
I would have to chose Java (or god forbids, C++).

In other words, python is not the best language, is not the fastest
language, is not the coolest, does not have all the best numerical
algorithms. But it is a pretty damn good tradeof between all those
points, the best I know of today, at least for my use of it,

cheers,

David


From fredmfp at gmail.com  Tue Jan 20 05:33:53 2009
From: fredmfp at gmail.com (fred)
Date: Tue, 20 Jan 2009 11:33:53 +0100
Subject: [SciPy-user] ndimage convolve vs. RAM issue...
In-Reply-To: <49706838.50808@gmail.com>
References: <49706838.50808@gmail.com>
Message-ID: <4975A891.6080206@gmail.com>

fred a ?crit :
> Hi all,
> 
> On a bi-xeon quad core (debian 64 bits) with 8 GB of RAM, if I want to
> convolve a 102*122*143 float array (~7 MB) with a kernel of 77*77*41
> cells (~1 MB), I get a MemoryError in correlate:
> 
> File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line
> 331, in convolve
>     origin, True)
>   File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line
> 312, in _correlate_or_convolve
>     _nd_image.correlate(input, weights, output, mode, cval, origins)
> MemoryError

Nobody can help me on this issue ?

I really need some help, since ndimage.convolve is _very_ efficient ;-)

TIA


Cheers,

-- 
Fred


From gael.varoquaux at normalesup.org  Tue Jan 20 05:35:12 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Tue, 20 Jan 2009 11:35:12 +0100
Subject: [SciPy-user] ndimage convolve vs. RAM issue...
In-Reply-To: <4975A891.6080206@gmail.com>
References: <49706838.50808@gmail.com> <4975A891.6080206@gmail.com>
Message-ID: <20090120103512.GB6595@phare.normalesup.org>

On Tue, Jan 20, 2009 at 11:33:53AM +0100, fred wrote:
> I really need some help, since ndimage.convolve is _very_ efficient ;-)

Did you try fftconvolve?

Ga?l


From hep.sebastien.binet at gmail.com  Tue Jan 20 05:50:08 2009
From: hep.sebastien.binet at gmail.com (Sebastien Binet)
Date: Tue, 20 Jan 2009 11:50:08 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <200901191458.19218.lists_ravi@lavabit.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<200901191458.19218.lists_ravi@lavabit.com>
Message-ID: <200901201150.08598.binet@cern.ch>

hi,

[snip]

> 3. The lack of a real JIT compiler is a serious issue if the use cases
> involve more than linear algebra and differential equation solvers. In many
> such cases, for-loops and/or while-loops are the only reasonable solutions,
> both of which, very often, execute much faster under Matlab or Java. Some
> operations are simply not vectorizable if you wish to have maintainable
> code, e.g., large groups of interacting state machines.
there is already *some* support for JITing stuff, and integrated with numpy.
Look at the nice code from Ilan:
http://www.enthought.com/~ischnell/mkufunc.html
which uses PyPy to translate relatively non-dynamic python code (aka RPython) 
into C.

On the same note, I always wondered if one could not sidestep the for-loop 
overhead with an ad hoc Context manager which would suppress/shortcut the 
dynamic nature of python for very localised pieces of code:
 with NotDynamic() as ctx:
   for i in xrange(10):
     ...
where all the usual dynamic type checking would be done once (to 
discover/infer the types) and then cached for subsequent loops...

cheers,
sebastien.
-- 
#########################################
# Dr. Sebastien Binet
# Laboratoire de l'Accelerateur Lineaire
# Universite Paris-Sud XI
# Batiment 200
# 91898 Orsay
#########################################


From almar.klein at gmail.com  Tue Jan 20 05:51:15 2009
From: almar.klein at gmail.com (Almar Klein)
Date: Tue, 20 Jan 2009 11:51:15 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <3d375d730901200109o6fa8c2d5tdcf48017d3d56183@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>
	<4974EDC9.6030603@molden.no>
	<e76aa17f0901191331l4bf5e407w2070ad24510ff057@mail.gmail.com>
	<0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no>
	<cc38d75f0901200105y500d54fap3a2fa4d040b1f56b@mail.gmail.com>
	<3d375d730901200109o6fa8c2d5tdcf48017d3d56183@mail.gmail.com>
Message-ID: <cc38d75f0901200251s3bc3c421w43ff0523acb92baf@mail.gmail.com>

>
> > I've once tried to do scientific programming in C#. I know, it's not
> Java,
> > but I guess its similar to some extend when compared to Python.
>
> What was your experience with it?
>

Well, I liked C# a lot, but NOT for scientific computing, as the compile-run
step takes too much time in that case. Plus you I missed the vast amount
of functions in Matlab / Python+Numpy+Scipy.

I wrote my experience down if you're interested:
http://sites.google.com/site/almarklein/quest

Cheers,
  Almar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090120/699dfd31/attachment.html>

From fredmfp at gmail.com  Tue Jan 20 06:22:17 2009
From: fredmfp at gmail.com (fred)
Date: Tue, 20 Jan 2009 12:22:17 +0100
Subject: [SciPy-user] ndimage convolve vs. RAM issue...
In-Reply-To: <20090120103512.GB6595@phare.normalesup.org>
References: <49706838.50808@gmail.com> <4975A891.6080206@gmail.com>
	<20090120103512.GB6595@phare.normalesup.org>
Message-ID: <4975B3E9.8050507@gmail.com>

Gael Varoquaux a ?crit :
> On Tue, Jan 20, 2009 at 11:33:53AM +0100, fred wrote:
>> I really need some help, since ndimage.convolve is _very_ efficient ;-)
> 
> Did you try fftconvolve?
Yep.

On a smaller kernel:

data: 600x800x720
kernel: 361

ndimage.convolve: 184 s

signal.fftconvolve: MemoryError


Another one :

data: 300x400x360
kernel: 361

ndimage.convolve: 22 s

signal.fftconvolve: 37 s


Besides this, ndimage.convolve can handle NaN, not signal.fftconvolve.


Cheers,

-- 
Fred


From dlrt2 at ast.cam.ac.uk  Tue Jan 20 06:30:21 2009
From: dlrt2 at ast.cam.ac.uk (David Trethewey)
Date: Tue, 20 Jan 2009 11:30:21 +0000
Subject: [SciPy-user] optimize.leastsq
Message-ID: <4975B5CD.9060002@ast.cam.ac.uk>

I'm using the following code to fit a gaussian to a histogram of some data.

#fit gaussian
    fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2))  # 
Target function
    errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the 
target function
    doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) + 
(q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2))
    doublegausserr = lambda q,x,y: doublegauss(q,x) - y
    # initial guess
    p0 = [10.0,-2,0.5]
    # find parameters of single gaussian
    p1,success  = optimize.leastsq(errfunc, p0[:], args = 
(hista[1],hista[0]))
    errors_sq = errfunc(p1,hista[1],hista[0])**2


I have the error message

Traceback (most recent call last):
 File "M31FeHfit_totalw108.py", line 116, in <module>
   p1,success  = optimize.leastsq(errfunc, p0[:], args = 
(hista[1],hista[0]))
 File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line 
264, in leastsq
   m = check_func(func,x0,args,n)[0]
 File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line 
11, in check_func
   res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
 File "M31FeHfit_totalw108.py", line 110, in <lambda>
   errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the 
target function
ValueError: shape mismatch: objects cannot be broadcast to a single shape

Anyone know why this happens? Curiously I have had it work before but 
not with my current versions of scipy and python etc.

David


From fredmfp at gmail.com  Tue Jan 20 07:17:43 2009
From: fredmfp at gmail.com (fred)
Date: Tue, 20 Jan 2009 13:17:43 +0100
Subject: [SciPy-user] ndimage convolve vs. RAM issue...
In-Reply-To: <49706838.50808@gmail.com>
References: <49706838.50808@gmail.com>
Message-ID: <4975C0E7.1060306@gmail.com>

fred a ?crit :
> Hi all,
> 
> On a bi-xeon quad core (debian 64 bits) with 8 GB of RAM, if I want to
> convolve a 102*122*143 float array (~7 MB) with a kernel of 77*77*41
> cells (~1 MB), I get a MemoryError in correlate:
> 
> File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line
> 331, in convolve
>     origin, True)
>   File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line
> 312, in _correlate_or_convolve
>     _nd_image.correlate(input, weights, output, mode, cval, origins)
> MemoryError


Can someone give me an explanation, if not a solution (I get one, called
multi-processing ;-))


Cheers,

-- 
Fred


From sturla at molden.no  Tue Jan 20 07:30:43 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 13:30:43 +0100
Subject: [SciPy-user] python (against java) advocacy for
	scientific	projects
In-Reply-To: <cc38d75f0901200105y500d54fap3a2fa4d040b1f56b@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>	<4974A6E4.70000@molden.no>	<e2712e1d0901191233u1a7ddb08qfc5fd72922a1efa4@mail.gmail.com>	<4974EDC9.6030603@molden.no>	<e76aa17f0901191331l4bf5e407w2070ad24510ff057@mail.gmail.com>	<0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no>
	<cc38d75f0901200105y500d54fap3a2fa4d040b1f56b@mail.gmail.com>
Message-ID: <4975C3F3.3020605@molden.no>

On 1/20/2009 10:05 AM, Almar Klein wrote:

> I've once tried to do scientific programming in C#. I know, it's not Java,
> but I guess its similar to some extend when compared to Python.

Not at all. C# is far better for scientific programming than Java (and 
F# is even better than C#).

S.M.


From scotta_2002 at yahoo.com  Tue Jan 20 07:34:01 2009
From: scotta_2002 at yahoo.com (Scott Askey)
Date: Tue, 20 Jan 2009 04:34:01 -0800 (PST)
Subject: [SciPy-user] integrate.odeint and simultaniuos equations
Message-ID: <200658.60894.qm@web36501.mail.mud.yahoo.com>

Do ode and odeint work in multiple dimensions?

I could not any examples with more than one degree of freedom.  And from the doc string it how to solve simultaneous  ode's was not obvious.  The code for modelling a 2d simple harmonic oscillator or spherical pendulum would give me the insight I need.

I found and understand the following 1 D harmonic oscillator model from the scipy cookbook.

V/R

Scott


from scipy import *
from pylab import *
deriv = lambda y,t : array([y[1],-y[0]-.1*y[1]])#xdot,x2dot
# Integration parameters
start=0
end=10
numsteps=10000
time=linspace(start,end,numsteps)
from scipy import integrate
y0=array([0.0005,0.2]) #x,x_dot
y=integrate.odeint(deriv,y0,time)
plot(time,y[:,0])#x,xdot
show()


From sturla at molden.no  Tue Jan 20 07:35:22 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 13:35:22 +0100
Subject: [SciPy-user] python (against java) advocacy for
	scientific	projects
In-Reply-To: <200901201150.08598.binet@cern.ch>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>	<4974A6E4.70000@molden.no>	<200901191458.19218.lists_ravi@lavabit.com>
	<200901201150.08598.binet@cern.ch>
Message-ID: <4975C50A.9080902@molden.no>

On 1/20/2009 11:50 AM, Sebastien Binet wrote:


> On the same note, I always wondered if one could not sidestep the for-loop 
> overhead with an ad hoc Context manager which would suppress/shortcut the 
> dynamic nature of python for very localised pieces of code:
>  with NotDynamic() as ctx:
>    for i in xrange(10):

Yes, and one could also use a decorator on a function to achieve a 
similar effect.

@nativecompiled
def foobar():

And in Python 3.0 there are optional type annotations which could be 
exploited.

Cython or RPython could be used as compiler.


S.M.


From sturla at molden.no  Tue Jan 20 07:53:16 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 13:53:16 +0100
Subject: [SciPy-user] python (against java) advocacy for
	scientific	projects
In-Reply-To: <5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com>
Message-ID: <4975C93C.5030907@molden.no>

On 1/20/2009 10:38 AM, David Cournapeau wrote:

> People in Lisp or OCAML communities
> almost never code in another language, at least not as often as we do
> in python 

Which, for Common Lisp, is due to optional static typing. To some 
extent, a 'fast' Common Lisp like SBCL or CMUCL have more in common with 
Cython than Python.

But for a purely synamic language like Python, the Java VM is more 
interesting. The speed of this VM/JIT is not due to Java's static 
typing. Hotspot was originally developed for StrongTalk, a JIT compiled 
implementation of Smalltalk (a dynamic language). Sun bought the company 
who created StrongTalk to use the StrongTalk VM for Java.

In addition to Smalltalk, there are also very efficient implementations 
of Scheme (e.g. Staling, Ikarus, Bigloo, Larency). Again this proves 
that it is possible to create fast implementations of dynamic languages. 
It just has not been done yet for Python.

S.M.


From david at ar.media.kyoto-u.ac.jp  Tue Jan 20 08:10:12 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Tue, 20 Jan 2009 22:10:12 +0900
Subject: [SciPy-user] python (against java) advocacy
	for	scientific	projects
In-Reply-To: <4975C93C.5030907@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>	<5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com>
	<4975C93C.5030907@molden.no>
Message-ID: <4975CD34.7020508@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
> On 1/20/2009 10:38 AM, David Cournapeau wrote:
>
>   
>> People in Lisp or OCAML communities
>> almost never code in another language, at least not as often as we do
>> in python 
>>     
>
> Which, for Common Lisp, is due to optional static typing. To some 
> extent, a 'fast' Common Lisp like SBCL or CMUCL have more in common with 
> Cython than Python.
>
> But for a purely synamic language like Python, the Java VM is more 
> interesting. The speed of this VM/JIT is not due to Java's static 
> typing. Hotspot was originally developed for StrongTalk, a JIT compiled 
> implementation of Smalltalk (a dynamic language). Sun bought the company 
> who created StrongTalk to use the StrongTalk VM for Java.
>
> In addition to Smalltalk, there are also very efficient implementations 
> of Scheme (e.g. Staling, Ikarus, Bigloo, Larency). Again this proves 
> that it is possible to create fast implementations of dynamic languages. 
> It just has not been done yet for Python.
>   

Yes, I did not want to imply it was not possible to do a fast python
implementation, only that there isn't any today for any production
usage. But going into C/Cython/Etc...  when one needs speed seems very
consensual  in python community, and I am always a bit surprised by this.

Maybe one of those examples where something is just good enough at some
point in history, which prevents more progress until it is too late.
Typically, I have a hard time imagining smalltalk being very useful for
anything but prototyping 30 years ago, and I would guess that things
like self were simply mandatory to make smalltalk usable in bigger
projects. But I was not born 30 years ago, so this can just be one more
proof of my lack of imagination,

David


From sturla at molden.no  Tue Jan 20 08:54:37 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 14:54:37 +0100
Subject: [SciPy-user] python (against java)
	advocacy	for	scientific	projects
In-Reply-To: <4975CD34.7020508@ar.media.kyoto-u.ac.jp>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>	<5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com>	<4975C93C.5030907@molden.no>
	<4975CD34.7020508@ar.media.kyoto-u.ac.jp>
Message-ID: <4975D79D.4040708@molden.no>

On 1/20/2009 2:10 PM, David Cournapeau wrote:

> Yes, I did not want to imply it was not possible to do a fast python
> implementation, only that there isn't any today for any production
> usage. But going into C/Cython/Etc...  when one needs speed seems very
> consensual  in python community, and I am always a bit surprised by this.

I remember that using x86 assembly was consensual in the Turbo Pascal 
(later Borland Delphi) community as well. Delphi could not deal with the 
floating point unit properly, and Borland did not care, so typically all 
numerics in Delphi programs were done in assembly. And the Delphi 
community did not seem to care. Even more strange, Borland did have a 
C/C++ compiler as well (Turbo C, later C++ Builder), which created 
object files binary compatible with Delphi (and incompatible with 
Microsoft C). But even still, the Delphi community preferred assembly to 
C for speeding up floating point operations. Sometimes it can be hard to 
understand human behaviour.

 > But I was not born 30 years ago, so this can just be one more
 > proof of my lack of imagination,

I was born 30 year ago, but at that time we could not afford a 
television set. And where I lived, FM radio broadcasts were still in 
mono only. Using satellite antennas for television was a felony. And it 
was prohibited for any store to be open after 4 PM. Needless to say, I 
had very little knowledge of computers at that time.

S.M.


From sturla at molden.no  Tue Jan 20 09:17:17 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 15:17:17 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <200901191458.19218.lists_ravi@lavabit.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no>
	<200901191458.19218.lists_ravi@lavabit.com>
Message-ID: <4975DCED.1060901@molden.no>

On 1/19/2009 8:58 PM, Ravi wrote:

>   The advice from Mr. Molden is well-argued, but he does gloss over a few of 
> the difficulties. These serious problems are also present in Matlab & Java for 
> the most part.

(this was actually written yesterday, but posted incorrectly.)


If you insist on using titulation, that is Dr. Molden to you.  :-P

There are a number of things to consider when comparing Python/NumPy 
with Matlab. But I was not comparing Python with Matlab. I was comparing 
Python with Java.

I retain that Java is not fit for scientific computing. There are no 
complex number primitive, no flexible array primitive, and no operator 
overloading. Try to pass an array slice to a function: It's not 
possible. One has to implement an array class to do that, and you end up 
with syntax like

    arr.set(idx, value)
    arr.set(idx, array.add(arr1,arr2))
    foobar(arr.get(idx))

instead of:

    arr[idx] = value
    arr[idx] = arr1 + arr2
    foobar(arr[idx])

Because Java is statically typed (not duck-typed like Python and 
Matlab), you end up with ugly C++ like templates for generic functions.

C++ template metaprogramming is fantastic if you want to write 
unmaintainable code. Hey it's even proven to be a Turing complete 
'language'! But why go through all of that pain just to match the 
performance of good old Fortran? I known an easier way ... just write 
Fortran instead.

S.M.


From matthieu.brucher at gmail.com  Tue Jan 20 09:26:42 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Tue, 20 Jan 2009 15:26:42 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4975DCED.1060901@molden.no>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>
	<4974A6E4.70000@molden.no> <200901191458.19218.lists_ravi@lavabit.com>
	<4975DCED.1060901@molden.no>
Message-ID: <e76aa17f0901200626v5c1565d8h2df4d5ea0b2f6eb1@mail.gmail.com>

> C++ template metaprogramming is fantastic if you want to write
> unmaintainable code. Hey it's even proven to be a Turing complete
> 'language'! But why go through all of that pain just to match the
> performance of good old Fortran? I known an easier way ... just write
> Fortran instead.

This may be off track, but I'd like to make this opposite argument.
I'm developping a generic framework for HPC, and the generic here is
C++ template-based. 100% static, optimized by the compiler, ...
With Fortran, I would have to rewrite the main computation routine
(more or less one thousand lines before adding memory optimizations,
not counting the model specific code) for each model I'd like to
implement (at least 5 are listed ATM). I don't think I would be able
to write this in Fortran as easily as in C++. OK, I'm not a Fortran
expert, but with this framework, I'm able to debug only one
computation function and to optimize it for every model in an easy
way, contrary to Fortran where I would have to modify every model,
hoping that I would not add any typo.

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From lists_ravi at lavabit.com  Tue Jan 20 09:27:14 2009
From: lists_ravi at lavabit.com (Ravi)
Date: Tue, 20 Jan 2009 09:27:14 -0500
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
Message-ID: <200901200927.15074.lists_ravi@lavabit.com>

On Monday 19 January 2009 16:11:34 Sturla Molden wrote:
> I retain that Java is not fit for scientific computing. There are no
> complex number primitive, no flexible array primitive, and no operator
> overloading.
[snip]
> The same for complex numbers.

I quite agree. I don't believe that Java is suited for matrix-based computing. 
However, the JIT is important for scientific computing that is not primarily 
matrix-based: discrete mathematics & other combinatorial problems are good 
examples. I am aware of psyco, but it does not work with x86_64 & none of the 
clusters I work with have 32-bit python installed any longer; this is pretty 
typical of large companies' R&D departments (like mine or likely, the OP's).

> And because it is statically typed (not
> duck-typed like Python and Matlab), you end up with ugly C++ like
> templates for generic functions.

This is both an advantage and a disadvantage. Bill Baxter pointed out the 
disadvantages on the list.

> C++ is used for scientific computing, particularly by younger
> scientists. But it remains that the majority of hard-core computational
> scientists prefer Fortran over C++ when native compilation is required.

Really? See www.cern.ch, wci.llnl.gov, etc. for hard-core computational 
scientists who prefer C++ over Fortran, many whom have been around for a few 
decades. You could complain that they have been using c++ only for 5-10 years, 
but then C++98 is only 10 years old, and reasonably conforming C++ compilers 
are only 5 or so years old. If you complain that C++ is such a complex 
language that it took 5 years for the majority of compilers to get it right, 
then I'd point to Fortran95 and ask for the length of time for freely 
available compilers to become reasonably conformant. All such language changes 
take a while to get implemented.

> I guess C++ templates is fine if you like bloatware. And C++ template
> metaprogramming is fantastic if you want to write unmaintainable code.

Really? Try writing just a fixed-point radix-8 FFT which handles complex input 
vectors up to, say, 64K in length with flexible rounding/clipping strategies 
with Fortran/python. I bet one could not write one that is even half as 
maintainable and half the performance of the C++ version. Or, for that matter, 
try writing something like Macaulay2 (or any nontrivial group-theoretic 
algorithms) on Fortran/python.

Code maintainability works by using clearly defined idioms. 5 or 10 years ago, 
no such idioms had been developed (apart from the STL) for template 
metaprogramming. The story is now different; check out boost.fusion, 
nt2.sourceforge.net or the eigen library. Similar idioms/patterns are now 
still under development for python generators (or the cool stuff from 
Twisted). As with any tool (like C++ or linear algebra), you have to learn how 
to use it.

> Hey it's even proven to be a Turing complete 'language'. But why go
> through all of that pain just to match the performance of good old
> Fortran? I known an easier way ... just write Fortran instead.

First, Fortran, as I pointed out above, is generally worthless for a lot of 
computation-intensive problems that don't map to its native data types. 

Second, Fortran is not magic; it simply uses optimized libraries underneath 
and the speed of Fortran compiled code depends upon the libraries but you can 
beat those libraries from C++ (because template metaprogramming can be used to  
provide more information to the compiler), e.g., see
  http://eigen.tuxfamily.org/index.php?title=Benchmark

Third, computation speed now on CotS processors depends more on cache & memory 
access optimization than anything else, which compilers can do with C/C++ just 
as well as with Fortran; the days of Fortran being the golden benchmark are 
long over. C/C++ (among others) have caught up. Note that virtually all major 
compiler vendors (including Microsoft, Intel, SGI & GCC) use the same code 
generation back-end for Fortran/C/C++ with the only difference being the 
amount of information that can be passed through the front-end; in this case, 
C++/C# can actually provide more information to the back-end (for use in 
optimization) because of the availability of compile-time scriptability.

Fourth, C++ can be easier to write than Fortran. You could object that writing 
such a C++ library is difficult, but the point is that Eigen or MTL needs to 
be written only once (just as you would write only once the Fortran compiler 
where this knowledge is embedded for Fortran).

Fifth, try getting a decent Fortran compiler for homegrown embedded systems.

Personally, I had a very difficult time switching from Fortran to C++, but 
with the benefit of hindsight, I realize that my initial resistance followed 
more from NIH and from familiarity with Fortran. At this point, I haven't 
found an easier tool than the combination of python/C++/Qt/CMake.

> To compare Python with Matlab for scientific computing, here it at least
> some points to consider:

I completely agree here; I am betting huge at my current company on switching 
successfully from Matlab to python. I was merely pointing out the differences 
for the OP who works at a big company where the cost of Matlab is not likely 
to be an issue.

Regards,
Ravi


____________________________________________________________________________________
Start a rewarding Medical Transcriptionist career. Click to find affordable 
and flexible programs. 
http://ads.lavabit.com/fc/PnY6tWr4ToNBjrYeUVAHcbUcX4W3wxlv3MIw8hjM7lruhBrrmtXzJ/
____________________________________________________________________________________


-------------------------------------------------------


From sturla at molden.no  Tue Jan 20 10:06:57 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 16:06:57 +0100
Subject: [SciPy-user] python (against java) advocacy for
	scientific	projects
In-Reply-To: <e76aa17f0901200626v5c1565d8h2df4d5ea0b2f6eb1@mail.gmail.com>
References: <e2712e1d0901190001l4a4bfce9nd5fe2ef86efe38c2@mail.gmail.com>	<4974A6E4.70000@molden.no>
	<200901191458.19218.lists_ravi@lavabit.com>	<4975DCED.1060901@molden.no>
	<e76aa17f0901200626v5c1565d8h2df4d5ea0b2f6eb1@mail.gmail.com>
Message-ID: <4975E891.6070903@molden.no>

On 1/20/2009 3:26 PM, Matthieu Brucher wrote:
> I'm able to debug only one
> computation function and to optimize it for every model in an easy
> way, contrary to Fortran where I would have to modify every model,
> hoping that I would not add any typo.

You can use Python to generate Fortran code on the fly, and you can call 
f2py from Python. There are examples of this in "Python Scripting for 
Computational Science" (3rd edition) by H.P. Langtangen.

http://www.springer.com/math/cse/book/978-3-540-73915-9


Sturla Molden


From josef.pktd at gmail.com  Tue Jan 20 10:34:27 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 20 Jan 2009 10:34:27 -0500
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <4975B5CD.9060002@ast.cam.ac.uk>
References: <4975B5CD.9060002@ast.cam.ac.uk>
Message-ID: <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>

On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
> I'm using the following code to fit a gaussian to a histogram of some data.
>
> #fit gaussian
>    fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2))  #
> Target function
>    errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
> target function
>    doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) +
> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2))
>    doublegausserr = lambda q,x,y: doublegauss(q,x) - y
>    # initial guess
>    p0 = [10.0,-2,0.5]
>    # find parameters of single gaussian
>    p1,success  = optimize.leastsq(errfunc, p0[:], args =
> (hista[1],hista[0]))
>    errors_sq = errfunc(p1,hista[1],hista[0])**2
>
>
> I have the error message
>
> Traceback (most recent call last):
>  File "M31FeHfit_totalw108.py", line 116, in <module>
>   p1,success  = optimize.leastsq(errfunc, p0[:], args =
> (hista[1],hista[0]))
>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
> 264, in leastsq
>   m = check_func(func,x0,args,n)[0]
>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
> 11, in check_func
>   res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
>  File "M31FeHfit_totalw108.py", line 110, in <lambda>
>   errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
> target function
> ValueError: shape mismatch: objects cannot be broadcast to a single shape
>
> Anyone know why this happens? Curiously I have had it work before but
> not with my current versions of scipy and python etc.
>
> David

Check the dimensions hista[1],hista[0]. I can run your part of the
code without problems.

If you want to estimate the parameters of a (parametric) distribution,
then using maximum likelihood estimation would be more appropriate
than using least squares on the histogram.

Josef


From fredmfp at gmail.com  Tue Jan 20 11:29:38 2009
From: fredmfp at gmail.com (fred)
Date: Tue, 20 Jan 2009 17:29:38 +0100
Subject: [SciPy-user] ndimage convolve vs. RAM issue...
In-Reply-To: <4975C0E7.1060306@gmail.com>
References: <49706838.50808@gmail.com> <4975C0E7.1060306@gmail.com>
Message-ID: <4975FBF2.4090301@gmail.com>

fred a ?crit :
>
> Can someone give me an explanation, if not a solution (I get one, called
> multi-processing ;-))
Stupid me.

I tested the wrong example.

It does not work :-(((((((((((


Cheers,

-- 
Fred


From cournape at gmail.com  Tue Jan 20 13:13:27 2009
From: cournape at gmail.com (David Cournapeau)
Date: Wed, 21 Jan 2009 03:13:27 +0900
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <200901200927.15074.lists_ravi@lavabit.com>
References: <200901200927.15074.lists_ravi@lavabit.com>
Message-ID: <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>

On Tue, Jan 20, 2009 at 11:27 PM, Ravi <lists_ravi at lavabit.com> wrote:

>
> Really? Try writing just a fixed-point radix-8 FFT which handles complex input
> vectors up to, say, 64K in length with flexible rounding/clipping strategies
> with Fortran/python. I bet one could not write one that is even half as
> maintainable and half the performance of the C++ version. Or, for that matter,
> try writing something like Macaulay2 (or any nontrivial group-theoretic
> algorithms) on Fortran/python.

The FFT reference is FFTW. It uses neither C++ or fortran. It does not
have rounding /clipping strategies that I know of, but is certainly as
flexible as you can make in C++. Multiple sizes and dimensions,
multiple strategies and architectures.

>
> Code maintainability works by using clearly defined idioms.

That's really only part of the story. Code maintainability also
requires the idioms to be well shared and understood by the community
- which C++ makes really hard to ensure because it is such a complex
beast. C++ is unmaintainable without a strong set of coding rules,
which only really works in companies, or when you have an already
strong framework (in open source, it is quite striking that C++ is
seldom used, except for complex GUI programs).

I have no reason to doubt your experience that template leads to
maintainable code - but it is exactly the contrary in my experience,
and often for code which is supposed to be state of the art (boost).

>
> First, Fortran, as I pointed out above, is generally worthless for a lot of
> computation-intensive problems that don't map to its native data types.
>
> Second, Fortran is not magic; it simply uses optimized libraries underneath
> and the speed of Fortran compiled code depends upon the libraries

Part of the fortran speed comes from the fact that fortran does not
have pointer. Pointers cause huge problems for optimization. And
meta-programming as done in C++ is nothing new; there are similar
schemes with much better syntax, and much more powerful in more high
level language - for example scheme + staline, ocaml + code generator,
faust for real time signal processing, etc... C++ templates are to
those systems what punch card is to python.

> but you can
> beat those libraries from C++ (because template metaprogramming can be used to
> provide more information to the compiler), e.g., see
>  http://eigen.tuxfamily.org/index.php?title=Benchmark

I think something like eigen will not suit python developers much.
First, it has dreadful compilation time (like everything
template-based), and their performance numbers, I never could
reproduce them. I have never seen such a difference between MKL and
ATLAS as shown on their benchmark - since they don't give enough
information, it is hard to tell which atlas they used, but in my
experience, ATLAS (and of course MKL) was always much faster than
eigen, on both mac os X (with accelerate, which is mostly customized
atlas, at least at its code) and Linux, with the benchmark they
provide. At this point, I don't understand what they are measuring.

I also note that they are so much faster than blitz, which itself was
supposed to match fortran speed. This puzzles me as a fundamental
contradiction somewhere :)

>
> Third, computation speed now on CotS processors depends more on cache & memory
> access optimization than anything else, which compilers can do with C/C++ just
> as well as with Fortran;

No, they can't. At least in standard C++, you can't provide enough
informations about pointers. But even then, it is often only 2 or 3
times slower - which rarely matters for scientific programming, except
for the biggest simulations. That's something that many C++ developers
don't seem to understand for some reason; I remember that one eigen
developer asked me once whether I would prefer coding in 3 days
something which runs in 3 hours or running in 3 days something which
took 3 hours to program - we both had an obvious answer to this
question, and you can guess it was not the same for both of us.

For real time programming (for signal processing kind of stuff for
example), this may matter, and indeed, C++ may be the best available
tool for this - it is certainly the de facto language for "real time"
music softwares, for example.

> You could object that writing
> such a C++ library is difficult, but the point is that Eigen or MTL needs to
> be written only once (just as you would write only once the Fortran compiler
> where this knowledge is embedded for Fortran).

But the point is that it is difficult for no reason but a dreadful
syntax. Something like eigen could be done in a higher level language.
To everyone his own interet, I guess, but I don't understand the joy
of spending time coding and debugging template code. It is just awful
- the compiler often cannot tell you even the line which has a syntax
error.

Something like fftw, wich a code generator written in a high level
language is a much better example of meta programming IMHO. It is
readable, flexible, and portable, at least in comparison to anything
C++ has to offer today.

David


From matthieu.brucher at gmail.com  Tue Jan 20 13:33:19 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Tue, 20 Jan 2009 19:33:19 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
References: <200901200927.15074.lists_ravi@lavabit.com>
	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
Message-ID: <e76aa17f0901201033u2a079a61hb4e496af006930f5@mail.gmail.com>

>> Code maintainability works by using clearly defined idioms.
>
> That's really only part of the story. Code maintainability also
> requires the idioms to be well shared and understood by the community
> - which C++ makes really hard to ensure because it is such a complex
> beast. C++ is unmaintainable without a strong set of coding rules,
> which only really works in companies, or when you have an already
> strong framework (in open source, it is quite striking that C++ is
> seldom used, except for complex GUI programs).
>
> I have no reason to doubt your experience that template leads to
> maintainable code - but it is exactly the contrary in my experience,
> and often for code which is supposed to be state of the art (boost).

It leads to maintainable code. It's as for C and Fortran, as for any
language, there must be rules (the higher-level, the more the rules).
There may be more rules than for C and Fortran 90, but both of them
can lead to horrible code. And in the research domain, it's more
horrible than good code.

>> First, Fortran, as I pointed out above, is generally worthless for a lot of
>> computation-intensive problems that don't map to its native data types.
>>
>> Second, Fortran is not magic; it simply uses optimized libraries underneath
>> and the speed of Fortran compiled code depends upon the libraries
>
> Part of the fortran speed comes from the fact that fortran does not
> have pointer.

You're wrong. Pointers are there since the begining. It's Fortran
nasis. Fortran 77 code is full of pointers when using dynamic
allocation. Fortran is simpler than C and C++, but mainly they do not
state the same things, leading to different optimization strategies.
For instance, Fortran forbids arguments aliases, C and C++ allow them.
This enables Fortran to achieve more optimizations.

>> Third, computation speed now on CotS processors depends more on cache & memory
>> access optimization than anything else, which compilers can do with C/C++ just
>> as well as with Fortran;
>
> No, they can't. At least in standard C++, you can't provide enough
> informations about pointers. But even then, it is often only 2 or 3
> times slower - which rarely matters for scientific programming, except
> for the biggest simulations. That's something that many C++ developers
> don't seem to understand for some reason; I remember that one eigen
> developer asked me once whether I would prefer coding in 3 days
> something which runs in 3 hours or running in 3 days something which
> took 3 hours to program - we both had an obvious answer to this
> question, and you can guess it was not the same for both of us.

Fortran can optimize better than C++ only in some circumstances.
Usually, it can't.

>> You could object that writing
>> such a C++ library is difficult, but the point is that Eigen or MTL needs to
>> be written only once (just as you would write only once the Fortran compiler
>> where this knowledge is embedded for Fortran).
>
> But the point is that it is difficult for no reason but a dreadful
> syntax. Something like eigen could be done in a higher level language.
> To everyone his own interet, I guess, but I don't understand the joy
> of spending time coding and debugging template code. It is just awful
> - the compiler often cannot tell you even the line which has a syntax
> error.

It's worse for C or Fortran macros.
>From my point of view, template errors can not that hard to debug.

> Something like fftw, wich a code generator written in a high level
> language is a much better example of meta programming IMHO. It is
> readable, flexible, and portable, at least in comparison to anything
> C++ has to offer today.

I don't think so. You get the generated code, and you have to find out
what generated the code that didn't compile. It's like for C++
templates: if you don't know the language, you can't understand. And
you need 2 languages. With C++, it's only one.

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From sturla at molden.no  Tue Jan 20 14:48:27 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 20:48:27 +0100
Subject: [SciPy-user] python (against java) advocacy for
	scientific	projects
In-Reply-To: <e76aa17f0901201033u2a079a61hb4e496af006930f5@mail.gmail.com>
References: <200901200927.15074.lists_ravi@lavabit.com>	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
	<e76aa17f0901201033u2a079a61hb4e496af006930f5@mail.gmail.com>
Message-ID: <49762A8B.6050905@molden.no>

On 1/20/2009 7:33 PM, Matthieu Brucher wrote:

> You're wrong. Pointers are there since the begining. It's Fortran
> nasis. Fortran 77 code is full of pointers when using dynamic
> allocation. 

Cray pointers are not standard Fortran. Apart from that, Fortran 77 does 
not have pointers. There is no dynamic allocation with Fortran 77, 
except if you use non-standard Cray pointers.

Fortran 90 (and later) have pointers, but pointers can only point to 
variables declared as targets (or dynamically allocated memory). What 
Fortran disallows is pointer aliasing, except for variables explicitly 
declared as 'pointer' or 'target'. This way, a Fortran compiler always 
knows what could be aliased and what is not.

Fortran 90 does not need pointers for dynamic allocation. Memory can 
also be allocated to allocatable arrays, which may or may not be aliased 
by pointers, depending on declaration.

If you don't use pointers, nothing can be aliased - and the compiler 
will just assume this is true.

Fortran pointers are not simply memory adresses. They are 'doped array 
structures', with dimensions, bounds and strides, very similar to 
NumPy's view arrays. If you pass a C pointer to a Fortran method that 
expects a Fortran pointer, it will usually fail.

ISO C (not ANSI C) has a 'restrict' keyword that informs the compiler it 
can treat a pointer as unaliased. ANSI C and ISO C++ can be just as 
efficient as Fortran. This is due to non-standard compiler pragmas, 
which informs the compiler about pointer aliasing.

Speed differences within an order of magnitude seldom counts. This can 
easily be solved by using more hardware or waiting a bit longer. The 
time spent coding is much more important, at least for scientific 
projects. Though for commercial work it will be different, as you have 
customers and competitors to consider.

For code that involves arrays and loops, it will be easier to program in 
Fortran than C++.

If you are going to make calls to the OS, C will easier than Fortran. 
The OS was written in C, and you just have to include the header and 
link the appropriate library.


Sturla Molden


From lists_ravi at lavabit.com  Tue Jan 20 16:15:17 2009
From: lists_ravi at lavabit.com (Ravi)
Date: Tue, 20 Jan 2009 16:15:17 -0500
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
References: <200901200927.15074.lists_ravi@lavabit.com>
	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
Message-ID: <200901201615.17634.lists_ravi@lavabit.com>

On Tuesday 20 January 2009 13:13:27 David Cournapeau wrote:
> On Tue, Jan 20, 2009 at 11:27 PM, Ravi <lists_ravi at lavabit.com> wrote:
> > Really? Try writing just a fixed-point radix-8 FFT 
[snip]
> The FFT reference is FFTW. It uses neither C++ or fortran. It does not
> have rounding /clipping strategies that I know of, but is certainly as
> flexible as you can make in C++.

Please notice that I specifically mentioned *fixed-point* FFTs. The area I 
work in is an intersection of algebraic geometry, signal processing and 
discrete mathematics. FFTW has no idea how to model 13-bit fixed point values, 
and certainly does not handle minimization of error propagation by choice of 
rounding vs. truncation in intermediate steps (rounding does not always lead 
to better error propagation compared to truncation, which is computationally 
much less intensive).

> > Code maintainability works by using clearly defined idioms.
>
> That's really only part of the story. Code maintainability also
> requires the idioms to be well shared and understood by the community
> - which C++ makes really hard to ensure because it is such a complex
> beast.

The second part is not really true. Of course C++ is a very young language 
with features that were completely unappreciated in the beginning by its 
target audience: C programmers looking for something more scalable. Well 
understood and shared idioms do not just appear on the scene. A significant 
body of work and experience is required before such idioms percolate down to 
the journeyman programmer. C++ reached that stage only circa 2005. (For a 
simple example, see how much work has been going on in the ipython-dev lists 
regarding asynchronous operations; this is not because asynchronous operations 
are not inherently difficult to understand - just that the standard idioms for 
handling asynchronous events are not yet commonly understood outside of a very 
small community (and even those idioms are still under refinement)).

> C++ is unmaintainable without a strong set of coding rules,
> which only really works in companies, or when you have an already
> strong framework (in open source, it is quite striking that C++ is
> seldom used, except for complex GUI programs).

Of course you have coding rules, but you have such rules even in small C 
projects. Boost does not really having many coding rules other than naming 
conventions and boost is widely deployed. Please read the CERN ROOT 
information page for the reason they switched from Fortran to C++ (speed & 
scalability). C++ is not the best language for every task; my only claim was 
that C++ is just as good as Fortran for a lot of tasks and even better. After 
all, I participate in this list because I use python just as much as C++.

> I have no reason to doubt your experience that template leads to
> maintainable code - but it is exactly the contrary in my experience,
> and often for code which is supposed to be state of the art (boost).

This is the fundamental misunderstanding. People treat C++ as an extension of 
C and then templates tie them into knots. I had the very same problem until I 
used some functional languages (Common Lisp, in my case) and realized that C++ 
is an new object-oriented language than has certain C features. This coincided 
with the time I became frustrated with Fortran and wished that I had a hybrid 
between C & Lisp and then it became clear to me that C++ is very near that.

> > First, Fortran, as I pointed out above, is generally worthless for a lot
> > of computation-intensive problems that don't map to its native data
> > types.
> >
> > Second, Fortran is not magic; it simply uses optimized libraries
> > underneath and the speed of Fortran compiled code depends upon the
> > libraries
>
> Part of the fortran speed comes from the fact that fortran does not
> have pointer.

Not true for Fortran95 as pointed out by Mattheiu & Sturla already.

> I think something like eigen will not suit python developers much.
> First, it has dreadful compilation time (like everything
> template-based), and their performance numbers, I never could
> reproduce them. I have never seen such a difference between MKL and
> ATLAS as shown on their benchmark - since they don't give enough
> information, it is hard to tell which atlas they used, but in my
> experience, ATLAS (and of course MKL) was always much faster than
> eigen, on both mac os X (with accelerate, which is mostly customized
> atlas, at least at its code) and Linux, with the benchmark they
> provide. At this point, I don't understand what they are measuring.

I used to work for a certain major competitor to the producers of MKL. ATLAS 
cad FFTW can both be beaten by a significant margin. In fact, with a certain 
compiler from the major competitor and our own libraries, we could beat 
Fortran performance (from the same competitor's compiler) on L2 & L3 BLAS from 
C/C++/Fortran.

> I also note that they are so much faster than blitz, which itself was
> supposed to match fortran speed. This puzzles me as a fundamental
> contradiction somewhere :)

Never used blitz seriously because of the painful interface; so, no comment.

> > Third, computation speed now on CotS processors depends more on cache &
> > memory access optimization than anything else, which compilers can do
> > with C/C++ just as well as with Fortran;
>
> No, they can't. At least in standard C++, you can't provide enough
> informations about pointers. But even then, it is often only 2 or 3
> times slower - which rarely matters for scientific programming, except
> for the biggest simulations.

Unfortunately, at least in my line of work, these "biggest simulations" are 
very common ones. One example from my past is LDPC code searches, where 
sometimes one has to resort to using FPGAs when we could not speed up 
computations any more; the 3 months we lost programming the FPGAs were amply 
repaid within a few weeks.

> But the point is that it is difficult for no reason but a dreadful
> syntax. Something like eigen could be done in a higher level language.
> To everyone his own interet, I guess, but I don't understand the joy
> of spending time coding and debugging template code. It is just awful
> - the compiler often cannot tell you even the line which has a syntax
> error.

I partly agree (and assert that you need to use better compilers, like 
Comeau). I wish it were possible to write DSELs easily in some other language 
(preferably some enhancement of OCaml), but I haven't yet found such a 
language that has sufficient mindshare in my area of work :-(

> Something like fftw, wich a code generator written in a high level
> language is a much better example of meta programming IMHO. It is
> readable, flexible, and portable, at least in comparison to anything
> C++ has to offer today.

Completely agreed, but tool availability is a big problem. In my case, I quote 
the zen of python: practicality beats purity :-) and so I stick with 
C++/python.

Just in case the main point was lost: (1) C++ does not fill every niche but 
has its place when used with Python. (2) Fortran is not a replacement for C++.

Regards,
Ravi


From robert.kern at gmail.com  Tue Jan 20 16:54:54 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 20 Jan 2009 15:54:54 -0600
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>
References: <4975B5CD.9060002@ast.cam.ac.uk>
	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>
Message-ID: <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>

On Tue, Jan 20, 2009 at 09:34,  <josef.pktd at gmail.com> wrote:
> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
>> I'm using the following code to fit a gaussian to a histogram of some data.
>>
>> #fit gaussian
>>    fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2))  #
>> Target function
>>    errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>> target function
>>    doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) +
>> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2))
>>    doublegausserr = lambda q,x,y: doublegauss(q,x) - y
>>    # initial guess
>>    p0 = [10.0,-2,0.5]
>>    # find parameters of single gaussian
>>    p1,success  = optimize.leastsq(errfunc, p0[:], args =
>> (hista[1],hista[0]))
>>    errors_sq = errfunc(p1,hista[1],hista[0])**2
>>
>>
>> I have the error message
>>
>> Traceback (most recent call last):
>>  File "M31FeHfit_totalw108.py", line 116, in <module>
>>   p1,success  = optimize.leastsq(errfunc, p0[:], args =
>> (hista[1],hista[0]))
>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>> 264, in leastsq
>>   m = check_func(func,x0,args,n)[0]
>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>> 11, in check_func
>>   res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
>>  File "M31FeHfit_totalw108.py", line 110, in <lambda>
>>   errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>> target function
>> ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>
>> Anyone know why this happens? Curiously I have had it work before but
>> not with my current versions of scipy and python etc.
>>
>> David
>
> Check the dimensions hista[1],hista[0]. I can run your part of the
> code without problems.
>
> If you want to estimate the parameters of a (parametric) distribution,
> then using maximum likelihood estimation would be more appropriate
> than using least squares on the histogram.

Right. You can't just take the value of the PDF and compare it to the
(density-normalized) value of the histogram. You have to integrate the
PDF over each bin and compare that value to the mass-normalized value
of the histogram. Least-squares still isn't quite appropriate for this
task, not least because the amount of weight that you should apply to
each data point is non-uniform.

If you are doing the histogramming yourself from the raw data, you
might be better off doing a maximum likelihood fit on the raw data
like the .fit() method of the rv_continuous distribution objects in
scipy.tats.

If the data you have is already pre-histogrammed or discretized,
though, you need a different formulation of ML. For given parameters,
integrate the PDF over the bins of your histogram. This will give you
the probability of a single sample falling into each bin. If you have
N samples from this distribution (N being the number of data points
that went into the real histogram), this defines a multinomial
distribution over the bins. You can evaluate the log-likelihood of
getting your real histogram given those PDF parameters using the
multinomial distribution.

I've actually had a far bit of success with the latter when estimating
Weibull distributions when the typical techniques failed to be robust.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From sturla at molden.no  Tue Jan 20 17:36:30 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 20 Jan 2009 23:36:30 +0100 (CET)
Subject: [SciPy-user] python (against java) advocacy for scientific
 projects
In-Reply-To: <200901201615.17634.lists_ravi@lavabit.com>
References: <200901200927.15074.lists_ravi@lavabit.com>
	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
	<200901201615.17634.lists_ravi@lavabit.com>
Message-ID: <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no>


> Of course you have coding rules, but you have such rules even in small C
> projects. Boost does not really having many coding rules other than naming
> conventions and boost is widely deployed. Please read the CERN ROOT
> information page for the reason they switched from Fortran to C++ (speed &
> scalability).

Actually the switched from "FORTRAN" (spelled with capitals) to C++ for
maintainability. I am not sure if that means Fortran 77 or FORTRAN IV, but
certainly not Fortran 90, 95 or 2003. The choice had just as much to do
with abundance of qualified developers as merits of the languages. There
is also a similar story of NASA, who tried to move spacecraft navigation
code from Fortran 77 to C++ in 1996, and failing miserably.

CERN ROOT is interesting though. It has a Python front end, and is LGPL
licensed. For those who don't know, ROOT is a data analysis framework
written for LHC (the new Doomsday machine), do deal with the enormous data
sets it generates (I've heard it is about ~10 terabytes per run). But ROOT
can be of general interest to scientists outside CERN as well.
http://root.cern.ch/


Sturla Molden


From daniele at grinta.net  Tue Jan 20 17:58:23 2009
From: daniele at grinta.net (Daniele Nicolodi)
Date: Tue, 20 Jan 2009 23:58:23 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no>
References: <200901200927.15074.lists_ravi@lavabit.com>	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>	<200901201615.17634.lists_ravi@lavabit.com>
	<98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no>
Message-ID: <4976570F.2050300@grinta.net>

Sturla Molden wrote:

> CERN ROOT is interesting though. It has a Python front end, and is LGPL
> licensed. For those who don't know, ROOT is a data analysis framework
> written for LHC (the new Doomsday machine), do deal with the enormous data
> sets it generates (I've heard it is about ~10 terabytes per run). But ROOT
> can be of general interest to scientists outside CERN as well.
> http://root.cern.ch/

I had a short exposition to ROOT codebase some years ago. While I 
recognise that the project reached probably its goals, despite those 
were is quite ambitious, the quality of the API and of the code is far 
from being perfect.

The project suffers a lot from the choice of being developped in C++. At 
the time when it was started C++ and its standard library was from being 
standard on different compilers and platforms. For this reason there are 
a lot of weels that has been re-engineered in ROOT.

Judging from the outside I think that at the time the project started 
the only reason to use C++ where that it was the language chosen for 
teaching at university level courses. Using C++ it was possible to hire 
fairly inexpensive PhD students for the development...

ROOT uses a C++ interpreter to offer something similar to ipythin or 
matlab console. It is simply a nightmer to work with it. And personaly I 
think that the quality of C++ data analysis routines a physician can 
write in hurry while working on an interesting experiment is probably 
the worst code you can find (and I can tell that being among the ones 
that wrote that kind of code...).

Cheers.
-- 
Daniele


From cournape at gmail.com  Tue Jan 20 20:10:41 2009
From: cournape at gmail.com (David Cournapeau)
Date: Wed, 21 Jan 2009 10:10:41 +0900
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <200901201615.17634.lists_ravi@lavabit.com>
References: <200901200927.15074.lists_ravi@lavabit.com>
	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
	<200901201615.17634.lists_ravi@lavabit.com>
Message-ID: <5b8d13220901201710n7c3dd819w4c5f1288cf47d791@mail.gmail.com>

On Wed, Jan 21, 2009 at 6:15 AM, Ravi <lists_ravi at lavabit.com> wrote:
> On Tuesday 20 January 2009 13:13:27 David Cournapeau wrote:
>> On Tue, Jan 20, 2009 at 11:27 PM, Ravi <lists_ravi at lavabit.com> wrote:
>> > Really? Try writing just a fixed-point radix-8 FFT
> [snip]
>> The FFT reference is FFTW. It uses neither C++ or fortran. It does not
>> have rounding /clipping strategies that I know of, but is certainly as
>> flexible as you can make in C++.
>
> Please notice that I specifically mentioned *fixed-point* FFTs. The area I
> work in is an intersection of algebraic geometry, signal processing and
> discrete mathematics. FFTW has no idea how to model 13-bit fixed point values,
> and certainly does not handle minimization of error propagation by choice of
> rounding vs. truncation in intermediate steps (rounding does not always lead
> to better error propagation compared to truncation, which is computationally
> much less intensive).
>
>> > Code maintainability works by using clearly defined idioms.
>>
>> That's really only part of the story. Code maintainability also
>> requires the idioms to be well shared and understood by the community
>> - which C++ makes really hard to ensure because it is such a complex
>> beast.
>
> The second part is not really true. Of course C++ is a very young language
> with features that were completely unappreciated in the beginning by its
> target audience: C programmers looking for something more scalable. Well
> understood and shared idioms do not just appear on the scene. A significant
> body of work and experience is required before such idioms percolate down to
> the journeyman programmer. C++ reached that stage only circa 2005. (For a
> simple example, see how much work has been going on in the ipython-dev lists
> regarding asynchronous operations; this is not because asynchronous operations
> are not inherently difficult to understand - just that the standard idioms for
> handling asynchronous events are not yet commonly understood outside of a very
> small community (and even those idioms are still under refinement)).
>
>> C++ is unmaintainable without a strong set of coding rules,
>> which only really works in companies, or when you have an already
>> strong framework (in open source, it is quite striking that C++ is
>> seldom used, except for complex GUI programs).
>
> Of course you have coding rules, but you have such rules even in small C
> projects.

Of course, you have some convention, but when you compare PEP 7 or
Linux coding standard for C coding standard and C++ coding standards
for mozilla, google coding standards and co, you see they are order of
magnitude simpler for C than for C++. You could argue that the
projects are much simpler, too :)

>
> I partly agree (and assert that you need to use better compilers, like
> Comeau). I wish it were possible to write DSELs easily in some other language
> (preferably some enhancement of OCaml), but I haven't yet found such a
> language that has sufficient mindshare in my area of work :-(

Yes, this last parameter is almost always the one which matters the
most at the end. Certainly, a big reason for C++ success was that it
could capitalize on C mindshare.

David


From david at ar.media.kyoto-u.ac.jp  Tue Jan 20 21:28:07 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Wed, 21 Jan 2009 11:28:07 +0900
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no>
References: <200901200927.15074.lists_ravi@lavabit.com>	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>	<200901201615.17634.lists_ravi@lavabit.com>
	<98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no>
Message-ID: <49768837.8090202@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
> CERN ROOT is interesting though. It has a Python front end, and is LGPL
> licensed. For those who don't know, ROOT is a data analysis framework
> written for LHC (the new Doomsday machine), do deal with the enormous data
> sets it generates (I've heard it is about ~10 terabytes per run).

At the current stage of technology, dealing with this kind of data
requires a lot of planning, work and tradeoff which can simply not be
used as a general rule. I mean, those projects are so big, with so many
people involved that trying to deduce anything worthwhile from their
choice of language does not sound convincing at all; actually, I would
not be surprised if technical matters such as language choice is of
secondary interest/importance compared to things like what people in
this community are familiar with, etc...

It is like all those talks about Ada vs C vs whatever for reliable code
- mostly conjecture to make the point people were intending to make
anyway. It has been consistently showed that technical matters were just
the symptoms of bigger organizational problems. On python ML, people
throw at each other technical explanation about failure in Ariane 5 - on
a related problem space, I find the following much more eye-opening (ad
filled page):

http://www.fastcompany.com/magazine/06/writestuff.html

We are developers, so we like to think technology is what matters. I
think we all know at some level it does not, but we just can't admit it
:) I hate C++, but I know a lot of very fine softwares were done with
it. A lot of working softwares are done on windows, with excel and
visual basic.

cheers,

David


From warren.weckesser at gmail.com  Wed Jan 21 00:00:18 2009
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Tue, 20 Jan 2009 23:00:18 -0600
Subject: [SciPy-user] integrate.odeint and simultaniuos equations
In-Reply-To: <200658.60894.qm@web36501.mail.mud.yahoo.com>
References: <200658.60894.qm@web36501.mail.mud.yahoo.com>
Message-ID: <114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com>

Scott,

I added an example with two degrees of freedom to the SciPy wiki, in this
"cookbook" entry:
    http://www.scipy.org/Cookbook/CoupledSpringMassSystem
A system with two degrees of freedom (and no constraints) will result in a
four dimensional state space; you will have a system of four first order
differential equations.  This is what Bastian Weber pointed out at the end
of his response to your first email about this.

Best regards,

Warren

On Tue, Jan 20, 2009 at 6:34 AM, Scott Askey <scotta_2002 at yahoo.com> wrote:

> Do ode and odeint work in multiple dimensions?
>
> I could not any examples with more than one degree of freedom.  And from
> the doc string it how to solve simultaneous  ode's was not obvious.  The
> code for modelling a 2d simple harmonic oscillator or spherical pendulum
> would give me the insight I need.
>
> I found and understand the following 1 D harmonic oscillator model from the
> scipy cookbook.
>
> V/R
>
> Scott
>
>
>
> from scipy import *
> from pylab import *
> deriv = lambda y,t : array([y[1],-y[0]-.1*y[1]])#xdot,x2dot
> # Integration parameters
> start=0
> end=10
> numsteps=10000
> time=linspace(start,end,numsteps)
> from scipy import integrate
> y0=array([0.0005,0.2]) #x,x_dot
> y=integrate.odeint(deriv,y0,time)
> plot(time,y[:,0])#x,xdot
> show()
>
>
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090120/a9dc1b02/attachment.html>

From dlrt2 at ast.cam.ac.uk  Wed Jan 21 03:55:38 2009
From: dlrt2 at ast.cam.ac.uk (David Trethewey)
Date: Wed, 21 Jan 2009 08:55:38 +0000
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
References: <4975B5CD.9060002@ast.cam.ac.uk>	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>
	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
Message-ID: <4976E30A.7060904@ast.cam.ac.uk>

Robert Kern wrote:
> On Tue, Jan 20, 2009 at 09:34,  <josef.pktd at gmail.com> wrote:
>   
>> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
>>     
>>> I'm using the following code to fit a gaussian to a histogram of some data.
>>>
>>> #fit gaussian
>>>    fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2))  #
>>> Target function
>>>    errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>>> target function
>>>    doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) +
>>> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2))
>>>    doublegausserr = lambda q,x,y: doublegauss(q,x) - y
>>>    # initial guess
>>>    p0 = [10.0,-2,0.5]
>>>    # find parameters of single gaussian
>>>    p1,success  = optimize.leastsq(errfunc, p0[:], args =
>>> (hista[1],hista[0]))
>>>    errors_sq = errfunc(p1,hista[1],hista[0])**2
>>>
>>>
>>> I have the error message
>>>
>>> Traceback (most recent call last):
>>>  File "M31FeHfit_totalw108.py", line 116, in <module>
>>>   p1,success  = optimize.leastsq(errfunc, p0[:], args =
>>> (hista[1],hista[0]))
>>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>>> 264, in leastsq
>>>   m = check_func(func,x0,args,n)[0]
>>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>>> 11, in check_func
>>>   res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
>>>  File "M31FeHfit_totalw108.py", line 110, in <lambda>
>>>   errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>>> target function
>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>>
>>> Anyone know why this happens? Curiously I have had it work before but
>>> not with my current versions of scipy and python etc.
>>>
>>> David
>>>       
>> Check the dimensions hista[1],hista[0]. I can run your part of the
>> code without problems.
>>
>> If you want to estimate the parameters of a (parametric) distribution,
>> then using maximum likelihood estimation would be more appropriate
>> than using least squares on the histogram.
>>     
>
> Right. You can't just take the value of the PDF and compare it to the
> (density-normalized) value of the histogram. You have to integrate the
> PDF over each bin and compare that value to the mass-normalized value
> of the histogram. Least-squares still isn't quite appropriate for this
> task, not least because the amount of weight that you should apply to
> each data point is non-uniform.
>
> If you are doing the histogramming yourself from the raw data, you
> might be better off doing a maximum likelihood fit on the raw data
> like the .fit() method of the rv_continuous distribution objects in
> scipy.tats.
>
> If the data you have is already pre-histogrammed or discretized,
> though, you need a different formulation of ML. For given parameters,
> integrate the PDF over the bins of your histogram. This will give you
> the probability of a single sample falling into each bin. If you have
> N samples from this distribution (N being the number of data points
> that went into the real histogram), this defines a multinomial
> distribution over the bins. You can evaluate the log-likelihood of
> getting your real histogram given those PDF parameters using the
> multinomial distribution.
>
> I've actually had a far bit of success with the latter when estimating
> Weibull distributions when the typical techniques failed to be robust.
>
>   
I am doing the histogramming from the raw data, so sounds like a maximum 
likelihood fit would be better. What I have is a series of velocity and 
Fe/H measurements for a series of stars in the Andromeda galaxy, and the 
idea is to find a gaussian and double gaussian fit, and have a look to 
see whether the double gaussian is significantly better, to detect 
whether there are two distinct populations within the stars.

David

David


From dlrt2 at ast.cam.ac.uk  Wed Jan 21 05:02:14 2009
From: dlrt2 at ast.cam.ac.uk (David Trethewey)
Date: Wed, 21 Jan 2009 10:02:14 +0000
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <4976E30A.7060904@ast.cam.ac.uk>
References: <4975B5CD.9060002@ast.cam.ac.uk>	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
	<4976E30A.7060904@ast.cam.ac.uk>
Message-ID: <4976F2A6.4040601@ast.cam.ac.uk>

So what I'm trying to work out now is how to use the .fit() method of 
rv_continuous for a single gaussian and a double gaussian.

David

David Trethewey wrote:
> Robert Kern wrote:
>   
>> On Tue, Jan 20, 2009 at 09:34,  <josef.pktd at gmail.com> wrote:
>>   
>>     
>>> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
>>>     
>>>       
>>>> I'm using the following code to fit a gaussian to a histogram of some data.
>>>>
>>>> #fit gaussian
>>>>    fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2))  #
>>>> Target function
>>>>    errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>>>> target function
>>>>    doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) +
>>>> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2))
>>>>    doublegausserr = lambda q,x,y: doublegauss(q,x) - y
>>>>    # initial guess
>>>>    p0 = [10.0,-2,0.5]
>>>>    # find parameters of single gaussian
>>>>    p1,success  = optimize.leastsq(errfunc, p0[:], args =
>>>> (hista[1],hista[0]))
>>>>    errors_sq = errfunc(p1,hista[1],hista[0])**2
>>>>
>>>>
>>>> I have the error message
>>>>
>>>> Traceback (most recent call last):
>>>>  File "M31FeHfit_totalw108.py", line 116, in <module>
>>>>   p1,success  = optimize.leastsq(errfunc, p0[:], args =
>>>> (hista[1],hista[0]))
>>>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>>>> 264, in leastsq
>>>>   m = check_func(func,x0,args,n)[0]
>>>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>>>> 11, in check_func
>>>>   res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
>>>>  File "M31FeHfit_totalw108.py", line 110, in <lambda>
>>>>   errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>>>> target function
>>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>>>
>>>> Anyone know why this happens? Curiously I have had it work before but
>>>> not with my current versions of scipy and python etc.
>>>>
>>>> David
>>>>       
>>>>         
>>> Check the dimensions hista[1],hista[0]. I can run your part of the
>>> code without problems.
>>>
>>> If you want to estimate the parameters of a (parametric) distribution,
>>> then using maximum likelihood estimation would be more appropriate
>>> than using least squares on the histogram.
>>>     
>>>       
>> Right. You can't just take the value of the PDF and compare it to the
>> (density-normalized) value of the histogram. You have to integrate the
>> PDF over each bin and compare that value to the mass-normalized value
>> of the histogram. Least-squares still isn't quite appropriate for this
>> task, not least because the amount of weight that you should apply to
>> each data point is non-uniform.
>>
>> If you are doing the histogramming yourself from the raw data, you
>> might be better off doing a maximum likelihood fit on the raw data
>> like the .fit() method of the rv_continuous distribution objects in
>> scipy.tats.
>>
>> If the data you have is already pre-histogrammed or discretized,
>> though, you need a different formulation of ML. For given parameters,
>> integrate the PDF over the bins of your histogram. This will give you
>> the probability of a single sample falling into each bin. If you have
>> N samples from this distribution (N being the number of data points
>> that went into the real histogram), this defines a multinomial
>> distribution over the bins. You can evaluate the log-likelihood of
>> getting your real histogram given those PDF parameters using the
>> multinomial distribution.
>>
>> I've actually had a far bit of success with the latter when estimating
>> Weibull distributions when the typical techniques failed to be robust.
>>
>>   
>>     
> I am doing the histogramming from the raw data, so sounds like a maximum 
> likelihood fit would be better. What I have is a series of velocity and 
> Fe/H measurements for a series of stars in the Andromeda galaxy, and the 
> idea is to find a gaussian and double gaussian fit, and have a look to 
> see whether the double gaussian is significantly better, to detect 
> whether there are two distinct populations within the stars.
>
> David
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   


From bastian.weber at gmx-topmail.de  Wed Jan 21 05:44:39 2009
From: bastian.weber at gmx-topmail.de (Bastian Weber)
Date: Wed, 21 Jan 2009 11:44:39 +0100
Subject: [SciPy-user] integrate.odeint and simultaniuos equations
In-Reply-To: <114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com>
References: <200658.60894.qm@web36501.mail.mud.yahoo.com>
	<114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com>
Message-ID: <4976FC97.40804@gmx-topmail.de>

Hi Warren,

> I added an example with two degrees of freedom to the SciPy wiki, in
> this "cookbook" entry:
>     http://www.scipy.org/Cookbook/CoupledSpringMassSystem


What a great job! I am really impressed.

Btw: what does happen if you run that two_springs_plot.py on a machine
without a working LaTex installation? I am curious because it uses $x_1$
in the legend but I could not find  something like:

matplotlib.rc('text', usetex=True).


Best Regards,
Bastian.


From wbaxter at gmail.com  Wed Jan 21 05:58:09 2009
From: wbaxter at gmail.com (Bill Baxter)
Date: Wed, 21 Jan 2009 19:58:09 +0900
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <200901201615.17634.lists_ravi@lavabit.com>
References: <200901200927.15074.lists_ravi@lavabit.com>
	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>
	<200901201615.17634.lists_ravi@lavabit.com>
Message-ID: <e86a5fd00901210258j4596bb59s1b1185d90257d81d@mail.gmail.com>

On Wed, Jan 21, 2009 at 6:15 AM, Ravi <lists_ravi at lavabit.com> wrote:
>> But the point is that it is difficult for no reason but a dreadful
>> syntax. Something like eigen could be done in a higher level language.
>> To everyone his own interet, I guess, but I don't understand the joy
>> of spending time coding and debugging template code. It is just awful
>> - the compiler often cannot tell you even the line which has a syntax
>> error.
>
> I partly agree (and assert that you need to use better compilers, like
> Comeau). I wish it were possible to write DSELs easily in some other language
> (preferably some enhancement of OCaml), but I haven't yet found such a
> language that has sufficient mindshare in my area of work :-(

These days I use Python for stuff that doesn't need to run fast, and
the D programming language for the rest.  It would please me very much
if I never had to write another line of C++ in all my living days.
And that goes triple for C++ template code.  Templates in D are a joy
compared to C++ templates.  They're actually usable for
meta-programming without turning your code into a spaghetti mess of
little helper structs and macros.   D's also got built-in GC so you
don't have to micromanage your memory.  And it's got familiar syntax
so you don't have to turn your brain inside out just to figure out how
to iterate over a list.  You can also call C or Fortran code directly
just by rewriting the function prototypes (kinda like ctypes lets you
do for python).  I've been pretty happy with it.  But it is still a
little raw around the edges at times, as is probably the case with
pretty much any non-mainstream language.

--bb


From josef.pktd at gmail.com  Wed Jan 21 06:58:49 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 21 Jan 2009 06:58:49 -0500
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <4976F2A6.4040601@ast.cam.ac.uk>
References: <4975B5CD.9060002@ast.cam.ac.uk>
	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>
	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
	<4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk>
Message-ID: <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>

On Wed, Jan 21, 2009 at 5:02 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
> So what I'm trying to work out now is how to use the .fit() method of
> rv_continuous for a single gaussian and a double gaussian.
>
> David
>

The maximum likelihood estimator for the single gaussian is given by
the mean and variance of your data set, but also stats.norm.fit works
well.

Your double gaussian is a mixture of gaussians and is not directly in
stats distribution. I wrote a subclass for this case as an example,
but I have to find it later, and I didn't try out the fit method.
Fitting mixtures of gaussians can also be done (in a more
sophisticated way) with the EM algorithm in the learn scikits package.

One more possibility, if you are not sure about the distributional
assumption is to use stats.kde, a gaussian kernel density estimation.
For bimodal distributions the smoothing parameter has to be changed,
you find some examples in this mailing list.

I'm not sure what to use or where to find a statistical test, for the
mixture versus unimodal distribution.

Josef


From david at ar.media.kyoto-u.ac.jp  Wed Jan 21 06:52:43 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Wed, 21 Jan 2009 20:52:43 +0900
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
References: <4975B5CD.9060002@ast.cam.ac.uk>	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>	<4976E30A.7060904@ast.cam.ac.uk>
	<4976F2A6.4040601@ast.cam.ac.uk>
	<1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
Message-ID: <49770C8B.40504@ar.media.kyoto-u.ac.jp>

josef.pktd at gmail.com wrote:
> I'm not sure what to use or where to find a statistical test, for the
> mixture versus unimodal distribution.
>   

A simple method is to use the Bayesian Information Criterion if you want
to test a model with one against two Gaussian: you estimate the maximum
likelihood estimator for both models, and compare the BIC for both set
of parameters. It is often use at least in machine learning community to
'tweak' the number of Gaussian in your mixture. It is implemented in
scikits.learn.machine.em.

Note that BIC is not really a statistical test per se, though, and that
it does not actually 'test' for unimodality (e.g. if you have an
unimodal but very skewed distribution, for example, BIC will most likely
tell you that the mixture with two components is 'best'). I don't know
much about non parametric testing, so don't have anything to say if the
data are significantly non Gaussian.

David


From hep.sebastien.binet at gmail.com  Wed Jan 21 07:54:50 2009
From: hep.sebastien.binet at gmail.com (Sebastien Binet)
Date: Wed, 21 Jan 2009 13:54:50 +0100
Subject: [SciPy-user] python (against java) advocacy for scientific
	projects
In-Reply-To: <4976570F.2050300@grinta.net>
References: <200901200927.15074.lists_ravi@lavabit.com>
	<98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no>
	<4976570F.2050300@grinta.net>
Message-ID: <200901211354.51064.binet@cern.ch>

On Tuesday 20 January 2009 23:58:23 Daniele Nicolodi wrote:
> Sturla Molden wrote:
> > CERN ROOT is interesting though. It has a Python front end, and is LGPL
> > licensed. For those who don't know, ROOT is a data analysis framework
> > written for LHC (the new Doomsday machine), do deal with the enormous
> > data sets it generates (I've heard it is about ~10 terabytes per run).
> > But ROOT can be of general interest to scientists outside CERN as well.
> > http://root.cern.ch/
>
> I had a short exposition to ROOT codebase some years ago. While I
> recognise that the project reached probably its goals, despite those
> were is quite ambitious, the quality of the API and of the code is far
> from being perfect.

being a user of ROOT as a core sw developer of one of the LHC experiment, I 
can't agree more.
the ROOT team had great ideas, eg: generating and using reflection 
informations to "automatically" persistify C++ objects.

The main problem is that ROOT development started in early 90's when C++ was 
still young, so there is a lot of cruft and esoteric (by today's C++ 
standards) code out there.

Further more, CINT (the C/C++ interpreter) doesn't encourage good C++ writing 
so you end up with crappy code written by hurried physicists, sometimes even 
in production. Many physicists learned C++ with CINT as it is so easy (no 
compilation needed)... and they caught really REALLY bad habits.
Not to mention all the corner cases, death traps and other surprises you can 
run into when using C++ or at the other end of spectrum, when people get 
carried away and try to use complicated idioms which (even when/if used right) 
will backfire b/c the new guy who has the pleasure to maintain that voodoo 
code will just make a total mess.

Thanks to some "crazy" people ;) , we do have python bindings to ROOT so it 
eases the pain, but still: IMHO C++ is a bad choice and really hurts LHC 
software.
No wonder why the next big accelerator (ILC) mostly dropped C++ and went for 
Fortran (MonteCarlo code,...) +java (control framework).

cheers,
sebastien.
-- 
#########################################
# Dr. Sebastien Binet
# Laboratoire de l'Accelerateur Lineaire
# Universite Paris-Sud XI
# Batiment 200
# 91898 Orsay
#########################################


From sturla at molden.no  Wed Jan 21 08:04:19 2009
From: sturla at molden.no (Sturla Molden)
Date: Wed, 21 Jan 2009 14:04:19 +0100
Subject: [SciPy-user] python (against java) advocacy for
	scientific	projects
In-Reply-To: <e86a5fd00901210258j4596bb59s1b1185d90257d81d@mail.gmail.com>
References: <200901200927.15074.lists_ravi@lavabit.com>	<5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com>	<200901201615.17634.lists_ravi@lavabit.com>
	<e86a5fd00901210258j4596bb59s1b1185d90257d81d@mail.gmail.com>
Message-ID: <49771D53.5030504@molden.no>

On 1/21/2009 11:58 AM, Bill Baxter wrote:

> These days I use Python for stuff that doesn't need to run fast, and
> the D programming language for the rest.  It would please me very much
> if I never had to write another line of C++ in all my living days.

I wish I never knew C++. Because then I would not have wasted so much 
time learning and using it.

If I need speed, I will henceforth resort to Fortran and interface with 
f2py. With the ISO C bindings of Fortran 2003 (similar to ctypes), C 
libraries can be called from Fortran with very little effort.

S.M.


From warren.weckesser at gmail.com  Wed Jan 21 08:28:26 2009
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 21 Jan 2009 07:28:26 -0600
Subject: [SciPy-user] integrate.odeint and simultaniuos equations
In-Reply-To: <4976FC97.40804@gmx-topmail.de>
References: <200658.60894.qm@web36501.mail.mud.yahoo.com>
	<114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com>
	<4976FC97.40804@gmx-topmail.de>
Message-ID: <114880320901210528h50673908o5eae4bddb88dd0ec@mail.gmail.com>

Hi Bastian,

On Wed, Jan 21, 2009 at 4:44 AM, Bastian Weber <bastian.weber at gmx-topmail.de
> wrote:

> Hi Warren,
>
> > I added an example with two degrees of freedom to the SciPy wiki, in
> > this "cookbook" entry:
> >     http://www.scipy.org/Cookbook/CoupledSpringMassSystem
>
>
> What a great job! I am really impressed.


Thanks!


>
>
> Btw: what does happen if you run that two_springs_plot.py on a machine
> without a working LaTex installation? I am curious because it uses $x_1$
> in the legend but I could not find  something like:
>
> matplotlib.rc('text', usetex=True).
>

It should still work--matplotlib has its own TeX renderer:
     http://matplotlib.sourceforge.net/users/mathtext.html


> Best Regards,
> Bastian.
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090121/e7625032/attachment.html>

From nwagner at iam.uni-stuttgart.de  Wed Jan 21 11:26:40 2009
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Wed, 21 Jan 2009 17:26:40 +0100
Subject: [SciPy-user] integrate.odeint and simultaniuos equations
In-Reply-To: <4976FC97.40804@gmx-topmail.de>
References: <200658.60894.qm@web36501.mail.mud.yahoo.com>
	<114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com>
	<4976FC97.40804@gmx-topmail.de>
Message-ID: <web-116165424@uni-stuttgart.de>

On Wed, 21 Jan 2009 11:44:39 +0100
  Bastian Weber <bastian.weber at gmx-topmail.de> wrote:
> Hi Warren,
> 
>> I added an example with two degrees of freedom to the 
>>SciPy wiki, in
>> this "cookbook" entry:
>>     http://www.scipy.org/Cookbook/CoupledSpringMassSystem
> 
> 
> What a great job! I am really impressed.
> 
> Btw: what does happen if you run that 
>two_springs_plot.py on a machine
> without a working LaTex installation? I am curious 
>because it uses $x_1$
> in the legend but I could not find  something like:
> 
> matplotlib.rc('text', usetex=True).
> 
> 
> Best Regards,
> Bastian.
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user

  
Hi Bastian,

if you are interested in general MDOF systems (n > 2) you 
can also try the attached example.

Cheers
           Nils
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mdof.py
Type: text/x-python
Size: 1781 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090121/cf451368/attachment.py>

From dlrt2 at ast.cam.ac.uk  Wed Jan 21 12:17:59 2009
From: dlrt2 at ast.cam.ac.uk (David Trethewey)
Date: Wed, 21 Jan 2009 17:17:59 +0000
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
References: <4975B5CD.9060002@ast.cam.ac.uk>	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>	<4976E30A.7060904@ast.cam.ac.uk>
	<4976F2A6.4040601@ast.cam.ac.uk>
	<1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
Message-ID: <497758C7.2070803@ast.cam.ac.uk>

How exactly would the EM algorithm be used? The homepage 
http://pypi.python.org/pypi/scikits.learn seems to be down at the moment.

David

josef.pktd at gmail.com wrote:
> On Wed, Jan 21, 2009 at 5:02 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
>   
>> So what I'm trying to work out now is how to use the .fit() method of
>> rv_continuous for a single gaussian and a double gaussian.
>>
>> David
>>
>>     
>
> The maximum likelihood estimator for the single gaussian is given by
> the mean and variance of your data set, but also stats.norm.fit works
> well.
>
> Your double gaussian is a mixture of gaussians and is not directly in
> stats distribution. I wrote a subclass for this case as an example,
> but I have to find it later, and I didn't try out the fit method.
> Fitting mixtures of gaussians can also be done (in a more
> sophisticated way) with the EM algorithm in the learn scikits package.
>
> One more possibility, if you are not sure about the distributional
> assumption is to use stats.kde, a gaussian kernel density estimation.
> For bimodal distributions the smoothing parameter has to be changed,
> you find some examples in this mailing list.
>
> I'm not sure what to use or where to find a statistical test, for the
> mixture versus unimodal distribution.
>
> Josef
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   


From bsouthey at gmail.com  Wed Jan 21 12:27:39 2009
From: bsouthey at gmail.com (Bruce Southey)
Date: Wed, 21 Jan 2009 11:27:39 -0600
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <4976E30A.7060904@ast.cam.ac.uk>
References: <4975B5CD.9060002@ast.cam.ac.uk>	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
	<4976E30A.7060904@ast.cam.ac.uk>
Message-ID: <49775B0B.3020605@gmail.com>

David Trethewey wrote:
> Robert Kern wrote:
>   
>> On Tue, Jan 20, 2009 at 09:34,  <josef.pktd at gmail.com> wrote:
>>   
>>     
>>> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
>>>     
>>>       
>>>> I'm using the following code to fit a gaussian to a histogram of some data.
>>>>
>>>> #fit gaussian
>>>>    fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2))  #
>>>> Target function
>>>>    errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>>>> target function
>>>>    doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) +
>>>> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2))
>>>>    doublegausserr = lambda q,x,y: doublegauss(q,x) - y
>>>>    # initial guess
>>>>    p0 = [10.0,-2,0.5]
>>>>    # find parameters of single gaussian
>>>>    p1,success  = optimize.leastsq(errfunc, p0[:], args =
>>>> (hista[1],hista[0]))
>>>>    errors_sq = errfunc(p1,hista[1],hista[0])**2
>>>>
>>>>
>>>> I have the error message
>>>>
>>>> Traceback (most recent call last):
>>>>  File "M31FeHfit_totalw108.py", line 116, in <module>
>>>>   p1,success  = optimize.leastsq(errfunc, p0[:], args =
>>>> (hista[1],hista[0]))
>>>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>>>> 264, in leastsq
>>>>   m = check_func(func,x0,args,n)[0]
>>>>  File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line
>>>> 11, in check_func
>>>>   res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
>>>>  File "M31FeHfit_totalw108.py", line 110, in <lambda>
>>>>   errfunc = lambda p, x, y: fitfunc(p,x) -y          # Distance to the
>>>> target function
>>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>>>
>>>> Anyone know why this happens? Curiously I have had it work before but
>>>> not with my current versions of scipy and python etc.
>>>>
>>>> David
>>>>       
>>>>         
>>> Check the dimensions hista[1],hista[0]. I can run your part of the
>>> code without problems.
>>>
>>> If you want to estimate the parameters of a (parametric) distribution,
>>> then using maximum likelihood estimation would be more appropriate
>>> than using least squares on the histogram.
>>>     
>>>       
>> Right. You can't just take the value of the PDF and compare it to the
>> (density-normalized) value of the histogram. You have to integrate the
>> PDF over each bin and compare that value to the mass-normalized value
>> of the histogram. Least-squares still isn't quite appropriate for this
>> task, not least because the amount of weight that you should apply to
>> each data point is non-uniform.
>>
>> If you are doing the histogramming yourself from the raw data, you
>> might be better off doing a maximum likelihood fit on the raw data
>> like the .fit() method of the rv_continuous distribution objects in
>> scipy.tats.
>>
>> If the data you have is already pre-histogrammed or discretized,
>> though, you need a different formulation of ML. For given parameters,
>> integrate the PDF over the bins of your histogram. This will give you
>> the probability of a single sample falling into each bin. If you have
>> N samples from this distribution (N being the number of data points
>> that went into the real histogram), this defines a multinomial
>> distribution over the bins. You can evaluate the log-likelihood of
>> getting your real histogram given those PDF parameters using the
>> multinomial distribution.
>>
>> I've actually had a far bit of success with the latter when estimating
>> Weibull distributions when the typical techniques failed to be robust.
>>
>>   
>>     
> I am doing the histogramming from the raw data, so sounds like a maximum 
> likelihood fit would be better. What I have is a series of velocity and 
> Fe/H measurements for a series of stars in the Andromeda galaxy, and the 
> idea is to find a gaussian and double gaussian fit, and have a look to 
> see whether the double gaussian is significantly better, to detect 
> whether there are two distinct populations within the stars.
>
> David
>
>
>   
Being out my area, but my question is reasoning for needing a double 
gaussian fit?

As Josef said, you can fit a mixture model 
(http://en.wikipedia.org/wiki/Mixture_mode) in which case you can 
construct a test based on treating the single gaussian as a special case 
with one mixture. You can use something like BIC 
(http://en.wikipedia.org/wiki/Bayesian_information_criterion) to compare 
the two to allow for differences in parameters. Note the assumptions of 
the likelihood ratio test may not apply.

Alternatively, you can model heterogeneous variance with a mixed model 
(http://en.wikipedia.org/wiki/Mixed_model) approach is very flexible 
such as modeling that different types of stars have different variances.

Also you can allow for non-gaussian models with the above as well...

Bruce


From josef.pktd at gmail.com  Wed Jan 21 12:29:19 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 21 Jan 2009 12:29:19 -0500
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <497758C7.2070803@ast.cam.ac.uk>
References: <4975B5CD.9060002@ast.cam.ac.uk>
	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>
	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
	<4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk>
	<1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
	<497758C7.2070803@ast.cam.ac.uk>
Message-ID: <1cd32cbb0901210929m47cf9a82k4ed16dfdabb2ce1@mail.gmail.com>

On Wed, Jan 21, 2009 at 12:17 PM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
> How exactly would the EM algorithm be used? The homepage
> http://pypi.python.org/pypi/scikits.learn seems to be down at the moment.
>

see:
http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/em/index.html

Note, when building the learn scikit, I always comment out manifold
learning in the setup.py since it seems to require boost, which I
don't have. The rest builds without problem.

Josef


From cmac at mit.edu  Wed Jan 21 12:44:02 2009
From: cmac at mit.edu (Christopher W. MacMinn)
Date: Wed, 21 Jan 2009 12:44:02 -0500
Subject: [SciPy-user] integrate.odeint and event handling
Message-ID: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>

Hello -

I was wondering if integrate.odeint offers any event handling  
capabilities.

For example, say that I want to solve the simple ODE df/dt=f-2 with  
f(t=0)=1.  Also, say that I want to stop the integration when f=0,  
maybe because I don't care about negative values of f, or maybe  
because what I really want to know is the value of t when f=0.  (The  
analytical solution is f(t) = 2-exp(t), and f=0 at t=ln(2).)

The MATLAB code below will produce the desired solution.  It will stop  
integration when f=0, and I believe it will also integrate with some  
care near f=0.  Additionally, MATLAB returns the vector of times at  
which the solution is evaluated, so I can easily grab the value of t  
when f=0.

% ---------------------------------------------------------
function [ts,fs] = ode_with_events()

    function dfdt = df(t,f)
        dfdt = f-2.;
    end

    function [value,isterminal,direction] = df_events(t,f)
        value = f;
        isterminal = 1;
        direction = 0;
    end

    f0 = 1.;
    t0 = 0.;
    t_max = 5.;
    options = odeset('events', at df_events);
    [ts,fs] = ode45(@df,[t0,t_max],f0,options);

end
% ---------------------------------------------------------


The Python code below integrates the ODE just fine, but is there a way  
to get the "event" functionality described above?

# ---------------------------------------------------------
import numpy as np
from scipy import integrate

def ode_with_events():

    def df(f,t):
        return f-2.

    f0 = 1.
    t0 = 0.
    t_max = 5.
    ts = np.linspace(t0,t_max,100)
    fs = integrate.odeint(df,f0,ts)

    return ts,fs
# ---------------------------------------------------------


Thanks!

Best, Chris


From rob.clewley at gmail.com  Wed Jan 21 13:01:15 2009
From: rob.clewley at gmail.com (Rob Clewley)
Date: Wed, 21 Jan 2009 13:01:15 -0500
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
Message-ID: <a749952d0901211001s3133d100n79d169526c105d21@mail.gmail.com>

Chris,

On Wed, Jan 21, 2009 at 12:44 PM, Christopher W. MacMinn <cmac at mit.edu> wrote:
> I was wondering if integrate.odeint offers any event handling
> capabilities.

No it doesn't, but you should try PyDSTool. Examples of simple event
detection are in the PyDSTool/tests/ directory, and a description of
the API and implementation on the wiki page
http://www.cam.cornell.edu/~rclewley/cgi-bin/moin.cgi/Events and
others linked.

Also, see the recent thread on this list
http://www.nabble.com/Event-handling-in-odeint-td20306029.html

-Rob


From warren.weckesser at enthought.com  Wed Jan 21 13:18:07 2009
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 21 Jan 2009 12:18:07 -0600
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
Message-ID: <497766DF.5040305@enthought.com>

Christopher W. MacMinn wrote:
> Hello -
>
> I was wondering if integrate.odeint offers any event handling  
> capabilities.
>
>
>   
<snip>
> Thanks!
>
> Best, Chris
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>   

Hi Chris,

As Rob Clewley pointed out, odeint does not provide event detection.  I
don't think SciPy's ode class does, either.  Rob's PyDSTool is one
alternative (and it provides a lot of other nice tools to go along with
the ODE solver); another is PySUNDIALS, as mentioned in the thread to
which Rob provided a link.

odeint is a wrapper for the LSODA solver in the Fortran ODEPACK
library.  This library also includes LSODAR, which is LSODA with
root-finding (aka event detection).  Does anyone want to take a stab at
wrapping LSODAR?  The wrapping of LSODA with odeint provides a good
starting point, and an ODE solver with root-finding would be a great
addition to SciPy.

Warren


-- 
Warren Weckesser
Enthought, Inc.
515 Congress Avenue, Suite 2100
Austin, TX  78701
512-536-1057 x249


From rob.clewley at gmail.com  Wed Jan 21 13:22:13 2009
From: rob.clewley at gmail.com (Rob Clewley)
Date: Wed, 21 Jan 2009 13:22:13 -0500
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <497766DF.5040305@enthought.com>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
	<497766DF.5040305@enthought.com>
Message-ID: <a749952d0901211022t526a2628v17340a1f4ab5d45d@mail.gmail.com>

> odeint is a wrapper for the LSODA solver in the Fortran ODEPACK
> library.  This library also includes LSODAR, which is LSODA with
> root-finding (aka event detection).  Does anyone want to take a stab at
> wrapping LSODAR?  The wrapping of LSODA with odeint provides a good
> starting point, and an ODE solver with root-finding would be a great
> addition to SciPy.
>
> Warren

Ryan Gutenkunst already wrapped it while working on the SloppyCell package. See

http://osdir.com/ml/python.scientific.devel/2005-07/msg00028.html

with a link there to the code. I've never tried it myself or even
looked at it, FYI :)
-Rob


From rob.clewley at gmail.com  Wed Jan 21 13:26:28 2009
From: rob.clewley at gmail.com (Rob Clewley)
Date: Wed, 21 Jan 2009 13:26:28 -0500
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <a749952d0901211022t526a2628v17340a1f4ab5d45d@mail.gmail.com>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
	<497766DF.5040305@enthought.com>
	<a749952d0901211022t526a2628v17340a1f4ab5d45d@mail.gmail.com>
Message-ID: <a749952d0901211026n4b4ee7fcvfe06ecd974b8f215@mail.gmail.com>

On Wed, Jan 21, 2009 at 1:22 PM, Rob Clewley <rob.clewley at gmail.com> wrote:
>> odeint is a wrapper for the LSODA solver in the Fortran ODEPACK
>> library.  This library also includes LSODAR, which is LSODA with
>> root-finding (aka event detection).  Does anyone want to take a stab at
>> wrapping LSODAR?  The wrapping of LSODA with odeint provides a good
>> starting point, and an ODE solver with root-finding would be a great
>> addition to SciPy.
>>
>> Warren
>
> Ryan Gutenkunst already wrapped it while working on the SloppyCell package. See
>
> http://osdir.com/ml/python.scientific.devel/2005-07/msg00028.html
>
> with a link there to the code. I've never tried it myself or even
> looked at it, FYI :)
> -Rob
>

PS There's some mention of Ryan's lsodar.pyf in the trunk of scipy SVN, as per

projects.scipy.org/scipy/scipy/browser/trunk/scipy/integrate/setup.py?rev=4763

but I don't know if it's still there. If it is, is the associated pyd
now shipped with Scipy? I haven't installed a new version for months.
-Rob


From nwagner at iam.uni-stuttgart.de  Wed Jan 21 13:30:50 2009
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Wed, 21 Jan 2009 19:30:50 +0100
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <a749952d0901211026n4b4ee7fcvfe06ecd974b8f215@mail.gmail.com>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
	<497766DF.5040305@enthought.com>
	<a749952d0901211022t526a2628v17340a1f4ab5d45d@mail.gmail.com>
	<a749952d0901211026n4b4ee7fcvfe06ecd974b8f215@mail.gmail.com>
Message-ID: <web-116169704@uni-stuttgart.de>

On Wed, 21 Jan 2009 13:26:28 -0500
  Rob Clewley <rob.clewley at gmail.com> wrote:
> On Wed, Jan 21, 2009 at 1:22 PM, Rob Clewley 
><rob.clewley at gmail.com> wrote:
>>> odeint is a wrapper for the LSODA solver in the Fortran 
>>>ODEPACK
>>> library.  This library also includes LSODAR, which is 
>>>LSODA with
>>> root-finding (aka event detection).  Does anyone want to 
>>>take a stab at
>>> wrapping LSODAR?  The wrapping of LSODA with odeint 
>>>provides a good
>>> starting point, and an ODE solver with root-finding 
>>>would be a great
>>> addition to SciPy.
>>>
>>> Warren
>>
>> Ryan Gutenkunst already wrapped it while working on the 
>>SloppyCell package. See
>>
>> http://osdir.com/ml/python.scientific.devel/2005-07/msg00028.html
>>
>> with a link there to the code. I've never tried it 
>>myself or even
>> looked at it, FYI :)
>> -Rob
>>
> 
> PS There's some mention of Ryan's lsodar.pyf in the 
>trunk of scipy SVN, as per
> 
> projects.scipy.org/scipy/scipy/browser/trunk/scipy/integrate/setup.py?rev=4763
> 
> but I don't know if it's still there. If it is, is the 
>associated pyd
> now shipped with Scipy? I haven't installed a new 
>version for months.
> -Rob
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
  

It might be a good addition for scikits.odes ?

Nils


From tomo.bbe at gmail.com  Wed Jan 21 15:23:36 2009
From: tomo.bbe at gmail.com (James)
Date: Wed, 21 Jan 2009 20:23:36 +0000
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
Message-ID: <5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com>

Chris,

The way I would go about this is more akin to how you would have to use
ODEPACK in Fortran. The odeint function takes a list of output timesteps,
but the Fortran routine is called once for each desired output and uses the
previous call as the initial conditions for the next.

So your example would read something like... (if not tested this btw...)

# ------------------------------
import numpy as np
from scipy import integrate

def df(f,t):
   return f-2.

def df_stop(f,t):
   return f < 0.0

f0 = 1.
t0 = 0.
t_max = 5.
nout = 100
ts = np.linspace(t0,t_max,nout)


fs = [f0,]
df_continue = True
i = 0
while df_continue:
    f = integrate.odeint(df,fs[i],[ts[i],ts[i+1]])
    i+=1
    if i==nout-1:
        df_continue = False
    elif df_stop(f[1][0],ts[i+1]):
        df_continue = False
    else:
        fs.append( f[1][0] )

fs = np.array( fs )


# ------------------------------
>
>
You could probably integrate the output time conditions into dt_stop by
using a fixed timestep to make it a bit cleaner.

Cheers,
James

On Wed, Jan 21, 2009 at 5:44 PM, Christopher W. MacMinn <cmac at mit.edu>wrote:

> Hello -
>
> I was wondering if integrate.odeint offers any event handling
> capabilities.
>
> For example, say that I want to solve the simple ODE df/dt=f-2 with
> f(t=0)=1.  Also, say that I want to stop the integration when f=0,
> maybe because I don't care about negative values of f, or maybe
> because what I really want to know is the value of t when f=0.  (The
> analytical solution is f(t) = 2-exp(t), and f=0 at t=ln(2).)
>
> The MATLAB code below will produce the desired solution.  It will stop
> integration when f=0, and I believe it will also integrate with some
> care near f=0.  Additionally, MATLAB returns the vector of times at
> which the solution is evaluated, so I can easily grab the value of t
> when f=0.
>
> % ---------------------------------------------------------
> function [ts,fs] = ode_with_events()
>
>    function dfdt = df(t,f)
>        dfdt = f-2.;
>    end
>
>    function [value,isterminal,direction] = df_events(t,f)
>        value = f;
>        isterminal = 1;
>        direction = 0;
>    end
>
>    f0 = 1.;
>    t0 = 0.;
>    t_max = 5.;
>    options = odeset('events', at df_events);
>    [ts,fs] = ode45(@df,[t0,t_max],f0,options);
>
> end
> % ---------------------------------------------------------
>
>
> The Python code below integrates the ODE just fine, but is there a way
> to get the "event" functionality described above?
>
> # ---------------------------------------------------------
> import numpy as np
> from scipy import integrate
>
> def ode_with_events():
>
>    def df(f,t):
>        return f-2.
>
>    f0 = 1.
>    t0 = 0.
>    t_max = 5.
>    ts = np.linspace(t0,t_max,100)
>    fs = integrate.odeint(df,f0,ts)
>
>    return ts,fs
> # ---------------------------------------------------------
>
>
> Thanks!
>
> Best, Chris
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090121/0c43bcbd/attachment.html>

From rob.clewley at gmail.com  Wed Jan 21 15:33:26 2009
From: rob.clewley at gmail.com (Rob Clewley)
Date: Wed, 21 Jan 2009 15:33:26 -0500
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
	<5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com>
Message-ID: <a749952d0901211233g7c2a0d1bhbacfab9f697d384e@mail.gmail.com>

Let's be clear about the expected functionality of this posted code...

> So your example would read something like... (if not tested this btw...)
>
> # ------------------------------
> import numpy as np
> from scipy import integrate
>
> def df(f,t):
>    return f-2.
>
> def df_stop(f,t):
>    return f < 0.0
>
> f0 = 1.
> t0 = 0.
> t_max = 5.
> nout = 100
> ts = np.linspace(t0,t_max,nout)
>
>
> fs = [f0,]
> df_continue = True
> i = 0
> while df_continue:
>     f = integrate.odeint(df,fs[i],[ts[i],ts[i+1]])
>     i+=1
>     if i==nout-1:
>         df_continue = False
>     elif df_stop(f[1][0],ts[i+1]):
>         df_continue = False
>     else:
>         fs.append( f[1][0] )
>
> fs = np.array( fs )
>
>

This won't stop integration at the actual time that the event occurred
(the OP said he wants to stop when f=0 and I am assuming he means to
some significant accuracy) - it only stops at some time after the
event occurred, up to an error of the fixed step size. The whole point
of the lsodar and pydstool routines is to be able to have an
integration that stops precisely when an event occurs, up to a
predetermined error tolerance. In this code, you would have to
re-integrate between the last two time points (the one before and the
one after the event) at much smaller time steps to discover where the
event is more accurately. This is efficiently done in the other codes.

-Rob


From tomo.bbe at gmail.com  Wed Jan 21 15:44:23 2009
From: tomo.bbe at gmail.com (James)
Date: Wed, 21 Jan 2009 20:44:23 +0000
Subject: [SciPy-user] integrate.odeint and event handling
In-Reply-To: <a749952d0901211233g7c2a0d1bhbacfab9f697d384e@mail.gmail.com>
References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu>
	<5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com>
	<a749952d0901211233g7c2a0d1bhbacfab9f697d384e@mail.gmail.com>
Message-ID: <5a757d050901211244i3859783fue3d71f1ea767be7f@mail.gmail.com>

A good point, well made. I naively thought the OP just wanted to stop
computation once f had gone negative, but thinking about it I suppose that
is pretty pointless if you can't figure that point in time out accurately or
efficiently. Sorry for the reading comprehension failure.

On Wed, Jan 21, 2009 at 8:33 PM, Rob Clewley <rob.clewley at gmail.com> wrote:

> Let's be clear about the expected functionality of this posted code...
>
> > So your example would read something like... (if not tested this btw...)
> >
> > # ------------------------------
> > import numpy as np
> > from scipy import integrate
> >
> > def df(f,t):
> >    return f-2.
> >
> > def df_stop(f,t):
> >    return f < 0.0
> >
> > f0 = 1.
> > t0 = 0.
> > t_max = 5.
> > nout = 100
> > ts = np.linspace(t0,t_max,nout)
> >
> >
> > fs = [f0,]
> > df_continue = True
> > i = 0
> > while df_continue:
> >     f = integrate.odeint(df,fs[i],[ts[i],ts[i+1]])
> >     i+=1
> >     if i==nout-1:
> >         df_continue = False
> >     elif df_stop(f[1][0],ts[i+1]):
> >         df_continue = False
> >     else:
> >         fs.append( f[1][0] )
> >
> > fs = np.array( fs )
> >
> >
>
> This won't stop integration at the actual time that the event occurred
> (the OP said he wants to stop when f=0 and I am assuming he means to
> some significant accuracy) - it only stops at some time after the
> event occurred, up to an error of the fixed step size. The whole point
> of the lsodar and pydstool routines is to be able to have an
> integration that stops precisely when an event occurs, up to a
> predetermined error tolerance. In this code, you would have to
> re-integrate between the last two time points (the one before and the
> one after the event) at much smaller time steps to discover where the
> event is more accurately. This is efficiently done in the other codes.
>
> -Rob
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090121/1ed25e24/attachment.html>

From ellisonbg.net at gmail.com  Wed Jan 21 23:33:30 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 21 Jan 2009 20:33:30 -0800
Subject: [SciPy-user] Build problems on OS X, 10.5 with g95
Message-ID: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com>

I am trying to build the latest 0.7 with g95 on OS X and get the
following failure:

A few minutes worth of stuff... then...
  "_PyErr_SetString", referenced from:
      _int_from_pyobj in _fftpackmodule.o
      _f2py_rout__fftpack_zfft in _fftpackmodule.o
      _f2py_rout__fftpack_zfft in _fftpackmodule.o
      _f2py_rout__fftpack_zfft in _fftpackmodule.o
      _f2py_rout__fftpack_drfft in _fftpackmodule.o
      _f2py_rout__fftpack_drfft in _fftpackmodule.o
      _f2py_rout__fftpack_drfft in _fftpackmodule.o
      _f2py_rout__fftpack_zrfft in _fftpackmodule.o
      _f2py_rout__fftpack_zrfft in _fftpackmodule.o
      _f2py_rout__fftpack_zrfft in _fftpackmodule.o
      _f2py_rout__fftpack_zfftnd in _fftpackmodule.o
      _f2py_rout__fftpack_zfftnd in _fftpackmodule.o
      _f2py_rout__fftpack_zfftnd in _fftpackmodule.o
      _f2py_rout__fftpack_zfftnd in _fftpackmodule.o
      _init_fftpack in _fftpackmodule.o
      _array_from_pyobj in fortranobject.o
      _fortran_setattr in fortranobject.o
      _fortran_setattr in fortranobject.o
  "_PyExc_ValueError", referenced from:
      _PyExc_ValueError$non_lazy_ptr in fortranobject.o
  "_PyString_AsString", referenced from:
      _array_from_pyobj in fortranobject.o
ld: symbol(s) not found
error: Command "/usr/local/g95/bin/g95 -shared -shared
build/temp.macosx-10.5-i386-2.5/build/src.macosx-10.5-i386-2.5/scipy/fftpack/_fftpackmodule.o
build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/zfft.o
build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/drfft.o
build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/zrfft.o
build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/zfftnd.o
build/temp.macosx-10.5-i386-2.5/build/src.macosx-10.5-i386-2.5/fortranobject.o
-Lbuild/temp.macosx-10.5-i386-2.5 -ldfftpack -o
build/lib.macosx-10.5-i386-2.5/scipy/fftpack/_fftpack.so" failed with
exit status 1

Ring any bells?

Thanks,

Brian


From robert.kern at gmail.com  Wed Jan 21 23:37:58 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 21 Jan 2009 22:37:58 -0600
Subject: [SciPy-user] Build problems on OS X, 10.5 with g95
In-Reply-To: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com>
References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com>
Message-ID: <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com>

On Wed, Jan 21, 2009 at 22:33, Brian Granger <ellisonbg.net at gmail.com> wrote:
> I am trying to build the latest 0.7 with g95 on OS X and get the
> following failure:

g95 is not supported yet on OS X. It needs to have ported over some of
the OS X modifications from the GnuFCompiler.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From ellisonbg.net at gmail.com  Wed Jan 21 23:41:14 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 21 Jan 2009 20:41:14 -0800
Subject: [SciPy-user] Build problems on OS X, 10.5 with g95
In-Reply-To: <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com>
References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com>
	<3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com>
Message-ID: <6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com>

Oh, good to know.  What is the current recommended way of getting
gfortran.  I used to get it off the OS X HPC site:

http://hpc.sourceforge.net/

But I know there are other versions floating around.

Thanks for the quick reply though.

Brian

On Wed, Jan 21, 2009 at 8:37 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Jan 21, 2009 at 22:33, Brian Granger <ellisonbg.net at gmail.com> wrote:
>> I am trying to build the latest 0.7 with g95 on OS X and get the
>> following failure:
>
> g95 is not supported yet on OS X. It needs to have ported over some of
> the OS X modifications from the GnuFCompiler.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From robert.kern at gmail.com  Wed Jan 21 23:44:15 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 21 Jan 2009 22:44:15 -0600
Subject: [SciPy-user] Build problems on OS X, 10.5 with g95
In-Reply-To: <6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com>
References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com>
	<3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com>
	<6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com>
Message-ID: <3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com>

On Wed, Jan 21, 2009 at 22:41, Brian Granger <ellisonbg.net at gmail.com> wrote:
> Oh, good to know.  What is the current recommended way of getting
> gfortran.  I used to get it off the OS X HPC site:
>
> http://hpc.sourceforge.net/
>
> But I know there are other versions floating around.

I strongly recommend avoiding the HPC binaries and using these:

  http://r.research.att.com/tools/

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From ellisonbg.net at gmail.com  Wed Jan 21 23:46:39 2009
From: ellisonbg.net at gmail.com (Brian Granger)
Date: Wed, 21 Jan 2009 20:46:39 -0800
Subject: [SciPy-user] Build problems on OS X, 10.5 with g95
In-Reply-To: <3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com>
References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com>
	<3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com>
	<6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com>
	<3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com>
Message-ID: <6ce0ac130901212046j324c0235u3517fa1bd4c589bd@mail.gmail.com>

> I strongly recommend avoiding the HPC binaries and using these:
>
>  http://r.research.att.com/tools/

Thanks, I hadn't seen this.


>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From david at ar.media.kyoto-u.ac.jp  Thu Jan 22 00:22:42 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Thu, 22 Jan 2009 14:22:42 +0900
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <497758C7.2070803@ast.cam.ac.uk>
References: <4975B5CD.9060002@ast.cam.ac.uk>	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>	<4976E30A.7060904@ast.cam.ac.uk>	<4976F2A6.4040601@ast.cam.ac.uk>	<1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
	<497758C7.2070803@ast.cam.ac.uk>
Message-ID: <497802A2.8050407@ar.media.kyoto-u.ac.jp>

David Trethewey wrote:
> How exactly would the EM algorithm be used? The homepage 
> http://pypi.python.org/pypi/scikits.learn seems to be down at the moment.
>   

Hi David,

    I have not updated the webpage, but the package has an example on
how to use the BIC for comparing models:

http://projects.scipy.org/scipy/scikits/browser/trunk/learn/scikits/learn/machine/em/examples/basic_example3.py

Note that it uses artificial data, so the model is well specified. It
does not work as well for real data :)

cheers,

David


From nadavh at visionsense.com  Thu Jan 22 06:44:38 2009
From: nadavh at visionsense.com (Nadav Horesh)
Date: Thu, 22 Jan 2009 13:44:38 +0200
Subject: [SciPy-user] Chirp Z transform
Message-ID: <710F2847B0018641891D9A216027636029C3D7@ex3.envision.co.il>

Chirp Z transform is a generalization of the Fourier transform.
Attached here a module for chirp z transform written by Paul Kienzle and I. We tried to follow scipy's coding-style directions. Is it possible (and how) to make it a part of the scipy project?


  Nadav.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: czt.py
Type: text/x-python
Size: 15521 bytes
Desc: czt.py
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090122/92a94818/attachment.py>

From dlrt2 at ast.cam.ac.uk  Thu Jan 22 10:03:41 2009
From: dlrt2 at ast.cam.ac.uk (David Trethewey)
Date: Thu, 22 Jan 2009 15:03:41 +0000
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <497802A2.8050407@ar.media.kyoto-u.ac.jp>
References: <4975B5CD.9060002@ast.cam.ac.uk>	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>	<4976E30A.7060904@ast.cam.ac.uk>	<4976F2A6.4040601@ast.cam.ac.uk>	<1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>	<497758C7.2070803@ast.cam.ac.uk>
	<497802A2.8050407@ar.media.kyoto-u.ac.jp>
Message-ID: <49788ACD.1000805@ast.cam.ac.uk>

Managed to get this working with my data. Being able to do a 2-d fit 
with both metallicity and velocity information used is certainly 
interesting, although it doesn't seem to be too good at detecting 
subpopulations within my stellar stream which is what I'm trying to do.

David


David Cournapeau wrote:
> David Trethewey wrote:
>   
>> How exactly would the EM algorithm be used? The homepage 
>> http://pypi.python.org/pypi/scikits.learn seems to be down at the moment.
>>   
>>     
>
> Hi David,
>
>     I have not updated the webpage, but the package has an example on
> how to use the BIC for comparing models:
>
> http://projects.scipy.org/scipy/scikits/browser/trunk/learn/scikits/learn/machine/em/examples/basic_example3.py
>
> Note that it uses artificial data, so the model is well specified. It
> does not work as well for real data :)
>
> cheers,
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>   


From stefan at sun.ac.za  Thu Jan 22 10:05:09 2009
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Thu, 22 Jan 2009 17:05:09 +0200
Subject: [SciPy-user] Chirp Z transform
In-Reply-To: <710F2847B0018641891D9A216027636029C3D7@ex3.envision.co.il>
References: <Acl8hW0AampPktNVTeyp8K75ZtAAsw==>
	<710F2847B0018641891D9A216027636029C3D7@ex3.envision.co.il>
Message-ID: <9457e7c80901220705k621cf71bs52e2349659075661@mail.gmail.com>

Hi Nadav

2009/1/22 Nadav Horesh <nadavh at visionsense.com>:
> Chirp Z transform is a generalization of the Fourier transform.
> Attached here a module for chirp z transform written by Paul Kienzle and I. We tried to follow scipy's coding-style directions. Is it possible (and how) to make it a part of the scipy project?

Thanks for working on this; I, for one, would like to see it in SciPy.

Recently I referred you to another implementation at

http://www.mail-archive.com/numpy-discussion at scipy.org/msg01812.html

Your version is much more complete, but the following struck me as
slightly strange:

data = np.random.random(10000)
a = czt.czt(data, w=np.exp(-2*1j*np.pi/float(len(data)))
b = chirpz_s.chirpz(data, 1, np.exp(-2*1j*np.pi/float(len(data))), len(data))

target = np.fft.fft(data)
err_a = np.sum(np.abs(a - target))
err_b = np.sum(np.abs(b - target))

In [152]:  err_a / err_b
Out[152]: 1.6138562461610748

The only reason I mention this is because you speak about the
inaccuracy in the docstring.  The errors are, on average, in the
vicinity of 1e-10 vs. 5e-11 respectively, so I'm probably on a wild
goose chase.

Regards
St?fan


From josef.pktd at gmail.com  Thu Jan 22 10:47:46 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 22 Jan 2009 10:47:46 -0500
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <49788ACD.1000805@ast.cam.ac.uk>
References: <4975B5CD.9060002@ast.cam.ac.uk>
	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>
	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
	<4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk>
	<1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
	<497758C7.2070803@ast.cam.ac.uk>
	<497802A2.8050407@ar.media.kyoto-u.ac.jp>
	<49788ACD.1000805@ast.cam.ac.uk>
Message-ID: <1cd32cbb0901220747q4f150676o29e502b34439e9e7@mail.gmail.com>

On Thu, Jan 22, 2009 at 10:03 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
> Managed to get this working with my data. Being able to do a 2-d fit
> with both metallicity and velocity information used is certainly
> interesting, although it doesn't seem to be too good at detecting
> subpopulations within my stellar stream which is what I'm trying to do.
>

>From my experience with hidden Markov models (estimated with ML not
EM), I know that good starting values for the location parameters are
necessary to get reliable results. I think, that the global properties
of the likelihood function are not very "nice".

What are you using as starting values?
I would try to get the suspected number of clusters and cluster
centers from visual inspection of the 2D histogram (or from stats.kde)
and use these as starting values.  The variance I would set so that
initially the individual distributions have only a small overlap.

Josef


From cournape at gmail.com  Thu Jan 22 11:04:06 2009
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 23 Jan 2009 01:04:06 +0900
Subject: [SciPy-user] optimize.leastsq
In-Reply-To: <1cd32cbb0901220747q4f150676o29e502b34439e9e7@mail.gmail.com>
References: <4975B5CD.9060002@ast.cam.ac.uk>
	<1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com>
	<3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com>
	<4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk>
	<1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com>
	<497758C7.2070803@ast.cam.ac.uk>
	<497802A2.8050407@ar.media.kyoto-u.ac.jp>
	<49788ACD.1000805@ast.cam.ac.uk>
	<1cd32cbb0901220747q4f150676o29e502b34439e9e7@mail.gmail.com>
Message-ID: <5b8d13220901220804w2e5f2f0axc33b5ea9584f5c52@mail.gmail.com>

On Fri, Jan 23, 2009 at 12:47 AM,  <josef.pktd at gmail.com> wrote:
> On Thu, Jan 22, 2009 at 10:03 AM, David Trethewey <dlrt2 at ast.cam.ac.uk> wrote:
>> Managed to get this working with my data. Being able to do a 2-d fit
>> with both metallicity and velocity information used is certainly
>> interesting, although it doesn't seem to be too good at detecting
>> subpopulations within my stellar stream which is what I'm trying to do.
>>
>
> >From my experience with hidden Markov models (estimated with ML not
> EM),

Well, EM, at least as implemented in the learn scikits, is
fundamentally a likelihood based method (EM is built such as its
objective function "force" a likelihood increase when it is itself
increased).

> I know that good starting values for the location parameters are
> necessary to get reliable results. I think, that the global properties
> of the likelihood function are not very "nice".

Indeed, for mixture with > 1 component, the likelihood function is not
concave anymore. EM can only find a local maximum of the likelihood.
The starting values are indeed important, there are various heuristics
which can help, but none of them are implemented in the toolbox.
Generally, one trick is to make sure the initial means are as far as
possible from each other - this is not always easy to do
automatically, although in that particular case, if the data are 2 d
with 2 components, this can be done by hand quite easily.

David


From michael.abshoff at googlemail.com  Thu Jan 22 11:30:15 2009
From: michael.abshoff at googlemail.com (Michael Abshoff)
Date: Thu, 22 Jan 2009 08:30:15 -0800
Subject: [SciPy-user] Build problems on OS X, 10.5 with g95
In-Reply-To: <6ce0ac130901212046j324c0235u3517fa1bd4c589bd@mail.gmail.com>
References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com>	<3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com>	<6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com>	<3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com>
	<6ce0ac130901212046j324c0235u3517fa1bd4c589bd@mail.gmail.com>
Message-ID: <49789F17.9090306@gmail.com>

Brian Granger wrote:
>> I strongly recommend avoiding the HPC binaries and using these:
>>
>>  http://r.research.att.com/tools/
> 
> Thanks, I hadn't seen this.

Yeah, I have been using that one to build 32 and 64 bit Scipy builds on 
OSX for Sage and it works really well.

Given that Scipy 0.7.rc1 was supposed to be out for a while and we are 
sitting here at a Sage Days itching to upgrade Scipy in Sage (finally!) 
what are the chances of the rc coming out soon? We will pull svn from 
the 0.7 branch later today anyway, but I was just curious since the rc 
has been imminent for a couple weeks now :)

>> --
>> Robert Kern

Cheers,

Michael

>>
>> "I have come to believe that the whole world is an enigma, a harmless
>> enigma that is made terrible by our own mad attempt to interpret it as
>> though it had an underlying truth."
>>  -- Umberto Eco
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
> 


From icy.flame.gm at gmail.com  Fri Jan 23 17:03:39 2009
From: icy.flame.gm at gmail.com (iCy-fLaME)
Date: Fri, 23 Jan 2009 22:03:39 +0000
Subject: [SciPy-user] A good way to test an array is zero to within
	numerical accuracy?
Message-ID: <bfb379e10901231403t62fed6b1pfc35d4daea4d911a@mail.gmail.com>

What would be a good way to test an array is zero everywhere to within
numerical accuracy? Such that, plus minus 0x1 for the float's
significand is as good as 0x0.

The test must be able to cope with both single and double floats,
because they are given at run time.

Thanks in advance.


iCy


From lukasz.klopotowski at ifpan.edu.pl  Fri Jan 23 17:00:34 2009
From: lukasz.klopotowski at ifpan.edu.pl (Lukasz Klopotowski)
Date: Fri, 23 Jan 2009 23:00:34 +0100
Subject: [SciPy-user] Fitting a function, which is an integral
References: 6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com
Message-ID: <497A3E02.90304@ifpan.edu.pl>

Hi!

I need to fit a function, which is an integral. For example, I do:
 >>> from scipy import *
 >>> from scipy import integrate
 >>> from scipy import optimize
 >>> def pfun(p):
...     def fun(x):
...         return p[0]+p[1]*x
...     return fun
...
 >>> def caleczka(p,x):
...     return integrate.quad(pfun(p),0,x)
...
 >>> def errcal(p,x,y):
...     return caleczka(p,x)-y
...

and after:

 >>> optimize.leastsq(errcal, [1,2], (ix, iy))

I get:

Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line 
266, in leastsq
    m = check_func(func,x0,args,n)[0]
  File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line 
12, in check_func
    res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
  File "<interactive input>", line 2, in errcal
  File "<interactive input>", line 2, in caleczka
  File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line 
185, in quad
    retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points)
  File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line 
233, in _quad
    if (b != Inf and a != -Inf):
ValueError: The truth value of an array with more than one element is 
ambiguous. Use a.any() or a.all()

Could someone take a look and point me in the right direction?

Thanks in advance

Lukasz


From robert.kern at gmail.com  Fri Jan 23 17:20:27 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 23 Jan 2009 16:20:27 -0600
Subject: [SciPy-user] Fitting a function, which is an integral
In-Reply-To: <497A3E02.90304@ifpan.edu.pl>
References: <497A3E02.90304@ifpan.edu.pl>
Message-ID: <3d375d730901231420u7f0e24b2pa8832bffb948a3@mail.gmail.com>

On Fri, Jan 23, 2009 at 16:00, Lukasz Klopotowski
<lukasz.klopotowski at ifpan.edu.pl> wrote:
> Hi!
>
> I need to fit a function, which is an integral. For example, I do:
>  >>> from scipy import *
>  >>> from scipy import integrate
>  >>> from scipy import optimize
>  >>> def pfun(p):
> ...     def fun(x):
> ...         return p[0]+p[1]*x
> ...     return fun
> ...
>  >>> def caleczka(p,x):
> ...     return integrate.quad(pfun(p),0,x)
> ...
>  >>> def errcal(p,x,y):
> ...     return caleczka(p,x)-y
> ...
>
> and after:
>
>  >>> optimize.leastsq(errcal, [1,2], (ix, iy))
>
> I get:
>
> Traceback (most recent call last):
>  File "<interactive input>", line 1, in <module>
>  File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line
> 266, in leastsq
>    m = check_func(func,x0,args,n)[0]
>  File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line
> 12, in check_func
>    res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
>  File "<interactive input>", line 2, in errcal
>  File "<interactive input>", line 2, in caleczka
>  File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line
> 185, in quad
>    retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points)
>  File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line
> 233, in _quad
>    if (b != Inf and a != -Inf):
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
>
> Could someone take a look and point me in the right direction?

The limit arguments to integrate.quad() cannot be arrays. They must be scalars.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From cdcasey at gmail.com  Fri Jan 23 18:13:02 2009
From: cdcasey at gmail.com (chris)
Date: Fri, 23 Jan 2009 17:13:02 -0600
Subject: [SciPy-user] module_test replacement
Message-ID: <ead057790901231513i68d11750r20ea3455b1dd2137@mail.gmail.com>

I've inherited some code that uses module_test and module_test_suite
from scipy_test. As these things no longer exist, is there a
functional equivalent I can use for a simple refactor? Or perhaps a
nice workaround?

Thanks,
-Chris


From robert.kern at gmail.com  Fri Jan 23 18:19:22 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 23 Jan 2009 17:19:22 -0600
Subject: [SciPy-user] module_test replacement
In-Reply-To: <ead057790901231513i68d11750r20ea3455b1dd2137@mail.gmail.com>
References: <ead057790901231513i68d11750r20ea3455b1dd2137@mail.gmail.com>
Message-ID: <3d375d730901231519i110a64c5r8dc000b3dcce87f@mail.gmail.com>

On Fri, Jan 23, 2009 at 17:13, chris <cdcasey at gmail.com> wrote:
> I've inherited some code that uses module_test and module_test_suite
> from scipy_test. As these things no longer exist, is there a
> functional equivalent I can use for a simple refactor? Or perhaps a
> nice workaround?

Just delete the test() and test_suite() functions that use them and
use nose as the test runner. Many of the test methods still use
"check_*" instead of "test_*" so you can configure nose to collect
those, too, or you can just search and replace to change them to
"test_*".

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From rowen at u.washington.edu  Fri Jan 23 18:29:07 2009
From: rowen at u.washington.edu (Russell E. Owen)
Date: Fri, 23 Jan 2009 15:29:07 -0800
Subject: [SciPy-user] A good way to test an array is zero to within
	numerical accuracy?
References: <bfb379e10901231403t62fed6b1pfc35d4daea4d911a@mail.gmail.com>
Message-ID: <rowen-C15AB6.15290723012009@news.gmane.org>

I recommend numpy.allclose

-- Russell

In article 
<bfb379e10901231403t62fed6b1pfc35d4daea4d911a at mail.gmail.com>,
 iCy-fLaME <icy.flame.gm at gmail.com> wrote:

> What would be a good way to test an array is zero everywhere to within
> numerical accuracy? Such that, plus minus 0x1 for the float's
> significand is as good as 0x0.
> 
> The test must be able to cope with both single and double floats,
> because they are given at run time.
> 
> Thanks in advance.
> 
> 
> 
> iCy


From cdcasey at gmail.com  Fri Jan 23 18:58:32 2009
From: cdcasey at gmail.com (chris)
Date: Fri, 23 Jan 2009 17:58:32 -0600
Subject: [SciPy-user] module_test replacement
In-Reply-To: <3d375d730901231519i110a64c5r8dc000b3dcce87f@mail.gmail.com>
References: <ead057790901231513i68d11750r20ea3455b1dd2137@mail.gmail.com>
	<3d375d730901231519i110a64c5r8dc000b3dcce87f@mail.gmail.com>
Message-ID: <ead057790901231558j454248f7od5ee7d38398b4ab1@mail.gmail.com>

Thanks, Robert. That really made things simple.

-Chris

On Fri, Jan 23, 2009 at 5:19 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Fri, Jan 23, 2009 at 17:13, chris <cdcasey at gmail.com> wrote:
>> I've inherited some code that uses module_test and module_test_suite
>> from scipy_test. As these things no longer exist, is there a
>> functional equivalent I can use for a simple refactor? Or perhaps a
>> nice workaround?
>
> Just delete the test() and test_suite() functions that use them and
> use nose as the test runner. Many of the test methods still use
> "check_*" instead of "test_*" so you can configure nose to collect
> those, too, or you can just search and replace to change them to
> "test_*".
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From sturla at molden.no  Fri Jan 23 23:13:22 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 24 Jan 2009 05:13:22 +0100 (CET)
Subject: [SciPy-user] The fastest kd-tree known to man?
Message-ID: <03803375ca5842570a842564f51f9cd1.squirrel@webmail.uio.no>


Yesterday evening I was experimenting with Anne Archibald's cKDTree from
the latest SciPy superpack. The speed of this implementation is really
amazing. So I decided to try to make something even faster.


Building on Anne's code, I tried to variations:


1. Use multiprocessing (the official backport from Python 2.6) for
parallel queries. All the data was stored in shared memory (allocated
using multiprocessing.RawArray).

2. Modify the cKDTree C-code with OpenMP pragmas, and let the compiler do
the rest.


Here are some results from my dual core laptop:

http://folk.uio.no/sturlamo/kdtree/bench1.png
http://folk.uio.no/sturlamo/kdtree/bench2.png

Black: single-threaded cKDTree from SciPy superpack rc2
Blue: cKDTree + multiprocessing + shared memory
Red: cKDTree + OpenMP


As you can see, the OpenMP'd version is the fastest. The overhead from
using OpenMP seem to be negigible. It is even the faster option for the
smallest data sets. Not bad for a single line of code:


#pragma omp parallel for schedule(guided) private(__pyx_v_c)

right above the for loop on line 1870 in

http://www.scipy.org/scipy/scipy/browser/trunk/scipy/spatial/ckdtree.c?rev=4957


Using multiprocessing incures some more overhead, on the order of a few
seconds. But for more substantial work it scales almost as well as OpenMP.
Some overhead is expected when working with Python.


The code and results (incl. Windows binaries) are in the file:

http://folk.uio.no/sturlamo/kdtree/parallel numpy.zip

Which should have MD5 checksum
075f045592a9500c5ea3e48975094f71 *parallel numpy.zip

(Yes that is an empty space in the file name.)

The zipfile includes:

- parallel_kdtree.py:
parallel implemention using multiprocessing

- multiprocessing_utils.py:
a few useful helper functions for making numpy and multiprocessing
cooperate. It could be useful for the scipy cookbook.

-ckdtree.c:
source file with OpenMP pragma

- Win32 binary folder:
Compiled with gcc 4.4.0 ('mingw' binary from gfortran). The pthread DLL is
needed to run OpenMP and comes from the gfortran 'mingw' distro. And as
with all prebuilt binaries: I am a nice guy and don't write malware, but
you run it at your own risk. This code is for testing purposes only.


As for the cookbook, I think it is time to remove my two entries there.
Both of them are obsolete by now.


Regards,

Sturla Molden


From sturla at molden.no  Fri Jan 23 23:16:56 2009
From: sturla at molden.no (Sturla Molden)
Date: Sat, 24 Jan 2009 05:16:56 +0100 (CET)
Subject: [SciPy-user] The fastest kd-tree known to man?
Message-ID: <16e84e9a50e61e088f509c0fe5b9d00a.squirrel@webmail.uio.no>


Yesterday evening I was experimenting with Anne Archibald's cKDTree from
the latest SciPy superpack. The speed of this implementation is really
amazing. So I decided to try to make something even faster.


Building on Anne's code, I tried to variations:


1. Use multiprocessing (the official backport from Python 2.6) for
parallel queries. All the data was stored in shared memory (allocated
using multiprocessing.RawArray).

2. Modify the cKDTree C-code with OpenMP pragmas, and let the compiler do
the rest.


Here are some results from my dual core laptop:

http://folk.uio.no/sturlamo/kdtree/bench1.png
http://folk.uio.no/sturlamo/kdtree/bench2.png

Black: single-threaded cKDTree from SciPy superpack rc2
Blue: cKDTree + multiprocessing + shared memory
Red: cKDTree + OpenMP


As you can see, the OpenMP'd version is the fastest. The overhead from
using OpenMP seem to be negigible. It is even the faster option for the
smallest data sets. Not bad for a single line of code:


#pragma omp parallel for schedule(guided) private(__pyx_v_c)

right above the for loop on line 1870 in

http://www.scipy.org/scipy/scipy/browser/trunk/scipy/spatial/ckdtree.c?rev=4957


Using multiprocessing incures some more overhead, on the order of a few
seconds. But for more substantial work it scales almost as well as OpenMP.
Some overhead is expected when working with Python.


The code and results (incl. Windows binaries) are in the file:

http://folk.uio.no/sturlamo/kdtree/parallel numpy.zip

Which should have MD5 checksum
075f045592a9500c5ea3e48975094f71 *parallel numpy.zip

(Yes that is an empty space in the file name.)

The zipfile includes:

- parallel_kdtree.py:
parallel implemention using multiprocessing

- multiprocessing_utils.py:
a few useful helper functions for making numpy and multiprocessing
cooperate. It could be useful for the scipy cookbook.

-ckdtree.c:
source file with OpenMP pragma

- Win32 binary folder:
Compiled with gcc 4.4.0 ('mingw' binary from gfortran). The pthread DLL is
needed to run OpenMP and comes from the gfortran 'mingw' distro. And as
with all prebuilt binaries: I am a nice guy and don't write malware, but
you run it at your own risk. This code is for testing purposes only.


As for the cookbook, I think it is time to remove my two entries there.
Both of them are obsolete by now.


Regards,

Sturla Molden


From christian.oreilly at polymtl.ca  Sat Jan 24 03:30:44 2009
From: christian.oreilly at polymtl.ca (Christian O'Reilly)
Date: Sat, 24 Jan 2009 03:30:44 -0500
Subject: [SciPy-user] ImportError: No module named factorial
Message-ID: <89d1750e0901240030q10a70e54pcf1ea592faa4adf5@mail.gmail.com>

Hi,

I'm trying to compile a windows executable (with py2exe) with code using
various components of the scipy library and I get serveral problems
concerning importation problems. Some of them are documented on the net but
I found myself trying to find a work around an error "ImportError: No module
named factorial" coming out from the file scipy.interpolate.polyint. The
only way I found to fix it was to change the line 2 of this file from "from
scipy import factorial" to "from scipy.misc.common import factorial".

I hope it may help,

-- 
Christian O'Reilly
?tudiant au doctorat en g?nie biom?dical
Laboratoire Scribens
?cole Polytechnique de Montr?al
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090124/3e04fbe4/attachment.html>

From tonyyu at MIT.EDU  Sat Jan 24 12:03:25 2009
From: tonyyu at MIT.EDU (Tony S Yu)
Date: Sat, 24 Jan 2009 12:03:25 -0500
Subject: [SciPy-user] scipy 0.7 changes behavior of sparse.spdiags
Message-ID: <65237075-B6F3-4EBD-B4F8-153485B57B68@mit.edu>

Thanks to all the scipy developers work on the new scipy release.

I just upgraded scipy 0.6 to scipy 0.7rc2. I think it's worth  
proclaiming very loudly (or at least mentioning in the release notes)  
that the behavior of sparse.spdiags has change since 0.6. See example  
below.

Cheers,
-Tony


 >>> import numpy as np
 >>> from scipy import sparse
 >>> data = np.array([[1,2,3,4],[1,2,3,4],[1,2,3,4]])
 >>> diags = np.array([0,-1,2])
 >>> sparse.spdiags(data, diags, 4, 4).todense()

In scipy 0.7rc2
-------------------
matrix([[1, 0, 3, 0],
         [1, 2, 0, 4],
         [0, 2, 3, 0],
         [0, 0, 3, 4]])

In scipy 0.6
-------------------
matrix([[1, 0, 1, 0],
         [1, 2, 0, 2],
         [0, 2, 3, 0],
         [0, 0, 3, 4]])


From vginer at gmail.com  Sat Jan 24 12:31:26 2009
From: vginer at gmail.com (Vicent)
Date: Sat, 24 Jan 2009 18:31:26 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
Message-ID: <50ed08f40901240931w6e1d637dk87882223bd241582@mail.gmail.com>

I am a Python beginner.

At this moment, I am starting to develop and numerical algorithm, which is
supposed to do lot of calculations, but it doesn't use matrices at all, for
example.

So, at this moment, as I don't know much about NumPy and SciPy, and although
it is said that they are very useful and enhancers for developing scientific
Python applications, I don't see the point to use them...

For example, I don't know if there is a great advantage in using NumPy data
types (like bool_, int_ and so on) instead of the associated Python data
types (bool, int, etc.).

When I read NumPy and SciPy documentation, I found lots of new features that
I may use, but I haven't been able to find any quick explanation about how
those modules improve Python performance (something like "Instead of doing
*this* with standard Python, do *this* using NumPy and SciPy because it's
faster/better/whatever").

For example, what is the difference between "random" from random module and
"random" from numpy.random? Or are they the same?

Before I thought that working with NumPy+SciPy would be mandatory for me,
and so that I should have to adapt my code to all its special features, from
the beggining. But, at this moment, my strategy would be working with
"plain" Python, and when necessary, look for features I need in NumPy and
SciPy. Is it OK?

Can you give a light to me?

Sorry if I seem too rude or hard, I didn't mean to. I am just a bit lost...

Thank you in advance, and sorry for my English mistakes.

-- 
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090124/6a264fef/attachment.html>

From philippetann at hotmail.com  Sat Jan 24 13:03:51 2009
From: philippetann at hotmail.com (Philippe TANN)
Date: Sat, 24 Jan 2009 19:03:51 +0100
Subject: [SciPy-user] minization of multivariable function
Message-ID: <COL116-W3E92032EC0EA9DB731CA8CBCC0@phx.gbl>


Hello,

I have some problems when I want to minimize a multivariable function with module fmin. Indeed, when I enter my function to minimize, I have indefinitely this kind of messages many times without the values where my function is minimal:

Optimization terminated successfully.
         Current function value: 0.000017
         Iterations: 30
         Function evaluations: 80
May you help me to solve this problem?
Thank you in advance,
PhT

_________________________________________________________________
T?l?phonez gratuitement ? tous vos proches avec Windows Live Messenger? !? T?l?chargez-le maintenant !
http://www.windowslive.fr/messenger/1.asp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090124/77220ce4/attachment.html>

From nwagner at iam.uni-stuttgart.de  Sat Jan 24 13:10:31 2009
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Sat, 24 Jan 2009 19:10:31 +0100
Subject: [SciPy-user] minization of multivariable function
In-Reply-To: <COL116-W3E92032EC0EA9DB731CA8CBCC0@phx.gbl>
References: <COL116-W3E92032EC0EA9DB731CA8CBCC0@phx.gbl>
Message-ID: <web-116272857@uni-stuttgart.de>

On Sat, 24 Jan 2009 19:03:51 +0100
  Philippe TANN <philippetann at hotmail.com> wrote:
> 
> Hello,
> 
> I have some problems when I want to minimize a 
>multivariable function with module fmin. Indeed, when I 
>enter my function to minimize, I have indefinitely this 
>kind of messages many times without the values where my 
>function is minimal:
> 
> Optimization terminated successfully.
>         Current function value: 0.000017
>         Iterations: 30
>         Function evaluations: 80
> May you help me to solve this problem?
> Thank you in advance,
> PhT
> 
> _________________________________________________________________
> T?l?phonez gratuitement ? tous vos proches avec Windows 
>Live Messenger? !? T?l?chargez-le maintenant !
> http://www.windowslive.fr/messenger/1.asp

  
Please can you provide an example ?

Nils


From philippetann at hotmail.com  Sat Jan 24 14:16:38 2009
From: philippetann at hotmail.com (Philippe TANN)
Date: Sat, 24 Jan 2009 20:16:38 +0100
Subject: [SciPy-user] minization of multivariable function
In-Reply-To: <web-116272857@uni-stuttgart.de>
References: <COL116-W3E92032EC0EA9DB731CA8CBCC0@phx.gbl>
	<web-116272857@uni-stuttgart.de>
Message-ID: <COL116-W10088E390847ACD50AC243CBCC0@phx.gbl>


Here is my program: I would like to estimate parameters by using the generalized method of moments. The parameters must be the values where the function is minimal. 
When I run my program, there is no syntax error but when I use the statement optiGMM(), the program is running indefinitely by showing several times the same message.:
Optimization terminated successfully.
         Current function value: 0.000015
         Iterations: 32
         Function evaluations: 87
Optimization terminated successfully.
         Current function value: 0.000014
         Iterations: 23
         Function evaluations: 72
Optimization terminated successfully.
         Current function value: 0.000016
         Iterations: 34
         Function evaluations: 82
Optimization terminated successfully.
         Current function value: 0.000012
         Iterations: 35
         Function evaluations: 91
Optimization terminated successfully.
         Current function value: 0.000014
         Iterations: 34
         Function evaluations: 94


> From: nwagner at iam.uni-stuttgart.de
> To: scipy-user at scipy.org
> Date: Sat, 24 Jan 2009 19:10:31 +0100
> Subject: Re: [SciPy-user] minization of multivariable function
> 
> On Sat, 24 Jan 2009 19:03:51 +0100
>   Philippe TANN <philippetann at hotmail.com> wrote:
> > 
> > Hello,
> > 
> > I have some problems when I want to minimize a 
> >multivariable function with module fmin. Indeed, when I 
> >enter my function to minimize, I have indefinitely this 
> >kind of messages many times without the values where my 
> >function is minimal:
> > 
> > Optimization terminated successfully.
> >         Current function value: 0.000017
> >         Iterations: 30
> >         Function evaluations: 80
> > May you help me to solve this problem?
> > Thank you in advance,
> > PhT
> > 
> > _________________________________________________________________
> > T?l?phonez gratuitement ? tous vos proches avec Windows 
> >Live Messenger  !  T?l?chargez-le maintenant !
> > http://www.windowslive.fr/messenger/1.asp
> 
>   
> 
> Please can you provide an example ?
> 
> Nils
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user

_________________________________________________________________
D?couvrez toutes les possibilit?s de communication avec vos proches
http://www.microsoft.com/windows/windowslive/default.aspx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090124/82406a92/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Heston GMM.py
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090124/82406a92/attachment.ksh>

From josef.pktd at gmail.com  Sat Jan 24 15:06:57 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 24 Jan 2009 15:06:57 -0500
Subject: [SciPy-user] minization of multivariable function
In-Reply-To: <COL116-W10088E390847ACD50AC243CBCC0@phx.gbl>
References: <COL116-W3E92032EC0EA9DB731CA8CBCC0@phx.gbl>
	<web-116272857@uni-stuttgart.de>
	<COL116-W10088E390847ACD50AC243CBCC0@phx.gbl>
Message-ID: <1cd32cbb0901241206j706fe0f0jd4fbfbb834cf1655@mail.gmail.com>

On Sat, Jan 24, 2009 at 2:16 PM, Philippe TANN <philippetann at hotmail.com> wrote:
> Here is my program: I would like to estimate parameters by using the
> generalized method of moments. The parameters must be the values where the
> function is minimal.
> When I run my program, there is no syntax error but when I use the statement
> optiGMM(), the program is running indefinitely by showing several times the
> same message.:
> Optimization terminated successfully.
>          Current function value: 0.000015
>          Iterations: 32
>          Function evaluations: 87
> Optimization terminated successfully.
>          Current function value: 0.000014
>          Iterations: 23
>          Function evaluations: 72
> Optimization terminated successfully.
>          Current function value: 0.000016
>          Iterations: 34
>          Function evaluations: 82
> Optimization terminated successfully.
>          Current function value: 0.000012
>          Iterations: 35
>          Function evaluations: 91
> Optimization terminated successfully.
>          Current function value: 0.000014
>          Iterations: 34
>          Function evaluations: 94
>
>
>
>
>> From: nwagner at iam.uni-stuttgart.de
>> To: scipy-user at scipy.org
>> Date: Sat, 24 Jan 2009 19:10:31 +0100
>> Subject: Re: [SciPy-user] minization of multivariable function
>>
>> On Sat, 24 Jan 2009 19:03:51 +0100
>> Philippe TANN <philippetann at hotmail.com> wrote:
>> >
>> > Hello,
>> >
>> > I have some problems when I want to minimize a
>> >multivariable function with module fmin. Indeed, when I
>> >enter my function to minimize, I have indefinitely this
>> >kind of messages many times without the values where my
>> >function is minimal:
>> >
>> > Optimization terminated successfully.
>> > Current function value: 0.000017
>> > Iterations: 30
>> > Function evaluations: 80
>> > May you help me to solve this problem?
>> > Thank you in advance,
>> > PhT
>> >

I just gave it a quick look. I finishes if I put maxiter in both of your fmin
You have a redundant nested fmin in matpoids,
matpoids is solving the same problem each time. The printout
"Optimization terminated successfully" comes from your fmin in
matpoids, not from your fmin in optiGMM

def g(xi):
    xi0=[0.1, 0.5, 0.3]
    W=matpoids(xi0)                    # here you call matpoids with
the same values each time
    L=Heston(xi[0], xi[1], xi[2])
    G=numpy.matrix(conditions(xi[0], xi[1], xi[2], L))
    return abs(float(G*(W*(G.T))))


Your code has a lot of loops and looks not very "efficiently"
programmed, but I didn't try to read in detail. Make sure you don't
have redundant calculations inside your objective function for fmin,
otherwise you might have to wait for a long time for your results.
Move calculations outside and put required parameters in
args=() when calling fmin.

with maxiter = 3 in both fmin, the program ends after a few minutes if I do

xi0=[0.1, 0.5, 0.3]
print optiGMM(xi0)

Josef


From david_baddeley at yahoo.com.au  Sat Jan 24 15:08:41 2009
From: david_baddeley at yahoo.com.au (David Baddeley)
Date: Sat, 24 Jan 2009 12:08:41 -0800 (PST)
Subject: [SciPy-user] How to start with SciPy and NumPy
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
Message-ID: <866667.90059.qm@web33004.mail.mud.yahoo.com>

Hi Vincent,

if you're new to both python and numerical programming I'd suggest you make yourself familiar with basic python first and then move on to the numerical stuff - it'll probably be easier that way. To answer your question, there are two main ways in which Numpy and Scipy help with numeric programming. The first (and simplest) of these is by providing lots of pre-rolled algorithms to do useful things (e.g. computing bessel functions, fourier transforms, and much more). The second, and arguably more important (at least when it comes to performance) is to facilitate vectorisation, which is best illustrated with an example. Say you wanted to compute the sin of a range of numbers between 0 and 2pi, with a spacing of .1 (i.e. 0, 0.1, 0.2 ..... 2pi). The standard python code (could be simplified somwhat by using list comprehensions, but if you're new to python that'd probably be more than a little confusing) would be as follows:

#define some x values
x = [ ]
for i in range(2*pi/0.1):
     x.append(0.1*i)

#calculate the corresponding y values 
y = [ ]
for x_ in x:
     y.append(sin(x_))

The equivalent code using numpy/scipy would be:

x = numpy.arange(0, 2*pi, 0.1)
y = numpy.sin(x)


This is much closer to the underlying maths, making it quicker to program and more readable, and also much faster. The reason for the speed increase is that python is an interpreted language and the for loops above are slow. Numpy effectively executes these under the hood in compiled c code which is much faster. An equally important factor is cost of allocating and navigating the python lists used for storage in the python example - as each data point is processed new memory needs to be allocated which is highly unlikely to be contiguous with the original.

This probably doesn't fully answer your question, but should give you a starting point do do a little googling / more reading in the documentation.  There's got to be some explanation of the benefits of vectorisation already our there - anyone got an idea where you'd find it?

David

----- Original Message ----

Message: 2
Date: Sat, 24 Jan 2009 18:31:26 +0100
From: Vicent <vginer at gmail.com>
Subject: [SciPy-user] How to start with SciPy and NumPy
To: scipy-user at scipy.org
Message-ID:
    <50ed08f40901240931w6e1d637dk87882223bd241582 at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

I am a Python beginner.

At this moment, I am starting to develop and numerical algorithm, which is
supposed to do lot of calculations, but it doesn't use matrices at all, for
example.

So, at this moment, as I don't know much about NumPy and SciPy, and although
it is said that they are very useful and enhancers for developing scientific
Python applications, I don't see the point to use them...

For example, I don't know if there is a great advantage in using NumPy data
types (like bool_, int_ and so on) instead of the associated Python data
types (bool, int, etc.).

When I read NumPy and SciPy documentation, I found lots of new features that
I may use, but I haven't been able to find any quick explanation about how
those modules improve Python performance (something like "Instead of doing
*this* with standard Python, do *this* using NumPy and SciPy because it's
faster/better/whatever").

For example, what is the difference between "random" from random module and
"random" from numpy.random? Or are they the same?

Before I thought that working with NumPy+SciPy would be mandatory for me,
and so that I should have to adapt my code to all its special features, from
the beggining. But, at this moment, my strategy would be working with
"plain" Python, and when necessary, look for features I need in NumPy and
SciPy. Is it OK?

Can you give a light to me?

Sorry if I seem too rude or hard, I didn't mean to. I am just a bit lost...

Thank you in advance, and sorry for my English mistakes.

-- 
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20090124/6a264fef/attachment-0001.html 

------------------------------

_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user


End of SciPy-user Digest, Vol 65, Issue 54
******************************************


      Get the world&#39;s best email - http://nz.mail.yahoo.com/


From josef.pktd at gmail.com  Sat Jan 24 15:46:55 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 24 Jan 2009 15:46:55 -0500
Subject: [SciPy-user] minization of multivariable function
In-Reply-To: <1cd32cbb0901241206j706fe0f0jd4fbfbb834cf1655@mail.gmail.com>
References: <COL116-W3E92032EC0EA9DB731CA8CBCC0@phx.gbl>
	<web-116272857@uni-stuttgart.de>
	<COL116-W10088E390847ACD50AC243CBCC0@phx.gbl>
	<1cd32cbb0901241206j706fe0f0jd4fbfbb834cf1655@mail.gmail.com>
Message-ID: <1cd32cbb0901241246s5977ca4n324ed31eea3f9c7c@mail.gmail.com>

Your weighting matrix doesn't look very good, the values are very  small

>>> Wm0 = matpoids([0.1, 0.5, 0.3])
Optimization terminated successfully.
         Current function value: 0.000012
         Iterations: 33
         Function evaluations: 86
>>> numpy.diag(Wm0)
array([  9.46553187e+05,   2.07547296e+10,   2.34384235e+10,
         4.98660601e+14])


when I use the identity matrix as the weighting matrix, the
optimization converges pretty fast with this result

>>>
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 21
         Function evaluations: 51
[ 0.12207186 -0.0352489   0.36672972]


Here are the changes, how I ran it:

def g(xi,W):
    L=Heston(xi[0], xi[1], xi[2])
    G=numpy.matrix(conditions(xi[0], xi[1], xi[2], L))
    return abs(float(G*(W*(G.T))))

def optiGMM(xi1,W):
    return fmin(g, xi1,args=(W,),xtol=0.001, ftol=0.0001, maxiter=300,
maxfun=None, full_output=0, disp=1, retall=0, callback=None)


#xi0 = [0.1, 0.5, 0.3]
xi0 = [ 0.1243751,  -0.03429623,  0.37577618]
#W=matpoids(xi0)
W = numpy.eye(4)
import time
t = time.time()
print optiGMM(xi0,W)
print time.time() - t


From v.gkinis at gfy.ku.dk  Sat Jan 24 16:51:35 2009
From: v.gkinis at gfy.ku.dk (Vasileios Gkinis)
Date: Sat, 24 Jan 2009 22:51:35 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <mailman.4757.1232827749.2878.scipy-user@scipy.org>
References: <mailman.4757.1232827749.2878.scipy-user@scipy.org>
Message-ID: <497B8D67.4060702@gfy.ku.dk>

An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090124/f6bff3d7/attachment.html>

From gael.varoquaux at normalesup.org  Sat Jan 24 17:32:37 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sat, 24 Jan 2009 23:32:37 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <50ed08f40901240931w6e1d637dk87882223bd241582@mail.gmail.com>
References: <50ed08f40901240931w6e1d637dk87882223bd241582@mail.gmail.com>
Message-ID: <20090124223237.GC11816@phare.normalesup.org>

On Sat, Jan 24, 2009 at 06:31:26PM +0100, Vicent wrote:
>    For example, what is the difference between "random" from random module
>    and "random" from numpy.random? Or are they the same?

Well, if you look at the number of distributions included in numpy.random
and random, this will give you a clue. In addition to shipping much more
distributions, numpy.random, just like al numpy, and scipy, works with
arrays, rather than numbers, which allows you to vectorize part of the
code (check out
http://en.wikipedia.org/wiki/Vectorization_(computer_science)
and 
http://en.wikipedia.org/wiki/Array_programming

You seem to believe that working with large chunk of numbers organized
in arrays is useful only for linear algebra, but on the opposite,
avoiding loops and working on arrays is the basis of a whole catagory of
very succesful language such as Matlab, or IDL. Many old-class numerical
developper despise these languages, but they have proven to be effective. 

>    Before I thought that working with NumPy+SciPy would be mandatory for me,
>    and so that I should have to adapt my code to all its special features,
>    from the beggining. But, at this moment, my strategy would be working with
>    "plain" Python, and when necessary, look for features I need in NumPy and
>    SciPy. Is it OK?

Wel, you can choose to do scientific computing without using the major
scientific libraries. You can do this in Python like in any other
language. You have 15 years of scientific computing in Python to
reinvent, and even more if you extend to other languages. I would advise
you to use them, until you can insight on how they are organized, and why
people ike them. Once you know them well, you can choose to do without,
but at least your choice will be made on an educated basis. I,
personnaly, think it would be foolish to do numerical work in Python
without numpy.

Yes, documentation showing the big picture is missing. The problem is
that nobody seems to have time to write it. Maybe it is because it
doesn't bring money, or academic credit in. We all need to survive.

I reckon from your name that you might be speaking French. It which case,
I just happen to have spent time writting a 12 page article trying to
give the big picture on this problem:
http://www.gnulinuxmag.com/index.php/2009/01/23/gnulinux-magazine-hs-n?40-janvierfevrier-2009-chez-votre-marchand-de-journaux

By the way, for the non French-speaking people on this list, I will write
an English version, but give me time. This one cost me a lot of week ends
in the past 2 years.

Ga?l


From cycomanic at gmail.com  Sun Jan 25 00:47:32 2009
From: cycomanic at gmail.com (Jochen)
Date: Sun, 25 Jan 2009 18:47:32 +1300
Subject: [SciPy-user] PyFFTW
Message-ID: <1232862452.4196.18.camel@phy.auckland.ac.nz>

Hi guys,

I have written a python bindings for the fftw3 C-library, because I
needed the extra speed for some of my simulations and I could not find
any which lets me access the planning. I thought some other people might
find it useful, you can find the code at http://pyfftw.berlios.de. For
me it's almost twice as fast as scipy or numpy fftpack (version 0.6 and
1.1.1) using estimated plans. 
PyFFTW is written in ctypes as this seemed the easiest way to do it.
However this is the first time I have written anything in ctypes, it's
also the first time I've released some source code (I am not a
programmer). The code definitely needs some testing, especially with
respect to higher dimensional ffts, because I don't use ffts with a
dimension higher than 2, same goes for real2real transforms. I'm happy
about any comments/criticism.

Cheers
Jochen


From cmac at mit.edu  Sun Jan 25 02:12:37 2009
From: cmac at mit.edu (Christopher MacMinn)
Date: Sun, 25 Jan 2009 02:12:37 -0500
Subject: [SciPy-user] SciPy-user Digest, Vol 65, Issue 54
In-Reply-To: <mailman.13.1232820003.25468.scipy-user@scipy.org>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
Message-ID: <95da30590901242312l1f0189eev3b27ea5c5143f41a@mail.gmail.com>

>
> I just upgraded scipy 0.6 to scipy 0.7rc2. I think it's worth
>
proclaiming very loudly (or at least mentioning in the release notes)
> that the behavior of sparse.spdiags has change since 0.6.
>

That's a pretty strange change.  I imagine it caused you some headache
before you figured it out. :)

- C
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/46511366/attachment.html>

From lorenzo.isella at gmail.com  Sun Jan 25 04:40:39 2009
From: lorenzo.isella at gmail.com (Lorenzo Isella)
Date: Sun, 25 Jan 2009 10:40:39 +0100
Subject: [SciPy-user] SciPy and GUI
Message-ID: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>

Dear All,
I hope this is not too off-topic. Given you Python code, relying on
SciPy for number-crunching, which tools would you use to create a GUI
in order to allow someone else to use it, without his knowing much (or
anything) about scipy and programming?I know Python is great for this,
but I do not know of anything specific.
Cheers

Lorenzo

-- 
I went to the race track once and bet on a horse that was so good that
it took seven others to beat him!


From gael.varoquaux at normalesup.org  Sun Jan 25 05:05:56 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sun, 25 Jan 2009 11:05:56 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
Message-ID: <20090125100556.GA29918@phare.normalesup.org>

On Sun, Jan 25, 2009 at 10:40:39AM +0100, Lorenzo Isella wrote:
> I hope this is not too off-topic. Given you Python code, relying on
> SciPy for number-crunching, which tools would you use to create a GUI
> in order to allow someone else to use it, without his knowing much (or
> anything) about scipy and programming?I know Python is great for this,
> but I do not know of anything specific.

I would use traits (see
http://code.enthought.com/projects/traits/documentation.php, and
http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html
for documentation and a tutorial)

The pro of traits is that it is really easy to use, and enforces good
software design.

The cons are that it is still not as mainstream as we would like. As a
result it is not installed on all computers. It is however shipped with
both major scientific Python distribution (python(x,y) and ETS), as well
as in ubuntu as debian, mandriva, and is currently being packaged for
fedora.

Ga?l


From s.mientki at ru.nl  Sun Jan 25 05:33:53 2009
From: s.mientki at ru.nl (Stef Mientki)
Date: Sun, 25 Jan 2009 11:33:53 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
Message-ID: <497C4011.9090906@ru.nl>


Lorenzo Isella wrote:
> Dear All,
> I hope this is not too off-topic. Given you Python code, relying on
> SciPy for number-crunching, which tools would you use to create a GUI
> in order to allow someone else to use it, without his knowing much (or
> anything) about scipy and programming?I know Python is great for this,
> but I do not know of anything specific.
>   
I would use wxPython to create the GUI
(or maybe PyQt now the Qt license has changed).
You can either create MatLab like environments, like this
  
http://mientki.ruhosting.nl/data_www/pylab_works/pw_animations_screenshots.html
or Labview like environments like
  http://mientki.ruhosting.nl/data_www/pylab_works/pw_manual.pdf
and here a demo of program with a extensive set of VPython-5 applications
  
http://mientki.ruhosting.nl/data_www/pylab_works/pw_application_vpython3.html
cheers,
Stef

> Cheers
>
> Lorenzo
>
>   


From vginer at gmail.com  Sun Jan 25 06:17:33 2009
From: vginer at gmail.com (Vicent)
Date: Sun, 25 Jan 2009 12:17:33 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <866667.90059.qm@web33004.mail.mud.yahoo.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
Message-ID: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>

On Sat, Jan 24, 2009 at 21:08, David Baddeley
<david_baddeley at yahoo.com.au>wrote:

> Hi Vincent,
>
> if you're new to both python and numerical programming I'd suggest you make
> yourself familiar with basic python first and then move on to the numerical
> stuff - it'll probably be easier that way.


Thank you for the advice.


> To answer your question, there are two main ways in which Numpy and Scipy
> help with numeric programming. The first (and simplest) of these is by
> providing lots of pre-rolled algorithms to do useful things (e.g. computing
> bessel functions, fourier transforms, and much more).


Yes, I realize of that. In that aspect, NumPy+SciPy are like any other
Python, for me. If any time I need something specific, I look if a package
for that already exists.


> The second, and arguably more important (at least when it comes to
> performance) is to facilitate vectorisation, which is best illustrated with
> an example.

[...]
>
> The equivalent code using numpy/scipy would be:
>
> x = numpy.arange(0, 2*pi, 0.1)
> y = numpy.sin(x)
>
>
> This is much closer to the underlying maths, making it quicker to program
> and more readable, and also much faster. The reason for the speed increase
> is that python is an interpreted language and the for loops above are slow.
> Numpy effectively executes these under the hood in compiled c code which is
> much faster. An equally important factor is cost of allocating and
> navigating the python lists used for storage in the python example - as each
> data point is processed new memory needs to be allocated which is highly
> unlikely to be contiguous with the original.


I understand this advantage. Sorry if this was already explained in the
online documentation, but I was not able to find it...

So, let me ask, in order to know if I have understood it well:

Any time I want to perform a task over all the elements on a list, and those
elements are the same type, it is better to use a NumPy array instead a list
to store data. Is that?

I have some questions related to this topic:

(1) Is there any point in maintaining a list and then create a temporary
NumPy array just to perform calculations, and then "copy and paste" the
results on the list?

I mean something similar to, for example, with lists and sets: I have a
list, because I'm interested in order, but then I buid a set based on that
list, just because I know it is faster to look for an element on a set
(isn't it??). Later, I "kill" the set, when it is no longer useful.

>>> c = [1, 2, 3, 1, 1, 2, "a"]
>>> type(c)
<type 'list'>
>>> d = set(c)
>>> type(d)
<type 'set'>
>>> d
set(['a', 1, 2, 3])
>>> "a" in d
True
>>> del d
>>> d
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'd' is not defined
>>> c
[1, 2, 3, 1, 1, 2, 'a']
>>>


(2) What about lists with different typed items within them?

(3) Can I perform operations over all the elements (scalars) in one given
array that meet some given condition?? For example, in your previous
example,"compute sinus only for those elements which are multiple of pi/4
(or whatever)".


>
>
> This probably doesn't fully answer your question, but should give you a
> starting point do do a little googling / more reading in the documentation.


Yes, thank you!


On Sat, Jan 24, 2009 at 22:51, Vasileios Gkinis <v.gkinis at gfy.ku.dk> wrote:

>
>
> Dear Vicent,
>
> You could perhaps take a closer look into the documentation section of
> scipy. I believe that many of your questions will be answered this way. I
> would suggest you take a look into the following performance study:
>
> http://www.scipy.org/PerformancePython
>
> Thank you, Vas, that's a good example. Now I am starting to understand the
power of using NumPy.


> [...]
>
> With time though complexity and size of the code get larger and larger and
> there one can see the benefits  of using the tools included in scipy/numpy.
> Reinventing the wheel is not a smart choice when tested and well coded
> methods are available.
>

OK, I get it...


On Sat, Jan 24, 2009 at 23:32, Gael Varoquaux <gael.varoquaux at normalesup.org
> wrote:

> On Sat, Jan 24, 2009 at 06:31:26PM +0100, Vicent wrote:
> >    For example, what is the difference between "random" from random
> module
> >    and "random" from numpy.random? Or are they the same?
>
> Well, if you look at the number of distributions included in numpy.random
> and random, this will give you a clue.


Ok, but, if I want just to generate a (pseudo) random number between 0 an 1
(uniform distribution), just one number or scalar (not a vector), does NumPy
implement an improved algortihm for that, different from the algorithm
within standard Python?

[ The reason for this question is that, in the past, I worked with some
pseudorandom number generators in C++, and I had some problems with the
quality of the "randomness" of those numbers (and I had to use more
specialized "random"-packages, etc.). ]


> In addition to shipping much more
> distributions, numpy.random, just like al numpy, and scipy, works with
> arrays, rather than numbers, which allows you to vectorize part of the
> code (check out
> http://en.wikipedia.org/wiki/Vectorization_(computer_science)<http://en.wikipedia.org/wiki/Vectorization_%28computer_science%29>
> and
> http://en.wikipedia.org/wiki/Array_programming
>

You seem to believe that working with large chunk of numbers organized
> in arrays is useful only for linear algebra, but on the opposite,
> avoiding loops and working on arrays is the basis of a whole catagory of
> very succesful language such as Matlab, or IDL. Many old-class numerical
> developper despise these languages, but they have proven to be effective.


I admit I had no idea about this topic. Thank you for those links!

So, it could be said that NumPy adds array programming capabilities to
Python?


> Yes, documentation showing the big picture is missing. The problem is
> that nobody seems to have time to write it. Maybe it is because it
> doesn't bring money, or academic credit in. We all need to survive.


An old problem...


> I reckon from your name that you might be speaking French. It which case,
> I just happen to have spent time writting a 12 page article trying to
> give the big picture on this problem:
> http://www.gnulinuxmag.com/index.php/2009/01/23/gnulinux-magazine-hs-n
> ?40-janvierfevrier-2009-chez-votre-marchand-de-journaux


Merci, Ga?l!

I don't speak French (well, just a little), but I understand it. Anyway, it
seems I can't get an online copy of your article.

[My name "Vicent" is in Valencian, which is a language of Spain. And, yes,
Valencian-Catalan is quite similar to French, in some aspects.]

Thank you to all for your kind answers!

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/09f22b2c/attachment.html>

From gael.varoquaux at normalesup.org  Sun Jan 25 06:30:26 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sun, 25 Jan 2009 12:30:26 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
Message-ID: <20090125113026.GD29918@phare.normalesup.org>

Hi Vicent,

Looks like you are all set to learning a lot, given your open attitude.
Don't hesitate to ask question here. Many of us are really lacking time
to answer questions and give advice as well as we would like (I used to
give much more advice a while ago, when I didn't know all this as well
:>), but most often you'll find someone who take the time to give
invaluable comments.

On Sun, Jan 25, 2009 at 12:17:33PM +0100, Vicent wrote:
>    I don't speak French (well, just a little), but I understand it. Anyway,
>    it seems I can't get an online copy of your article.

Indeed, it is not online yet. It will come out online in a couple of
months. On unixgarden.com. I really need to do an English version. I just
have soooo many on-going projects, both for work and for free software...

>    [My name "Vicent" is in Valencian, which is a language of Spain. And, yes,
>    Valencian-Catalan is quite similar to French, in some aspects.]

Sorry, I miss-read your name, and thought it was spelled "Vincent". My
mistake.

Ga?l


From cournape at gmail.com  Sun Jan 25 06:43:27 2009
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 25 Jan 2009 20:43:27 +0900
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
Message-ID: <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>

On Sun, Jan 25, 2009 at 8:17 PM, Vicent <vginer at gmail.com> wrote:
>
>
> On Sat, Jan 24, 2009 at 21:08, David Baddeley <david_baddeley at yahoo.com.au>
> wrote:
>>
>> Hi Vincent,
>>
>> if you're new to both python and numerical programming I'd suggest you
>> make yourself familiar with basic python first and then move on to the
>> numerical stuff - it'll probably be easier that way.
>
> Thank you for the advice.
>
>
>>
>> To answer your question, there are two main ways in which Numpy and Scipy
>> help with numeric programming. The first (and simplest) of these is by
>> providing lots of pre-rolled algorithms to do useful things (e.g. computing
>> bessel functions, fourier transforms, and much more).
>
> Yes, I realize of that. In that aspect, NumPy+SciPy are like any other
> Python, for me. If any time I need something specific, I look if a package
> for that already exists.

Depending on your POV, this may be true. But for many scientific
usages, an array capability is so fundamental that it has strong
consequences on all the dependent code (e.g. little scientific code in
python will use list as its core data structure, for example). It is a
fundamental building block if you want.

> I understand this advantage. Sorry if this was already explained in the
> online documentation, but I was not able to find it...

I think the online documentation is organized for people who are
familiar with those concepts - most people doing numerical
computations are familiar with the union R/matlab/idl/labview. I am
not sure we have a documentation for people not familiar with those
concepts - this would certainly be nice.

> (1) Is there any point in maintaining a list and then create a temporary
> NumPy array just to perform calculations, and then "copy and paste" the
> results on the list?
>

Depends on whether you need a list for later computation: a list
generally takes much more memory if you only care about homogenous
items (a numpy array only takes M * N bytes + overhead, where M is the
size of one item and N the number of bytes of your item - 4 for a 32
bits integers). OTOH, if you keep resizing your data, list may makes
sense - and list can be faster than arrays for small sizes.

There is no unique rule, but for computation on a lot of data, numpy
arrays certainly are a powerful data structure, useful on its own.

> (2) What about lists with different typed items within them?

Numpy arrays - and generally arrays - fundamentally rely on the
assumption of the same type for every item. A lot of the performances
of array comes from this assumption (it means you can access any item
randomly without the need to traverse any other item first, etc...).

> (3) Can I perform operations over all the elements (scalars) in one given
> array that meet some given condition?? For example, in your previous
> example,"compute sinus only for those elements which are multiple of pi/4
> (or whatever)".

Of course. For example, getting an array with all the positive numbers is:

b = a[a>0]

And this will be much faster than list comprehension for relatively large arrays

David


From vginer at gmail.com  Sun Jan 25 07:01:36 2009
From: vginer at gmail.com (Vicent)
Date: Sun, 25 Jan 2009 13:01:36 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <20090125113026.GD29918@phare.normalesup.org>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<20090125113026.GD29918@phare.normalesup.org>
Message-ID: <50ed08f40901250401q5efc9f09s7f3e6947e8af5c0b@mail.gmail.com>

On Sun, Jan 25, 2009 at 12:30, Gael Varoquaux <gael.varoquaux at normalesup.org
> wrote:

> Hi Vicent,
>
> Looks like you are all set to learning a lot, given your open attitude.
> Don't hesitate to ask question here. Many of us are really lacking time
> to answer questions and give advice as well as we would like (I used to
> give much more advice a while ago, when I didn't know all this as well
> :>), but most often you'll find someone who take the time to give
> invaluable comments.


Thank you. I try to open my mind, also because I think it's necessary for my
job.


>
>
>
>
> Sorry, I miss-read your name, and thought it was spelled "Vincent". My
> mistake.
>

It's a common mistake, doesn't mind.

-- 
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/7a337595/attachment.html>

From vginer at gmail.com  Sun Jan 25 07:26:08 2009
From: vginer at gmail.com (Vicent)
Date: Sun, 25 Jan 2009 13:26:08 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
Message-ID: <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com>

On Sun, Jan 25, 2009 at 12:43, David Cournapeau <cournape at gmail.com> wrote:


> Depending on your POV, this may be true. But for many scientific
> usages, an array capability is so fundamental that it has strong
> consequences on all the dependent code (e.g. little scientific code in
> python will use list as its core data structure, for example). It is a
> fundamental building block if you want.


I see...


>
>
> I think the online documentation is organized for people who are
> familiar with those concepts - most people doing numerical
> computations are familiar with the union R/matlab/idl/labview. I am
> not sure we have a documentation for people not familiar with those
> concepts - this would certainly be nice.


I understand that the online documentation is not complete, also as long as
NumPy and SciPy current version numbers are under 1.

But, yes, it would be desirable a kind of introduction to the benefits of
array programming, or something like that.


>
>
> > (1) Is there any point in maintaining a list and then create a temporary
> > NumPy array just to perform calculations, and then "copy and paste" the
> > results on the list?
> >
>
> Depends on whether you need a list for later computation: a list
> generally takes much more memory if you only care about homogenous
> items (a numpy array only takes M * N bytes + overhead, where M is the
> size of one item and N the number of bytes of your item - 4 for a 32
> bits integers). OTOH, if you keep resizing your data, list may makes
> sense - and list can be faster than arrays for small sizes.
>
> There is no unique rule, but for computation on a lot of data, numpy
> arrays certainly are a powerful data structure, useful on its own.
>
> > (2) What about lists with different typed items within them?
>
> Numpy arrays - and generally arrays - fundamentally rely on the
> assumption of the same type for every item. A lot of the performances
> of array comes from this assumption (it means you can access any item
> randomly without the need to traverse any other item first, etc...).


In my case, I am not expecting to change the type of the items within a
list, one they've been entered. And also, I'll have some lists whose
elements will be the same type.

But, also, I am going to have a list of "variables", that can be "float",
"int" or "bool" (in the sense of 0-1 or bit valued), and I want to store,
for each variable (or value) in the list, which kind of type it is/has.

If I do it using lists, I can get the type of a given element in the list by
doing something like this:

>>> type(c[1])
<type 'int'>

If I used NumPy arrays, then every value would be stored as "float" (I
guess), and then an extra field would be necessary in order to store and get
the actual type for each variable.

I mean, I would have a "variable" class which would contain "value" and
"type" as properties (among others), and then I would have a NumPy array of
"variable" objects.


Stop! [ I am thinking...]

Anyway, I'll have a "variable" object, because I need to store some
information for each variable, it doesn't depend on wether I use lists or
arrays to store "variables".

So, within each "variable" element in the NumPy array, the "value" property
for that "variable" can contain an integer value, or a boolean value, etc.
No matter about "different types of elements", because all of them are
"wrapped" with the "variable" structure.


Anyway, from your answer, I see that the point is "How large are the
lists/arrays I am planning to use/need?" Isn't it?

Which approach would be better (lists or arrays)? I guess it depends on the
size on the set of variables... I am thinking about not many variables,
maybe from 10 to 100, at this point of my research.


--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/c317e4aa/attachment.html>

From stefan at sun.ac.za  Sun Jan 25 08:22:48 2009
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Sun, 25 Jan 2009 15:22:48 +0200
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <20090125100556.GA29918@phare.normalesup.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
Message-ID: <9457e7c80901250522n47df5285w19f14740d538ff63@mail.gmail.com>

2009/1/25 Gael Varoquaux <gael.varoquaux at normalesup.org>:
> I would use traits (see
> http://code.enthought.com/projects/traits/documentation.php, and
> http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html
> for documentation and a tutorial)
>
> The pro of traits is that it is really easy to use, and enforces good
> software design.

I can add my voice to Gael's here.  Just last week I advised a
colleague, who wanted to build a GUI to a filter design package, to
consider Traits.  After only one day, he had his whole application
running, and spent the rest of the week tweaking small features to his
liking.  Having that kind of power is hard to imagine!

Whereas widget toolkits provide fundamental building blocks, Traits
provides much more: a well thought-through user-interface framework
that evolved through a company's need to rapidly deploy GUIs for
scientific applications.  I easily put my trust in code that is backed
by the collective experience of so many talented programmers!

Regards
St?fan


From sturla at molden.no  Sun Jan 25 10:01:31 2009
From: sturla at molden.no (Sturla Molden)
Date: Sun, 25 Jan 2009 16:01:31 +0100 (CET)
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
Message-ID: <6ab8b44c7504dcfb00e2745681ec1a20.squirrel@webmail.uio.no>


For now I would use wxPython and wxFormBuilder for this. Here is what I
require from a GUI toolkit:

- The GUI should be constructed using a GUI builder.

- The GUI should be able to embed matplotlib.

- Support for OpenGL.

- It should look good on Windows, Linux and MacOSX.

- A liberal license.


Here is an example on how to use wxFormBuilder with Python:

http://folk.uio.no/sturlamo/HelloWorld.py
http://folk.uio.no/sturlamo/HelloWorld.xrc
http://folk.uio.no/sturlamo/HelloWorld.fbp

There is more information on how to use XRC files with wxPython here:

http://wiki.wxpython.org/index.cgi/XRCTutorial
http://wiki.wxpython.org/UsingXmlResources

I generally think Qt is better than wxWidgets, but until now the license
has deterred me from using it. I am not use how well QtDesigner works with
PyQt, and if matplotlib can be embedded. But when Qt and PyGTL become
released under LGPL, I will take a look at it again.

If you work on Windows only, there is a second option as well: Use Visual
Basic or Borland Delphi. Wrap your Python code as an ActiveX object using
pywin32. Look in Mark Hammond's book for examples on how to do this.


Regards,

Sturla Molden


From sturla at molden.no  Sun Jan 25 10:40:53 2009
From: sturla at molden.no (Sturla Molden)
Date: Sun, 25 Jan 2009 16:40:53 +0100 (CET)
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
Message-ID: <c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>

> On Sun, Jan 25, 2009 at 8:17 PM, Vicent <vginer at gmail.com> wrote:

>> (2) What about lists with different typed items within them?
>
> Numpy arrays - and generally arrays - fundamentally rely on the
> assumption of the same type for every item. A lot of the performances
> of array comes from this assumption (it means you can access any item
> randomly without the need to traverse any other item first, etc...).

Just to clear up a common misunderstanding:

Pythons lists are also implemented as arrays, with the append to the back
being amortized to O(1). This means that Python allocates some empty space
at the end, proportional to the size of the list. Thus, every append does
not need to invoke realloc, and the complexity becomes O(1) on average.
Python's dict and set are also amortized to O(1). And the
collections.deque has amortized to O(1) when appending to both ends.

A python list and deque looks like an array of pointers in C, and thus
supports random access.

The performance of arrays over linked lists comes mainly from cache
coherency, not random access. Most use of these containers are sequential.

It is often claimed that Python lists are slower than linked lists for
appends in the middle. This is not true: While the "insert in the middle"
is O(1) with a linked list and O(N) with Python lists, reaching the middle
item is O(1) with Python lists and O(N) with linked lists. For an append
in the middle you need to to both.

Using linked lists is a bad habit of many programmers, particularly from
the Java community, because introductory CS textbooks explain linked lists
without explaining their weaknesses. I'd estimate that >75% of all
programmers in this world does not know that list and tree structures can
be implemented more efficiently using arrays instead of chained pointers.


Sturla Molden


From prabhu at aero.iitb.ac.in  Sun Jan 25 11:05:47 2009
From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran)
Date: Sun, 25 Jan 2009 21:35:47 +0530
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>	<866667.90059.qm@web33004.mail.mud.yahoo.com>	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
Message-ID: <497C8DDB.5080804@aero.iitb.ac.in>

Sturla Molden wrote:
> It is often claimed that Python lists are slower than linked lists for
> appends in the middle. This is not true: While the "insert in the middle"
> is O(1) with a linked list and O(N) with Python lists, reaching the middle
> item is O(1) with Python lists and O(N) with linked lists. For an append
> in the middle you need to to both.

Just a minor point:  Often, when you want to insert something "in the 
middle" of a sequence you want to insert it next to an existing element 
and in that case there isn't an additional O(N) to find the middle 
element that you speak of.  A more common need I have had is to remove 
an existing element located arbitrarily in which case using link lists 
is practical.  However, if the order of elements in the array is not of 
consequence, one can easily devise a simple scheme to remove an element 
in the middle of the array in O(1) operations.

cheers,
prabhu


From vginer at gmail.com  Sun Jan 25 11:25:05 2009
From: vginer at gmail.com (Vicent)
Date: Sun, 25 Jan 2009 17:25:05 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com>
Message-ID: <50ed08f40901250825o7d7aabe9j43b69d9f8c1eeeab@mail.gmail.com>

On Sun, Jan 25, 2009 at 13:26, Vicent <vginer at gmail.com> wrote:

>
>
> I understand that the online documentation is not complete, also as long as
> NumPy and SciPy current version numbers are under 1.
>
>
Sorry for the mistake. Now I see that NumPy current version is 1.2.1. SciPy
is in its 0.7.0rc2

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/f5e2d6ad/attachment.html>

From vginer at gmail.com  Sun Jan 25 11:45:58 2009
From: vginer at gmail.com (Vicent)
Date: Sun, 25 Jan 2009 17:45:58 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com>
Message-ID: <50ed08f40901250845ya5c1da5n28a6102b94aa885e@mail.gmail.com>

On Sun, Jan 25, 2009 at 13:26, Vicent <vginer at gmail.com> wrote:

>
>
> If I used NumPy arrays, then every value would be stored as "float" (I
> guess), and then an extra field would be necessary in order to store and get
> the actual type for each variable.
>
> I mean, I would have a "variable" class which would contain "value" and
> "type" as properties (among others), and then I would have a NumPy array of
> "variable" objects.
>
>
> Stop! [ I am thinking...]
>
> Anyway, I'll have a "variable" object, because I need to store some
> information for each variable, it doesn't depend on wether I use lists or
> arrays to store "variables".
>
> So, within each "variable" element in the NumPy array, the "value" property
> for that "variable" can contain an integer value, or a boolean value, etc.
> No matter about "different types of elements", because all of them are
> "wrapped" with the "variable" structure.
>
>
>
I've been reading a little about NumPy arrays and "dtypes" or data-types.

I understand I can create arrays where each element follows a specific
data-type structure.

But, if I create a class, can objects (which are instances) from that class
be elements of a NumPy array?

I mean, can I build a NumPy array whose elements are objects of a class I've
defined?

Sorry if my expressions are not right enough...

--
Vicent


-- 
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/d408ff1d/attachment.html>

From warren.weckesser at gmail.com  Sun Jan 25 13:08:30 2009
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Sun, 25 Jan 2009 12:08:30 -0600
Subject: [SciPy-user] Update web page?
Message-ID: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com>

The web page is still announcing the 2008 conferences.  Time for an update?

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/e5427d95/attachment.html>

From gael.varoquaux at normalesup.org  Sun Jan 25 13:11:33 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sun, 25 Jan 2009 19:11:33 +0100
Subject: [SciPy-user] Update web page?
In-Reply-To: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com>
References: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com>
Message-ID: <20090125181133.GC24704@phare.normalesup.org>

On Sun, Jan 25, 2009 at 12:08:30PM -0600, Warren Weckesser wrote:
>    The web page is still announcing the 2008 conferences.  Time for an
>    update?

Its a wiki. Go for it, by all means.

Ga?l


From simpson at math.toronto.edu  Sun Jan 25 13:29:16 2009
From: simpson at math.toronto.edu (Gideon Simpson)
Date: Sun, 25 Jan 2009 13:29:16 -0500
Subject: [SciPy-user] does(can? should?) scipy still use fftw
Message-ID: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu>

Reading the posts here, I'm gathering there have been some changes in  
how the fft is implemented in scipy.  Just to clarify:

Can scipy use fftw?

If so, is there any advantage, performance or otherwise, to linking  
scipy to fftw?

-gideon


From warren.weckesser at gmail.com  Sun Jan 25 13:29:56 2009
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Sun, 25 Jan 2009 12:29:56 -0600
Subject: [SciPy-user] Update web page?
In-Reply-To: <20090125181133.GC24704@phare.normalesup.org>
References: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com>
	<20090125181133.GC24704@phare.normalesup.org>
Message-ID: <114880320901251029s17970025p80d444609a4705d@mail.gmail.com>

On Sun, Jan 25, 2009 at 12:11 PM, Gael Varoquaux <
gael.varoquaux at normalesup.org> wrote:

> On Sun, Jan 25, 2009 at 12:08:30PM -0600, Warren Weckesser wrote:
> >    The web page is still announcing the 2008 conferences.  Time for an
> >    update?
>
> Its a wiki. Go for it, by all means.
>

Done.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/33130365/attachment.html>

From wnbell at gmail.com  Sun Jan 25 13:52:46 2009
From: wnbell at gmail.com (Nathan Bell)
Date: Sun, 25 Jan 2009 13:52:46 -0500
Subject: [SciPy-user] scipy 0.7 changes behavior of sparse.spdiags
In-Reply-To: <65237075-B6F3-4EBD-B4F8-153485B57B68@mit.edu>
References: <65237075-B6F3-4EBD-B4F8-153485B57B68@mit.edu>
Message-ID: <d05265cb0901251052x1892c05fqc83772dde662420d@mail.gmail.com>

On Sat, Jan 24, 2009 at 12:03 PM, Tony S Yu <tonyyu at mit.edu> wrote:
> Thanks to all the scipy developers work on the new scipy release.
>
> I just upgraded scipy 0.6 to scipy 0.7rc2. I think it's worth
> proclaiming very loudly (or at least mentioning in the release notes)
> that the behavior of sparse.spdiags has change since 0.6. See example
> below.
>

Hi Tony,

Sorry for the omission.  It has been added to the release notes in r5522.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/


From gokhansever at gmail.com  Sun Jan 25 17:51:17 2009
From: gokhansever at gmail.com (gsever)
Date: Sun, 25 Jan 2009 14:51:17 -0800 (PST)
Subject: [SciPy-user] Update web page?
In-Reply-To: <114880320901251029s17970025p80d444609a4705d@mail.gmail.com>
References: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com> 
	<20090125181133.GC24704@phare.normalesup.org>
	<114880320901251029s17970025p80d444609a4705d@mail.gmail.com>
Message-ID: <e3d681ae-5135-4e70-9b2b-89232558dc38@w39g2000prb.googlegroups.com>

Hello,

Speaking of web-site have you checked the RSS items? Spams are
everywhere...

I can take care of this of an access granted to me.

On Jan 25, 1:29?pm, Warren Weckesser <warren.weckes... at gmail.com>
wrote:
> On Sun, Jan 25, 2009 at 12:11 PM, Gael Varoquaux <
>
> gael.varoqu... at normalesup.org> wrote:
> > On Sun, Jan 25, 2009 at 12:08:30PM -0600, Warren Weckesser wrote:
> > > ? ?The web page is still announcing the 2008 conferences. ?Time for an
> > > ? ?update?
>
> > Its a wiki. Go for it, by all means.
>
> Done.
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user


From eike.welk at gmx.net  Sun Jan 25 17:33:26 2009
From: eike.welk at gmx.net (Eike Welk)
Date: Sun, 25 Jan 2009 23:33:26 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <20090125100556.GA29918@phare.normalesup.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
Message-ID: <200901252333.26989.eike.welk@gmx.net>

On Sunday 25 January 2009, Gael Varoquaux wrote:
> It is however
> shipped with both major scientific Python distribution (python(x,y)
> and ETS), as well as in ubuntu as debian, mandriva, and is
> currently being packaged for fedora.

You should really ask the Open-Suse guys to package Traits too (and 
maybe some other interesting stuff from Enthought). From your 
introduction Traits seems very good for quickly putting a user 
interface on a numerical program. 

Numpy, Scipy and Matplotlib for Suse are here:
http://download.opensuse.org/repositories/science/openSUSE_11.1/

Some details:
http://download.opensuse.org/repositories/science/openSUSE_11.1/repodata/repoview/Development.Libraries.Python.group.html

The people who work on it seem to be some volunteers:
lars at linux-schulserver.de
Werner Hoch <werner{*}ho{%}gmx{*}de>
Felix Richter <felix{*}richter2{%}uni-rostock{*}de>

Kind Regards,
Eike.


From cournape at gmail.com  Sun Jan 25 22:02:49 2009
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 26 Jan 2009 12:02:49 +0900
Subject: [SciPy-user] does(can? should?) scipy still use fftw
In-Reply-To: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu>
References: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu>
Message-ID: <5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com>

On Mon, Jan 26, 2009 at 3:29 AM, Gideon Simpson
<simpson at math.toronto.edu> wrote:
> Reading the posts here, I'm gathering there have been some changes in
> how the fft is implemented in scipy.  Just to clarify:
>
> Can scipy use fftw?

Support for fftw was removed for 0.7.

cheers,

David


From simpson at math.toronto.edu  Sun Jan 25 22:25:47 2009
From: simpson at math.toronto.edu (Gideon Simpson)
Date: Sun, 25 Jan 2009 22:25:47 -0500
Subject: [SciPy-user] does(can? should?) scipy still use fftw
In-Reply-To: <5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com>
References: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu>
	<5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com>
Message-ID: <9FABDFA4-C27D-4E53-88DA-B64E0D777C2E@math.toronto.edu>

Ok.  Then perhaps one thing to change is the documentation both on the  
website and within the package so that we are not tempted to try and  
build against it.
-gideon

On Jan 25, 2009, at 10:02 PM, David Cournapeau wrote:

> On Mon, Jan 26, 2009 at 3:29 AM, Gideon Simpson
> <simpson at math.toronto.edu> wrote:
>> Reading the posts here, I'm gathering there have been some changes in
>> how the fft is implemented in scipy.  Just to clarify:
>>
>> Can scipy use fftw?
>
> Support for fftw was removed for 0.7.
>
> cheers,
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From david at ar.media.kyoto-u.ac.jp  Sun Jan 25 23:08:08 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 26 Jan 2009 13:08:08 +0900
Subject: [SciPy-user] does(can? should?) scipy still use fftw
In-Reply-To: <9FABDFA4-C27D-4E53-88DA-B64E0D777C2E@math.toronto.edu>
References: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu>	<5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com>
	<9FABDFA4-C27D-4E53-88DA-B64E0D777C2E@math.toronto.edu>
Message-ID: <497D3728.8090400@ar.media.kyoto-u.ac.jp>

Gideon Simpson wrote:
> Ok.  Then perhaps one thing to change is the documentation both on the  
> website and within the package so that we are not tempted to try and  
> build against it.
>   

I agree the website documentation for insstallation should be improved -
it is a big mess ATM; someone needs to clean this up, but this would
require quite some time to put something decent,

David


From david at ar.media.kyoto-u.ac.jp  Sun Jan 25 23:24:02 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 26 Jan 2009 13:24:02 +0900
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>	<866667.90059.qm@web33004.mail.mud.yahoo.com>	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
Message-ID: <497D3AE2.9090604@ar.media.kyoto-u.ac.jp>

Sturla Molden wrote:
>> On Sun, Jan 25, 2009 at 8:17 PM, Vicent <vginer at gmail.com> wrote:
>>     
>
>   
>>> (2) What about lists with different typed items within them?
>>>       
>> Numpy arrays - and generally arrays - fundamentally rely on the
>> assumption of the same type for every item. A lot of the performances
>> of array comes from this assumption (it means you can access any item
>> randomly without the need to traverse any other item first, etc...).
>>     
>
> Just to clear up a common misunderstanding:
>
> Pythons lists are also implemented as arrays, with the append to the back
> being amortized to O(1).

Hm, I did not know that - indeed, when I was talking about list, I was
thinking about linked list.

>  This means that Python allocates some empty space
> at the end, proportional to the size of the list. Thus, every append does
> not need to invoke realloc, and the complexity becomes O(1) on average.
>   

This is independent of the list implementation, isn't it ? I am quite
curious to understand how you could get O(1) complexity for a "growable"
container: if you don't know in advance the number of items, and you add
O(N) items, how come you can get O(1) complexity ?

> The performance of arrays over linked lists comes mainly from cache
> coherency, not random access. Most use of these containers are sequential.
>   

This may be true for one dimensional array, but generally, I think numpy
array performances come a lot from any item being reachable directly
from its 'coordinates' (this plus using native types instead of python
objects of course).

> I'd estimate that >75% of all
> programmers in this world does not know that list and tree structures can
> be implemented more efficiently using arrays instead of chained pointers.
>   

Maybe >75% programmers do not need to implement their own tree and list
:) The only time I implemented my own list was at my first course course
of programming, in C, which convinced me for quite some time that
programming was awful and consisted in looking for bus errors in that
strangely named ultrasparc machine,

cheers,

David


From millman at berkeley.edu  Mon Jan 26 00:11:10 2009
From: millman at berkeley.edu (Jarrod Millman)
Date: Sun, 25 Jan 2009 21:11:10 -0800
Subject: [SciPy-user] ANN: SciPy 0.7.0rc2 (release candidate)
Message-ID: <c7009a550901252111w6b8ba17ekf4d98ced3551f225@mail.gmail.com>

I'm pleased to announce the second release candidate for SciPy 0.7.0.
Due to an issue with the Window's build scripts, the first release
candidate wasn't announced.

SciPy is a package of tools for science and engineering for Python.
It includes modules for statistics, optimization, integration, linear
algebra, Fourier transforms, signal and image processing, ODE solvers,
and more.

This release candidate comes almost one year after the 0.6.0 release
and contains many new features, numerous bug-fixes, improved test
coverage, and better documentation.  Please note that SciPy 0.7.0rc2
requires Python 2.4 or greater and NumPy 1.2.0 or greater.

For information, please see the release notes:
http://sourceforge.net/project/shownotes.php?group_id=27747&release_id=655674

You can download the release from here:
http://sourceforge.net/project/showfiles.php?group_id=27747&package_id=19531&release_id=655674

Thank you to everybody who contributed to this release.

Enjoy,

Jarrod Millman


From vginer at gmail.com  Mon Jan 26 02:53:47 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 08:53:47 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
Message-ID: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>

Hello again.

I have a doubt, related with all what was talked in the last posts of the
previous thread.

I've managed to build a NumPy array whose elements (scalars?) are objects
from a class. That class was defined by me previously.

Each object contains several properties. For example, they have the property
"value".

For each object, the property "value" can contain many different things, for
example, an integer value, a boolean value or a "float". SO, I think it
wouldn't be possible to replace that "object"/class with a NumPy data-type
or "struct", in case I wanted.

My question: is that a problem? I mean, is that NumPy array going to be
"slow" to search, and so on, because its elements are not "optimized" NumPy
types?? Maybe this question has no sense, but, actually, I would like to
know if there is any kind of problem with that kind of "mixed structure":
using NumPy arrays of developer-defined (non-NumPy) objects.

Thank you in advance!

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/02ac8fe2/attachment.html>

From pgmdevlist at gmail.com  Mon Jan 26 03:02:17 2009
From: pgmdevlist at gmail.com (Pierre GM)
Date: Mon, 26 Jan 2009 03:02:17 -0500
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
Message-ID: <9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com>

Vicent,
Without a more specific example, it might be quite difficult for us to  
help you.
Would your 'value' property be of the same type for all the objects of  
your sequence ? If yes, then you could define a class where 'value'  
would be a ndarray. Other properties would then be other arrays, and  
so forth.
But I probably speak out of place.
P.

On Jan 26, 2009, at 2:53 AM, Vicent wrote:

> Hello again.
>
> I have a doubt, related with all what was talked in the last posts  
> of the previous thread.
>
> I've managed to build a NumPy array whose elements (scalars?) are  
> objects from a class. That class was defined by me previously.
>
> Each object contains several properties. For example, they have the  
> property "value".
>
> For each object, the property "value" can contain many different  
> things, for example, an integer value, a boolean value or a "float".  
> SO, I think it wouldn't be possible to replace that "object"/class  
> with a NumPy data-type or "struct", in case I wanted.
>
> My question: is that a problem? I mean, is that NumPy array going to  
> be "slow" to search, and so on, because its elements are not  
> "optimized" NumPy types?? Maybe this question has no sense, but,  
> actually, I would like to know if there is any kind of problem with  
> that kind of "mixed structure": using NumPy arrays of developer- 
> defined (non-NumPy) objects.
>
> Thank you in advance!
>
> --
> Vicent
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From gael.varoquaux at normalesup.org  Mon Jan 26 03:09:41 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Mon, 26 Jan 2009 09:09:41 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
Message-ID: <20090126080941.GA1894@phare.normalesup.org>

On Mon, Jan 26, 2009 at 01:24:02PM +0900, David Cournapeau wrote:
> Maybe >75% programmers do not need to implement their own tree and list
> :) The only time I implemented my own list was at my first course course
> of programming, in C, which convinced me for quite some time that
> programming was awful and consisted in looking for bus errors in that
> strangely named ultrasparc machine,

Sounds familiar to me :)

Ga?l


From vginer at gmail.com  Mon Jan 26 03:45:50 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 09:45:50 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
	<9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com>
Message-ID: <50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com>

On Mon, Jan 26, 2009 at 09:02, Pierre GM <pgmdevlist at gmail.com> wrote:

> Vicent,
> Without a more specific example, it might be quite difficult for us to
> help you.
> Would your 'value' property be of the same type for all the objects of
> your sequence ?


No, that's what I meant.


> If yes, then you could define a class where 'value'
> would be a ndarray. Other properties would then be other arrays, and
> so forth.
> But I probably speak out of place.
> P.
>


Ok, this is an example of what I am referring to. The class is called
"Element", and the property is called "property1" (and not "value", which
can be confusing):

>>> import numpy as N
>>>
>>> class Element :
...     def __init__(self, value) :
...         self.property1 = value
...
>>> a = Element(1.)
>>> b = Element(1)
>>> c = Element(True)
>>> type(a.property1)
<type 'float'>
>>> type(b.property1)
<type 'int'>
>>> type(c.property1)
<type 'bool'>
>>> alltog = N.array([a,b,c])


The "alltog" array has 3 members, or elements, or scalars... each of them
being objects from the "Element" class, although each of them "contains" a
different type of value in its "property1".

[ I know that "property1" is just like a "pointer" (more or less), so I
understand that the objects named by "a", "b" and "c" don't "contain" any
number,actually. Is like that, isn't it? ]

My (multiple) question is:

Is that a "bad" (not optimal) implementation, because I am mixing NumPy
"optimized" arrays with "simple" objects? Would it be better if each element
in the array was a "record" built by using NumPy "dtype" feature? I think I
can't, because each value in "property1" can have a different type, as you
see.

I hope now it's clearer...

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/20b2df5d/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Mon Jan 26 03:34:07 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 26 Jan 2009 17:34:07 +0900
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
 start with SciPy and NumPy)
In-Reply-To: <50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>	<9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com>
	<50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com>
Message-ID: <497D757F.1010802@ar.media.kyoto-u.ac.jp>

Vicent wrote:
>
> Is that a "bad" (not optimal) implementation, because I am mixing
> NumPy "optimized" arrays with "simple" objects?

The problem is that your question is too general without more context.
Discussing about the best data representation without  the problem you
are trying to solve makes little sense, I think.

cheers,

David


From vginer at gmail.com  Mon Jan 26 04:05:33 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 10:05:33 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <497D757F.1010802@ar.media.kyoto-u.ac.jp>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
	<9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com>
	<50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com>
	<497D757F.1010802@ar.media.kyoto-u.ac.jp>
Message-ID: <50ed08f40901260105u58cdf209r70b5eb2cf96b8893@mail.gmail.com>

On Mon, Jan 26, 2009 at 09:34, David Cournapeau <
david at ar.media.kyoto-u.ac.jp> wrote:

> Vicent wrote:
> >
> > Is that a "bad" (not optimal) implementation, because I am mixing
> > NumPy "optimized" arrays with "simple" objects?
>
> The problem is that your question is too general without more context.
> Discussing about the best data representation without  the problem you
> are trying to solve makes little sense, I think.
>

In fact... I think I am going to use it in a sequential way, I mean, I am
going to build loops to go from the first element to the last in the array,
and perform some operations related with the properties of each element.

Also, it is possible that I need to perform some searches, I mean, to look
for a concrete value of "property1" within the Elements in the array.

I don't know if I should by more concrete...

--
vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/f7f140b2/attachment.html>

From faltet at pytables.org  Mon Jan 26 05:22:23 2009
From: faltet at pytables.org (Francesc Alted)
Date: Mon, 26 Jan 2009 11:22:23 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
Message-ID: <200901261122.24357.faltet@pytables.org>

Ei, Vicent,

A Monday 26 January 2009, Vicent escrigu?:
> Hello again.
>
> I have a doubt, related with all what was talked in the last posts of
> the previous thread.
>
> I've managed to build a NumPy array whose elements (scalars?) are
> objects from a class. That class was defined by me previously.
>
> Each object contains several properties. For example, they have the
> property "value".
>
> For each object, the property "value" can contain many different
> things, for example, an integer value, a boolean value or a "float".
> SO, I think it wouldn't be possible to replace that "object"/class
> with a NumPy data-type or "struct", in case I wanted.
>
> My question: is that a problem? I mean, is that NumPy array going to
> be "slow" to search, and so on, because its elements are not
> "optimized" NumPy types?? Maybe this question has no sense, but,
> actually, I would like to know if there is any kind of problem with
> that kind of "mixed structure": using NumPy arrays of
> developer-defined (non-NumPy) objects.

Yes.  In general, having arrays of 'object' dtype is a problem in NumPy 
because you won't be able to reach the high performance that NumPy can 
usually reach by specifying other dtypes like 'float' or 'int'.  This 
is because many of the NumPy accelerations are based on two facts:

1. That every element of the array is of equal size (in order to allow 
high memory performance on common access patterns).

2. That operations between each of these elements have available 
hardware that can perform fast operations with them.

In nowadays architectures, the sort of elements that satisfy those 
conditions are mainly these types:

boolean, integer, float, complex and fixed-length strings

Another kind of array element that can benefit from NumPy better 
computational abilities are compound objects that are made of the above 
ones, which are commonly referred as 'record types'.  However, in order 
to preserve condition 1, these compound objects cannot vary in size 
from element to element (so, your example does not fit here).  However, 
such record arrays normally lacks the property 2 for most operations, 
so they are normally seen more as a data containers than a 
computational object "per se".

So, you have two options here:

- If you want to stick with collections of classes with attributes that 
can be general python objects, then try to use python containers for 
your case.  You will find that, in general, they are better suited for 
doing most of your desired operations.

- If you need extreme computational speed, then you need to change your 
data schema (and perhaps the way your brain works too) and start to 
think in terms of homegeneous array NumPy objects as your building 
blocks.

This is why people wanted that you were more explicit in describing your 
situation: they tried to see whether NumPy arrays could be used as the 
basic building blocks for your data schema or not.  My advice here is 
that you try first with regular python containers.  If you are not 
satisfied with speed or memory consumption, then try to restate your 
problem in terms of arrays and use NumPy to accelerate them (and to 
consume far less memory too).

Hope that helps,

-- 
Francesc Alted


From vginer at gmail.com  Mon Jan 26 06:45:02 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 12:45:02 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <200901261122.24357.faltet@pytables.org>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
	<200901261122.24357.faltet@pytables.org>
Message-ID: <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com>

On Mon, Jan 26, 2009 at 11:22, Francesc Alted <faltet at pytables.org> wrote:

> Ei, Vicent,
>
> Yes.  In general, having arrays of 'object' dtype is a problem in NumPy
> because you won't be able to reach the high performance that NumPy can
> usually reach by specifying other dtypes like 'float' or 'int'.  This
> is because many of the NumPy accelerations are based on two facts:
>
> 1. That every element of the array is of equal size (in order to allow
> high memory performance on common access patterns).
>
> 2. That operations between each of these elements have available
> hardware that can perform fast operations with them.
>
> In nowadays architectures, the sort of elements that satisfy those
> conditions are mainly these types:
>
> boolean, integer, float, complex and fixed-length strings
>
> Another kind of array element that can benefit from NumPy better
> computational abilities are compound objects that are made of the above
> ones, which are commonly referred as 'record types'.  However, in order
> to preserve condition 1, these compound objects cannot vary in size
> from element to element (so, your example does not fit here).  However,
> such record arrays normally lacks the property 2 for most operations,
> so they are normally seen more as a data containers than a
> computational object "per se".
>
> So, you have two options here:
>
> - If you want to stick with collections of classes with attributes that
> can be general python objects, then try to use python containers for
> your case.  You will find that, in general, they are better suited for
> doing most of your desired operations.
>
> - If you need extreme computational speed, then you need to change your
> data schema (and perhaps the way your brain works too) and start to
> think in terms of homegeneous array NumPy objects as your building
> blocks.
>
> This is why people wanted that you were more explicit in describing your
> situation: they tried to see whether NumPy arrays could be used as the
> basic building blocks for your data schema or not.  My advice here is
> that you try first with regular python containers.  If you are not
> satisfied with speed or memory consumption, then try to restate your
> problem in terms of arrays and use NumPy to accelerate them (and to
> consume far less memory too).
>
> Hope that helps,
>


Of course it helps!!   :-)    Gr?cies, Francesc.

That solves my question. I realize of the importance of adapting my mind and
my data structures to NumPy arrays, dtypes, "records" and so on.

But, it leads me to another question:

(1) How can I match/join object-oriented programming with the array+record
NumPy philosophy?

I mean, as far as I understood, what I thought that should be defined as an
object with properties and methods, may be better defined as a "record
dtype" + some functions that operate with that kind of records. Right?

So... Isn't it possible to "embed" the second approach into the first??
Maybe it makes no sense, but I would like to know it.

[I answer myself: I think I could keep classes for several "big" and unique
or not frequent classes (and that don't require much computation), and
arrays + NumPy-like records for massive computations over "grids" or
"matrices" of "similar" elements.]

(2) Just to be sure: An array can be assigned to a property of an object,
can't it?

Sorry if I'm being too general again!

In fact, I know that some of my colleagues don't work with objects, but just
with "structs" or "records" and functions that directly manage those
"records". They work with C++ and Delphy, by the way.

Thank you in advance for your answers.

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/fe3b79b4/attachment.html>

From daniele at grinta.net  Mon Jan 26 06:54:09 2009
From: daniele at grinta.net (Daniele Nicolodi)
Date: Mon, 26 Jan 2009 12:54:09 +0100
Subject: [SciPy-user] Plotting simple 3d objects with mayavi
Message-ID: <497DA461.2020301@grinta.net>

Hello, I wrote some code that simulates gas particles into some small 
volumes where a test mass is floating, in order to compute gas damping 
coefficients.

To illustrate the geometries and to check my surfaces description in the 
more complex setups I would like to be able to draw the objects. I think 
  mayavi can be the tool of choice here.

However I'm unable to find an easy way to plot orthogonal surfaces. My 
geometry is described in terms of orthogonal surfaces. What i would like 
is then a way to draw surfaces given their vertex or a similar description.

Can someone please point my to the simpliest way of acomplishing this?

Thanks. Cheers.
-- 
Daniele


From david at ar.media.kyoto-u.ac.jp  Mon Jan 26 06:45:18 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 26 Jan 2009 20:45:18 +0900
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
 start with SciPy and NumPy)
In-Reply-To: <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>	<200901261122.24357.faltet@pytables.org>
	<50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com>
Message-ID: <497DA24E.30306@ar.media.kyoto-u.ac.jp>

Vicent wrote:
>
> (2) Just to be sure: An array can be assigned to a property of an
> object, can't it?

A numpy array is a 'full' python object, thus can be used in the same
cases as a python object. One thing to keep in mind though is that
arrays have some copy semantics which may surprise you:

# a is some sort of array
b = a

In that case, b is a new name for the content in a, and any change in b
will reflect in a. So if you have:

class A:
    def __init__(self, data):
        self.data = data

and data are modified outside A instances, the data inside the instances
will be changed as well. This is similar to python list, but there are
some differences as well.

> In fact, I know that some of my colleagues don't work with objects,
> but just with "structs" or "records" and functions that directly
> manage those "records". They work with C++ and Delphy, by the way.

Working with object or not is not generally the most relevant aspect of
good design - if you can do the same with a few functions and standard
python objects/containers, it is often simpler and better to use them. A
good example is simple file handling: if you compare the Java and the
python method, the python method is certainly more elegant, and don't
rely exclusively on objects as Java does.

cheers,

David


From gael.varoquaux at normalesup.org  Mon Jan 26 07:55:40 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Mon, 26 Jan 2009 13:55:40 +0100
Subject: [SciPy-user] Plotting simple 3d objects with mayavi
In-Reply-To: <497DA461.2020301@grinta.net>
References: <497DA461.2020301@grinta.net>
Message-ID: <20090126125540.GF1894@phare.normalesup.org>

On Mon, Jan 26, 2009 at 12:54:09PM +0100, Daniele Nicolodi wrote:
> However I'm unable to find an easy way to plot orthogonal surfaces. My 
> geometry is described in terms of orthogonal surfaces. What i would like 
> is then a way to draw surfaces given their vertex or a similar description.

Hi,

I am not sure what you call "orthogonal surfaces". How is your data
described?

The different functions for plotting meshes in Mayavi would be:
http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.mesh
http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.triangular_mesh
http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.surf

They each correspond to a different description of the surface. You might
have a different description that needs to be translated in one of these.

Cheers,

Ga?l


From Dharhas.Pothina at twdb.state.tx.us  Mon Jan 26 09:30:39 2009
From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina)
Date: Mon, 26 Jan 2009 08:30:39 -0600
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <20090125100556.GA29918@phare.normalesup.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
Message-ID: <497D74AF.63BA.009B.0@twdb.state.tx.us>

Gael,

I almost sent a similar question a few days ago about making a GUI app
so I'll tag along here.

I'm trying to make a GUI application to QA/QC field data. I need to
pull data from a text file or database. Explore it and choose points (ie
bad data etc) to delete etc. I have virtually no experience in GUI
programming except for some stuff with visual C++ over 10 years ago that
I vaguely remember.

I've read your tutorial using traits and matplotlib and also a little
bit of some of the Chaco examples. But I'm struggling to decide whether
to go with traits + matplotlib or with chaco. I've also read some of the
older mailing list discussions about chaco and matplotlib but those
don't focus so much on GUI applications.

On one hand, I am already using matplotlib and the timeseries toolkit
extensively in scripts so I'm familiar with them and know that they can
make pretty much any type of plot I need. Also matplotlib has a large
community.

On the other hand, chaco seems to have been designed for this type of
interactive application and the plots I need for the GUI app are simpler
and are supported by Chaco.

Do you (or any others) have any comments about the pros and cons of
each for someone new at this stuff. 

thanks,

- dharhas


>>> Gael Varoquaux <gael.varoquaux at normalesup.org> 1/25/2009 4:05 AM
>>>
On Sun, Jan 25, 2009 at 10:40:39AM +0100, Lorenzo Isella wrote:
> I hope this is not too off-topic. Given you Python code, relying on
> SciPy for number-crunching, which tools would you use to create a
GUI
> in order to allow someone else to use it, without his knowing much
(or
> anything) about scipy and programming?I know Python is great for
this,
> but I do not know of anything specific.

I would use traits (see
http://code.enthought.com/projects/traits/documentation.php, and
http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html

for documentation and a tutorial)

The pro of traits is that it is really easy to use, and enforces good
software design.

The cons are that it is still not as mainstream as we would like. As a
result it is not installed on all computers. It is however shipped
with
both major scientific Python distribution (python(x,y) and ETS), as
well
as in ubuntu as debian, mandriva, and is currently being packaged for
fedora.

Ga?l
_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org 
http://projects.scipy.org/mailman/listinfo/scipy-user


From matthieu.brucher at gmail.com  Mon Jan 26 09:40:11 2009
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Mon, 26 Jan 2009 15:40:11 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
Message-ID: <e76aa17f0901260640g4abbdaf6l11646ccbaa75deea@mail.gmail.com>

> I've read your tutorial using traits and matplotlib and also a little
> bit of some of the Chaco examples. But I'm struggling to decide whether
> to go with traits + matplotlib or with chaco. I've also read some of the
> older mailing list discussions about chaco and matplotlib but those
> don't focus so much on GUI applications.

Chaco can easily be used with Traits. In fact Enthought develops both,
so it's in their best interest that everything is fine.

I didn't use Chaco in the past, so I don't have an opinion.

> On one hand, I am already using matplotlib and the timeseries toolkit
> extensively in scripts so I'm familiar with them and know that they can
> make pretty much any type of plot I need. Also matplotlib has a large
> community.
>
> On the other hand, chaco seems to have been designed for this type of
> interactive application and the plots I need for the GUI app are simpler
> and are supported by Chaco.
>
> Do you (or any others) have any comments about the pros and cons of
> each for someone new at this stuff.

Matthieu
-- 
Information System Engineer, Ph.D.
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From sturla at molden.no  Mon Jan 26 09:52:37 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 26 Jan 2009 15:52:37 +0100 (CET)
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
Message-ID: <fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>

> Sturla Molden wrote:

> This is independent of the list implementation, isn't it ? I am quite
> curious to understand how you could get O(1) complexity for a "growable"
> container: if you don't know in advance the number of items, and you add
> O(N) items, how come you can get O(1) complexity ?

Each time the array is re-sized, you add in some extra empty slots. Make
sure the number of extra slots is proportional to the size of the array.
That is:

Worst-case complexity when calling realloc: O(N), i.e. allocating a new
buffer, copy over the data. Time to new realloc: O(N), i.e k*N empty
slots. The worst case complexity for re-sizes using realloc is on average
O(N)/O(N) = O(1). Between realloc's the empty slots are used, so these
appends are O(1). I.e. appends to the growable array is O(1) on average.
This is referred to as "amortized O(1) complexity". The advantage of this
over linked lists is that elements will be stored om a cache coherent
manner. But in both cases the interface to the user will be that of a list
(growable container). Python lists do this, C++ std::vector do this, etc.

It is handy to know of this strategy for NumPy as well; e.g. if you want
to write an ndarray subclass that can grow and shrink dynamically.


> This may be true for one dimensional array, but generally, I think numpy
> array performances come a lot from any item being reachable directly
> from its 'coordinates' (this plus using native types instead of python
> objects of course).

An ndarrays data is a "one-dimensional array" of bytes.

If you jump back and forth using arr[i,j] the speed will depend on the
size of the array. If it is to big to fit in cache this may be slow, if it
is small enough this will be fast.

On the other hand, if you iterate over arr[i,j] in an ordered manner, it
will be fast because the elements are stored cache coherently. That is, if
the array is stored in C order, it is a big chance that arr[i,j+1] will be
in cache when you have retrived arr[i,j].

It is not the coordinates that gives you the speed, it is how the data are
cached by the processor.


Regards,
S. Molden


From sturla at molden.no  Mon Jan 26 09:57:57 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 26 Jan 2009 15:57:57 +0100 (CET)
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
Message-ID: <ebb7ccc54f30ec785780a7f73fa33c8f.squirrel@webmail.uio.no>


> On one hand, I am already using matplotlib and the timeseries toolkit
> extensively in scripts so I'm familiar with them and know that they can
> make pretty much any type of plot I need. Also matplotlib has a large
> community.

Matplotlib is excellent for plotting, but its disadvantage is lack of
speed when the data sets are large. I have seen matplotlib spend 5 minutes
to plot a digitized signal, whereas OpenGL could do the same in an
eye-blink. But yes, matplotlib creates very nice looking graphics, and
it's pylab interface is familiar enough to old Matlab users like myself.

As I said in a previous post, Matplotlib and wxPython can easily be
integrated. There is examples of this on the Matplotlib website.

S.M.


From cournape at gmail.com  Mon Jan 26 10:11:35 2009
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 27 Jan 2009 00:11:35 +0900
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
	<fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
Message-ID: <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>

On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden <sturla at molden.no> wrote:
>> Sturla Molden wrote:
>
>> This is independent of the list implementation, isn't it ? I am quite
>> curious to understand how you could get O(1) complexity for a "growable"
>> container: if you don't know in advance the number of items, and you add
>> O(N) items, how come you can get O(1) complexity ?
>
> Each time the array is re-sized, you add in some extra empty slots. Make
> sure the number of extra slots is proportional to the size of the array.

So this is the well known method of over allocating when you need to
grow, right ? This is not constant time, and depends on the number of
items you are adding I think (sublinearly, but still)

>
>> This may be true for one dimensional array, but generally, I think numpy
>> array performances come a lot from any item being reachable directly
>> from its 'coordinates' (this plus using native types instead of python
>> objects of course).
>
> An ndarrays data is a "one-dimensional array" of bytes.

Yes, it is one segment of memory, but there is more to that - the
interpretation of that data buffer (from the dtype): the fact that
each item has the exact same size, and that the array is not "ragged"
has big impact on performances as well I think.

>
> If you jump back and forth using arr[i,j] the speed will depend on the
> size of the array. If it is to big to fit in cache this may be slow, if it
> is small enough this will be fast.

Yes, but that's not the only factor: the dependency to the array's
size is only a limitation of the hardware. From a number of operations
POV, the coordinates are the only thing needed: a[i, j, k, ...] is
translated directly to the coordinate of the 1d buffer using the
strides info. This is really specifig to arrays, I would say.

Cache is obviously significant, but this is a consequence of the lack
of multiple indirection, itself a consequence of array being a set of
homogenous items. If the items all have difference sizes, you can't
know directly the number of bytes to jump directly, and you will need
indirection which breaks locality of your data - I think that's how
array-based lists work, the array is just the address of the items,
whereas for arrays, the address is the item.

cheers,

David


From gael.varoquaux at normalesup.org  Mon Jan 26 10:12:03 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Mon, 26 Jan 2009 16:12:03 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
Message-ID: <20090126151203.GG1894@phare.normalesup.org>

Quickly,

On Mon, Jan 26, 2009 at 08:30:39AM -0600, Dharhas Pothina wrote:
> I've read your tutorial using traits and matplotlib and also a little
> bit of some of the Chaco examples. But I'm struggling to decide whether
> to go with traits + matplotlib or with chaco. I've also read some of the
> older mailing list discussions about chaco and matplotlib but those
> don't focus so much on GUI applications.

> On one hand, I am already using matplotlib and the timeseries toolkit
> extensively in scripts so I'm familiar with them and know that they can
> make pretty much any type of plot I need. Also matplotlib has a large
> community.

> On the other hand, chaco seems to have been designed for this type of
> interactive application and the plots I need for the GUI app are simpler
> and are supported by Chaco.

> Do you (or any others) have any comments about the pros and cons of
> each for someone new at this stuff. 

Matlplotlib has a huge user base and an excellent documentation. It can
be inserted in Traits (I have shown it in my tutorial). It is suitable
for GUI developement with Traits. Many people have done it, including me.

On the other hand, matplotlib's model is very much imperative and
script-based. This makes it easy to understand, but really is not the
right paradigm for interactive applications in an object-oriented
language. Chances are that, unless you are very experienced with the MVC
pattern and interactive application design, you will make architectural
errors when building an interactive application with Matplotlib. Chaco
will constrain you, force you to do things according to its model, which
you will hate (we all did at some point), but later on you will be happy
that it enforced on you some object-oriented structure, on some
separation of concerns (think model-view-controller, which can be
transcribed in terms of data-plot-interactor in Chaco). In addition, the
fact that Chaco plugs into Traits seemlessly gives you a huge amount of
benefit for interactivity. The focus switches from registering callbacks
all over the place to reactive programming on attribute modification.

Now all this nice and fancy architecture, this "Good" design, and so
forth, you may not actually care about, if your application is
simple-enough. A poorly designed application has difficulties growing,
but what if it will never grow? I could draw a scale with increasing
interactivity and complexity, and we could argue where to put the line
delimiting Matplotlib land and Chaco land. I use both.

Another win of Chaco is speed. You might not care either.

Chaco used to be really poorly documented. Things are improving a lot
(http://code.enthought.com/projects/chaco/documentation.php).

The developers are responsive, on the enthought-dev mailing list. Quite a
few people have made the choice of Chaco, and been very happy with it.

I can't decide for you, sorry. If you are going to code something large
and long-lived, I suggest you spend a few days coding 'hello word'
applications in both, exploring things similar to what you will need to
code in your final app, and make the decision afterward. The time spent
doing this will be neglectible compared to the time spent coding a big
app. If you are going to code a very small app, it doesn't really matter.

Good luck,

Ga?l


From christopher.paul.taylor at gmail.com  Mon Jan 26 10:22:25 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Mon, 26 Jan 2009 10:22:25 -0500
Subject: [SciPy-user] more build issues - slamch.o relocation R_X86_64_32S
	error
Message-ID: <f44caf830901260722j5b907e67rcd60e1f975bddc72@mail.gmail.com>

So I've followed the build instructions in the ATLAS package, and the
SciPy package. I've also followed the instructions for building scipy
on a centos box and I consistently get the following build error:

gcc: build/src.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/lib/lapack/flapackmodule.c
/usr/bin/g77 -g -Wall -g -Wall -shared
build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/lib/lapack/flapackmodule.o
build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/fortranobject.o
-L~/opt/usr/local/atlas/lib -Lbuild/temp.linux-x86_64-2.5 -llapack
-lptf77blas -lptcblas -latlas -lg2c -o
build/lib.linux-x86_64-2.5/scipy/lib/lapack/flapack.so
/usr/bin/ld: ~/opt/usr/local/atlas/lib/liblapack.a(slamch.o):
relocation R_X86_64_32S against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
~/opt/usr/local/atlas/lib/liblapack.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
/usr/bin/ld: ~/opt/usr/local/atlas/lib/liblapack.a(slamch.o):
relocation R_X86_64_32S against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
~/opt/usr/local/atlas/lib/liblapack.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared
build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/lib/lapack/flapackmodule.o
build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/fortranobject.o
-L~/opt/usr/local/atlas/lib -Lbuild/temp.linux-x86_64-2.5 -llapack
-lptf77blas -lptcblas -latlas -lg2c -o
build/lib.linux-x86_64-2.5/scipy/lib/lapack/flapack.so" failed with
exit status 1
[
I've aliased gfortran to g77 and that gets me similar results. any
recommendations? this build error is *killing* me.

thanks,
ct


From cournape at gmail.com  Mon Jan 26 10:28:40 2009
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 27 Jan 2009 00:28:40 +0900
Subject: [SciPy-user] more build issues - slamch.o relocation
	R_X86_64_32S error
In-Reply-To: <f44caf830901260722j5b907e67rcd60e1f975bddc72@mail.gmail.com>
References: <f44caf830901260722j5b907e67rcd60e1f975bddc72@mail.gmail.com>
Message-ID: <5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com>

On Tue, Jan 27, 2009 at 12:22 AM, christopher taylor
<christopher.paul.taylor at gmail.com> wrote:
> So I've followed the build instructions in the ATLAS package, and the
> SciPy package. I've also followed the instructions for building scipy
> on a centos box and I consistently get the following build error:

You need to build ATLAS and LAPACK with -fPIC. For ATLAS, you do it as follows:

../configure -Fa alg -FPIC

For LAPACK, you need to set both OPTS and NOOPT in make.inc.

> I've aliased gfortran to g77 and that gets me similar results. any
> recommendations?

This is not a good idea: g77 and gfortran are not compatible; for all
practical purpose, you cannot mix code build by one with code built by
the other. You have to make sure either g77 or gfortran is used for
everything, from lapack to scipy.

cheers,

David


From argriffi at ncsu.edu  Mon Jan 26 10:32:48 2009
From: argriffi at ncsu.edu (alex)
Date: Mon, 26 Jan 2009 10:32:48 -0500
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>	<866667.90059.qm@web33004.mail.mud.yahoo.com>	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>	<fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
	<5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
Message-ID: <497DD7A0.3050204@ncsu.edu>

David Cournapeau wrote:
> On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden <sturla at molden.no> wrote:
>   
>>> Sturla Molden wrote:
>>>       
>>> This is independent of the list implementation, isn't it ? I am quite
>>> curious to understand how you could get O(1) complexity for a "growable"
>>> container: if you don't know in advance the number of items, and you add
>>> O(N) items, how come you can get O(1) complexity ?
>>>       
>> Each time the array is re-sized, you add in some extra empty slots. Make
>> sure the number of extra slots is proportional to the size of the array.
>>     
>
> So this is the well known method of over allocating when you need to
> grow, right ? This is not constant time, and depends on the number of
> items you are adding I think (sublinearly, but still)
>
>   
I think you guys are talking about different N.  If N is the number of 
items already in the list, then adding a single item to the list could 
be O(N) if you use arrays to represent lists and you do not 
over-allocate when you need to grow.  By over-allocating when you need 
to grow, you can get amortized O(1) for the operation of adding a single 
element (not N elements) to the list.  Python apparently uses this 
latter method.  I guess the discussion started on the topic of the 
difference between arrays and lists, and that Python's 'list' has some 
properties of a classical 'array' (fast random access) and some of a 
classical 'list' (fast amortized append).

Alex


From faltet at pytables.org  Mon Jan 26 10:32:56 2009
From: faltet at pytables.org (Francesc Alted)
Date: Mon, 26 Jan 2009 16:32:56 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
	<200901261122.24357.faltet@pytables.org>
	<50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com>
Message-ID: <200901261632.57194.faltet@pytables.org>

A Monday 26 January 2009, Vicent escrigu?:
> On Mon, Jan 26, 2009 at 11:22, Francesc Alted <faltet at pytables.org> 
wrote:
> > Ei, Vicent,
> >
> > Yes.  In general, having arrays of 'object' dtype is a problem in
> > NumPy because you won't be able to reach the high performance that
> > NumPy can usually reach by specifying other dtypes like 'float' or
> > 'int'.  This is because many of the NumPy accelerations are based
> > on two facts:
> >
> > 1. That every element of the array is of equal size (in order to
> > allow high memory performance on common access patterns).
> >
> > 2. That operations between each of these elements have available
> > hardware that can perform fast operations with them.
> >
> > In nowadays architectures, the sort of elements that satisfy those
> > conditions are mainly these types:
> >
> > boolean, integer, float, complex and fixed-length strings
> >
> > Another kind of array element that can benefit from NumPy better
> > computational abilities are compound objects that are made of the
> > above ones, which are commonly referred as 'record types'. 
> > However, in order to preserve condition 1, these compound objects
> > cannot vary in size from element to element (so, your example does
> > not fit here).  However, such record arrays normally lacks the
> > property 2 for most operations, so they are normally seen more as a
> > data containers than a computational object "per se".
> >
> > So, you have two options here:
> >
> > - If you want to stick with collections of classes with attributes
> > that can be general python objects, then try to use python
> > containers for your case.  You will find that, in general, they are
> > better suited for doing most of your desired operations.
> >
> > - If you need extreme computational speed, then you need to change
> > your data schema (and perhaps the way your brain works too) and
> > start to think in terms of homegeneous array NumPy objects as your
> > building blocks.
> >
> > This is why people wanted that you were more explicit in describing
> > your situation: they tried to see whether NumPy arrays could be
> > used as the basic building blocks for your data schema or not.  My
> > advice here is that you try first with regular python containers. 
> > If you are not satisfied with speed or memory consumption, then try
> > to restate your problem in terms of arrays and use NumPy to
> > accelerate them (and to consume far less memory too).
> >
> > Hope that helps,
>
> Of course it helps!!   :-)    Gr?cies, Francesc.
>
> That solves my question. I realize of the importance of adapting my
> mind and my data structures to NumPy arrays, dtypes, "records" and so
> on.
>
> But, it leads me to another question:
>
> (1) How can I match/join object-oriented programming with the
> array+record NumPy philosophy?
>
> I mean, as far as I understood, what I thought that should be defined
> as an object with properties and methods, may be better defined as a
> "record dtype" + some functions that operate with that kind of
> records. Right?
>
> So... Isn't it possible to "embed" the second approach into the
> first?? Maybe it makes no sense, but I would like to know it.
>
> [I answer myself: I think I could keep classes for several "big" and
> unique or not frequent classes (and that don't require much
> computation), and arrays + NumPy-like records for massive
> computations over "grids" or "matrices" of "similar" elements.]

Yeah, you are getting the idea.  It is common sense to use general 
Python machinery for building the skeleton of your application, and 
when you want to accelerate/improve the parts of the code taking most 
of the runtime, then it is when NumPy/SciPy can enter in action.

> (2) Just to be sure: An array can be assigned to a property of an
> object, can't it?

David has already answered this: there is no problem doing that.

> Sorry if I'm being too general again!

Don't be afraid to ask as many people here is really willing to help.  
In case we need more concrete details, we will ask you to do that.

Au!

-- 
Francesc Alted


From robfalck at gmail.com  Mon Jan 26 10:35:52 2009
From: robfalck at gmail.com (Rob Falck)
Date: Mon, 26 Jan 2009 10:35:52 -0500
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <20090126151203.GG1894@phare.normalesup.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<20090126151203.GG1894@phare.normalesup.org>
Message-ID: <ccb1b5850901260735h11747794g7cac23d9f35222f4@mail.gmail.com>

Having just picked up PyQt after doing a lot of work in wxPython, I'm not
sure if I'll bother going back to wx.  Qt seems to be more well thought out
than wx, and QtDesigner saves me a LOT of time.  The wx Demo has a larger
set of examples, however.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/fd5dbd3b/attachment.html>

From sturla at molden.no  Mon Jan 26 10:36:47 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 26 Jan 2009 16:36:47 +0100 (CET)
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
	<fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
	<5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
Message-ID: <728822d1af5aaf165bdd54817d17e04b.squirrel@webmail.uio.no>

> On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden <sturla at molden.no> wrote:

> So this is the well known method of over allocating when you need to
> grow, right ? This is not constant time, and depends on the number of
> items you are adding I think (sublinearly, but still)

Ok, I think you misunderstood:

- Adding one element to a Python list of length N is O(1) on average.

- Adding one element to a linked list of length N is O(1).

- In both bases, growing a list from length 0 to length N takes O(N) time.

For Python lists, if you know the size in advance, pre-allocation
certainly helps: [None]*N is much faster than [None for n in range(N)].
But both operations are actually of O(N) complexity. For linked lists,
pre-allocation is meaningless.


> I think that's how
>  array-based lists work, the array is just the address of the items,
> whereas for arrays, the address is the item.

That is correct.

But it may or may not matter. If the array list has pointers to cache
coherent objects, this double indirection has very little consequence for
the performance.


S.M.


From christopher.paul.taylor at gmail.com  Mon Jan 26 10:41:43 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Mon, 26 Jan 2009 10:41:43 -0500
Subject: [SciPy-user] more build issues - slamch.o relocation
	R_X86_64_32S error
In-Reply-To: <5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com>
References: <f44caf830901260722j5b907e67rcd60e1f975bddc72@mail.gmail.com>
	<5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com>
Message-ID: <f44caf830901260741h2d2e382v385fc60fe28b59d8@mail.gmail.com>

oh, the alias was done to force any calls to gfortran to actually call g77.

i've tried this command:

 ../configure -Fa alg -fPIC
--with-netlib-lapack=/home/ctaylor/builds/lapack-3.1.1/lapack_LINUX.a

and I modified make.inc with the following options:

OPTS     = -O2 -fPIC
DRVOPTS  = $(OPTS)
NOOPT    = -O0 -fPIC

I'm still getting these relocation errors during scipy's build.

i guess the follow on question is this, how do i tell ./setup.py to
select g77 OR gfortran to execute with?

ct

On Mon, Jan 26, 2009 at 10:28 AM, David Cournapeau <cournape at gmail.com> wrote:
> On Tue, Jan 27, 2009 at 12:22 AM, christopher taylor
> <christopher.paul.taylor at gmail.com> wrote:
>> So I've followed the build instructions in the ATLAS package, and the
>> SciPy package. I've also followed the instructions for building scipy
>> on a centos box and I consistently get the following build error:
>
> You need to build ATLAS and LAPACK with -fPIC. For ATLAS, you do it as follows:
>
> ../configure -Fa alg -FPIC
>
> For LAPACK, you need to set both OPTS and NOOPT in make.inc.
>
>> I've aliased gfortran to g77 and that gets me similar results. any
>> recommendations?
>
> This is not a good idea: g77 and gfortran are not compatible; for all
> practical purpose, you cannot mix code build by one with code built by
> the other. You have to make sure either g77 or gfortran is used for
> everything, from lapack to scipy.
>
> cheers,
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From cournape at gmail.com  Mon Jan 26 10:57:15 2009
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 27 Jan 2009 00:57:15 +0900
Subject: [SciPy-user] more build issues - slamch.o relocation
	R_X86_64_32S error
In-Reply-To: <f44caf830901260741h2d2e382v385fc60fe28b59d8@mail.gmail.com>
References: <f44caf830901260722j5b907e67rcd60e1f975bddc72@mail.gmail.com>
	<5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com>
	<f44caf830901260741h2d2e382v385fc60fe28b59d8@mail.gmail.com>
Message-ID: <5b8d13220901260757n7b213e4ev20a8f5dec23dfec7@mail.gmail.com>

On Tue, Jan 27, 2009 at 12:41 AM, christopher taylor
<christopher.paul.taylor at gmail.com> wrote:
> oh, the alias was done to force any calls to gfortran to actually call g77.
>
> i've tried this command:
>
>  ../configure -Fa alg -fPIC
> --with-netlib-lapack=/home/ctaylor/builds/lapack-3.1.1/lapack_LINUX.a
>
> and I modified make.inc with the following options:
>
> OPTS     = -O2 -fPIC
> DRVOPTS  = $(OPTS)
> NOOPT    = -O0 -fPIC

You got the same error as before ? The error message is quite
unambiguous: you have somwhere an object file compiled without the
-fPIC flag, and from your build log, it is a LAPACK file. Did you
clean everything before rebuilding, to be sure to start from scratch.

>
> i guess the follow on question is this, how do i tell ./setup.py to
> select g77 OR gfortran to execute with?

For atlas, it is -C if compiler_name
For LAPACK, you set it in the compilers
For numpy/scipy: python setup.py build --fcompiler=gnu95 will force
gfortran even if g77 is found.

But the problem posted originally is unlikely to be caused by g77/gfortran mix,

David


From sturla at molden.no  Mon Jan 26 10:59:22 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 26 Jan 2009 16:59:22 +0100 (CET)
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <ccb1b5850901260735h11747794g7cac23d9f35222f4@mail.gmail.com>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<20090126151203.GG1894@phare.normalesup.org>
	<ccb1b5850901260735h11747794g7cac23d9f35222f4@mail.gmail.com>
Message-ID: <c816229fae8f272a7c339c363e39cc63.squirrel@webmail.uio.no>


> Having just picked up PyQt after doing a lot of work in wxPython, I'm not
> sure if I'll bother going back to wx.  Qt seems to be more well thought
> out
> than wx, and QtDesigner saves me a LOT of time.

That is why I use wxFormBuilder for wxPython as well. GUIs should not be
designed by hand-writing source code. I will consider switching to Qt when
the LGPL version is released. PyQt is clearly superior to wxPython, and
QtDesigner is better than wxFormBuilder.

But for now: as GPL is viral, anything built with Qt gets tainted with
GPL, unless you buy a commercial license. I am not considering the
separate commercial PyQt license here; it is the commercial Qt license
that costs big bucks.


Here are examples of using Matplotlib in wxPython and PyQt GUIs:

http://eli.thegreenplace.net/files/prog_code/wx_mpl_bars.py.txt

http://eli.thegreenplace.net/files/prog_code/qt_mpl_bars.py.txt


Regards,
Sturla Molden


From cournape at gmail.com  Mon Jan 26 11:05:50 2009
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 27 Jan 2009 01:05:50 +0900
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <497DD7A0.3050204@ncsu.edu>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
	<fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
	<5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
	<497DD7A0.3050204@ncsu.edu>
Message-ID: <5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com>

On Tue, Jan 27, 2009 at 12:32 AM, alex <argriffi at ncsu.edu> wrote:
> David Cournapeau wrote:
>> On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden <sturla at molden.no> wrote:
>>
>>>> Sturla Molden wrote:
>>>>
>>>> This is independent of the list implementation, isn't it ? I am quite
>>>> curious to understand how you could get O(1) complexity for a "growable"
>>>> container: if you don't know in advance the number of items, and you add
>>>> O(N) items, how come you can get O(1) complexity ?
>>>>
>>> Each time the array is re-sized, you add in some extra empty slots. Make
>>> sure the number of extra slots is proportional to the size of the array.
>>>
>>
>> So this is the well known method of over allocating when you need to
>> grow, right ? This is not constant time, and depends on the number of
>> items you are adding I think (sublinearly, but still)
>>
>>
> I think you guys are talking about different N.  If N is the number of
> items already in the list, then adding a single item to the list could
> be O(N) if you use arrays to represent lists and you do not
> over-allocate when you need to grow.  By over-allocating when you need
> to grow, you can get amortized O(1) for the operation of adding a single
> element (not N elements) to the list.

I meant that adding N items to a list requires O(log(N)) malloc when
using over allocation (double the size at ever allocation). I am not
sure to understand how the number of items already in the list would
influence the complexity of growing, at least when complexity =
counting the number of malloc.

David


From sturla at molden.no  Mon Jan 26 11:16:39 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 26 Jan 2009 17:16:39 +0100 (CET)
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<866667.90059.qm@web33004.mail.mud.yahoo.com>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
	<fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
	<5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
	<497DD7A0.3050204@ncsu.edu>
	<5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com>
Message-ID: <f4f1bdd974454d00e78ed35d71a7a02d.squirrel@webmail.uio.no>

> On Tue, Jan 27, 2009 at 12:32 AM, alex <argriffi at ncsu.edu> wrote:

> I meant that adding N items to a list requires O(log(N)) malloc when
> using over allocation (double the size at ever allocation). I am not
> sure to understand how the number of items already in the list would
> influence the complexity of growing, at least when complexity =
> counting the number of malloc.

Because the items already in the list has to be copied for every malloc.
The O(log(N)) mallocs do not have O(1) complexity. The complexity is not
counting the number of mallocs.

By the way, a Python list does not double in size on each allocation. It
has a less greedy growth pattern.


S.M.


From gary.pajer at gmail.com  Mon Jan 26 11:19:48 2009
From: gary.pajer at gmail.com (Gary Pajer)
Date: Mon, 26 Jan 2009 11:19:48 -0500
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
Message-ID: <88fe22a0901260819v2228dc63qfd7ddff915cf2c3b@mail.gmail.com>

On Mon, Jan 26, 2009 at 9:30 AM, Dharhas Pothina <
Dharhas.Pothina at twdb.state.tx.us> wrote:

> Gael,
>
> I almost sent a similar question a few days ago about making a GUI app
> so I'll tag along here.
>
> [...]
>
> On one hand, I am already using matplotlib and the timeseries toolkit
> extensively in scripts so I'm familiar with them and know that they can
> make pretty much any type of plot I need. Also matplotlib has a large
> community.
>
> On the other hand, chaco seems to have been designed for this type of
> interactive application and the plots I need for the GUI app are simpler
> and are supported by Chaco.
>
> Do you (or any others) have any comments about the pros and cons of
> each for someone new at this stuff.
>
> thanks,
>
> - dharhas


I had to make this decision some time ago.  I chose chaco, only because I
wanted a unified set of features and approach in a GUI application.

The downside was that I had to learn how to use chaco when I already knew
mpl, and that was at a time when things in Traits and Chaco were changing
rapidly.  Things now appear to have settled down considerably.  The
documentation for Chaco is still not what we would like, but it is much
better.

The place to start is this tutorial:
https://svn.enthought.com/svn/enthought/Chaco/trunk/docs/scipy08_tutorial.pdf
Don't start with the examples that are available in the svn version of
chaco.  Those examples use windowing frameworks other than TraitsUI, and
they are hard for a beginner to follow.

Am I happy with my decision?  Well, I'm not sure what would have happened if
I chose mpl.  My application works perfectly.  But I occasionally have to
ask questions on this list because the documentation is still a work in
progress.  Things are better for me since I was directed to the tutorial
above.  (I *highly* recommend that tutorial.)

I still use mpl if my task is to make a plot from scratch, outside of my lab
application.

-gary
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/9a03bd5a/attachment.html>

From jh at physics.ucf.edu  Mon Jan 26 11:27:08 2009
From: jh at physics.ucf.edu (jh at physics.ucf.edu)
Date: Mon, 26 Jan 2009 11:27:08 -0500
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <mailman.4788.1232970308.2878.scipy-user@scipy.org>
	(scipy-user-request@scipy.org)
References: <mailman.4788.1232970308.2878.scipy-user@scipy.org>
Message-ID: <wl8wsciuj6r.fsf@glup.physics.ucf.edu>

Vincent, others, I added some brief text and examples to the Getting
Started page in the "What are they useful for?" section that I think
address your basic concerns.  Can you look them over?  Wiki experts:
if someone can fix the indentation, please do!

--jh--


From cournape at gmail.com  Mon Jan 26 11:34:21 2009
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 27 Jan 2009 01:34:21 +0900
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <f4f1bdd974454d00e78ed35d71a7a02d.squirrel@webmail.uio.no>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
	<fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
	<5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
	<497DD7A0.3050204@ncsu.edu>
	<5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com>
	<f4f1bdd974454d00e78ed35d71a7a02d.squirrel@webmail.uio.no>
Message-ID: <5b8d13220901260834y3203496fo2e32de083abf1c8d@mail.gmail.com>

On Tue, Jan 27, 2009 at 1:16 AM, Sturla Molden <sturla at molden.no> wrote:
>> On Tue, Jan 27, 2009 at 12:32 AM, alex <argriffi at ncsu.edu> wrote:
>
>> I meant that adding N items to a list requires O(log(N)) malloc when
>> using over allocation (double the size at ever allocation). I am not
>> sure to understand how the number of items already in the list would
>> influence the complexity of growing, at least when complexity =
>> counting the number of malloc.
>
> Because the items already in the list has to be copied for every malloc.
> The O(log(N)) mallocs do not have O(1) complexity. The complexity is not
> counting the number of mallocs.

Ok - I did not understand what amortized cost meant.

>
> By the way, a Python list does not double in size on each allocation. It
> has a less greedy growth pattern.

Yes - but then python does not use malloc directly either anyway :)

David


From gary.pajer at gmail.com  Mon Jan 26 11:36:37 2009
From: gary.pajer at gmail.com (Gary Pajer)
Date: Mon, 26 Jan 2009 11:36:37 -0500
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <c816229fae8f272a7c339c363e39cc63.squirrel@webmail.uio.no>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<20090126151203.GG1894@phare.normalesup.org>
	<ccb1b5850901260735h11747794g7cac23d9f35222f4@mail.gmail.com>
	<c816229fae8f272a7c339c363e39cc63.squirrel@webmail.uio.no>
Message-ID: <88fe22a0901260836s2c2e24f8t87ba57f3fbb406f3@mail.gmail.com>

On Mon, Jan 26, 2009 at 10:59 AM, Sturla Molden <sturla at molden.no> wrote:

>
> > Having just picked up PyQt after doing a lot of work in wxPython, I'm not
> > sure if I'll bother going back to wx.  Qt seems to be more well thought
> > out
> > than wx, and QtDesigner saves me a LOT of time.
>
> That is why I use wxFormBuilder for wxPython as well. GUIs should not be
> designed by hand-writing source code. I will consider switching to Qt when
> the LGPL version is released. PyQt is clearly superior to wxPython, and
> QtDesigner is better than wxFormBuilder.


TraitsUI is somewhere in between a graphical builder and writing source
code.  I've never used wxFormBuilder.  I started to use QtDesigner, and in
fact I was in the middle of figuring out QtDesigner when I discovered Traits
and TraitsUI.  I didn't learn QtDesigner well enough to comment in any
detail.

But I previously used Boa, and I can say with certainty that I find creating
a GUI with Traits and TraitsUI to be *much easier* than using Boa. And I was
never tempted to go back to QtDesigner.   On the other hand, I used to use
the Matlab GUI maker, and thought it was pretty easy to use.  That was many
years ago, now.     YMMV.    I'm a scientist, not a programmer.  I'm hooked
on Traits now.  Aside from the ease of GUI building, there is the whole
Traits way of doing things which very much helps me design my programs.   In
fact, you can see me quoted here:
http://code.enthought.com/projects/index.php

-gary


>
>
> But for now: as GPL is viral, anything built with Qt gets tainted with
> GPL, unless you buy a commercial license. I am not considering the
> separate commercial PyQt license here; it is the commercial Qt license
> that costs big bucks.
>
>
> Here are examples of using Matplotlib in wxPython and PyQt GUIs:
>
> http://eli.thegreenplace.net/files/prog_code/wx_mpl_bars.py.txt
>
> http://eli.thegreenplace.net/files/prog_code/qt_mpl_bars.py.txt
>
>
> Regards,
> Sturla Molden
>
>
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/df229543/attachment.html>

From sturla at molden.no  Mon Jan 26 11:40:50 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 26 Jan 2009 17:40:50 +0100 (CET)
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <5b8d13220901260834y3203496fo2e32de083abf1c8d@mail.gmail.com>
References: <mailman.13.1232820003.25468.scipy-user@scipy.org>
	<50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com>
	<5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com>
	<c9cc1c831c2f4461bc84d7631db65c33.squirrel@webmail.uio.no>
	<497D3AE2.9090604@ar.media.kyoto-u.ac.jp>
	<fb74ee2e63f7a3f7d2f575e1e95cc6ec.squirrel@webmail.uio.no>
	<5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com>
	<497DD7A0.3050204@ncsu.edu>
	<5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com>
	<f4f1bdd974454d00e78ed35d71a7a02d.squirrel@webmail.uio.no>
	<5b8d13220901260834y3203496fo2e32de083abf1c8d@mail.gmail.com>
Message-ID: <54915f3494492ff54ab043ff7be0c00e.squirrel@webmail.uio.no>

> On Tue, Jan 27, 2009 at 1:16 AM, Sturla Molden <sturla at molden.no> wrote:

>> By the way, a Python list does not double in size on each allocation. It
>> has a less greedy growth pattern.
>
> Yes - but then python does not use malloc directly either anyway :)

Yes. listobject.c use realloc for resizing, to avoid copying the data if
it can be avoided.

S.M.


From vginer at gmail.com  Mon Jan 26 11:56:54 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 17:56:54 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <497DA24E.30306@ar.media.kyoto-u.ac.jp>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
	<200901261122.24357.faltet@pytables.org>
	<50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com>
	<497DA24E.30306@ar.media.kyoto-u.ac.jp>
Message-ID: <50ed08f40901260856u638294d6nca9271c92a104d4a@mail.gmail.com>

On Mon, Jan 26, 2009 at 12:45, David Cournapeau <
david at ar.media.kyoto-u.ac.jp> wrote:

> Vicent wrote:
> >
> > (2) Just to be sure: An array can be assigned to a property of an
> > object, can't it?
>
> A numpy array is a 'full' python object, thus can be used in the same
> cases as a python object.


Sorry I meant "working with classes" vs "working with structures or
records".

I know that everything in Python is an object, but I was thinking of
building my own structures for storing information by using "classes", in a
OOP context.

There I realize that maybe I have to forget defining "classes" and just use
NumPy objects, for those heavy/intensive search and/or computing tasks in my
code.

[ Again, asking myself...: Do I miss something? I mean, actually, a NumPy
array has properties/attributes and methods... So, maybe using objects from
NumPy doesn't mean forget object oriented programming. I think I was a bit
confused about it... ]


Working with object or not is not generally the most relevant aspect of
> good design - if you can do the same with a few functions and standard
> python objects/containers, it is often simpler and better to use them.


That's true... In fact, for me, I think it's a matter of programming
style...

-- 
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/adbf15e4/attachment.html>

From jeremy at jeremysanders.net  Mon Jan 26 11:59:44 2009
From: jeremy at jeremysanders.net (Jeremy Sanders)
Date: Mon, 26 Jan 2009 16:59:44 +0000
Subject: [SciPy-user] SciPy and GUI
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<20090126151203.GG1894@phare.normalesup.org>
Message-ID: <glkq61$q77$1@ger.gmane.org>

Gael Varoquaux wrote:
 
> On the other hand, matplotlib's model is very much imperative and
> script-based. This makes it easy to understand, but really is not the
> right paradigm for interactive applications in an object-oriented
> language. Chances are that, unless you are very experienced with the MVC
> pattern and interactive application design, you will make architectural
> errors when building an interactive application with Matplotlib. Chaco
> will constrain you, force you to do things according to its model, which
> you will hate (we all did at some point), but later on you will be happy
> that it enforced on you some object-oriented structure, on some
> separation of concerns (think model-view-controller, which can be
> transcribed in terms of data-plot-interactor in Chaco). In addition, the
> fact that Chaco plugs into Traits seemlessly gives you a huge amount of
> benefit for interactivity. The focus switches from registering callbacks
> all over the place to reactive programming on attribute modification.

Veusz may be an alternative option (disclaimer - I wrote the thing). It is 
object-based and would naturally fit in a PyQt system as it is written in 
PyQt.

http://home.gna.org/veusz/

You can simply inherit the Veusz SimpleWindow to get a QWidget you can stick 
in your application.

Jeremy


From vginer at gmail.com  Mon Jan 26 12:00:04 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 18:00:04 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <200901261632.57194.faltet@pytables.org>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
	<200901261122.24357.faltet@pytables.org>
	<50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com>
	<200901261632.57194.faltet@pytables.org>
Message-ID: <50ed08f40901260900i57f85f05sa6b069b3e81513f5@mail.gmail.com>

On Mon, Jan 26, 2009 at 16:32, Francesc Alted <faltet at pytables.org> wrote:

> A Monday 26 January 2009, Vicent escrigu?:
> > [I answer myself: I think I could keep classes for several "big" and
> > unique or not frequent classes (and that don't require much
> > computation), and arrays + NumPy-like records for massive
> > computations over "grids" or "matrices" of "similar" elements.]
>
> Yeah, you are getting the idea.  It is common sense to use general
> Python machinery for building the skeleton of your application, and
> when you want to accelerate/improve the parts of the code taking most
> of the runtime, then it is when NumPy/SciPy can enter in action.


I think I got it...    :-)


> Don't be afraid to ask as many people here is really willing to help.
> In case we need more concrete details, we will ask you to do that.
>
>
Thanks, Francesc.

Au!

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/532a659e/attachment.html>

From gael.varoquaux at normalesup.org  Mon Jan 26 12:18:06 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Mon, 26 Jan 2009 18:18:06 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <wl8wsciuj6r.fsf@glup.physics.ucf.edu>
References: <mailman.4788.1232970308.2878.scipy-user@scipy.org>
	<wl8wsciuj6r.fsf@glup.physics.ucf.edu>
Message-ID: <20090126171806.GI1894@phare.normalesup.org>

On Mon, Jan 26, 2009 at 11:27:08AM -0500, jh at physics.ucf.edu wrote:
> if someone can fix the indentation, please do!

Done.

Ga?l


From faltet at pytables.org  Mon Jan 26 12:18:58 2009
From: faltet at pytables.org (Francesc Alted)
Date: Mon, 26 Jan 2009 18:18:58 +0100
Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to
	start with SciPy and NumPy)
In-Reply-To: <50ed08f40901260856u638294d6nca9271c92a104d4a@mail.gmail.com>
References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com>
	<497DA24E.30306@ar.media.kyoto-u.ac.jp>
	<50ed08f40901260856u638294d6nca9271c92a104d4a@mail.gmail.com>
Message-ID: <200901261818.58611.faltet@pytables.org>

A Monday 26 January 2009, Vicent escrigu?:
> On Mon, Jan 26, 2009 at 12:45, David Cournapeau <
>
> david at ar.media.kyoto-u.ac.jp> wrote:
> > Vicent wrote:
> > > (2) Just to be sure: An array can be assigned to a property of an
> > > object, can't it?
> >
> > A numpy array is a 'full' python object, thus can be used in the
> > same cases as a python object.
>
> Sorry I meant "working with classes" vs "working with structures or
> records".
>
> I know that everything in Python is an object, but I was thinking of
> building my own structures for storing information by using
> "classes", in a OOP context.
>
> There I realize that maybe I have to forget defining "classes" and
> just use NumPy objects, for those heavy/intensive search and/or
> computing tasks in my code.

Or just implement a bridge between your "classes" and NumPy objects.  
There are many possibilities, but IMO you should try first some of the 
most easy-to-work possibilities that you can figure out and then add 
complexity or NumPy objects in case you need them.  It is worth to note 
that, although in many cases the fact of working with NumPy objects 
eases the life of the programmer, this should be not the case for 
everyone.  As always, your mileage may vary.

> [ Again, asking myself...: Do I miss something? I mean, actually, a
> NumPy array has properties/attributes and methods... So, maybe using
> objects from NumPy doesn't mean forget object oriented programming. I
> think I was a bit confused about it... ]

Yeah.  Many programs that use NumPy intensively are perfect examples of 
OOP.  NumPy and OOP are not mutually exclusive in any way.

Au!

-- 
Francesc Alted


From vginer at gmail.com  Mon Jan 26 12:26:19 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 18:26:19 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <wl8wsciuj6r.fsf@glup.physics.ucf.edu>
References: <mailman.4788.1232970308.2878.scipy-user@scipy.org>
	<wl8wsciuj6r.fsf@glup.physics.ucf.edu>
Message-ID: <50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com>

On Mon, Jan 26, 2009 at 17:27, <jh at physics.ucf.edu> wrote:

> Vincent, others, I added some brief text and examples to the Getting
> Started page in the "What are they useful for?" section that I think
> address your basic concerns.  Can you look them over?  Wiki experts:
> if someone can fix the indentation, please do!
>
> --jh--


I think it's useful for people who are starting, like me.

By the way, the link to Topical Software in that section is wrong. I think
it should be "http://scipy.org/Topical_Software".

Thanks again.

--
vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/d8b7fac0/attachment.html>

From markperrymiller at gmail.com  Mon Jan 26 12:37:23 2009
From: markperrymiller at gmail.com (Mark Miller)
Date: Mon, 26 Jan 2009 09:37:23 -0800
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com>
References: <mailman.4788.1232970308.2878.scipy-user@scipy.org>
	<wl8wsciuj6r.fsf@glup.physics.ucf.edu>
	<50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com>
Message-ID: <d1eaff10901260937l703dd3ak213c5e246d179c0@mail.gmail.com>

Just a note to Vincent and anyone that might work on documentation:

>From my perspective, the single best document that ever gave me a feel for
numpy and its capabilities is this:

http://www.scipy.org/Numpy_Example_List_With_Doc

When I was new to things, being able to take 10 minutes to scroll through a
comprehensive list of features/functions really helped a lot.  The current
reference guide (http://docs.scipy.org/doc/numpy/reference/) is good, but
when you don't necessarily know what you're looking for, being able to see
*everything* really helped me a lot.

Just mentioning it,

-Mark


On Mon, Jan 26, 2009 at 9:26 AM, Vicent <vginer at gmail.com> wrote:

>
>
>
>
> On Mon, Jan 26, 2009 at 17:27, <jh at physics.ucf.edu> wrote:
>
>> Vincent, others, I added some brief text and examples to the Getting
>> Started page in the "What are they useful for?" section that I think
>> address your basic concerns.  Can you look them over?  Wiki experts:
>> if someone can fix the indentation, please do!
>>
>> --jh--
>
>
> I think it's useful for people who are starting, like me.
>
> By the way, the link to Topical Software in that section is wrong. I think
> it should be "http://scipy.org/Topical_Software".
>
> Thanks again.
>
> --
> vicent
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/093ff3c7/attachment.html>

From vginer at gmail.com  Mon Jan 26 13:04:24 2009
From: vginer at gmail.com (Vicent)
Date: Mon, 26 Jan 2009 19:04:24 +0100
Subject: [SciPy-user] How to start with SciPy and NumPy
In-Reply-To: <d1eaff10901260937l703dd3ak213c5e246d179c0@mail.gmail.com>
References: <mailman.4788.1232970308.2878.scipy-user@scipy.org>
	<wl8wsciuj6r.fsf@glup.physics.ucf.edu>
	<50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com>
	<d1eaff10901260937l703dd3ak213c5e246d179c0@mail.gmail.com>
Message-ID: <50ed08f40901261004v68cd5d43nd67d122fda3bcdbb@mail.gmail.com>

On Mon, Jan 26, 2009 at 18:37, Mark Miller <markperrymiller at gmail.com>wrote:

> Just a note to Vincent and anyone that might work on documentation:
>
> >From my perspective, the single best document that ever gave me a feel for
> numpy and its capabilities is this:
>
> http://www.scipy.org/Numpy_Example_List_With_Doc
>
>
Oh, it's great. I was watching the version without "doc strings" (just
because of the same reason you gave), but I think this is still better.

Thank you!!

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/1fa4d9aa/attachment.html>

From Dharhas.Pothina at twdb.state.tx.us  Mon Jan 26 15:06:51 2009
From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina)
Date: Mon, 26 Jan 2009 14:06:51 -0600
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <20090126151203.GG1894@phare.normalesup.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<20090126151203.GG1894@phare.normalesup.org>
Message-ID: <497DC37B.63BA.009B.0@twdb.state.tx.us>


Thank you all. 

These comments have been extremely helpful. My initial application is fairly small and uses small datasets, but I'm also looking at this as a learning opportunity for larger applications I hope to write in the future. I think I will try coding this with Chaco. If I find the learning curve too daunting or if it doesn't meet my needs I'll explore the some of the other options that have been suggested.

- dharhas


From timmichelsen at gmx-topmail.de  Mon Jan 26 16:00:06 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Mon, 26 Jan 2009 22:00:06 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <20090125100556.GA29918@phare.normalesup.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
Message-ID: <gll88m$ia3$1@ger.gmane.org>

Hello,
this post got me interested again into building a GUI for my app.
There have been various posts but this one really brings in some great 
ideas.

> http://code.enthought.com/projects/traits/documentation.php, and
> http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html
> for documentation and a tutorial)
Gael, may I ask you to let sphinx create a PDF version for the Trais Docs?
I would like to use my offline travelling time to look more into this.

It would be nice to have this as a reference with me.

Thanks in advance,
Timmie


From timmichelsen at gmx-topmail.de  Mon Jan 26 16:07:44 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Mon, 26 Jan 2009 22:07:44 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
Message-ID: <gll8n1$jup$1@ger.gmane.org>

Hello!

> I'm trying to make a GUI application to QA/QC field data. I need to
> pull data from a text file or database. Explore it and choose points (ie
> bad data etc) to delete etc. I have virtually no experience in GUI
> programming except for some stuff with visual C++ over 10 years ago that
> I vaguely remember.
May I ask what kind of data you are working with?

I am also using scipy & timeseries for mostly measurement data evaluation.
I am working with environmental/climate data.


Kind regards,
Timmie


From cdcasey at gmail.com  Mon Jan 26 16:22:41 2009
From: cdcasey at gmail.com (chris)
Date: Mon, 26 Jan 2009 15:22:41 -0600
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <gll88m$ia3$1@ger.gmane.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<gll88m$ia3$1@ger.gmane.org>
Message-ID: <ead057790901261322p26eff89bw861a8be2458b4d89@mail.gmail.com>

There are currently issues with building PDFs of Traits documentation
from Sphinx sources. I think currently the best way to have a local
copy of the Traits docs is to check out the Traits source. The docs
folder should have a pdf, as well as an html.zip. python setup.py
build_docs would build html docs from current sources, although I
don't think there have been many changes since the last release.

To check out Traits:
svn co https://svn.enthought.com/svn/enthought/Traits/trunk

Docs available online (outdated PDF):
http://code.enthought.com/projects/traits/documentation.php

-Chris

On Mon, Jan 26, 2009 at 3:00 PM, Tim Michelsen
<timmichelsen at gmx-topmail.de> wrote:
> Hello,
> this post got me interested again into building a GUI for my app.
> There have been various posts but this one really brings in some great
> ideas.
>
>> http://code.enthought.com/projects/traits/documentation.php, and
>> http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html
>> for documentation and a tutorial)
> Gael, may I ask you to let sphinx create a PDF version for the Trais Docs?
> I would like to use my offline travelling time to look more into this.
>
> It would be nice to have this as a reference with me.
>
> Thanks in advance,
> Timmie
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From Dharhas.Pothina at twdb.state.tx.us  Mon Jan 26 16:30:35 2009
From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina)
Date: Mon, 26 Jan 2009 15:30:35 -0600
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <gll8n1$jup$1@ger.gmane.org>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us><497D74AF.63BA.009B.0@twdb.state.tx.us>
	<gll8n1$jup$1@ger.gmane.org>
Message-ID: <497DD71B.63BA.009B.0@twdb.state.tx.us>

Hi Tim,

I'm working with environmental/climate measurement data too. I also work with hydrodynamic models and comparing results from these to measurement data.

The measurement data is mainly water quality parameters (Salinity, Temperature, Depth, D.O etc). Our group collects data in various bays and estuaries in Texas and the last couple of years I've been spearheading an effort to make the way data from the field instruments (collected by us and by other agencies for us) is QA/QC'd and archived in a more systematic and reproducible manner. Original procedures involved lots of manual editing and file mangling with excel with no record of what had been done & why.

- dharhas

>>> Tim Michelsen <timmichelsen at gmx-topmail.de> 1/26/2009 3:07 PM >>>
Hello!

> I'm trying to make a GUI application to QA/QC field data. I need to
> pull data from a text file or database. Explore it and choose points (ie
> bad data etc) to delete etc. I have virtually no experience in GUI
> programming except for some stuff with visual C++ over 10 years ago that
> I vaguely remember.
May I ask what kind of data you are working with?

I am also using scipy & timeseries for mostly measurement data evaluation.
I am working with environmental/climate data.


Kind regards,
Timmie

_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org 
http://projects.scipy.org/mailman/listinfo/scipy-user


From simpson at math.toronto.edu  Mon Jan 26 16:44:58 2009
From: simpson at math.toronto.edu (Gideon Simpson)
Date: Mon, 26 Jan 2009 16:44:58 -0500
Subject: [SciPy-user] test fails in 0.7rc2
Message-ID: <ECB08F55-7CDB-43B6-8DA1-AAC03FB68DA7@math.toronto.edu>

Not sure how serious this is, but:

======================================================================
FAIL: test_x_stride (test_fblas.TestCgemv)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ 
scipy/lib/blas/tests/test_fblas.py", line 345, in test_x_stride
     assert_array_almost_equal(desired_y,y)
   File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ 
numpy/testing/utils.py", line 311, in assert_array_almost_equal
     header='Arrays are not almost equal')
   File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ 
numpy/testing/utils.py", line 296, in assert_array_compare
     raise AssertionError(msg)
AssertionError:
Arrays are not almost equal

(mismatch 33.3333333333%)
  x: array([  7.42531872 -7.42531872j,   4.58355808 -2.58355808j,
        -12.38274670+16.38274765j], dtype=complex64)
  y: array([  7.42531872 -7.42531872j,   4.58355808 -2.58355808j,
        -12.38274670+16.38274574j], dtype=complex64)

----------------------------------------------------------------------

-gideon


From robert.kern at gmail.com  Mon Jan 26 16:53:01 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 26 Jan 2009 15:53:01 -0600
Subject: [SciPy-user] test fails in 0.7rc2
In-Reply-To: <ECB08F55-7CDB-43B6-8DA1-AAC03FB68DA7@math.toronto.edu>
References: <ECB08F55-7CDB-43B6-8DA1-AAC03FB68DA7@math.toronto.edu>
Message-ID: <3d375d730901261353p6d1ce471k4e286567428793a5@mail.gmail.com>

On Mon, Jan 26, 2009 at 15:44, Gideon Simpson <simpson at math.toronto.edu> wrote:
> Not sure how serious this is, but:
>
> ======================================================================
> FAIL: test_x_stride (test_fblas.TestCgemv)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/
> scipy/lib/blas/tests/test_fblas.py", line 345, in test_x_stride
>     assert_array_almost_equal(desired_y,y)
>   File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/
> numpy/testing/utils.py", line 311, in assert_array_almost_equal
>     header='Arrays are not almost equal')
>   File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/
> numpy/testing/utils.py", line 296, in assert_array_compare
>     raise AssertionError(msg)
> AssertionError:
> Arrays are not almost equal
>
> (mismatch 33.3333333333%)
>  x: array([  7.42531872 -7.42531872j,   4.58355808 -2.58355808j,
>        -12.38274670+16.38274765j], dtype=complex64)
>  y: array([  7.42531872 -7.42531872j,   4.58355808 -2.58355808j,
>        -12.38274670+16.38274574j], dtype=complex64)

Doesn't look particularly serious. Possibly your ATLAS is using
aggressive speed optimization at the cost of a couple of decimal
points of precision. What platform are you on? What BLAS are you
using?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From simpson at math.toronto.edu  Mon Jan 26 16:57:11 2009
From: simpson at math.toronto.edu (Gideon Simpson)
Date: Mon, 26 Jan 2009 16:57:11 -0500
Subject: [SciPy-user] test fails in 0.7rc2
In-Reply-To: <3d375d730901261353p6d1ce471k4e286567428793a5@mail.gmail.com>
References: <ECB08F55-7CDB-43B6-8DA1-AAC03FB68DA7@math.toronto.edu>
	<3d375d730901261353p6d1ce471k4e286567428793a5@mail.gmail.com>
Message-ID: <435F433A-9316-4471-882F-B3D878A61592@math.toronto.edu>

I'm running ATLAS 3.8.2 with lapack 3.1.1 built on gcc 4.3.2.  ATLAS  
and lapack are built with the compiler flags:

-fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64


-gideon

On Jan 26, 2009, at 4:53 PM, Robert Kern wrote:

> On Mon, Jan 26, 2009 at 15:44, Gideon Simpson <simpson at math.toronto.edu 
> > wrote:
>> Not sure how serious this is, but:
>>
>> = 
>> =====================================================================
>> FAIL: test_x_stride (test_fblas.TestCgemv)
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>  File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/
>> scipy/lib/blas/tests/test_fblas.py", line 345, in test_x_stride
>>    assert_array_almost_equal(desired_y,y)
>>  File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/
>> numpy/testing/utils.py", line 311, in assert_array_almost_equal
>>    header='Arrays are not almost equal')
>>  File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/
>> numpy/testing/utils.py", line 296, in assert_array_compare
>>    raise AssertionError(msg)
>> AssertionError:
>> Arrays are not almost equal
>>
>> (mismatch 33.3333333333%)
>> x: array([  7.42531872 -7.42531872j,   4.58355808 -2.58355808j,
>>       -12.38274670+16.38274765j], dtype=complex64)
>> y: array([  7.42531872 -7.42531872j,   4.58355808 -2.58355808j,
>>       -12.38274670+16.38274574j], dtype=complex64)
>
> Doesn't look particularly serious. Possibly your ATLAS is using
> aggressive speed optimization at the cost of a couple of decimal
> points of precision. What platform are you on? What BLAS are you
> using?
>
> -- 
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From timmichelsen at gmx-topmail.de  Mon Jan 26 18:18:59 2009
From: timmichelsen at gmx-topmail.de (Tim Michelsen)
Date: Tue, 27 Jan 2009 00:18:59 +0100
Subject: [SciPy-user] scipy & climate data [Re: SciPy and GUI]
In-Reply-To: <497DD71B.63BA.009B.0@twdb.state.tx.us>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>	<20090125100556.GA29918@phare.normalesup.org>	<497D74AF.63BA.009B.0@twdb.state.tx.us><497D74AF.63BA.009B.0@twdb.state.tx.us>	<gll8n1$jup$1@ger.gmane.org>
	<497DD71B.63BA.009B.0@twdb.state.tx.us>
Message-ID: <gllgd4$g07$1@ger.gmane.org>

> I'm working with environmental/climate measurement data too. I also
> work with hydrodynamic models and comparing results from these to
> measurement data.

> The measurement data is mainly water quality parameters (Salinity,
> Temperature, Depth, D.O etc). Our group collects data in various bays
> and estuaries in Texas and the last couple of years I've been
> spearheading an effort to make the way data from the field
> instruments (collected by us and by other agencies for us) is QA/QC'd
> and archived in a more systematic and reproducible manner. Original
> procedures involved lots of manual editing and file mangling with
> excel with no record of what had been done & why.

Pierre GM sent me a link to his work off list.
Your work seem to be in the same area of interest. Link up with him.
Please see here:
https://code.launchpad.net/~pierregm/scipy/climpy
http://bazaar.launchpad.net/~pierregm/scipy/climpy/annotate/head%3A/scikits/climpy/doc/source/examples/examples.rst

Regards,
Timmie


From ebicici at ku.edu.tr  Mon Jan 26 18:25:10 2009
From: ebicici at ku.edu.tr (Ergun Bicici)
Date: Tue, 27 Jan 2009 01:25:10 +0200
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
Message-ID: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>

Dear SciPy Users,

Conjugate Gradient iteration of scipy-0.7.0rc1 is giving me problems when
called inside a threading.Thread. The info code (numiter) returned is -6.
>From CG_REVCOM:

!  INFO    (output) integer
!
!        = 0: Successful exit. Iterated approximate solution returned.
!
!          >  0: Convergence to tolerance not achieved. This will be
!                set to the number of iterations performed.
!
!          <  0: Illegal input parameter.
!
!                   -1: matrix dimension N < 0
!                   -3: Maximum number of iterations ITER < = 0.
!                   -5: Erroneous NDX1/NDX2 in INIT call.
!                   -6: Erroneous RLBL.

When I perform a sequential CG, it works fine.

Regards,
Ergun

Ergun Bicici
Koc University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090127/c725cec2/attachment.html>

From robert.kern at gmail.com  Mon Jan 26 18:29:15 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 26 Jan 2009 17:29:15 -0600
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
Message-ID: <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>

On Mon, Jan 26, 2009 at 17:25, Ergun Bicici <ebicici at ku.edu.tr> wrote:
>
> Dear SciPy Users,
>
> Conjugate Gradient iteration of scipy-0.7.0rc1 is giving me problems when
> called inside a threading.Thread. The info code (numiter) returned is -6.

It's probably not threadsafe.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From cycomanic at gmail.com  Mon Jan 26 18:45:25 2009
From: cycomanic at gmail.com (Jochen)
Date: Tue, 27 Jan 2009 12:45:25 +1300
Subject: [SciPy-user] FFTW python bindings again
Message-ID: <1233013525.4180.17.camel@phy.auckland.ac.nz>

Hi all, 
about starting a new thread, stupid gmail does not show my posts to the
list. Anyways I noticed a big mistake in how I was allocating the
aligned memory thus it was actually not guaranteed to be 16byte aligned.
I guess it didn't show up because I was mainly testing using complex
numbers and malloc just took the next free block, which happened to be
aligned because I had just allocated a large chunk of aligned data.
Anyways I have created a new version where the issue is fixed. It can
again be found at http://pyfftw.berlios.de. 

Cheers
Jochen

P.S.: I haven't received any comments on this, is this not of interest
to the scipy community? 


From robert.kern at gmail.com  Mon Jan 26 18:49:44 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 26 Jan 2009 17:49:44 -0600
Subject: [SciPy-user] FFTW python bindings again
In-Reply-To: <1233013525.4180.17.camel@phy.auckland.ac.nz>
References: <1233013525.4180.17.camel@phy.auckland.ac.nz>
Message-ID: <3d375d730901261549y6d10df20i3e4764fd7de10906@mail.gmail.com>

On Mon, Jan 26, 2009 at 17:45, Jochen <cycomanic at gmail.com> wrote:
> Hi all,
> about starting a new thread, stupid gmail does not show my posts to the
> list.

Just reply to your message in Sent Mail.

> Anyways I noticed a big mistake in how I was allocating the
> aligned memory thus it was actually not guaranteed to be 16byte aligned.
> I guess it didn't show up because I was mainly testing using complex
> numbers and malloc just took the next free block, which happened to be
> aligned because I had just allocated a large chunk of aligned data.
> Anyways I have created a new version where the issue is fixed. It can
> again be found at http://pyfftw.berlios.de.
>
> Cheers
> Jochen
>
> P.S.: I haven't received any comments on this, is this not of interest
> to the scipy community?

Many people simply don't comment. Please do keep us informed, though!

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From gary.pajer at gmail.com  Mon Jan 26 21:08:11 2009
From: gary.pajer at gmail.com (Gary Pajer)
Date: Mon, 26 Jan 2009 21:08:11 -0500
Subject: [SciPy-user] FFTW python bindings again
In-Reply-To: <1233013525.4180.17.camel@phy.auckland.ac.nz>
References: <1233013525.4180.17.camel@phy.auckland.ac.nz>
Message-ID: <88fe22a0901261808x51059c66i5b69604ba473c7d8@mail.gmail.com>

On Mon, Jan 26, 2009 at 6:45 PM, Jochen <cycomanic at gmail.com> wrote:

> Hi all,
>

[...]

>
>
> Cheers
> Jochen
>
> P.S.: I haven't received any comments on this, is this not of interest
> to the scipy community?


I'm interested.  In fact I downloaded it.  I've only taken a very quick
look, but perhaps you can answer a question:  is this OS agnostic, or is it
Linux only?

-gary
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090126/31c7d613/attachment.html>

From cycomanic at gmail.com  Mon Jan 26 21:30:48 2009
From: cycomanic at gmail.com (Jochen)
Date: Tue, 27 Jan 2009 15:30:48 +1300
Subject: [SciPy-user] FFTW python bindings again
In-Reply-To: <88fe22a0901261808x51059c66i5b69604ba473c7d8@mail.gmail.com>
References: <1233013525.4180.17.camel@phy.auckland.ac.nz>
	<88fe22a0901261808x51059c66i5b69604ba473c7d8@mail.gmail.com>
Message-ID: <1233023448.20296.21.camel@phy.auckland.ac.nz>

On Mon, 2009-01-26 at 21:08 -0500, Gary Pajer wrote:
> On Mon, Jan 26, 2009 at 6:45 PM, Jochen <cycomanic at gmail.com> wrote:
>         Hi all,
> 
> [...] 
> 
>         
>         
>         Cheers
>         Jochen
>         
>         P.S.: I haven't received any comments on this, is this not of
>         interest
>         to the scipy community?
> 
> I'm interested.  In fact I downloaded it.  I've only taken a very
> quick look, but perhaps you can answer a question:  is this OS
> agnostic, or is it Linux only?
> 
> -gary

This should be OS agnostic, however I have not tested on any other
systems (don't have a windows/OSX machine to easily test on). The only
thing that could fail is loading the fftw shared library. 
I do a:
lib = ctypes.cdll.LoadLibrary(util.find_library('fftw3'))
for this to be successful ctypes needs to find the fftw3 library. The
way I understand the ctypes documentation this should also work in
Windows or OSX and other unicies. I also assumed that fftw3 uses c-type
calling conventions on all platforms. 
I'm actually checking if ctypes can find the fftw3 libraries in setup.py
if not the install will fail. I would actually be grateful if you could
check. If you don't want to install anything you can just do a python
setup.py build, the setup will raise an exception if ctypes cannot find
fftw3. Providing the possibility for specifying the path to fftw3 is
actually on my TODO.

Thanks for the interest
Cheers
Jochen
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From david at ar.media.kyoto-u.ac.jp  Mon Jan 26 22:43:10 2009
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Tue, 27 Jan 2009 12:43:10 +0900
Subject: [SciPy-user] FFTW python bindings again
In-Reply-To: <1233013525.4180.17.camel@phy.auckland.ac.nz>
References: <1233013525.4180.17.camel@phy.auckland.ac.nz>
Message-ID: <497E82CE.3030405@ar.media.kyoto-u.ac.jp>

Jochen wrote:
> Hi all, 
> about starting a new thread, stupid gmail does not show my posts to the
> list. Anyways I noticed a big mistake in how I was allocating the
> aligned memory thus it was actually not guaranteed to be 16byte aligned.
> I guess it didn't show up because I was mainly testing using complex
> numbers and malloc just took the next free block, which happened to be
> aligned because I had just allocated a large chunk of aligned data.
>   

I believe fftw automatically detects whether your array is aligned or
not - problem appear when you create your plan with aligned pointers,
but use other pointers later. That's one reason why fftw backend was not
that fast in scipy BTW, because the simplest way to ensure this was to
copy data into aligned buffers.

At least on linux, allocating big buffers with malloc is almost
guaranteed not to be aligned, because of its use of mmap above a certain
threshold. We discovered this fact a while ago:

http://projects.scipy.org/pipermail/scipy-dev/2007-August/007591.html

Those are some of the reasons why we decided to drop fftw support: to
use it efficiently is not that easy, because we would first need
guarantees about aligned allocator (once you take into accout that numpy
also uses realloc, just using posix_memalign is not enough).

On the other hand, I think it would be very nice to have fftw wrappers
outside scipy. For some technical aspects, I answered to you in your
other post,

David


From cycomanic at gmail.com  Tue Jan 27 00:00:23 2009
From: cycomanic at gmail.com (Jochen)
Date: Tue, 27 Jan 2009 18:00:23 +1300
Subject: [SciPy-user] FFTW python bindings again
In-Reply-To: <497E82CE.3030405@ar.media.kyoto-u.ac.jp>
References: <1233013525.4180.17.camel@phy.auckland.ac.nz>
	<497E82CE.3030405@ar.media.kyoto-u.ac.jp>
Message-ID: <1233032423.20296.78.camel@phy.auckland.ac.nz>

On Tue, 2009-01-27 at 12:43 +0900, David Cournapeau wrote:
> Jochen wrote:
> > Hi all, 
> > about starting a new thread, stupid gmail does not show my posts to the
> > list. Anyways I noticed a big mistake in how I was allocating the
> > aligned memory thus it was actually not guaranteed to be 16byte aligned.
> > I guess it didn't show up because I was mainly testing using complex
> > numbers and malloc just took the next free block, which happened to be
> > aligned because I had just allocated a large chunk of aligned data.
> >   
> 
> I believe fftw automatically detects whether your array is aligned or
> not - problem appear when you create your plan with aligned pointers,
> but use other pointers later. That's one reason why fftw backend was not
> that fast in scipy BTW, because the simplest way to ensure this was to
> copy data into aligned buffers.

Yes I understand that. I was using a somewhat hackish way of creating
the memory aligned array, i.e. I was casting the pointer returned from
ctypes to a bytes array and then passed that to ndarray.__new__ as a
buffer. I didn't realise was that in the process I was allocating new
memory, which when I tested manually was still aligned because I had
just allocated aligned array (I was only using small arrays). I now use 
PyBuffer_FromReadWriteMemory to create a buffer object to pass to
ndarray.__new__ in order to create the aligned memory.


> At least on linux, allocating big buffers with malloc is almost
> guaranteed not to be aligned, because of its use of mmap above a certain
> threshold. We discovered this fact a while ago:
> 
> http://projects.scipy.org/pipermail/scipy-dev/2007-August/007591.html
> 
I think I stumbled accross that thread when I was looking for fftw
bindings. 

> Those are some of the reasons why we decided to drop fftw support: to
> use it efficiently is not that easy, because we would first need
> guarantees about aligned allocator (once you take into accout that numpy
> also uses realloc, just using posix_memalign is not enough).
> 
> On the other hand, I think it would be very nice to have fftw wrappers
> outside scipy. For some technical aspects, I answered to you in your
> other post,
> 
> David

Thanks for the comments
Cheers
Jochen

> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user


From bryan.cole at teraview.com  Tue Jan 27 07:19:36 2009
From: bryan.cole at teraview.com (Bryan Cole)
Date: Tue, 27 Jan 2009 12:19:36 +0000
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
Message-ID: <1233058138.2461.87.camel@bryan.teraview.local>


> 
> On one hand, I am already using matplotlib and the timeseries toolkit
> extensively in scripts so I'm familiar with them and know that they can
> make pretty much any type of plot I need. Also matplotlib has a large
> community.
> 
> On the other hand, chaco seems to have been designed for this type of
> interactive application and the plots I need for the GUI app are simpler
> and are supported by Chaco.

A few weeks back I added a recipe to the scipy wiki for embedding a
matplotlib figure in a Traits app. You can find it at

http://www.scipy.org/EmbeddingInTraitsGUI

BC


From wnbell at gmail.com  Tue Jan 27 14:11:49 2009
From: wnbell at gmail.com (Nathan Bell)
Date: Tue, 27 Jan 2009 14:11:49 -0500
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
Message-ID: <d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>

On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern <robert.kern at gmail.com> wrote:
>
> It's probably not threadsafe.
>

I don't know Fortran, so I can't say:
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src


Anway, here's a pure-SciPy CG implementation (also BSD-licensed):
http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py

It should be a drop-in replacement for sparse.linalg.cg() and have
comparable speed.  The only dependency is to the norm() function in
PyAMG, but you can swipe that easily too.

In time we should replace all of the Fortran implementations of the
iterative methods with pure-Python code.  This would be a nice target
for SciPy 0.8.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/


From sturla at molden.no  Tue Jan 27 14:26:02 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 27 Jan 2009 20:26:02 +0100 (CET)
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
Message-ID: <b809fd40aad95c6114adbb4727c1205c.squirrel@webmail.uio.no>

> On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern <robert.kern at gmail.com>
> wrote:


>> It's probably not threadsafe.
>>
>
> I don't know Fortran, so I can't say:
> http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src

Well I do, and that is not thread safe. The offending line is 110. I makes
this routine work like a finite state machine. All local variables are
declared static.


S.M.


From dominique.orban at gmail.com  Tue Jan 27 14:34:45 2009
From: dominique.orban at gmail.com (Dominique Orban)
Date: Tue, 27 Jan 2009 14:34:45 -0500
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
Message-ID: <8793ae6e0901271134i21cbd5beo65be4daab2c177a5@mail.gmail.com>

On Tue, Jan 27, 2009 at 2:11 PM, Nathan Bell <wnbell at gmail.com> wrote:
> On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> It's probably not threadsafe.
>>
>
> I don't know Fortran, so I can't say:
> http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src
>
>
> Anway, here's a pure-SciPy CG implementation (also BSD-licensed):
> http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py
>
> It should be a drop-in replacement for sparse.linalg.cg() and have
> comparable speed.  The only dependency is to the norm() function in
> PyAMG, but you can swipe that easily too.
>
> In time we should replace all of the Fortran implementations of the
> iterative methods with pure-Python code.  This would be a nice target
> for SciPy 0.8.

I've been interested in that and put together a basic initial package
at http://github.com/dpo/pykrylov/tree/master. The only prerequisite
should be Numpy.

For now I've been concentrating on Krylov methods that do not requite
products with the transpose, and on real linear systems.

-- 
Dominique


From sturla at molden.no  Tue Jan 27 15:02:16 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 27 Jan 2009 21:02:16 +0100 (CET)
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <b809fd40aad95c6114adbb4727c1205c.squirrel@webmail.uio.no>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
	<b809fd40aad95c6114adbb4727c1205c.squirrel@webmail.uio.no>
Message-ID: <5065c9b98108b4db76a25f10ccd96230.squirrel@webmail.uio.no>


> All local variables are declared static.

Fortran code written like this is not uncommon. It is often used because
Fortran 66 and 77 did not support dynamic memory management or derived
data types.

I'd also like to add that this may still be safe for "parallel processing"
in a Fortran context. Fortran programmers rarely work with threads
directly as C and Python programmers often do. Instead it is common to use
multiple processes (forking or MPI), compiler directives (OpenMP), or
autovectorizing compilers. The code cited could be "safe" for concurrency
in either of these contexts. The issue of what is thread safe and what is
not is actually produced from a bad concurrency abstraction used in C
(posix threads or Win32 threads). Thread safety is a problem Fortran
programmers usually don't have to care about. Parallel processing is not
done with threads. Making this subroutine thread-safe is easy: Just put a
lock in it. Or better yet: put the lock in the C wrapper that f2py
produces. Then, if parallel processing is required, use Fortran the
correct way: e.g. insert OpenMP directives into the Fortran code. Don't
try to do parallel processing by calling this function from multiple
threads concurrently. That is what's causing the havoc. This is Fortran,
not C, so don't use it like C.

Sturla Molden


From robert.kern at gmail.com  Tue Jan 27 16:10:15 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 27 Jan 2009 15:10:15 -0600
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
Message-ID: <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com>

On Tue, Jan 27, 2009 at 13:11, Nathan Bell <wnbell at gmail.com> wrote:
> On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> It's probably not threadsafe.
>
> I don't know Fortran, so I can't say:
> http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src
>
> Anway, here's a pure-SciPy CG implementation (also BSD-licensed):
> http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py
>
> It should be a drop-in replacement for sparse.linalg.cg() and have
> comparable speed.  The only dependency is to the norm() function in
> PyAMG, but you can swipe that easily too.
>
> In time we should replace all of the Fortran implementations of the
> iterative methods with pure-Python code.  This would be a nice target
> for SciPy 0.8.

+1

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From sturla at molden.no  Tue Jan 27 16:31:25 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 27 Jan 2009 22:31:25 +0100 (CET)
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
	<3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com>
Message-ID: <81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no>


>> In time we should replace all of the Fortran implementations of the
>> iterative methods with pure-Python code.  This would be a nice target
>> for SciPy 0.8.
>
> +1
>
> --
> Robert Kern


How would this be performance wise? Some iterative methods are fast in
Python, others are not. Why not protect Fortran code unsafe for threads
with a global lock in the Python module? It would not be any worse than
the GIL, which would affect pure Python code.

S. M.


From ebicici at ku.edu.tr  Tue Jan 27 16:36:16 2009
From: ebicici at ku.edu.tr (Ergun Bicici)
Date: Tue, 27 Jan 2009 23:36:16 +0200
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
	<3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com>
Message-ID: <4ded78d60901271336i7ffde1d0q89d104142db3f042@mail.gmail.com>

Sounds good. +1 :)

Ergun Bicici
Koc University


On Tue, Jan 27, 2009 at 11:10 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Jan 27, 2009 at 13:11, Nathan Bell <wnbell at gmail.com> wrote:
> > On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern <robert.kern at gmail.com>
> wrote:
> >>
> >> It's probably not threadsafe.
> >
> > I don't know Fortran, so I can't say:
> >
> http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src
> >
> > Anway, here's a pure-SciPy CG implementation (also BSD-licensed):
> > http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py
> >
> > It should be a drop-in replacement for sparse.linalg.cg() and have
> > comparable speed.  The only dependency is to the norm() function in
> > PyAMG, but you can swipe that easily too.
> >
> > In time we should replace all of the Fortran implementations of the
> > iterative methods with pure-Python code.  This would be a nice target
> > for SciPy 0.8.
>
> +1
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090127/67218628/attachment.html>

From wnbell at gmail.com  Tue Jan 27 23:11:45 2009
From: wnbell at gmail.com (Nathan Bell)
Date: Tue, 27 Jan 2009 23:11:45 -0500
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>
	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>
	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>
	<3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com>
	<81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no>
Message-ID: <d05265cb0901272011u23ee4c96nc32f8d0e0d445248@mail.gmail.com>

On Tue, Jan 27, 2009 at 4:31 PM, Sturla Molden <sturla at molden.no> wrote:
>
> How would this be performance wise? Some iterative methods are fast in
> Python, others are not. Why not protect Fortran code unsafe for threads
> with a global lock in the Python module? It would not be any worse than
> the GIL, which would affect pure Python code.
>

I don't have an opinion on the locking issue, but the dominant cost in
most iterative methods for linear systems is the cost of the sparse
matrix-vector products (for y = A*x for sparse A).  A smaller amount
of time is spent in level 1 BLAS operations like axpy() and norm().
All of these map efficiently to existing Python + SciPy functionality,
so there's little overhead.

IMO the advantage of the pure Python approach is also evident.
Compare the Python code that interfaces to the Fortran CG
implementation to the pure Python + SciPy implementation of the
*entire* algorithm:
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative.py#L196
http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py#71

I'm definitely in favor of parallelizing as much as of SciPy as
possible.  Now that most compilers support OpenMP it would be fairly
straightforward to parallelize the C++ code that implements sparse
matrix-vector multiplication (among other things).  In conjunction, we
ought to add the necessary compiler flags to setuptools so that
OpenMP-enabled sources are handled correctly.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/


From sturla at molden.no  Wed Jan 28 06:30:48 2009
From: sturla at molden.no (Sturla Molden)
Date: Wed, 28 Jan 2009 12:30:48 +0100
Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe?
In-Reply-To: <d05265cb0901272011u23ee4c96nc32f8d0e0d445248@mail.gmail.com>
References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com>	<3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com>	<d05265cb0901271111x6cfec08bob55b6bbd4bfc398@mail.gmail.com>	<3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com>	<81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no>
	<d05265cb0901272011u23ee4c96nc32f8d0e0d445248@mail.gmail.com>
Message-ID: <498041E8.50006@molden.no>

On 1/28/2009 5:11 AM, Nathan Bell wrote:

> I don't have an opinion on the locking issue, but the dominant cost in
> most iterative methods for linear systems is the cost of the sparse
> matrix-vector products (for y = A*x for sparse A).  A smaller amount
> of time is spent in level 1 BLAS operations like axpy() and norm().

That is fine then, as long as the heavy lifting is not done in Python.


> In conjunction, we
> ought to add the necessary compiler flags to setuptools so that
> OpenMP-enabled sources are handled correctly.


With GCC 4.3 and 4.4 one must use the compile flag -fopenmp and link 
with -lgomp and -lpthread.

On Windows this makes the extension dependent on pthreadGC2.dll. It is 
only 59 kB so it makes no sence to have this in a DLL. But I cannot find 
a static version of the library.

With f2py and gfortran 4.4 on Windows (mingw binary) I do this:

f2py.py --fcompiler=gnu95 --f90flags=-fopenmp --build-dir ./build \
    -c foobar.pyf foobar.f95 -lgomp -lpthread -lmsvcr71

I am still not sure what this would look like with setuptools though.


Sturla Molden


From scotta_2002 at yahoo.com  Wed Jan 28 12:28:57 2009
From: scotta_2002 at yahoo.com (Scott Askey)
Date: Wed, 28 Jan 2009 09:28:57 -0800 (PST)
Subject: [SciPy-user] Passing DAE mass matrix to integrate.odeint and/or ode
Message-ID: <506631.44954.qm@web36506.mail.mud.yahoo.com>

I am trying to solve and semi-explicit DAE system (index 1).
For a simple pendulum I 5 equations, there 1st order sytem and the explicit contraint.

0=x1^2+x2^2
x1'=x3
x2'=x4
x3'=-lam*x1
x4'=-lam*x2-g

Can I do this with scipy or must I try pydstools?  The problem is doable with  matlab ode15s which is simlar to  scipy.integrate.ode(f).\
set_integrator('vode', method='bdf', order=15) 

http://www.scipy.org/NumPy_for_Matlab_Users

V/R

Scott
 

From fperez.net at gmail.com  Wed Jan 28 13:48:31 2009
From: fperez.net at gmail.com (Fernando Perez)
Date: Wed, 28 Jan 2009 10:48:31 -0800
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <1233058138.2461.87.camel@bryan.teraview.local>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<1233058138.2461.87.camel@bryan.teraview.local>
Message-ID: <db6b5ecc0901281048y595ace6et9d582cceb0d8a734@mail.gmail.com>

On Tue, Jan 27, 2009 at 4:19 AM, Bryan Cole <bryan.cole at teraview.com> wrote:

> A few weeks back I added a recipe to the scipy wiki for embedding a
> matplotlib figure in a Traits app. You can find it at
>
> http://www.scipy.org/EmbeddingInTraitsGUI

Excellent!  Just a suggestion: self-contained recipes like this should
always be added as entries to the scipy cookbook rather than as
self-contained pages. It will make it easier to find it later and
keeps the top-level of the site organized:

http://www.scipy.org/Cookbook

Cheers,

f

ps - the cookbook page is timing out right now, after 4 tries.  I
don't know what's wrong with the server...


From rob.clewley at gmail.com  Wed Jan 28 14:22:40 2009
From: rob.clewley at gmail.com (Rob Clewley)
Date: Wed, 28 Jan 2009 14:22:40 -0500
Subject: [SciPy-user] Passing DAE mass matrix to integrate.odeint and/or
	ode
In-Reply-To: <506631.44954.qm@web36506.mail.mud.yahoo.com>
References: <506631.44954.qm@web36506.mail.mud.yahoo.com>
Message-ID: <a749952d0901281122w7bc239d2m308254e0dd50a5a2@mail.gmail.com>

Scott,


On Wed, Jan 28, 2009 at 12:28 PM, Scott Askey <scotta_2002 at yahoo.com> wrote:
> I am trying to solve and semi-explicit DAE system (index 1).
> For a simple pendulum I 5 equations, there 1st order sytem and the explicit contraint.

I don't believe it is possible in scipy alone, at least not with the
API that is currently exposed from the underlying library integrators
(even if in principle they can support it). Sorry!

> Can I do this with scipy or must I try pydstools?  The problem is doable with  matlab ode15s which is simlar to  scipy.integrate.ode(f).\
> set_integrator('vode', method='bdf', order=15)

I am more than willing to assist you in getting your problem working
with PyDSTool if you are interested in pursuing that.

-Rob


From peter.skomoroch at gmail.com  Wed Jan 28 15:39:28 2009
From: peter.skomoroch at gmail.com (Peter Skomoroch)
Date: Wed, 28 Jan 2009 15:39:28 -0500
Subject: [SciPy-user] Computational Economics with SciPy
Message-ID: <e4fc0d2a0901281239pfcf7afdga2b44065748960ce@mail.gmail.com>

Just stumbled across a new book by John Stachurski using scipy which will
ship later this month

Economic Dynamics: Theory and Computation
John Stachurski
MIT Press, 2009
http://www.amazon.com/Economic-Dynamics-Computation-John-Stachurski/dp/0262012774
http://johnstachurski.net/book/book.html

There are some nice tutorials using scipy here as well:

http://johnstachurski.net/lectures/index.html


*Economic Dynamics: Theory and Computation* is a graduate level introduction
> to deterministic and stochastic dynamics, dynamic programming and
> computational methods with economic applications.
> Topics
>
>    - Programming techniques
>    - Basic analysis (real analysis, metric spaces, fixed points)
>    - Deterministic dynamic systems
>    - Finite state Markov chains
>    - Finite state dynamic programming
>    - Continuous state stochastic dynamics
>    - Continuous state dynamic programming
>
>
-Pete


-- 
Peter N. Skomoroch
peter.skomoroch at gmail.com
http://www.datawrangling.com
http://del.icio.us/pskomoroch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090128/94771764/attachment.html>

From bryan at cole.uklinux.net  Wed Jan 28 17:27:10 2009
From: bryan at cole.uklinux.net (Bryan Cole)
Date: Wed, 28 Jan 2009 22:27:10 +0000
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <db6b5ecc0901281048y595ace6et9d582cceb0d8a734@mail.gmail.com>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<1233058138.2461.87.camel@bryan.teraview.local>
	<db6b5ecc0901281048y595ace6et9d582cceb0d8a734@mail.gmail.com>
Message-ID: <1233181629.24579.8.camel@pc2.cole.uklinux.net>


> 
> Excellent!  Just a suggestion: self-contained recipes like this should
> always be added as entries to the scipy cookbook rather than as
> self-contained pages. 

It is! It's linked under the Matplotlib cookbook. Since other recipes
concerning embedding mpl in GUIs are linked there, it seemed the right
place for it to go.

BC

> It will make it easier to find it later and
> keeps the top-level of the site organized:


From dwf at cs.toronto.edu  Wed Jan 28 18:06:01 2009
From: dwf at cs.toronto.edu (David Warde-Farley)
Date: Wed, 28 Jan 2009 18:06:01 -0500
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <1233181629.24579.8.camel@pc2.cole.uklinux.net>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<1233058138.2461.87.camel@bryan.teraview.local>
	<db6b5ecc0901281048y595ace6et9d582cceb0d8a734@mail.gmail.com>
	<1233181629.24579.8.camel@pc2.cole.uklinux.net>
Message-ID: <0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu>

On 28-Jan-09, at 5:27 PM, Bryan Cole wrote:

> It is! It's linked under the Matplotlib cookbook. Since other recipes
> concerning embedding mpl in GUIs are linked there, it seemed the right
> place for it to go.

I think what Fernando meant was to create pages as /Cookbook/ 
MyCookbookRecipe, rather than /MyCookbookRecipe, to make things even  
easier to find in a flat list of wiki pages (and make it clear from  
just the URL that it's part of 'the cookbook').

Cheers,

DWF


From bryan at cole.uklinux.net  Thu Jan 29 02:28:15 2009
From: bryan at cole.uklinux.net (Bryan Cole)
Date: Thu, 29 Jan 2009 07:28:15 +0000
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<1233058138.2461.87.camel@bryan.teraview.local>
	<db6b5ecc0901281048y595ace6et9d582cceb0d8a734@mail.gmail.com>
	<1233181629.24579.8.camel@pc2.cole.uklinux.net>
	<0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu>
Message-ID: <1233214091.32050.8.camel@pc2.cole.uklinux.net>


> I think what Fernando meant was to create pages as /Cookbook/ 
> MyCookbookRecipe, rather than /MyCookbookRecipe, to make things even  
> easier to find in a flat list of wiki pages (and make it clear from  
> just the URL that it's part of 'the cookbook').

Ah, I see what you mean now.

I would rename the page as suggested, but I can't see how to do this
("rename page" is greyed out for me). Could someone with more
wiki-permissions than I rename it?

cheers
BC

> 
> Cheers,
> 
> DWF


From gael.varoquaux at normalesup.org  Thu Jan 29 03:48:38 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Thu, 29 Jan 2009 09:48:38 +0100
Subject: [SciPy-user] SciPy and GUI
In-Reply-To: <1233214091.32050.8.camel@pc2.cole.uklinux.net>
References: <a2b3004b0901250140i77e5ff07qe10cdd2c6297b9c3@mail.gmail.com>
	<20090125100556.GA29918@phare.normalesup.org>
	<497D74AF.63BA.009B.0@twdb.state.tx.us>
	<1233058138.2461.87.camel@bryan.teraview.local>
	<db6b5ecc0901281048y595ace6et9d582cceb0d8a734@mail.gmail.com>
	<1233181629.24579.8.camel@pc2.cole.uklinux.net>
	<0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu>
	<1233214091.32050.8.camel@pc2.cole.uklinux.net>
Message-ID: <20090129084838.GG5567@phare.normalesup.org>

On Thu, Jan 29, 2009 at 07:28:15AM +0000, Bryan Cole wrote:

> > I think what Fernando meant was to create pages as /Cookbook/ 
> > MyCookbookRecipe, rather than /MyCookbookRecipe, to make things even  
> > easier to find in a flat list of wiki pages (and make it clear from  
> > just the URL that it's part of 'the cookbook').

> Ah, I see what you mean now.

> I would rename the page as suggested, but I can't see how to do this
> ("rename page" is greyed out for me). Could someone with more
> wiki-permissions than I rename it?

Done.

That poor wiki seems overloaded (for instance the cookbook page returns a
server error quite often). I don't know anything about wiki technology,
so I can't really help, but it is a pitty.

Ga?l


From nicolas.chopin at bristol.ac.uk  Thu Jan 29 04:48:01 2009
From: nicolas.chopin at bristol.ac.uk (Nicolas Chopin)
Date: Thu, 29 Jan 2009 10:48:01 +0100
Subject: [SciPy-user] accuracy of stats.gamma.pdf
Message-ID: <49817B51.3080200@bristol.ac.uk>

  Dear list,
when I compute:

stats.gamma.pdf(5.,2.,5.)

I get:
array([0.])

whereas the same command in R outputs:
 > dgamma(5.,2.,5.)
[1] 1.735993e-09

Is this a bug, and then should I report it somewhere?
Or is it just that scipy's implementation of the gamma pdf is a bit
less accurate than R's?

 I need to compute log-pdf's, so I need relative accuracy, not absolute 
accuracy;
but I can implement my own log-pdf routine, of course.

Thank you in advance for your wise replies.


Nicolas Chopin                       


<http://www.stats.bris.ac.uk/%7Emanxac/>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090129/01d3b959/attachment.html>

From josef.pktd at gmail.com  Thu Jan 29 07:38:43 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 29 Jan 2009 07:38:43 -0500
Subject: [SciPy-user] accuracy of stats.gamma.pdf
In-Reply-To: <49817B51.3080200@bristol.ac.uk>
References: <49817B51.3080200@bristol.ac.uk>
Message-ID: <1cd32cbb0901290438h1d2f71c6ube78f10452ef7e02@mail.gmail.com>

On Thu, Jan 29, 2009 at 4:48 AM, Nicolas Chopin
<nicolas.chopin at bristol.ac.uk> wrote:
>   Dear list,
> when I compute:
>
> stats.gamma.pdf(5.,2.,5.)
>
> I get:
> array([0.])
>
> whereas the same command in R outputs:
>> dgamma(5.,2.,5.)
> [1] 1.735993e-09
>
> Is this a bug, and then should I report it somewhere?
> Or is it just that scipy's implementation of the gamma pdf is a bit
> less accurate than R's?
>
>  I need to compute log-pdf's, so I need relative accuracy, not absolute
> accuracy;
> but I can implement my own log-pdf routine, of course.
>
> Thank you in advance for your wise replies.
>
>
>
> Nicolas Chopin
>

According to Johnson, Kotz, Balakrishnan gamma.pdf(5.,2.,5.) is zero,
it is at the lower boundary.

But for using log-pdf you might still be better of writing the
log(pdf) directly because you can use directly the expression for log
instead of calculating first exp and then log.

I check it more later today.

Josef


From pav at iki.fi  Thu Jan 29 07:52:37 2009
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 29 Jan 2009 12:52:37 +0000 (UTC)
Subject: [SciPy-user] accuracy of stats.gamma.pdf
References: <49817B51.3080200@bristol.ac.uk>
Message-ID: <gls8ql$m1p$1@ger.gmane.org>

Thu, 29 Jan 2009 10:48:01 +0100, Nicolas Chopin wrote:

> Dear list,
> when I compute:
> 
> stats.gamma.pdf(5.,2.,5.)
> 
> I get:
> array([0.])
> 
> whereas the same command in R outputs:
>  > dgamma(5.,2.,5.)
> [1] 1.735993e-09

Reading help(dgamma) in R, and help(scipy.stats.gamma.pdf) reveals the 
following:

In R, the third parameter to dgamma is the rate parameter. In Scipy, the 
third parameter is the location parameter, ie.

    scipy.stats.gamma.pdf(x, a, mu) == dgamma(x - mu, a)

It appears that scipy.stats.gamma doesn't have a scale parameter.

-- 
Pauli Virtanen


From pav at iki.fi  Thu Jan 29 07:58:55 2009
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 29 Jan 2009 12:58:55 +0000 (UTC)
Subject: [SciPy-user] accuracy of stats.gamma.pdf
References: <49817B51.3080200@bristol.ac.uk> <gls8ql$m1p$1@ger.gmane.org>
Message-ID: <gls96f$m1p$2@ger.gmane.org>

Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote:
[clip]
> It appears that scipy.stats.gamma doesn't have a scale parameter.

Oops, obviously it has a scale parameter:

In Scipy:
>>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5)
1.7359929831205026e-09

In R:
> dgamma(5,2,5)
[1] 1.735993e-09

So, no bugs present, just different order of arguments.


From nicolas.chopin at bristol.ac.uk  Thu Jan 29 08:40:19 2009
From: nicolas.chopin at bristol.ac.uk (Nicolas CHOPIN)
Date: Thu, 29 Jan 2009 13:40:19 +0000 (UTC)
Subject: [SciPy-user] accuracy of stats.gamma.pdf
References: <49817B51.3080200@bristol.ac.uk> <gls8ql$m1p$1@ger.gmane.org>
	<gls96f$m1p$2@ger.gmane.org>
Message-ID: <loom.20090129T133738-174@post.gmane.org>

Pauli Virtanen <pav <at> iki.fi> writes:

> 
> Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote:
> [clip]
> > It appears that scipy.stats.gamma doesn't have a scale parameter.
> 
> Oops, obviously it has a scale parameter:
> 
> In Scipy:
> >>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5)
> 1.7359929831205026e-09
> 
> In R:
> > dgamma(5,2,5)
> [1] 1.735993e-09
> 
> So, no bugs present, just different order of arguments.
> 


oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes,
sorry...
A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with
shape a, and scale=1/b, and nobody uses a location parameter.
Thanks again


From josef.pktd at gmail.com  Thu Jan 29 09:51:40 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 29 Jan 2009 09:51:40 -0500
Subject: [SciPy-user] accuracy of stats.gamma.pdf
In-Reply-To: <loom.20090129T133738-174@post.gmane.org>
References: <49817B51.3080200@bristol.ac.uk> <gls8ql$m1p$1@ger.gmane.org>
	<gls96f$m1p$2@ger.gmane.org> <loom.20090129T133738-174@post.gmane.org>
Message-ID: <1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com>

On Thu, Jan 29, 2009 at 8:40 AM, Nicolas CHOPIN
<nicolas.chopin at bristol.ac.uk> wrote:
> Pauli Virtanen <pav <at> iki.fi> writes:
>
>>
>> Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote:
>> [clip]
>> > It appears that scipy.stats.gamma doesn't have a scale parameter.
>>
>> Oops, obviously it has a scale parameter:
>>
>> In Scipy:
>> >>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5)
>> 1.7359929831205026e-09
>>
>> In R:
>> > dgamma(5,2,5)
>> [1] 1.735993e-09
>>
>> So, no bugs present, just different order of arguments.
>>
>
>
> oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes,
> sorry...
> A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with
> shape a, and scale=1/b, and nobody uses a location parameter.
> Thanks again
>

I'm glad this is cleared up, I appreciate any report on differences
with R, since not all corner cases are properly tested.

Location and scale are keyword arguments for any continuous
distribution and are handled generically, (which currently has the
disadvantage that fit cannot estimate the distribution parameters
while keeping the location fixed).

That's my way of checking a distribution without looking at the source:

>>> stats.gamma.numargs
1
>>> stats.gamma.shapes
'a'

>>> print stats.gamma.extradoc


Gamma distribution

For a = integer, this is the Erlang distribution, and for a=1 it is the
exponential distribution.

gamma.pdf(x,a) = x**(a-1)*exp(-x)/gamma(a)
for x >= 0, a > 0.


>>> stats.gamma.pdf(5.,2.,loc=0,scale=5)
0.07357588823428847
>>> stats.gamma.pdf(5.,2.,loc=5)
0.0
>>> stats.gamma.pdf(5.,2.,loc=0,scale=1/5.)
1.7359929831205026e-009

Josef


From ludovic.drouineau at ifremer.fr  Fri Jan 30 07:22:03 2009
From: ludovic.drouineau at ifremer.fr (Ludovic DROUINEAU)
Date: Fri, 30 Jan 2009 13:22:03 +0100
Subject: [SciPy-user] Problem reading NetCDF File
Message-ID: <4982F0EB.1000102@ifremer.fr>

Hi all,

When I try to open a NetCDF file, I have the following error:
 File "C:\Python25\lib\site-packages\scipy\io\netcdf.py", line 194, in 
_read_values
   count = n * bytes[nc_type-1]
IndexError: list index out of range

My code is:
from scipy.io import netcdf

nc = netcdf.netcdf_file ('test.nc', 'r')


Here is the header of the netcdf file:
netcdf test {
dimensions:
    time = UNLIMITED ; // (33635 currently)
variables:
    double measureTS(time) ;
        measureTS:element_name = "measure time" ;
        measureTS:cardinalitymin = 1 ;
        measureTS:cardinalitymax = 1 ;
        measureTS:comment = "time of measure as determined by the GPS" ;
        measureTS:long_name = "measure timestamp" ;
        measureTS:units = "day since 1899-12-30T00:00:00 UTC" ;
        measureTS:shortunits = "days" ;
        measureTS:positive = "up" ;
        measureTS:C_format = "%14.7f" ;
        measureTS:axis = "T" ;
        measureTS:measuretimedata = "measureTS" ;
        measureTS:valid_max = 100000. ;
        measureTS:valid_min = 30000. ;
        measureTS:precision = 12 ;
        measureTS:scale = 7 ;
        measureTS:_FillValue = 0. ;
        measureTS:missing_value = 0. ;
        measureTS:scale_factor = 1. ;
        measureTS:add_offset = 0. ;
        measureTS:element_version = "1.0" ;
        measureTS:valid_range = "30000.000000,100000.000000" ;
    double lat(time) ;
        lat:element_name = "latitude" ;
        lat:cardinalitymin = 1 ;
        lat:cardinalitymax = 1 ;
        lat:comment = "latitude of the fix for the reference geodetic 
system" ;
        lat:long_name = "latitude" ;
        lat:units = "degree_north" ;
        lat:shortunits = "?" ;
        lat:positive = "up" ;
        lat:C_format = "%11.7f" ;
        lat:axis = "Y" ;
        lat:coordinates = "measureTS" ;
        lat:measuretimedata = "measureTS" ;
        lat:valid_max = 90. ;
        lat:valid_min = -90. ;
        lat:precision = 9 ;
        lat:scale = 7 ;
        lat:_FillValue = -100. ;
        lat:missing_value = -100. ;
        lat:scale_factor = 1. ;
        lat:add_offset = 0. ;
        lat:element_version = "1.0" ;
        lat:valid_range = "-90.000000,90.000000" ;
    double long(time) ;
        long:element_name = "longitude" ;
        long:cardinalitymin = 1 ;
        long:cardinalitymax = 1 ;
        long:comment = "longitude of the fix for the reference geodetic 
system" ;
        long:long_name = "longitude" ;
        long:units = "degree_east" ;
        long:shortunits = "?" ;
        long:positive = "up" ;
        long:C_format = "%12.7f" ;
        long:axis = "X" ;
        long:coordinates = "measureTS" ;
        long:measuretimedata = "measureTS" ;
        long:valid_max = 180. ;
        long:valid_min = -180. ;
        long:precision = 10 ;
        long:scale = 7 ;
        long:_FillValue = -200. ;
        long:missing_value = -200. ;
        long:scale_factor = 1. ;
        long:add_offset = 0. ;
        long:element_version = "1.0" ;
        long:valid_range = "-180.000000,180.000000" ;
    double alt(time) ;
        alt:element_name = "altitude" ;
        alt:cardinalitymin = 0 ;
        alt:cardinalitymax = 1 ;
        alt:comment = "altitude of the fix above the reference ellipso??d" ;
        alt:long_name = "altitude" ;
        alt:units = "m" ;
        alt:shortunits = "m" ;
        alt:positive = "up" ;
        alt:C_format = "%9.3f" ;
        alt:axis = "Z" ;
        alt:coordinates = "measureTS lat long" ;
        alt:measuretimedata = "measureTS" ;
        alt:valid_max = 30000000. ;
        alt:valid_min = -1000000. ;
        alt:precision = 7 ;
        alt:scale = 3 ;
        alt:_FillValue = -10000000. ;
        alt:missing_value = -10000000. ;
        alt:scale_factor = 1. ;
        alt:add_offset = 0. ;
        alt:element_version = "1.0" ;
        alt:valid_range = "-1000000.000000,30000000.000000" ;
    byte prec(time) ;
        prec:element_name = "horizontal position precision code" ;
        prec:cardinalitymin = 0 ;
        prec:cardinalitymax = 1 ;
        prec:comment = "precision of the position as determined by the 
GPS or the acquisition server" ;
        prec:long_name = "precision" ;
        prec:units = "dimensionless" ;
        prec:shortunits = "dimensionless" ;
        prec:positive = "up" ;
        prec:coordinates = "measureTS lat long" ;
        prec:measuretimedata = "measureTS" ;
        prec:valid_max = 9. ;
        prec:valid_min = 0. ;
        prec:precision = 1 ;
        prec:scale = 0 ;
        prec:_FillValue = -1b ;
        prec:missing_value = -1. ;
        prec:scale_factor = 1. ;
        prec:add_offset = 0. ;
        prec:element_version = "1.0" ;
        prec:valid_range = "0.000000,9.000000" ;
    byte mode(time) ;
        mode:element_name = "GPS mode" ;
        mode:cardinalitymin = 0 ;
        mode:cardinalitymax = 1 ;
        mode:comment = "mode used by the GPS to compute the fix in NMEA 
norm" ;
        mode:long_name = "GPS mode" ;
        mode:units = "dimensionless" ;
        mode:shortunits = "dimensionless" ;
        mode:positive = "up" ;
        mode:coordinates = "measureTS lat long" ;
        mode:measuretimedata = "measureTS" ;
        mode:valid_max = 7. ;
        mode:valid_min = 0. ;
        mode:precision = 1 ;
        mode:scale = 0 ;
        mode:_FillValue = -1b ;
        mode:missing_value = -1. ;
        mode:scale_factor = 1. ;
        mode:add_offset = 0. ;
        mode:element_version = "1.1" ;
        mode:valid_range = "0.000000,7.000000" ;
    float gndcourse(time) ;
        gndcourse:element_name = "course" ;
        gndcourse:cardinalitymin = 0 ;
        gndcourse:cardinalitymax = 1 ;
        gndcourse:comment = "heading of the speed vector of the GPS 
antenna relative to the reference geodetic system (i.e. ground)" ;
        gndcourse:long_name = "ground course" ;
        gndcourse:units = "degree" ;
        gndcourse:shortunits = "?" ;
        gndcourse:positive = "up" ;
        gndcourse:C_format = "%5.2f" ;
        gndcourse:coordinates = "measureTS lat long" ;
        gndcourse:measuretimedata = "measureTS" ;
        gndcourse:valid_max = 360. ;
        gndcourse:valid_min = 0. ;
        gndcourse:precision = 4 ;
        gndcourse:scale = 2 ;
        gndcourse:_FillValue = -1.f ;
        gndcourse:missing_value = -1. ;
        gndcourse:scale_factor = 1. ;
        gndcourse:add_offset = 0. ;
        gndcourse:element_version = "1.0" ;
        gndcourse:valid_range = "0.000000,360.000000" ;
    float gndspeed(time) ;
        gndspeed:element_name = "speed (ground)" ;
        gndspeed:cardinalitymin = 0 ;
        gndspeed:cardinalitymax = 1 ;
        gndspeed:comment = "module of speed of GPS antenna relative to 
the reference geodetic system (i.e. ground)" ;
        gndspeed:long_name = "ground speed" ;
        gndspeed:units = "knot" ;
        gndspeed:shortunits = "kn" ;
        gndspeed:positive = "up" ;
        gndspeed:C_format = "%5.2f" ;
        gndspeed:coordinates = "measureTS lat long" ;
        gndspeed:measuretimedata = "measureTS" ;
        gndspeed:valid_max = 200. ;
        gndspeed:valid_min = 0. ;
        gndspeed:precision = 4 ;
        gndspeed:scale = 2 ;
        gndspeed:_FillValue = -1.f ;
        gndspeed:missing_value = -1. ;
        gndspeed:scale_factor = 1. ;
        gndspeed:add_offset = 0. ;
        gndspeed:element_version = "1.0" ;
        gndspeed:valid_range = "0.000000,200.000000" ;
    double time(time) ;
        time:long_name = "acquisition time" ;
        time:units = "days since 1899-12-30 00:00:00 UTC" ;
        time:calendar = "gregorian" ;
        time:axis = "T" ;
        time:_FillValue = 0. ;

// global attributes:
        :history = "TECHSAS v.2.35 - 2006-09-23T02:56:27 UTC 
2007-02-25T22:39:21Z" ;
        :source = "Acquisition of AQUA1" ;
        :conventions = "CF-1.0." ;
        :creationtime = "2007-02-25T22:39:21Z" ;
        :device_deviceid = "PP_AQUA1" ;
        :device_devicename = "AQUA1" ;
        :device_position = "passerelle" ;
        :device_installdate = "2001-09-10T10:30:00Z" ;
        :device_latestcalibrationdate = "2001-09-10T10:30:00Z" ;
        :device_workingparameters = "WGS84" ;
        :device_sourcetype = "gps gyr" ;
        :firstframetime = "2007-02-25T22:39:21Z" ;
        :lastframetime = "2007-02-26T07:59:56Z" ;
        :device_X = 24.6 ;
        :device_Y = 0.6 ;
        :device_Z = -31. ;
        :frame_name = "position" ;
        :frame_major = "1" ;
        :frame_minor = "1" ;
        :frame_sourcetype = "gps" ;
        :frame_period = 1. ;
        :title = "Techsas 2.321" ;
        :institution = "Ifremer" ;
        :reference = "http://www.ifremer.fr" ;
}

Thank you in advance for your replies.

-- 
 Ludovic DROUINEAU
NSE/ILE
Ifremer Centre de Brest
BP 70 - 29280 Plouzan?
t?l. 33 (0)2 98 22 40 94
email  Ludovic.Drouineau at ifremer.fr


From scott.sinclair.za at gmail.com  Fri Jan 30 09:07:00 2009
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Fri, 30 Jan 2009 16:07:00 +0200
Subject: [SciPy-user] Problem reading NetCDF File
In-Reply-To: <4982F0EB.1000102@ifremer.fr>
References: <4982F0EB.1000102@ifremer.fr>
Message-ID: <6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com>

> 2009/1/30 Ludovic DROUINEAU <ludovic.drouineau at ifremer.fr>:
> Hi all,
>
> When I try to open a NetCDF file, I have the following error:
>  File "C:\Python25\lib\site-packages\scipy\io\netcdf.py", line 194, in
> _read_values
>   count = n * bytes[nc_type-1]
> IndexError: list index out of range
>
> My code is:
> from scipy.io import netcdf
>
> nc = netcdf.netcdf_file ('test.nc', 'r')

I'm not sure if anyone is actively maintaining scipy.io.netcdf (you'll
find out if there is a response to your query). In case there isn't,
you might have better luck with one of the following:

http://code.google.com/p/netcdf4-python/
http://matplotlib.sourceforge.net/basemap/doc/html/api/basemap_api.html#mpl_toolkits.basemap.NetCDFFile
http://www.pyngl.ucar.edu/Nio.shtml
http://pypi.python.org/pypi/pupynere/1.0

Cheers,
Scott


From rmay31 at gmail.com  Fri Jan 30 10:12:24 2009
From: rmay31 at gmail.com (Ryan May)
Date: Fri, 30 Jan 2009 09:12:24 -0600
Subject: [SciPy-user] Problem reading NetCDF File
In-Reply-To: <6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com>
References: <4982F0EB.1000102@ifremer.fr>
	<6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com>
Message-ID: <498318D8.9060008@gmail.com>

Scott Sinclair wrote:
>> 2009/1/30 Ludovic DROUINEAU <ludovic.drouineau at ifremer.fr>:
>> Hi all,
>>
>> When I try to open a NetCDF file, I have the following error:
>>  File "C:\Python25\lib\site-packages\scipy\io\netcdf.py", line 194, in
>> _read_values
>>   count = n * bytes[nc_type-1]
>> IndexError: list index out of range
>>
>> My code is:
>> from scipy.io import netcdf
>>
>> nc = netcdf.netcdf_file ('test.nc', 'r')
> 
> I'm not sure if anyone is actively maintaining scipy.io.netcdf (you'll
> find out if there is a response to your query). In case there isn't,
> you might have better luck with one of the following:

Well, scipy.io.netcdf is a (now outdated) version of pupynere.  Pupynere itself
is maintained, it's just that the version in scipy is out of date.  That might be
something good to fix at some point.

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma


From rjchacko at gmail.com  Fri Jan 30 10:32:30 2009
From: rjchacko at gmail.com (Ranjit Chacko)
Date: Fri, 30 Jan 2009 10:32:30 -0500
Subject: [SciPy-user] getting started with scipy
In-Reply-To: <mailman.4919.1233233952.2878.scipy-user@scipy.org>
References: <mailman.4919.1233233952.2878.scipy-user@scipy.org>
Message-ID: <692111C8-95EB-4872-A4BC-FF92EB71C9E6@gmail.com>

Hi,

I'm interested in starting to use scipy for my simulations, and I've  
got a question about how Traits/Chaco handle the graphics. Are the  
plot objects handled by a separate thread? Right now I'm using a Java  
package someone else wrote that has one thread for the simulation code  
and another for updating the visualizations.

Thanks,

Ranjit


From christopher.paul.taylor at gmail.com  Fri Jan 30 10:37:08 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Fri, 30 Jan 2009 10:37:08 -0500
Subject: [SciPy-user] sparse matrix eigenvector/value solver
Message-ID: <f44caf830901300737j7bf24c55la7d4b5d22daac4c@mail.gmail.com>

I've been reading through scipy.sparse.* for some way to solve for
eigenvectors and values without much success. I found a PySparse
library which seems to have a solution that I'm looking for- any
recommendations? Is this something I can do in the scope of scipy or
should I look into this PySparse library?

As an aside, the sparse matrices I'm working with are *huge*.

ct


From gael.varoquaux at normalesup.org  Fri Jan 30 10:48:24 2009
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Fri, 30 Jan 2009 16:48:24 +0100
Subject: [SciPy-user] getting started with scipy
In-Reply-To: <692111C8-95EB-4872-A4BC-FF92EB71C9E6@gmail.com>
References: <mailman.4919.1233233952.2878.scipy-user@scipy.org>
	<692111C8-95EB-4872-A4BC-FF92EB71C9E6@gmail.com>
Message-ID: <20090130154824.GC23594@phare.normalesup.org>

On Fri, Jan 30, 2009 at 10:32:30AM -0500, Ranjit Chacko wrote:
> I'm interested in starting to use scipy for my simulations, and I've  
> got a question about how Traits/Chaco handle the graphics. Are the  
> plot objects handled by a separate thread? Right now I'm using a Java  
> package someone else wrote that has one thread for the simulation code  
> and another for updating the visualizations.

Not by default. Thread are not something to be taken lightly, you can if
you want do the processing and the display is separate thread, but if you
go down this road, you better be careful with race conditions.

Ga?l


From josef.pktd at gmail.com  Fri Jan 30 11:05:20 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 30 Jan 2009 11:05:20 -0500
Subject: [SciPy-user] sparse matrix eigenvector/value solver
In-Reply-To: <f44caf830901300737j7bf24c55la7d4b5d22daac4c@mail.gmail.com>
References: <f44caf830901300737j7bf24c55la7d4b5d22daac4c@mail.gmail.com>
Message-ID: <1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com>

On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor
<christopher.paul.taylor at gmail.com> wrote:
> I've been reading through scipy.sparse.* for some way to solve for
> eigenvectors and values without much success. I found a PySparse
> library which seems to have a solution that I'm looking for- any
> recommendations? Is this something I can do in the scope of scipy or
> should I look into this PySparse library?
>
> As an aside, the sparse matrices I'm working with are *huge*.
>
> ct

Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ?

I think this got recently added to scipy, but I didn't see any
reference to it in the docs.

Josef


From christopher.paul.taylor at gmail.com  Fri Jan 30 11:18:24 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Fri, 30 Jan 2009 11:18:24 -0500
Subject: [SciPy-user] sparse matrix eigenvector/value solver
In-Reply-To: <1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com>
References: <f44caf830901300737j7bf24c55la7d4b5d22daac4c@mail.gmail.com>
	<1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com>
Message-ID: <f44caf830901300818j27e022e4g973f605815d6a186@mail.gmail.com>

With a little greping, I found it!

Look in scipy/sparse/linalg/eigen/arpack/speigs.py

There's a function,  ARPACK_eigs that returns eigenvals and eigenvecs.

Cheers!

ct

On Fri, Jan 30, 2009 at 11:05 AM,  <josef.pktd at gmail.com> wrote:
> On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor
> <christopher.paul.taylor at gmail.com> wrote:
>> I've been reading through scipy.sparse.* for some way to solve for
>> eigenvectors and values without much success. I found a PySparse
>> library which seems to have a solution that I'm looking for- any
>> recommendations? Is this something I can do in the scope of scipy or
>> should I look into this PySparse library?
>>
>> As an aside, the sparse matrices I'm working with are *huge*.
>>
>> ct
>
> Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ?
>
> I think this got recently added to scipy, but I didn't see any
> reference to it in the docs.
>
> Josef
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From josef.pktd at gmail.com  Fri Jan 30 11:24:54 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 30 Jan 2009 11:24:54 -0500
Subject: [SciPy-user] sparse matrix eigenvector/value solver
In-Reply-To: <f44caf830901300818j27e022e4g973f605815d6a186@mail.gmail.com>
References: <f44caf830901300737j7bf24c55la7d4b5d22daac4c@mail.gmail.com>
	<1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com>
	<f44caf830901300818j27e022e4g973f605815d6a186@mail.gmail.com>
Message-ID: <1cd32cbb0901300824g62003577g87d93e05e0785a28@mail.gmail.com>

On Fri, Jan 30, 2009 at 11:18 AM, christopher taylor
<christopher.paul.taylor at gmail.com> wrote:
> With a little greping, I found it!
>
> Look in scipy/sparse/linalg/eigen/arpack/speigs.py
>
> There's a function,  ARPACK_eigs that returns eigenvals and eigenvecs.
>
> Cheers!
>
> ct
>
> On Fri, Jan 30, 2009 at 11:05 AM,  <josef.pktd at gmail.com> wrote:
>> On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor
>> <christopher.paul.taylor at gmail.com> wrote:
>>> I've been reading through scipy.sparse.* for some way to solve for
>>> eigenvectors and values without much success. I found a PySparse
>>> library which seems to have a solution that I'm looking for- any
>>> recommendations? Is this something I can do in the scope of scipy or
>>> should I look into this PySparse library?
>>>
>>> As an aside, the sparse matrices I'm working with are *huge*.
>>>
>>> ct
>>
>> Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ?
>>
>> I think this got recently added to scipy, but I didn't see any
>> reference to it in the docs.
>>
>> Josef
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


arpack is not imported in

scipy\sparse\linalg\eigen\__init__.py

and is not included in the docs, links and automodule are missing.
Currently it has to be imported directly

import scipy.sparse.linalg.eigen.arpack

importing only import scipy.sparse.linalg.eigen  does not expose/load arpack

Josef


From christopher.paul.taylor at gmail.com  Fri Jan 30 11:26:32 2009
From: christopher.paul.taylor at gmail.com (christopher taylor)
Date: Fri, 30 Jan 2009 11:26:32 -0500
Subject: [SciPy-user] sparse matrix eigenvector/value solver
In-Reply-To: <1cd32cbb0901300824g62003577g87d93e05e0785a28@mail.gmail.com>
References: <f44caf830901300737j7bf24c55la7d4b5d22daac4c@mail.gmail.com>
	<1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com>
	<f44caf830901300818j27e022e4g973f605815d6a186@mail.gmail.com>
	<1cd32cbb0901300824g62003577g87d93e05e0785a28@mail.gmail.com>
Message-ID: <f44caf830901300826v21584083n25be21e662aa4166@mail.gmail.com>

thanks! i'm in the process of testing it out.

ct

On Fri, Jan 30, 2009 at 11:24 AM,  <josef.pktd at gmail.com> wrote:
> On Fri, Jan 30, 2009 at 11:18 AM, christopher taylor
> <christopher.paul.taylor at gmail.com> wrote:
>> With a little greping, I found it!
>>
>> Look in scipy/sparse/linalg/eigen/arpack/speigs.py
>>
>> There's a function,  ARPACK_eigs that returns eigenvals and eigenvecs.
>>
>> Cheers!
>>
>> ct
>>
>> On Fri, Jan 30, 2009 at 11:05 AM,  <josef.pktd at gmail.com> wrote:
>>> On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor
>>> <christopher.paul.taylor at gmail.com> wrote:
>>>> I've been reading through scipy.sparse.* for some way to solve for
>>>> eigenvectors and values without much success. I found a PySparse
>>>> library which seems to have a solution that I'm looking for- any
>>>> recommendations? Is this something I can do in the scope of scipy or
>>>> should I look into this PySparse library?
>>>>
>>>> As an aside, the sparse matrices I'm working with are *huge*.
>>>>
>>>> ct
>>>
>>> Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ?
>>>
>>> I think this got recently added to scipy, but I didn't see any
>>> reference to it in the docs.
>>>
>>> Josef
>>> _______________________________________________
>>> SciPy-user mailing list
>>> SciPy-user at scipy.org
>>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>>
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user at scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>>
>
>
> arpack is not imported in
>
> scipy\sparse\linalg\eigen\__init__.py
>
> and is not included in the docs, links and automodule are missing.
> Currently it has to be imported directly
>
> import scipy.sparse.linalg.eigen.arpack
>
> importing only import scipy.sparse.linalg.eigen  does not expose/load arpack
>
> Josef
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


From oliphant at enthought.com  Fri Jan 30 12:30:58 2009
From: oliphant at enthought.com (Travis E. Oliphant)
Date: Fri, 30 Jan 2009 11:30:58 -0600
Subject: [SciPy-user] accuracy of stats.gamma.pdf
In-Reply-To: <1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com>
References: <49817B51.3080200@bristol.ac.uk>
	<gls8ql$m1p$1@ger.gmane.org>	<gls96f$m1p$2@ger.gmane.org>
	<loom.20090129T133738-174@post.gmane.org>
	<1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com>
Message-ID: <49833952.9020600@enthought.com>

josef.pktd at gmail.com wrote:
> On Thu, Jan 29, 2009 at 8:40 AM, Nicolas CHOPIN
> <nicolas.chopin at bristol.ac.uk> wrote:
>   
>> Pauli Virtanen <pav <at> iki.fi> writes:
>>
>>     
>>> Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote:
>>> [clip]
>>>       
>>>> It appears that scipy.stats.gamma doesn't have a scale parameter.
>>>>         
>>> Oops, obviously it has a scale parameter:
>>>
>>> In Scipy:
>>>       
>>>>>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5)
>>>>>>             
>>> 1.7359929831205026e-09
>>>
>>> In R:
>>>       
>>>> dgamma(5,2,5)
>>>>         
>>> [1] 1.735993e-09
>>>
>>> So, no bugs present, just different order of arguments.
>>>
>>>       
>> oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes,
>> sorry...
>> A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with
>> shape a, and scale=1/b, and nobody uses a location parameter.
>> Thanks again
>>
>>     
>
> I'm glad this is cleared up, I appreciate any report on differences
> with R, since not all corner cases are properly tested.
>
> Location and scale are keyword arguments for any continuous
> distribution and are handled generically, (which currently has the
> disadvantage that fit cannot estimate the distribution parameters
> while keeping the location fixed).
>   

That's a good point!    It should be possible to fix any of the 
parameters and estimate the others from the data.   If you know more you 
should use it, because your estimates of what remains unknown can only 
improve (and sometimes markedly so).

If somebody fixes this, I would welcome the change.  

-Travis


From josef.pktd at gmail.com  Fri Jan 30 12:57:32 2009
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 30 Jan 2009 12:57:32 -0500
Subject: [SciPy-user] accuracy of stats.gamma.pdf
In-Reply-To: <49833952.9020600@enthought.com>
References: <49817B51.3080200@bristol.ac.uk> <gls8ql$m1p$1@ger.gmane.org>
	<gls96f$m1p$2@ger.gmane.org> <loom.20090129T133738-174@post.gmane.org>
	<1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com>
	<49833952.9020600@enthought.com>
Message-ID: <1cd32cbb0901300957w6e38f74ehac6cef85ef0ab25f@mail.gmail.com>

On Fri, Jan 30, 2009 at 12:30 PM, Travis E. Oliphant
<oliphant at enthought.com> wrote:
> josef.pktd at gmail.com wrote:
>> On Thu, Jan 29, 2009 at 8:40 AM, Nicolas CHOPIN
>> <nicolas.chopin at bristol.ac.uk> wrote:
>>
>>> Pauli Virtanen <pav <at> iki.fi> writes:
>>>
>>>
>>>> Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote:
>>>> [clip]
>>>>
>>>>> It appears that scipy.stats.gamma doesn't have a scale parameter.
>>>>>
>>>> Oops, obviously it has a scale parameter:
>>>>
>>>> In Scipy:
>>>>
>>>>>>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5)
>>>>>>>
>>>> 1.7359929831205026e-09
>>>>
>>>> In R:
>>>>
>>>>> dgamma(5,2,5)
>>>>>
>>>> [1] 1.735993e-09
>>>>
>>>> So, no bugs present, just different order of arguments.
>>>>
>>>>
>>> oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes,
>>> sorry...
>>> A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with
>>> shape a, and scale=1/b, and nobody uses a location parameter.
>>> Thanks again
>>>
>>>
>>
>> I'm glad this is cleared up, I appreciate any report on differences
>> with R, since not all corner cases are properly tested.
>>
>> Location and scale are keyword arguments for any continuous
>> distribution and are handled generically, (which currently has the
>> disadvantage that fit cannot estimate the distribution parameters
>> while keeping the location fixed).
>>
>
> That's a good point!    It should be possible to fix any of the
> parameters and estimate the others from the data.   If you know more you
> should use it, because your estimates of what remains unknown can only
> improve (and sometimes markedly so).
>
> If somebody fixes this, I would welcome the change.
>
> -Travis
>

It's Ticket #832, especially not being able to fix the support
(location) of the distribution is a problem. We
should be able to get this in before the next release with some
planned enhancements
to distributions. (But I don't have time right now)

Josef


From schugschug at gmail.com  Sat Jan 31 20:06:20 2009
From: schugschug at gmail.com (Eric Schug)
Date: Sat, 31 Jan 2009 20:06:20 -0500
Subject: [SciPy-user] Automating Matlab
Message-ID: <4984F58C.5070605@gmail.com>

Is there strong interest in automating matlab to numpy conversion?

I have a working version of a matlab to python translator.
It allows translation of matlab scripts into numpy constructs, 
supporting most of the matlab language.  The parser is nearly complete.  
Most of the remaining work involves providing a robust translation. Such as
    * making sure that copies on assign are done when needed.
    * correct indexing a(:) becomes a.flatten(1) when on the left hand 
side (lhs) of equals
       and a[:] when on the right hand side


I've seen a few projects attempt to do this, but for one reason or 
another have stopped it.


From robert.kern at gmail.com  Sat Jan 31 20:34:57 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 31 Jan 2009 19:34:57 -0600
Subject: [SciPy-user] Automating Matlab
In-Reply-To: <4984F58C.5070605@gmail.com>
References: <4984F58C.5070605@gmail.com>
Message-ID: <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com>

On Sat, Jan 31, 2009 at 19:06, Eric Schug <schugschug at gmail.com> wrote:
> Is there strong interest in automating matlab to numpy conversion?

Yes! Please post your code somewhere!

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


From dwf at cs.toronto.edu  Sat Jan 31 20:49:32 2009
From: dwf at cs.toronto.edu (David Warde-Farley)
Date: Sat, 31 Jan 2009 20:49:32 -0500
Subject: [SciPy-user] Automating Matlab
In-Reply-To: <4984F58C.5070605@gmail.com>
References: <4984F58C.5070605@gmail.com>
Message-ID: <5BC40EFF-8964-45CB-9DA3-D4FA87EE4B2E@cs.toronto.edu>

On 31-Jan-09, at 8:06 PM, Eric Schug wrote:

> Is there strong interest in automating matlab to numpy conversion?

I think there is a strong interest in this. One of the main obstacles  
to changing environments is inertia and familiarity. My advisor  
repeatedly expresses his wish to give Python another try, and having  
an easy way to show him how his existing scripts translate would be  
awesome.

Of course there are caveats, corner cases where such translations will  
fail, but a fairly foolproof method of converting simple scripts would  
be just fantastic. I imagine if you've gotten further along than  
previous attempts you'll receive a lot of street cred on this list and  
probably a lot of patches to make things work better. :)

David