From jjl at pobox.com  Thu Nov  1 11:19:12 2001
From: jjl at pobox.com (John J. Lee)
Date: Thu Nov  1 11:19:12 2001
Subject: [Numpy-discussion] RE: Numeric2
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFKEEODLAA.perry@stsci.edu>
Message-ID: <Pine.SOL.4.30.0111011916360.17706-100000@mimosa.csv.warwick.ac.uk>

On Tue, 30 Oct 2001, Perry Greenfield wrote:
[...]
> > What is the current status of Numeric2?
> >
> We are in the process of putting it up on sourceforge now.
[...]

What does it do??


John


From hungjunglu at yahoo.com  Fri Nov  2 10:24:10 2001
From: hungjunglu at yahoo.com (Hung Jung Lu)
Date: Fri Nov  2 10:24:10 2001
Subject: [Numpy-discussion] Assembly optimized numerical packages?
Message-ID: <20011102182318.66182.qmail@web12606.mail.yahoo.com>

Hi,

This is a tangential topic.

Can someone give me pointers where to find
freeware/shareware/commercial packages for linear
algebra and probability calculations (e.g: Cholesky
decomposition, eigenvalue & eigenvectors in
diagonalization, interpolation, normal distribution,
beta distribution, inverse cumulative normal function,
etc.), and such that it uses assembly level
optimization (I need highspeed, but on mundane Pentium
3 or Pentium 4 machines) and can be used in Windows
platform and from Microsoft's Visual C++?

I know mtxvec from www.dewresearch.com does something
along these lines, but it seems like they are aiming
for specific dev platforms (CBuilder and Delphi).

thanks!

Hung Jung


__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com


From chrishbarker at home.net  Fri Nov  2 11:40:07 2001
From: chrishbarker at home.net (Chris Barker)
Date: Fri Nov  2 11:40:07 2001
Subject: [Numpy-discussion] Assembly optimized numerical packages?
References: <20011102182318.66182.qmail@web12606.mail.yahoo.com>
Message-ID: <3BE2FADF.23D659E9@home.net>

Hung Jung Lu wrote:

> Can someone give me pointers where to find
> freeware/shareware/commercial packages for linear
> algebra and probability calculations (e.g: Cholesky
> decomposition, eigenvalue & eigenvectors in
> diagonalization,

This sounds likeyou are looking for is LAPACK with a good BLAS. Do a web
search, and you'll find lot's of pointers.

interpolation, normal distribution,
> beta distribution, inverse cumulative normal function,
> etc.)

I'm lost here. Perhaps someone else will have some pointers.

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From jsaenz at wm.lc.ehu.es  Mon Nov  5 00:56:09 2001
From: jsaenz at wm.lc.ehu.es (Jon Saenz)
Date: Mon Nov  5 00:56:09 2001
Subject: [Numpy-discussion] Assembly optimized numerical packages?
In-Reply-To: <20011102182318.66182.qmail@web12606.mail.yahoo.com>
Message-ID: <Pine.OSF.3.95.1011105095309.19176F-100000@lcdx00.wm.lc.ehu.es>

On Fri, 2 Nov 2001, Hung Jung Lu wrote:

> Can someone give me pointers where to find
> freeware/shareware/commercial packages for linear
> algebra and probability calculations (e.g: Cholesky
> decomposition, eigenvalue & eigenvectors in
> diagonalization, interpolation, normal distribution,
> beta distribution, inverse cumulative normal function,
> etc.), and such that it uses assembly level
> optimization (I need highspeed, but on mundane Pentium
> 3 or Pentium 4 machines) and can be used in Windows
> platform and from Microsoft's Visual C++?
For statistical distribution functions, you can check DCDFLIB.C:
http://odin.mdacc.tmc.edu/anonftp/page_2.html

It is C, not assembler.

Jon Saenz.				| Tfno: +34 946012445
Depto. Fisica Aplicada II               | Fax:  +34 944648500
Facultad de Ciencias.   \\ Universidad del Pais Vasco \\
Apdo. 644   \\ 48080 - Bilbao  \\ SPAIN


From R.M.Everson at exeter.ac.uk  Tue Nov  6 13:05:05 2001
From: R.M.Everson at exeter.ac.uk (R.M.Everson)
Date: Tue Nov  6 13:05:05 2001
Subject: [Numpy-discussion] Sparse matrices
Message-ID: <ye1y9ljtxwk.fsf@orange30.ex.ac.uk>

Hello,

Does anyone have a working sparse matrix module for Numeric 20.2.0 and
Python 2.1 (or similar).  I'm tryinng to get the version in the SciPy
CVS tree to work - so far without success.

I don't want anything particularly fancy -- not even sparse matrix
inversion.  Addition and multiplication would be fine.

Thanks for any ideas/pointers/software etc!

Cheers,

Richard.

-- 
Department of Computer Science, Exeter University    Voice: +44 1392 264065
R.M.Everson at exeter.ac.uk                         Secretary: +44 1392 264061
http://www.dcs.ex.ac.uk/people/reverson                Fax: +44 1392 264067


From vanandel at atd.ucar.edu  Tue Nov  6 13:15:04 2001
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Tue Nov  6 13:15:04 2001
Subject: [Numpy-discussion] MA - math operations do not preserve fill_value
Message-ID: <3BE852CA.A18F9E5C@atd.ucar.edu>

Using Python 2.1 and Numeric 20.2.1 on Redhat Linux 7.1

Shouldn't masked arrays preserve the fill value of their operands, if
both operands have the same fill value?  Otherwise, if I want to
preserve the value of the fill_value, I have to write expressions like:


d=masked_values((a+b),a.fill_value()) 

Here's a demonstration of the problem:

>>> a = masked_values((1.0,2.0,3.0,4.0,-999.0), -999)
>>> b = masked_values((-999.0,1.0,2.0,3.0,4.0), -999)

>>> a
array(data =
 [   1.,   2.,   3.,   4.,-999.,],
      mask =
 [0,0,0,0,1,],
      fill_value=-999)
 
>>> b
array(data =
 [-999.,   1.,   2.,   3.,   4.,],
      mask =
 [1,0,0,0,0,],
      fill_value=-999)

>>> c=a+b
>>> c
array(data =
 [  1.00000002e+20,  3.00000000e+00,  5.00000000e+00,  7.00000000e+00,
        1.00000002e+20,],
      mask =
 [1,0,0,0,1,],
      fill_value=[  1.00000002e+20,])

>>> d=masked_values((a+b),a.fill_value())
>>> d
array(data =
 [-999.,   3.,   5.,   7.,-999.,],
      mask =
 [1,0,0,0,1,],
      fill_value=-999)
-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From roitblat at hawaii.edu  Tue Nov  6 17:05:03 2001
From: roitblat at hawaii.edu (Herbert L. Roitblat)
Date: Tue Nov  6 17:05:03 2001
Subject: [Numpy-discussion] Sparse matrices
References: <ye1y9ljtxwk.fsf@orange30.ex.ac.uk>
Message-ID: <055701c16727$b57fed90$8fd6afcf@pixi.com>

Travis Oliphant has one.
H.
----- Original Message -----
From: "R.M.Everson" <R.M.Everson at exeter.ac.uk>
To: <numpy-discussion at lists.sourceforge.net>
Sent: Tuesday, November 06, 2001 11:03 AM
Subject: [Numpy-discussion] Sparse matrices


>
> Hello,
>
> Does anyone have a working sparse matrix module for Numeric 20.2.0 and
> Python 2.1 (or similar).  I'm tryinng to get the version in the SciPy
> CVS tree to work - so far without success.
>
> I don't want anything particularly fancy -- not even sparse matrix
> inversion.  Addition and multiplication would be fine.
>
> Thanks for any ideas/pointers/software etc!
>
> Cheers,
>
> Richard.
>
> --
> Department of Computer Science, Exeter University    Voice: +44 1392
264065
> R.M.Everson at exeter.ac.uk                         Secretary: +44 1392
264061
> http://www.dcs.ex.ac.uk/people/reverson                Fax: +44 1392
264067
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From jochen at jochen-kuepper.de  Tue Nov  6 18:57:02 2001
From: jochen at jochen-kuepper.de (Jochen =?iso-8859-1?q?K=FCpper?=)
Date: Tue Nov  6 18:57:02 2001
Subject: [Numpy-discussion] Sparse matrices
In-Reply-To: <055701c16727$b57fed90$8fd6afcf@pixi.com>
References: <ye1y9ljtxwk.fsf@orange30.ex.ac.uk>
	<055701c16727$b57fed90$8fd6afcf@pixi.com>
Message-ID: <m3u1w79tmy.fsf@box.home.de>

On Tue, 6 Nov 2001 15:01:18 -1000 Herbert L Roitblat wrote:

Herbert> Travis Oliphant has one.

Isn't that the one in SciPy?

Herbert> ----- Original Message -----
Herbert> From: "R.M.Everson" <R.M.Everson at exeter.ac.uk>
Herbert> To: <numpy-discussion at lists.sourceforge.net>
Herbert> Sent: Tuesday, November 06, 2001 11:03 AM
Herbert> Subject: [Numpy-discussion] Sparse matrices

>> Does anyone have a working sparse matrix module for Numeric 20.2.0
>> and Python 2.1 (or similar).  I'm tryinng to get the version in the
>> SciPy CVS tree to work - so far without success.

Herbert,

this inverse citing really is counterproductive on mls.

Greetings,
Jochen
-- 
Einigkeit und Recht und Freiheit                http://www.Jochen-Kuepper.de
    Libert?, ?galit?, Fraternit?                GnuPG key: 44BCCD8E
        Sex, drugs and rock-n-roll


From nwagner at mecha.uni-stuttgart.de  Sun Nov 11 07:32:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Sun Nov 11 07:32:03 2001
Subject: [Numpy-discussion] RandomArray - random
Message-ID: <3BEEA88E.742E9225@mecha.uni-stuttgart.de>

Hi,

I tried to produce a random matrix say Q (2ndof \times nsamp+1) with
Numpy 20.2
and
Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "copyright", "credits" or "license" for more information.

Traceback (most recent call last):
  File "modal.py", line 192, in ?
    Q = 2.0*random((2*ndof,nsamp+1))-ones((2*ndof,nsamp+1))
TypeError: random() takes exactly 1 argument (2 given)

Does it require a new syntax to obtain a matrix consisting of uniformly
distributed random numbers in the range +/- 1 ?

Nils


From paul at pfdubois.com  Sun Nov 11 09:14:02 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Sun Nov 11 09:14:02 2001
Subject: [Numpy-discussion] RandomArray - random
In-Reply-To: <3BEEA88E.742E9225@mecha.uni-stuttgart.de>
Message-ID: <000001c16ad3$f3e688a0$3d01a8c0@plstn1.sfba.home.com>

Your reference to random is not fully qualified so I suppose you could
be picking up some other random. But I just tried
RandomArray.random((2,3)) and it worked fine.

BTW you could just do 2.0*random((n,m))-1.0.


-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Nils
Wagner
Sent: Sunday, November 11, 2001 8:34 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] RandomArray - random


Hi,

I tried to produce a random matrix say Q (2ndof \times nsamp+1) with
Numpy 20.2 and Python 2.1.1 (#1, Sep 24 2001, 05:28:47) [GCC 2.95.3
20010315 (SuSE)] on linux2 Type "copyright", "credits" or "license" for
more information.

Traceback (most recent call last):
  File "modal.py", line 192, in ?
    Q = 2.0*random((2*ndof,nsamp+1))-ones((2*ndof,nsamp+1))
TypeError: random() takes exactly 1 argument (2 given)

Does it require a new syntax to obtain a matrix consisting of uniformly
distributed random numbers in the range +/- 1 ?

Nils

_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From nwagner at mecha.uni-stuttgart.de  Mon Nov 12 04:01:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Mon Nov 12 04:01:03 2001
Subject: [Numpy-discussion] RandomArray - random
References: <000001c16ad3$f3e688a0$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <3BEFC88E.F87F363E@mecha.uni-stuttgart.de>

"Paul F. Dubois" schrieb:
> 
> Your reference to random is not fully qualified so I suppose you could
> be picking up some other random. But I just tried
> RandomArray.random((2,3)) and it worked fine.
> 
> BTW you could just do 2.0*random((n,m))-1.0.
> 
It seems to be a conflict with Vpython formerly Visualpython. 
http://cil.andrew.cmu.edu/projects/visual/index.html

Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from Numeric import *
>>> from RandomArray import *
>>> random((2,3))
array([[ 0.68769461,  0.33015978,  0.07285815],
       [ 0.20514929,  0.81925279,  0.50694615]])
>>> from visual import *
Visual-2001-09-24
>>> random((2,3))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: random() takes exactly 1 argument (2 given)
>>>

Nils

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Nils
> Wagner
> Sent: Sunday, November 11, 2001 8:34 AM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] RandomArray - random
> 
> Hi,
> 
> I tried to produce a random matrix say Q (2ndof \times nsamp+1) with
> Numpy 20.2 and Python 2.1.1 (#1, Sep 24 2001, 05:28:47) [GCC 2.95.3
> 20010315 (SuSE)] on linux2 Type "copyright", "credits" or "license" for
> more information.
> 
> Traceback (most recent call last):
>   File "modal.py", line 192, in ?
>     Q = 2.0*random((2*ndof,nsamp+1))-ones((2*ndof,nsamp+1))
> TypeError: random() takes exactly 1 argument (2 given)
> 
> Does it require a new syntax to obtain a matrix consisting of uniformly
> distributed random numbers in the range +/- 1 ?
> 
> Nils
> 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From neelk at cswcasa.com  Mon Nov 12 09:24:02 2001
From: neelk at cswcasa.com (Krishnaswami, Neel)
Date: Mon Nov 12 09:24:02 2001
Subject: [Numpy-discussion] Building Numeric with Intel KML and mingw32
Message-ID: <B1E4D3274D57D411BE8400D0B783FF32A8D5AA@exchange1.cswv.com>

Hello, 

I'm trying to rebuild Numeric with the Intel Kernel Math Libraries. 

I've gotten Numeric building normally with the default BLAS libraries,
but I'm not sure what I need to put into the libraries_dir_list and
libraries_list variables in the setup.py file. 

I have the directories mkl\ia32\bin (contains the DLLs), mkl\ia32\lib 
(contains the lib*.a files), and mkl\include (contains the *.h files). 

Can anyone tells me what goes where? 

--
Neel Krishnaswami
neelk at cswcasa.com
 

From nwagner at mecha.uni-stuttgart.de  Tue Nov 13 02:22:01 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Tue Nov 13 02:22:01 2001
Subject: [Numpy-discussion] Total least squares problem
Message-ID: <3BF102C6.8C651D9E@mecha.uni-stuttgart.de>

Hi,

How do I solve a Total Least Squares problem in Numpy ?
A small example would be appreciated.

The TLS problem assumes an overdetermined set of linear equations
AX = B, where both the data matrix A as well as the observation
matrix B are inaccurate:

Nils

Reference:

R.D.Fierro, G.H. Golub, P.C. Hansen, D.P.O'Leary,
Regularization by truncated total least squares,
SIAM J. Sci. Comput. Vol.18(4) 1997 pp. 1223-1241


From barnard at stat.harvard.edu  Tue Nov 13 06:42:03 2001
From: barnard at stat.harvard.edu (barnard at stat.harvard.edu)
Date: Tue Nov 13 06:42:03 2001
Subject: [Numpy-discussion] Small Bug in multiarray.c
Message-ID: <15345.13522.866400.686203@aragorn.stat.harvard.edu>

When attempting to compile the CVS version of Numpy using MSVC 6
under Windows 2000 I found a small error in multiarray.c: the doc
string for arange contains newlines. The offending code begins one
line # 1168. Simple removing the newlines from the string fixes the
error.

John

********************************
* John Barnard, Ph.D.
* Senior Research Statistician
* deCODE genetics
* 1000 Winter Str., Suite 3100
* Waltham, MA 02451
* Phone (Direct)  : (781) 290-5771 Ext. 27
* Phone (General) : (781) 466-8833
* Fax             : (781) 466-8686
* Email: j.barnard at decode.com
********************************


From oliphant at ee.byu.edu  Tue Nov 13 11:25:03 2001
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Nov 13 11:25:03 2001
Subject: [Numpy-discussion] Total least squares problem
In-Reply-To: <3BF102C6.8C651D9E@mecha.uni-stuttgart.de>
Message-ID: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>

>
> How do I solve a Total Least Squares problem in Numpy ?
> A small example would be appreciated.
>
> The TLS problem assumes an overdetermined set of linear equations
> AX = B, where both the data matrix A as well as the observation
> matrix B are inaccurate:


X, resids, rank, s = LinearAlgebra.linear_least_squares(A,B)

-Travis


From R.M.Everson at exeter.ac.uk  Tue Nov 13 13:53:01 2001
From: R.M.Everson at exeter.ac.uk (R.M.Everson)
Date: Tue Nov 13 13:53:01 2001
Subject: [Numpy-discussion] BLAS and innerproduct
Message-ID: <ye1lmhaibjv.fsf@orange30.ex.ac.uk>

Hello,

So far as I can tell Numeric.dot(), which uses innerproduct() from
multiarraymodule.c doesn't call the BLAS, even if Numeric was compiled
against native BLAS.   This means (at least on my machine) that 

X = ones((150, 16384), 'd')
C = dot(X, tranpose(X))

is about 15 times as slow as the comparable operations in Matlab (v6),
which does, I think, use the native BLAS.

I guess that multiarray.c is not particularly optimised to use the
BLAS because of the difficulties of coping with all sorts of types
(float32, int64 etc), and with non-contiguous arrays.  The
innerproduct is so basic to most of the work I use Numeric for that a
speed up here would make a big difference.  I'm thinking of patching
multiarray.c to use the BLAS when it can, but before I start are there
good reasons for doing something different?

Any advice gratefully received!

Cheers,

Richard.


-- 
Department of Computer Science, Exeter University    Voice: +44 1392 264065
R.M.Everson at exeter.ac.uk                         Secretary: +44 1392 264061
http://www.dcs.ex.ac.uk/people/reverson                Fax: +44 1392 264067


From nwagner at mecha.uni-stuttgart.de  Wed Nov 14 04:44:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Wed Nov 14 04:44:03 2001
Subject: [Numpy-discussion] Total least squares problem
References: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>
Message-ID: <3BF27591.EC1BF4EA@mecha.uni-stuttgart.de>

Travis Oliphant schrieb:
> 
> >
> > How do I solve a Total Least Squares problem in Numpy ?
> > A small example would be appreciated.
> >
> > The TLS problem assumes an overdetermined set of linear equations
> > AX = B, where both the data matrix A as well as the observation
> > matrix B are inaccurate:
> 
> X, resids, rank, s = LinearAlgebra.linear_least_squares(A,B)
> 
> -Travis

Travis,

There is a difference between classical least squares (Numpy)
and TLS (total least squares).
I am attaching a small example for illustration.

Nils
-------------- next part --------------
from Numeric import *
from LinearAlgebra import *
A = zeros((6,3),Float)
b = zeros((6,1),Float)
#
# Example by Van Huffel
# http://www.netlib.org/vanhuffel/dtls-doc
#
A[0,0] = 0.80010002
A[0,1] = 0.39985167
A[0,2] = 0.60005390

A[1,0] = 0.29996484
A[1,1] = 0.69990689
A[1,2] = 0.39997269

A[2,0] = 0.49994235
A[2,1] = 0.60003167
A[2,2] = 0.20012361

A[3,0] = 0.90013643
A[3,1] = 0.20016919
A[3,2] = 0.79995025

A[4,0] = 0.39998539
A[4,1] = 0.80006338
A[4,2] = 0.49985474

A[5,0] = 0.20002274
A[5,1] = 0.90007114
A[5,2] = 0.70009777

b[0] = 0.89999446
b[1] = 0.82997570
b[2] = 0.79011189
b[3] = 0.85002662
b[4] = 0.99016399
b[5] = 0.10299439
print 'Solution of an overdetermined system of linear equations A x = b'
print
print 'A'
print
print A
#
print 'b'
print
print b
#
x, resids, rank, s = linear_least_squares(A,b)
print
print 'Least squares solution (Numpy)'
print
print x
print
print 'Computed rank',rank
print
print 'Sum of the squared residuals', resids
print 
print 'Singular values of A in descending order'
print
print s
#
xtls = zeros((3,1),Float)
#
# total least squares solution given by Van Huffel 
# http://www.netlib.org/vanhuffel/dtls-doc
#
xtls[0] = 0.500254
xtls[1] = 0.800251
xtls[2] = 0.299492
print 
print 'Total least squares solution'
print
print xtls
print
print 'Residuals of LS (Numpy)'
print 
print matrixmultiply(A,x)-b
print 
print 'Residuals of TLS'
print 
print matrixmultiply(A,xtls)-b
print 
#
# Least squares in Numpy A^\top A x = A^\top b
#
Atb = matrixmultiply(transpose(A),b)
AtA = matrixmultiply(transpose(A),A)
xls = solve_linear_equations(AtA,Atb)
print
print 'Least squares solution via normal equation'
print 
print xls

From hinsen at cnrs-orleans.fr  Wed Nov 14 05:30:07 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed Nov 14 05:30:07 2001
Subject: [Numpy-discussion] Total least squares problem
In-Reply-To: <3BF27591.EC1BF4EA@mecha.uni-stuttgart.de>
References: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>
	<3BF27591.EC1BF4EA@mecha.uni-stuttgart.de>
Message-ID: <m31yj1bi1i.fsf@chinon.cnrs-orleans.fr>

Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:

> There is a difference between classical least squares (Numpy)
> and TLS (total least squares).

Algorithmically speaking it is even a very different problem. I'd say
the only reasonable (i.e. efficient) solution for NumPy is to
implement the TLS algorithm in a C subroutine calling LAPACK routines
for SVD etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From nwagner at mecha.uni-stuttgart.de  Wed Nov 14 06:13:07 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Wed Nov 14 06:13:07 2001
Subject: [Numpy-discussion] Total least squares problem
References: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>
		<3BF27591.EC1BF4EA@mecha.uni-stuttgart.de> <m31yj1bi1i.fsf@chinon.cnrs-orleans.fr>
Message-ID: <3BF28365.53373B65@mecha.uni-stuttgart.de>

Konrad Hinsen schrieb:
> 
> Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:
> 
> > There is a difference between classical least squares (Numpy)
> > and TLS (total least squares).
> 
> Algorithmically speaking it is even a very different problem. I'd say
> the only reasonable (i.e. efficient) solution for NumPy is to
> implement the TLS algorithm in a C subroutine calling LAPACK routines
> for SVD etc.
> 
> Konrad.
> --
There are two Fortran implementations of the TLS algorithm already
available via
http://www.netlib.org/vanhuffel/      
.
Moreover there is a tool called f2py that generates Python C/API modules
for wrapping Fortran 77/90/95 codes to Python.

Unfortunately I am not very familar with this tool.
 
Therefore I need some advice for this.

Thanks in advance

 Nils
 
  
> -------------------------------------------------------------------------------
> Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
> Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
> Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
> 45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
> France                                   | Nederlands/Francais
> -------------------------------------------------------------------------------


From nwagner at mecha.uni-stuttgart.de  Thu Nov 15 01:14:01 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Thu Nov 15 01:14:01 2001
Subject: [Numpy-discussion] Numpy, BLAS, LAPACK, f2py
Message-ID: <3BF395DB.DA1C34A4@mecha.uni-stuttgart.de>

Hi,

I have installed f2py on my system for wrapping existing FORTRAN 77
codes to Python.
Then I have gone through the following steps

An example for using a TLS (total least squares routine)
http://www.netlib.org/vanhuffel/

2) Get dtsl.f with dependencies
3) Run
   f2py dtsl.f -m foo -h foo.pyf only: dtsl
        \         \      \       \________ just wrap dtsl function      
         \         \      \______ create signature file
          \         \____ python module name
           \_____ Fortran 77 code
4) Edit foo.pyf to your specific needs (optional)
5) Run
   f2py foo.pyf
   \_____________ this will create Python C/API module foomodule.c
6) Run
   make -f Makefile-foo
   \_____________ this will build the module
7) In python:

Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import foo
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: ./foomodule.so: undefined symbol: dcopy_
>>>

Any suggestions to solve this problem ?

Nils

There are prebuilt libraries of LAPACK and BLAS in /usr/lib

-rw-r--r--    1 root     root       657706 Sep 24 01:00 libblas.a
lrwxrwxrwx    1 root     root           12 Okt 22 19:27 libblas.so ->
libblas.so.2
lrwxrwxrwx    1 root     root           16 Okt 22 19:27 libblas.so.2 ->
libblas.so.2.2.0
-rwxr-xr-x    1 root     root       559600 Sep 24 01:01 libblas.so.2.2.0
-rw-r--r--    1 root     root      5763150 Sep 24 01:00 liblapack.a
lrwxrwxrwx    1 root     root           14 Okt 22 19:27 liblapack.so ->
liblapack.so.3
lrwxrwxrwx    1 root     root           18 Okt 22 19:27 liblapack.so.3
-> liblapack.so.3.0.0
-rwxr-xr-x    1 root     root      4826626 Sep 24 01:01
liblapack.so.3.0.0


From gvermeul at labs.polycnrs-gre.fr  Thu Nov 15 01:28:02 2001
From: gvermeul at labs.polycnrs-gre.fr (Gerard Vermeulen)
Date: Thu Nov 15 01:28:02 2001
Subject: [Numpy-discussion] Numpy, BLAS, LAPACK, f2py
In-Reply-To: <3BF395DB.DA1C34A4@mecha.uni-stuttgart.de>
References: <3BF395DB.DA1C34A4@mecha.uni-stuttgart.de>
Message-ID: <01111510271301.11576@taco.polycnrs-gre.fr>

Hi,

Try to link in the blas library (there is a dcopy_ in my blas library,
but better check the README first).

best regards -- Gerard

On Thursday 15 November 2001 11:15, Nils Wagner wrote:
> Hi,
>
> I have installed f2py on my system for wrapping existing FORTRAN 77
> codes to Python.
> Then I have gone through the following steps
>
> An example for using a TLS (total least squares routine)
> http://www.netlib.org/vanhuffel/
>
> 2) Get dtsl.f with dependencies
> 3) Run
>    f2py dtsl.f -m foo -h foo.pyf only: dtsl
>         \         \      \       \________ just wrap dtsl function
>          \         \      \______ create signature file
>           \         \____ python module name
>            \_____ Fortran 77 code
> 4) Edit foo.pyf to your specific needs (optional)
> 5) Run
>    f2py foo.pyf
>    \_____________ this will create Python C/API module foomodule.c
> 6) Run
>    make -f Makefile-foo
>    \_____________ this will build the module
> 7) In python:
>
> Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
> [GCC 2.95.3 20010315 (SuSE)] on linux2
> Type "copyright", "credits" or "license" for more information.
>
> >>> import foo
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ImportError: ./foomodule.so: undefined symbol: dcopy_
>
>
> Any suggestions to solve this problem ?
>
> Nils
>
> There are prebuilt libraries of LAPACK and BLAS in /usr/lib
>
> -rw-r--r--    1 root     root       657706 Sep 24 01:00 libblas.a
> lrwxrwxrwx    1 root     root           12 Okt 22 19:27 libblas.so ->
> libblas.so.2
> lrwxrwxrwx    1 root     root           16 Okt 22 19:27 libblas.so.2 ->
> libblas.so.2.2.0
> -rwxr-xr-x    1 root     root       559600 Sep 24 01:01 libblas.so.2.2.0
> -rw-r--r--    1 root     root      5763150 Sep 24 01:00 liblapack.a
> lrwxrwxrwx    1 root     root           14 Okt 22 19:27 liblapack.so ->
> liblapack.so.3
> lrwxrwxrwx    1 root     root           18 Okt 22 19:27 liblapack.so.3
> -> liblapack.so.3.0.0
> -rwxr-xr-x    1 root     root      4826626 Sep 24 01:01
> liblapack.so.3.0.0
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From perry at stsci.edu  Fri Nov 16 14:33:02 2001
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Nov 16 14:33:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
Message-ID: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>

We have been working on a reimplementation of Numeric, the
numeric array manipulation extension module for Python. 
The reimplementation is virtually a complete rewrite
and because it is not completely backwards compatible
with Numeric, we have dubbed it numarray to prevent
confusion.

While we think this version is not yet mature enough for
most to use in everyday projects, we are interested in
feedback on the user interface and the open issues (see 
the documents on the web page shown below). We also welcome
those who would like to contribute to this effort by helping
with the development or adding libraries.

An early beta version is available on sourceforge as the
package Numarray (http://sourceforge.net/projects/numpy/)

Information on the goals, changes in user interface, open issues,
and design can be found at http://aten.stsci.edu/numarray


From pete at shinners.org  Fri Nov 16 15:12:02 2001
From: pete at shinners.org (Pete Shinners)
Date: Fri Nov 16 15:12:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
Message-ID: <3BF59D10.2070107@shinners.org>

Perry Greenfield wrote:

> An early beta version is available on sourceforge as the
> package Numarray (http://sourceforge.net/projects/numpy/)
> 
> Information on the goals, changes in user interface, open issues,
> and design can be found at http://aten.stsci.edu/numarray

you ask a few questions on the information website, here are some of my 
answers for things i "care" about.

note that my main use of numpy is as a pixel buffer for images. some of 
the changes like avoiding type promotion sounds really good to me :]

5) should the implementation be bulletproof for private vars?
i don't think you should worry about this. as long as the interface is 
well defined, i wouldn't worry about protecting users from themselves. i 
this it will be the rare numarray user will be in a situation where they 
need to modify the internal C data.

7) necessary to add other types?
yes. i really want unsigned int16 and unsigned int32. all my operations 
are on pixel data, and things can just get messy when i need to treat 
packed color values as signed integers.

8) negative and out-of-range indices?
i'd prefer them to be kept as similar to python as can be. the current 
implementation in Numeric is nice for me.


one other thing i'd like there to be a little focus on is adding my own 
new ufunc operators. for image manipulation i'd like new ufunc operators 
that clamp the results to legal values. i'd be happy to do this myself, 
but i don't believe it's possible with the current Numeric.

the last thing i really really want is for this to be rolled into the 
standard python distribution. that is perhaps the most important aspect 
for me. i do not like requiring the extra dependency for generic numeric 
arrays. :]


From oliphant.travis at ieee.org  Fri Nov 16 18:42:02 2001
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Nov 16 18:42:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
Message-ID: <E164vQQ-0003ai-00@mail.xmission.com>

>
> While we think this version is not yet mature enough for
> most to use in everyday projects, we are interested in
> feedback on the user interface and the open issues (see
> the documents on the web page shown below). We also welcome
> those who would like to contribute to this effort by helping
> with the development or adding libraries.
>

What I've seen looks great.   You've all done some good work here.  

Of course, I do have some feedback.  I haven't looked at everything, these 
points have just caught my eye.

Complex Types:
==============

1)  I don't like the idea of complex types being a separate subclass of 
ndarray.  This makes them "different."   Unless this "difference" can be 
completely hidden (which I doubt), I would prefer complex types to be on the 
same level as other numeric types.

2)  Also,  in your C-API, you have a different pointer to the imaginary data. 
  I much prefer the way it is done currently to have complex numbers 
represented as an 8-byte, or 16-byte chunk of contiguous memory. 


Index Arrays:
===========

1)  For what it's worth, my initial reaction to your indexing scheme is 
negative.  I would prefer that if

a = [[1,2,3,4],
      [5,6,7,8],
      [9,10,11,12],
      [13,14,15,16]]

then 

a[[1,3],[0,3]] returns the sub-matrix:

[[   4,  6],
 [ 12, 14]

i.e. the cross-product of [1,3] x [0,3]   This is the way MATLAB works.  I'm 
not sure what IDL does.

If I understand your scheme, right now, then I would have to append an extra 
dimension to my indexing arrays to get this behavior, right?

2) I would like to be able to index the array in a flattenned sense as well 
(is that possible?)  in other words, it would be nice if a[flat(9,10,11)] or 
something got me the elements 9,10,11 in a one-dimensional interpretation of 
the array.

3) Why can't you combine slice notation and indexing?  Just interpret the 
slice as index array that would be created from using tha range operator on 
the same start, stop, and step objects.  Is this the plan?

That's all for now.  I don't mean to be critical, I'm really impressed with 
what works so far.  These are just some concerns I have right now.

-Travis Oliphant


From europax at home.com  Sat Nov 17 08:06:02 2001
From: europax at home.com (Rob)
Date: Sat Nov 17 08:06:02 2001
Subject: [Numpy-discussion] Numeric Python EM Project may need mirror
Message-ID: <3BF68A67.C4963807@home.com>

Hi all,

I just got an email from @home yesterday, saying that all customers
should back up their web pages, email, etc etc.  I know they are in
bankruptcy, but this email sounded ominous.  I'm wondering if there is
some kindly soul who would want to mirror this site.   I'd really love
to have this site on Starship Python, but haven't had any responses to
emails to them.  

I'm continuously working on more code for the site so I'd hate to see it
go down, even if temporarily.

Sincerely,  Rob.
-- 
The Numeric Python EM Project

www.members.home.net/europax


From greenfield at home.com  Sat Nov 17 14:58:02 2001
From: greenfield at home.com (Perry Greenfield)
Date: Sat Nov 17 14:58:02 2001
Subject: FW: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
Message-ID: <NFBBIJKCMEHBALEOENDECEPACGAA.greenfield@home.com>


-----Original Message-----
>
> What I've seen looks great.   You've all done some good work here.
>
Thanks, you were origin of some of the ideas used.

> Of course, I do have some feedback.  I haven't looked at
> everything, these
> points have just caught my eye.
>
> Complex Types:
> ==============
>
> 1)  I don't like the idea of complex types being a separate subclass of
> ndarray.  This makes them "different."   Unless this "difference" can be
> completely hidden (which I doubt), I would prefer complex types
> to be on the
> same level as other numeric types.
>
I think that we also don't like that, and after doing the original,
somewhat incomplete, implementation using the subclassed approach,
I began to feel that implementing it in C (albeit using a different
approach for the code generation) was probably easier and more
elegant than what was done here. So you are very likely to see
it integrated as a regular numeric type, with a more C-based
implementation.

> 2)  Also,  in your C-API, you have a different pointer to the
> imaginary data.
>   I much prefer the way it is done currently to have complex numbers
> represented as an 8-byte, or 16-byte chunk of contiguous memory.
>
Any reason not to allow both? (The pointer to the real can be interpreted
as either a pointer to 8-byte or 16-byte quantities). It is true
that figuring out the imaginary pointer from the real is trivial
so I suppose it really isn't necessary.
>
> Index Arrays:
> ===========
>
> 1)  For what it's worth, my initial reaction to your indexing scheme is
> negative.  I would prefer that if
>
> a = [[1,2,3,4],
>       [5,6,7,8],
>       [9,10,11,12],
>       [13,14,15,16]]
>
> then
>
> a[[1,3],[0,3]] returns the sub-matrix:
>
> [[   4,  6],
>  [ 12, 14]
>
> i.e. the cross-product of [1,3] x [0,3]   This is the way MATLAB
> works.  I'm
> not sure what IDL does.
>
I'm afraid I don't understand the example. Could you elaborate
a bit more how this is supposed to work? (Or is it possible
there is an error? I would understand it if the result were
[[5, 8],[13,16]] corresponding to the index pairs
[[(1,0),(1,3)],[(3,0),(3,3)]])

> If I understand your scheme, right now, then I would have to
> append an extra
> dimension to my indexing arrays to get this behavior, right?
>
> 2) I would like to be able to index the array in a flattenned
> sense as well
> (is that possible?)  in other words, it would be nice if
> a[flat(9,10,11)] or
> something got me the elements 9,10,11 in a one-dimensional
> interpretation of
> the array.
>
Why not:

ravel(a)[[9,10,11]] ?

> 3) Why can't you combine slice notation and indexing?  Just interpret the
> slice as index array that would be created from using tha range
> operator on
> the same start, stop, and step objects.  Is this the plan?
>
I think that allowing slicing could be possible. But things were
getting pretty complex as they were, and we wanted to see if
there was agreement on how it was being done so far. It could
be extended to handle slices, if there was a well defined
interpretation. (I think there may be at least two possible
interpretations considered). As for the above, sure, but of
course the slice would have to be shape consistent with
the other index arrays (under the current scheme).

> That's all for now.  I don't mean to be critical, I'm really
> impressed with
> what works so far.  These are just some concerns I have right now.
>
> -Travis Oliphant
>
Thanks Travis, we're looking for constructive feedback, positive
or negative.

Perry


From greenfield at home.com  Sat Nov 17 16:28:02 2001
From: greenfield at home.com (Perry Greenfield)
Date: Sat Nov 17 16:28:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E165Ai5-0001VX-00@mail.xmission.com>
Message-ID: <NEBBIJKBMLDBLNCEEFOCGEECCMAA.greenfield@home.com>

> > I think that we also don't like that, and after doing the original,
> > somewhat incomplete, implementation using the subarray approach,
> > I began to feel that implementing it in C (albiet using a different
> > approach for the code generation) was probably easier and more
> > elegant than what was done here. So you are very likely to see
> > it integrated as a regular numeric type, with a more C-based
> > implementation.
>
> Sounds good.   Is development going to take place on the CVS
> tree.  If so, I
> could help out by comitting changes directly.
>
> >
> > > 2)  Also,  in your C-API, you have a different pointer to the
> > > imaginary data.
> > >   I much prefer the way it is done currently to have complex numbers
> > > represented as an 8-byte, or 16-byte chunk of contiguous memory.
> >
> > Any reason not to allow both? (The pointer to the real can be
> interpreted
> > as either a pointer to 8-byte or 16-byte quantities). It is true
> > that figuring out the imaginary pointer from the real is trivial
> > so I suppose it really isn't necessary.
>
> I guess the way you've structured the ndarray, it is possible.  I figured
> some operations might be faster, but perhaps not if you have two pointers
> running at the same time, anyway.
>
Well, the C implementation I was thinking of would only use
one pointer. The API could supply both if some algorithms would
find it useful to just access the imaginary data alone. But as
mentioned, I don't think it is important to include, so we
could easily get rid of it (and probably should)

> >
> > > Index Arrays:
> > > ===========
> > >
> > > 1)  For what it's worth, my initial reaction to your indexing
> scheme is
> > > negative.  I would prefer that if
> > >
> > > a = [[1,2,3,4],
> > >       [5,6,7,8],
> > >       [9,10,11,12],
> > >       [13,14,15,16]]
> > >
> > > then
> > >
> > > a[[1,3],[0,3]] returns the sub-matrix:
> > >
> > > [[   4,  6],
> > >  [ 12, 14]
> > >
> > > i.e. the cross-product of [1,3] x [0,3]   This is the way MATLAB
> > > works.  I'm
> > > not sure what IDL does.
> >
> > I'm afraid I don't understand the example. Could you elaborate
> > a bit more how this is supposed to work? (Or is it possible
> > there is an error? I would understand it if the result were
> > [[5, 8],[13,16]] corresponding to the index pairs
> > [[(1,0),(1,3)],[(3,0),(3,3)]])
> >
>
> The idea is to consider indexing with arrays of integers to be a
> generalization of slice index notation.   Simply interpret the
> slice as an
> array of integers that would be formed by using the range operator.
>
> For example, I would like to see
>
> a[1:5,1:3] be the same thing  as  a[[1,2,3,4],[1,2]]
>
> a[1:5,1:3] selects the 2-d subarray consisting of rows 1 to 4 and
> columns 1
> to 2 (inclusive starting with the first row being row 0).  In
> other words,
> the indices used to select the elements of a are ordered-pairs
> taken from the
> cross-product of the index set:
>
> [1,2,3,4] x [1,2] = [(1,1), (1,2), (2,1), (2,2), (3,1), (3,2),
> (4,1), (4,2)]
> and these selected elements are structured as a 2-d array of shape (4,2)
>
> Does this make more sense?  Indexing would be a natural extension of this
> behavior but allowing sets that can't be necessarily formed from
> the range
> function.
>
I understand this (but is the example in the first message
consistent with this?). This is certainly a reasonable
interpetation. But if this is the way multiple index arrays
are interpreted, how does one easily specify scattered points
in a multidimensional array? The only other alternative I can
think of is to use some of the dimensions of a multidimensional
index array as indicies for each of the dimensions. For example,
if one wanted to index random points in a 2d array, then
supplying an nx2 array would provide a list of n such points.
But I see this as a more limiting way to do this (and there
are often benefits to being able to keep the indices for
different dimensions in separate arrays.

But I think doing what you would like to do is straightforward
even with the existing implementation. For example, if x is a
2d array we could easily develop a function such that:

x[outer_index_product([1,3,4],[1,5])]
# with a better function name!

The function outer_index_product would return a tuple of two
index arrays each with a shape of 3x2. These arrays
would not take up more space than the original
arrays even though they appear to have a much
larger size (the one dimension is replicated by
use of a 0 stride size so the data buffer is
the same as the original). Would this be acceptable?

In the end, all these indexing behaviors can be provided
by different functions. So it isn't really a question of
which one to have and which not to have. The question is
what is supported by the indexing notation? For us, the
behavior we have implemented is far more useful for our
applications than the one you propose. But perhaps we are
in the minority, so I'd be very interested in hearing which
indexing interpretation is most useful to the general
community.

> > Why not:
> >
> > ravel(a)[[9,10,11]] ?
>
> sure, that would work, especially if ravel doesn't make a copy of
> the data
> (which I presume it does not).
>
Correct.

Perry


From greenfield at home.com  Sat Nov 17 17:23:06 2001
From: greenfield at home.com (Perry Greenfield)
Date: Sat Nov 17 17:23:06 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E165Bjn-0006TO-00@usw-sf-list1.sourceforge.net>
Message-ID: <NFBBIJKCMEHBALEOENDEGEPACGAA.greenfield@home.com>

From: Pete Shinners <pete at shinners.org>

> 7) necessary to add other types?
> yes. i really want unsigned int16 and unsigned int32. all my operations 
> are on pixel data, and things can just get messy when i need to treat 
> packed color values as signed integers.
> 
Unsigned int16 is already supported. UInt32 could be done, but
raises some interesting issues with regard to combining with
Int32. I don't believe the current implementation prevents you
from carrying around unsigned data in Int32 arrays. If you
are using them as packed color values, do you ever do any
arithmetic operations on them other than to pack and unpack 
them?

> one other thing i'd like there to be a little focus on is adding my own 
> new ufunc operators. for image manipulation i'd like new ufunc operators 
> that clamp the results to legal values. i'd be happy to do this myself, 
> but i don't believe it's possible with the current Numeric.
> 
It will be possible for users to add their own ufuncs. We will 
eventually document how to do so (and it should be fairly simple
to do once we give a few example templates).

Perry
> 


From alessandro.mirone at wanadoo.fr  Sun Nov 18 07:42:01 2001
From: alessandro.mirone at wanadoo.fr (Alessandro Mirone)
Date: Sun Nov 18 07:42:01 2001
Subject: [Numpy-discussion] Heigenvalues is broken
Message-ID: <3BF7E462.A473B686@wanadoo.fr>

Is it a problem of lapack3.0 of of
LinearAlgebra.py?
..................... ==> (Eigenvalues should be (0,2)) 


>>> a=array([[1,0],[0,1]])
>>> b=array([[0,1],[-1,0]])
>>> M=a+b*complex(0,1.0)
>>> Heigenvalues(M)
array([-2.30277564,  1.30277564])
>>> print M
[[ 1.+0.j  0.+1.j]
 [ 0.-1.j  1.+0.j]]
>>>


From oliphant.travis at ieee.org  Sun Nov 18 19:01:01 2001
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Nov 18 19:01:01 2001
Subject: [Numpy-discussion] Heigenvalues is broken
In-Reply-To: <3BF7E462.A473B686@wanadoo.fr>
References: <3BF7E462.A473B686@wanadoo.fr>
Message-ID: <E165efe-0008M3-00@mail.xmission.com>

On Sunday 18 November 2001 09:40 am, Alessandro Mirone wrote:
> Is it a problem of lapack3.0 of of
> LinearAlgebra.py?
> ..................... ==> (Eigenvalues should be (0,2))
>
> >>> a=array([[1,0],[0,1]])
> >>> b=array([[0,1],[-1,0]])
> >>> M=a+b*complex(0,1.0)
> >>> Heigenvalues(M)

I suspect it is your lapack.  On an Athlon running Mandrake Linux with the  
lapack-3.0-9 package, I get.

>>> a=array([[1,0],[0,1]])
>>> b=array([[0,1],[-1,0]])
>>> M=a+b*complex(0,1.0)
>>> Heigenvalues(M)
array([ 0.,  2.])


-Travis


From nwagner at mecha.uni-stuttgart.de  Sun Nov 18 23:58:01 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Sun Nov 18 23:58:01 2001
Subject: [Numpy-discussion] Heigenvalues is broken
References: <3BF7E462.A473B686@wanadoo.fr>
Message-ID: <3BF8C9FA.97B3AEB1@mecha.uni-stuttgart.de>

Alessandro Mirone schrieb:
> 
> Is it a problem of lapack3.0 of of
> LinearAlgebra.py?
> ..................... ==> (Eigenvalues should be (0,2))
> 
> >>> a=array([[1,0],[0,1]])
> >>> b=array([[0,1],[-1,0]])
> >>> M=a+b*complex(0,1.0)
> >>> Heigenvalues(M)
> array([-2.30277564,  1.30277564])
> >>> print M
> [[ 1.+0.j  0.+1.j]
>  [ 0.-1.j  1.+0.j]]
> >>>
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

On an Athlon running SuSE Linux 7.3 with the  
lapack-3.0-0 package, I get.

[-2.30277564  1.30277564]

Nils


From Peter.Verveer at embl-heidelberg.de  Mon Nov 19 02:45:02 2001
From: Peter.Verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Nov 19 02:45:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <3BF59D10.2070107@shinners.org>
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu> <3BF59D10.2070107@shinners.org>
Message-ID: <E165lul-0007lF-00@usw-sf-list1.sourceforge.net>

On Saturday 17 November 2001 00:11 am, you wrote:
> note that my main use of numpy is as a pixel buffer for images. some of
> the changes like avoiding type promotion sounds really good to me :]

I have exactly the same application so I agree with this.

> 7) necessary to add other types?
> yes. i really want unsigned int16 and unsigned int32. all my operations
> are on pixel data, and things can just get messy when i need to treat
> packed color values as signed integers.

Yes please! One of the things that irritates me most on the original Numeric 
is that some types are lacking. I think the whole range of data types should 
be supported, even if some may be seldom used by most people.

> one other thing i'd like there to be a little focus on is adding my own
> new ufunc operators. for image manipulation i'd like new ufunc operators
> that clamp the results to legal values. i'd be happy to do this myself,
> but i don't believe it's possible with the current Numeric.

I write functions in C that directly access the numeric data. I don't use the 
ufunc api. One reason that I do that is that I want my libary of routines to 
be useful independent of  Numeric, so I only have a tiny glue between my C 
routines and Numeric. I hope that it will be still possible to do this in the 
new version.

> the last thing i really really want is for this to be rolled into the
> standard python distribution. that is perhaps the most important aspect
> for me. i do not like requiring the extra dependency for generic numeric
> arrays. :]

I second that!

Cheers, Peter
-- 
Dr. Peter J. Verveer
Bastiaens Group
Cell Biology and Cell Biophysics Programme
EMBL
Meyerhofstrasse 1
D-69117 Heidelberg
Germany
Tel. : +49 6221 387245
Fax  : +49 6221 387242
Email: Peter.Verveer at embl-heidelberg.de


From tpitts at accentopto.com  Mon Nov 19 05:58:03 2001
From: tpitts at accentopto.com (Todd Alan Pitts, Ph.D.)
Date: Mon Nov 19 05:58:03 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E164vQQ-0003ai-00@mail.xmission.com>; from oliphant.travis@ieee.org on Fri, Nov 16, 2001 at 07:43:41PM -0700
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu> <E164vQQ-0003ai-00@mail.xmission.com>
Message-ID: <20011119065758.B11653@fermi.accentopto.com>

Thanks for all of your work.  Things seem to be shaping up nicely.  I
just wanted to second some of the concerns below:

> Complex Types:
> ==============
> 
> 1)  I don't like the idea of complex types being a separate subclass of 
> ndarray.  This makes them "different."   Unless this "difference" can be 
> completely hidden (which I doubt), I would prefer complex types to be on the 
> same level as other numeric types.
> 
> 2)  Also,  in your C-API, you have a different pointer to the imaginary data. 
>   I much prefer the way it is done currently to have complex numbers 
> represented as an 8-byte, or 16-byte chunk of contiguous memory. 
> 

The second comment above is really critical for accessing utility
available in a very large number of numerical libraries.  In my view
this would "break" the utility of numpy severely -- recopying arrays
both on the way out and the way in would be extremely cumbersome.

-Todd Alan Pitts


From jh at oobleck.astro.cornell.edu  Mon Nov 19 08:47:02 2001
From: jh at oobleck.astro.cornell.edu (Joe Harrington)
Date: Mon Nov 19 08:47:02 2001
Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #345 - 4 msgs
In-Reply-To: <E165Bjn-0006TO-00@usw-sf-list1.sourceforge.net>
	(numpy-discussion-request@lists.sourceforge.net)
References: <E165Bjn-0006TO-00@usw-sf-list1.sourceforge.net>
Message-ID: <200111191646.fAJGkCL28182@oobleck.astro.cornell.edu>

Just to fill in the blanks, here's what IDL does:

IDL> a = [[1,2,3,4], $
IDL>      [5,6,7,8], $
IDL>      [9,10,11,12], $
IDL>      [13,14,15,16]]
IDL> print,a
       1       2       3       4
       5       6       7       8
       9      10      11      12
      13      14      15      16
IDL> print, a[[1,3],[0,3]] 
       2      16

--jh--


From jsw at cdc.noaa.gov  Mon Nov 19 11:37:05 2001
From: jsw at cdc.noaa.gov (Jeff Whitaker)
Date: Mon Nov 19 11:37:05 2001
Subject: [Numpy-discussion] Heigenvalues is broken
In-Reply-To: <E165efe-0008M3-00@mail.xmission.com>
Message-ID: <Pine.OSX.4.40.0111191234250.20848-100000@crdmac10.cdc.noaa.gov>

On Sun, 18 Nov 2001, Travis Oliphant wrote:

> On Sunday 18 November 2001 09:40 am, Alessandro Mirone wrote:
> > Is it a problem of lapack3.0 of of
> > LinearAlgebra.py?
> > ..................... ==> (Eigenvalues should be (0,2))
> >
> > >>> a=array([[1,0],[0,1]])
> > >>> b=array([[0,1],[-1,0]])
> > >>> M=a+b*complex(0,1.0)
> > >>> Heigenvalues(M)
>
> I suspect it is your lapack.  On an Athlon running Mandrake Linux with the
> lapack-3.0-9 package, I get.
>
> >>> a=array([[1,0],[0,1]])
> >>> b=array([[0,1],[-1,0]])
> >>> M=a+b*complex(0,1.0)
> >>> Heigenvalues(M)
> array([ 0.,  2.])

This is definitely a hardware/compiler dependant feature.  I get the
"right" answer on Solaris (with the forte compiler) but the same "wrong"
answer as Alessandro on MacOS X/gcc.  I've tried fiddling with compiler
options on my OS X box, to no avail.

-Jeff

 --
Jeffrey S. Whitaker         Phone  : (303)497-6313
Meteorologist               FAX    : (303)497-6449
NOAA/OAR/CDC  R/CDC1        Email  : jsw at cdc.noaa.gov
325 Broadway                Web    : www.cdc.noaa.gov/~jsw
Boulder, CO, USA 80303-3328 Office : Skaggs Research Cntr 1D-124


From ransom at physics.mcgill.ca  Mon Nov 19 11:47:02 2001
From: ransom at physics.mcgill.ca (Scott Ransom)
Date: Mon Nov 19 11:47:02 2001
Subject: [Numpy-discussion] Heigenvalues is broken
In-Reply-To: <Pine.OSX.4.40.0111191234250.20848-100000@crdmac10.cdc.noaa.gov>
References: <Pine.OSX.4.40.0111191234250.20848-100000@crdmac10.cdc.noaa.gov>
Message-ID: <E165uMn-0002IJ-00@spock.physics.mcgill.ca>

On November 19, 2001 02:36 pm, Jeff Whitaker wrote:
>
> This is definitely a hardware/compiler dependant feature.  I get the
> "right" answer on Solaris (with the forte compiler) but the same "wrong"
> answer as Alessandro on MacOS X/gcc.  I've tried fiddling with compiler
> options on my OS X box, to no avail.

But seemingly it is even stranger than this.  Here are my results from Debian 
unstable using Lapack 3.0 on an Athlon system:

Python 2.1.1 (#1, Nov 11 2001, 18:19:24)
[GCC 2.95.4 20011006 (Debian prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from LinearAlgebra import *
>>> a=array([[1,0],[0,1]])
>>> b=array([[0,1],[-1,0]])
>>> M=a+b*complex(0,1.0)
>>> Heigenvalues(M)
array([ 0.,  2.])

Scott


> On Sun, 18 Nov 2001, Travis Oliphant wrote:
> > On Sunday 18 November 2001 09:40 am, Alessandro Mirone wrote:
> > > Is it a problem of lapack3.0 of of
> > > LinearAlgebra.py?
> > > ..................... ==> (Eigenvalues should be (0,2))
> > >
> > > >>> a=array([[1,0],[0,1]])
> > > >>> b=array([[0,1],[-1,0]])
> > > >>> M=a+b*complex(0,1.0)
> > > >>> Heigenvalues(M)
> >
> > I suspect it is your lapack.  On an Athlon running Mandrake Linux with
> > the lapack-3.0-9 package, I get.
> >
> > >>> a=array([[1,0],[0,1]])
> > >>> b=array([[0,1],[-1,0]])
> > >>> M=a+b*complex(0,1.0)
> > >>> Heigenvalues(M)
> >
> > array([ 0.,  2.])
>
> This is definitely a hardware/compiler dependant feature.  I get the
> "right" answer on Solaris (with the forte compiler) but the same "wrong"
> answer as Alessandro on MacOS X/gcc.  I've tried fiddling with compiler
> options on my OS X box, to no avail.
>
> -Jeff
>
>  --
> Jeffrey S. Whitaker         Phone  : (303)497-6313
> Meteorologist               FAX    : (303)497-6449
> NOAA/OAR/CDC  R/CDC1        Email  : jsw at cdc.noaa.gov
> 325 Broadway                Web    : www.cdc.noaa.gov/~jsw
> Boulder, CO, USA 80303-3328 Office : Skaggs Research Cntr 1D-124
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
Scott M. Ransom              Address:  McGill Univ. Physics Dept.
Phone:  (514) 398-6492                 3600 University St., Rm 338
email:  ransom at physics.mcgill.ca       Montreal, QC  Canada H3A 2T8 
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989


From Barrett at stsci.edu  Mon Nov 19 14:12:02 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Mon Nov 19 14:12:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
Message-ID: <3BF98336.9010500@STScI.Edu>

Perry Greenfield wrote:

> 
> An early beta version is available on sourceforge as the
> package Numarray (http://sourceforge.net/projects/numpy/)
> 
> Information on the goals, changes in user interface, open issues,
> and design can be found at http://aten.stsci.edu/numarray


  6) Should array properties be accessible as public attributes
    instead of through accessor methods?

    We don't currently allow public array attributes to make
    the Python code simpler and faster (otherwise we will
    be forced to use __setattr__ and such). This results in
    incompatibilty with previous code that uses such attributes.


I prefer the use of public attributes over accessor methods.


-- 
Paul Barrett, PhD      Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From perry at stsci.edu  Tue Nov 20 12:29:13 2001
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Nov 20 12:29:13 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E166HCW-0007DC-00@usw-sf-list1.sourceforge.net>
Message-ID: <JFEGLNDJEDNOMPPHDEJFMEJDDLAA.perry@stsci.edu>

>   6) Should array properties be accessible as public attributes
>     instead of through accessor methods?
> 
>     We don't currently allow public array attributes to make
>     the Python code simpler and faster (otherwise we will
>     be forced to use __setattr__ and such). This results in
>     incompatibilty with previous code that uses such attributes.
> 
> 
> I prefer the use of public attributes over accessor methods.
> 
> 
> -- 
> Paul Barrett, PhD      Space Telescope Science Institute

The issue of efficiency may not be a problem with Python 2.2
or later since it provides new mechanisms that avoid the need
to use __setattr__ to solve this problem. (e.g. __slots__,
property, __get__, and __set__). So it becomes more
of an issue of which style people prefer rather than simplicity
and speed of the code.

Perry


From chrishbarker at home.net  Tue Nov 20 15:23:12 2001
From: chrishbarker at home.net (Chris Barker)
Date: Tue Nov 20 15:23:12 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical arrays (Numeric) available
 for  download
References: <JFEGLNDJEDNOMPPHDEJFEEJCDLAA.perry@stsci.edu>
Message-ID: <3BFAEA19.3153B495@home.net>

Perry Greenfield wrote:

> > One major comment that isn't directly addressed on the web page is the
> > ease of writing new functions, I suppose Ufuncs, although I don't
> > usually care if they work on anything other than Arrays. I hope the new
> > system will make it easier to write new ones. 
<snip>

> Absolutely. We will provide examples of how to write new ufuncs. It should
> be very simple in one sense (requiring few lines of code) if our code
> generator machinery is used (but context is important here so this
> is why examples or a template is extremely important). But it isn't
> particularly hard to do without the code generator. And such ufuncs
> will handle *all* the generality of arrays including slices, non-aligned
> arrays, byteswapped arrays, and type conversion. I'd like to provide
> examples of writing ufuncs within a few weeks (along with examples
> of other kinds of functions using the C-API as well).

This sounds great! The code generting machinery sound very promising,
and examples are, of course, key. I found digging through the NumPy
source to figure out how to do things very treacherous. Making writing
Ufuncs easy will enocourage a lot more C Ufuncs to be written which
should help perfomance.

> > Also, I can't help wondering if this could leverage more existing code.
> > The blitz++ package being used by Eric Jones in the SciPy.compiler
> > project looks very promising. It's probably too late, but I'm wondering
> > what the reasons are for re-inventing such a general purpose wheel.
> >
> I'm not sure which "wheel" you are talking about :-)

The wheel I'm talking about are multi-dimensional array objects...

> We certainly
> aren't trying to replilcate what Eric Jones has done with the
> SciPy.compiler approach (which is very interesting in its own right).

I know, I just think using an existing set of C++ classes for multiple
typed multidimansional arrays would make sense, although I imagine it is
too late now!

> If the issue is why we are redoing Numeric:

Actually, I think I had a pretty good idea why you were working on this.

> 1) it has to be rewritten to be acceptable to Guido before it can be
>    part of the Standard Library.
> 2) to add new types (e.g. unsigned) and representations (e.g., non-aligned,
>    byteswapped, odd strides, etc). Using memory mapped data requires some
>    of these.
> 3) to make it more memory efficient with large arrays.
> 4) to make it more generally extensible

I'm particualry excited about 1) and 4)

> > As a whole I have found that I would like the transition from Python to
> > Compiled laguages to be smoother. The standard answer to Python
> > perfomance is to profile, and then re-write the computationally intesive
> > pertions in C. This would be a whole lot easier if Python used datatypes
> > that are easy to use from C/C++ as well as Python. I hope NumPy2 can
> > move in this direction.
> >
> What do you see as missing in numarray in that sense? Aside from UInt32
> I'm not aware of any missing type that is available on all platforms.
> There is the issue of Float128 and such. Adding these is not hard.
> The real issue is how to deal with the platforms that don't support them.

I used Poor wording. When I wrote "datatypes", I meant data types in a
much higher order sense. Perhaps structures or classes would be a better
term. What I mean is that is should be easy to use an manipulate the
same multidimensional arrays from both Python and C/C++. In the current
Numeric, most folks generate a contiguous array, and then just use the
array->data pointer to get what is essentially a C array. That's fine if
you are using it in a traditional C way, with fixed dimension, one
datatype, etc. What I'm imagining is having an object in C or C++ that
could be easily used as a multidimentional array. I'm thinking C++ would
probably neccesary, and probably templates as well, which is why blitz++
looked promising. Of course, blitz++ only compiles with a few up-to-date
compilers, so you'd never get it into the standard library that way!

This could also lead the way to being able to compile NumPy code....<end
fantasy>

> I think it is pretty easy to install since it use distutils.

I agree, but from the newsgroup, it is clear that a lot of folks are
very reluctant to use something that is not part of the standard
library.

> > >    We estimate
> > >    that numarray is probably another order of magnitude worse,
> > >    i.e., that 20K element arrays are at half the asymptotic
> > >    speed. How much should this be improved?
> >
> > A lot. I use arrays smaller than that most of the time!
> >
> What is good enough? As fast as current Numeric?

As fast as current Numeric would be "good enough" for me. It would be a
shame to go backwards in performance!

> (IDL does much
> better than that for example).

My personal benchmark is MATLAB, which I imagine is similar to IDL in
performance.

> 10 element arrays will never be
> close to C speed in any array based language embedded in an
> interpreted environment.

Well, sure, I'm not expecting that

> 100, maybe, but will be very hard.
> 1000 should be possible with some work.

I suppose MATLAB has it easier, as all arrays are doubles, and, (untill
recently anyway), all variable where arrays, and all arrays were 2-d.
NumPy is a lot more flexible that that. Is is the type and size checking
that takes the time?
 
> Another approach is to try to cast many of the functions as being
> able to broadcast over repeated small arrays. After all, if one
> is only doing a computation on one small array, it seems unlikely
> that the overhead of Python will be objectionable. Only if you
> have many such arrays to repeat calculations on, should it be
> a problem (or am I wrong about that).

You are probably right about that.

> If these repeated calculations
> can be "assembled"  into a higher dimensionality array (which
> I understand isn't always possible) and operated on in that sense,
> the efficiency issue can be dealt with.

I do that when possible, but it's not always possible.

> But I guess this can only
> be seen with specific existing examples and programs. I would
> be interested in seeing the kinds of applications you have now
> to gauge what the most effective solution would be.

One of the things I do a lot with are coordinates of points and
polygons. Sets if points I can handle easily as an NX2 array, but
polygons don't work so well, as each polgon has a different number of
points, so I use a list of arrays, which I have to loop over. Each
polygon can have from about 10 to thousands of points (mostly 10-20,
however). One way I have dealt with this is to store a polygon set as a
large array of all the points, and another array with the indexes of the
start and end of each polygon. That way I can transform the coordinates
of all the polygons in one operation. It works OK, but sometimes it is
more useful to have them in a sequence. 

> As mentioned,
> we tend to deal with large data sets and so I don't think we have
> a lot of such examples ourselves.

I know large datasets were one of your driving factors, but I really
don't want to make performance on smaller datasets secondary.

I hope I'll get a chance to play with it soon....

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From nwagner at mecha.uni-stuttgart.de  Thu Nov 22 02:43:06 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Thu Nov 22 02:43:06 2001
Subject: [Numpy-discussion] Numpy for FORTRAN users
Message-ID: <3BFCE508.E6C365DF@mecha.uni-stuttgart.de>

Hi,

Currently users must be aware of the fact that multi-dimensional
arrays are stored differently in Python and Fortran.

Is there any progress that users do not need to worry about this 
rather confusing and technical detail ?

Nils


From martin.wiechert at gmx.de  Thu Nov 22 05:23:02 2001
From: martin.wiechert at gmx.de (Martin Wiechert)
Date: Thu Nov 22 05:23:02 2001
Subject: [Numpy-discussion] Numpy2 and GSL
Message-ID: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>

Hi!

Just an uneducated question.
Are there any plans to wrap GSL for Numpy2?
I did not actually try it (It's not Python ;-)),
but it looks clean and powerful.

Regards,
Martin.


From hinsen at cnrs-orleans.fr  Thu Nov 22 06:29:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Thu Nov 22 06:29:02 2001
Subject: [Numpy-discussion] Numpy2 and GSL
In-Reply-To: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
References: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
Message-ID: <m3wv0ilw8e.fsf@chinon.cnrs-orleans.fr>

Martin Wiechert <martin.wiechert at gmx.de> writes:

> Are there any plans to wrap GSL for Numpy2?
> I did not actually try it (It's not Python ;-)),
> but it looks clean and powerful.

I have heard that several projects decided not to use it for legal
reasons; GSL is GPL, not LGPL. Personally I don't see the problem for
Python/NumPy, but then I am not a lawyer...  And I haven't used GSL
either, but it looks good from the description.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From edcjones at erols.com  Thu Nov 22 17:30:10 2001
From: edcjones at erols.com (Edward C. Jones)
Date: Thu Nov 22 17:30:10 2001
Subject: [Numpy-discussion] Numeric & changes in Python division
Message-ID: <3BFDA742.5080109@erols.com>

# Python 2.2b1, Numeric 20.2.0

from __future__ import division
import Numeric

arr = Numeric.ones((2,2), 'f')
arr = arr/2.0

#Traceback (most recent call last):
#  File "bug.py", line 6, in ?
#arr = arr/2.0
#TypeError: unsupported operand type(s) for /


From paul at pfdubois.com  Thu Nov 22 18:51:01 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Thu Nov 22 18:51:01 2001
Subject: [Numpy-discussion] Numeric & changes in Python division
In-Reply-To: <3BFDA742.5080109@erols.com>
Message-ID: <000201c173c9$606902c0$3d01a8c0@plstn1.sfba.home.com>

You know what the doctor said: if it hurts when you do that, don't do
that.

Seriously, I have not the slightest idea what you're doing here. My
project won't get to 2.2 until well into the new year. Especially if
stuff like this has to be fixed. I haven't even read most of the 2.2
changes.

I understand this is also an issue with CXX. Barry Scott runs CXX now
since I am no longer in a job where I use C++. When he will get to this
I don't know. I need to demote myself on the CXX website.

You haven't seen any recent changes to Numpy, or comments from me on
numarray, because I have a release to get out at my job.


-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of
Edward C. Jones
Sent: Thursday, November 22, 2001 5:33 PM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Numeric & changes in Python division


# Python 2.2b1, Numeric 20.2.0

from __future__ import division
import Numeric

arr = Numeric.ones((2,2), 'f')
arr = arr/2.0

#Traceback (most recent call last):
#  File "bug.py", line 6, in ?
#arr = arr/2.0
#TypeError: unsupported operand type(s) for /


_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From siopis at umich.edu  Fri Nov 23 20:59:01 2001
From: siopis at umich.edu (Christos Siopis <siopis@umich.edu>)
Date: Fri Nov 23 20:59:01 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same thing?
In-Reply-To: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
Message-ID: <Pine.LNX.4.33.0111232236590.18487-100000@orb.astro.lsa.umich.edu>

[ This message got longer than i had initially thought, but these thoughts 
  have been bugging me for so long that i cannot resist the temptation to 
  push the send button! Apologies in advance to those not interested...
]

On Mon, 26 Nov 2001, Martin Wiechert wrote:

> Hi!
> 
> Just an uneducated question.
> Are there any plans to wrap GSL for Numpy2?
> I did not actually try it (It's not Python ;-)),
> but it looks clean and powerful.
> 
> Regards,
> Martin.

I actually think that this question has come up before in this list,
perhaps more than once. And i think it brings up a bigger issue, which is:  
to what extent is it useful for the numerical community to have multiple
numerical libraries, and to what extent does this constitute a waste of
resources?

Numpy (Python), PDL (Perl), GSL (C), and a rather large number of other
libraries usually have to re-implement the same old numerical algorithms,
but offered under a different interface each time. However, there is such
a big body of numerical algorithms out there that it's a daunting effort
to integrate them into every language's numerical library (anyone want to
implement LAPACK's functionality in Numpy?) The compromise that is usually
made is to wrap one library around another. While this may be "better than
nothing", it is usually not a pleasant situation as it leads to
inconsistencies in the interface, inconsistencies in the error handling,
difficulties in the installation, problems with licensing,...

Since i have been a beneficiary rather than a contributor to the numerical
open-source community, i feel somewhat hesitant to file this "complaint",
but i really do think that there are relatively few people out there who
are both willing and capable of building quality open-source numerical
software, while there are too many algorithms to implement, so the
community should be vigilant to minimize waste of resources!

Don't take me wrong, i am not saying that Numpy, PDL, GSL & co. should be
somehow "merged" --obviously, one needs different wrappers to call
numerical routines from Python, Perl, C, C++ or Java. But there should be
a way so that the actual *implementation* of the numerical algorithms is
only done once and for all.

So what i envision, in some sense, is a super-library of "all"/as many as
possible numerical algorithms, which will present appropriate (but
consistent) APIs for different programming languages, so that no matter
what language i use, i can expect consistent interface, consistent
numerical behavior, consistent error handling etc. Furthermore, different
levels of access should allow the application developer to access
low-level or high-level routines as needed (and could object orientation
be efficiently built as a higher-level wrapper?)

This way, the programmer won't have to worry whether the secant root
finder that s/he is using handles exceptions well or how NaNs are treated.
Perhaps most importantly, people would feel compelled to go into the pain
of "translating" existing libraries such as LAPACK into this common
framework, because they will know that this will benefit the entire
community and won't go away when the next scripting language du jour
eclipses their current favorite. Over time, this may lead to a truly
precious resource for the numerical community.

Now, i do realize that this may sound like a "holy grail" of numerical
computing, that it is something which is very difficult, if not impossible
to accomplish. It certainly does not seem like a project that the next
ambitious programmer or lab group would want to embark into on a rainy
day. Rather, it would require a number of important requirements and
architectural decisions to be made first, and trade-offs considered. This
would perhaps be best coordinated by the numerical community at large,
perhaps under the auspices of some organization. But this would be time
well-spent, for it would form the foundations on which a truly universal
numerical library could be built. Experience gained from all the numerical
projects to this day would obviously be invaluable in such an endeavor.

I suspect that this list may not be the best place to discuss such a
topic, but i think that some of the most active people in the field lurk
here, and i would love to hear their thoughts and understand why i am
wrong :) If there is a more appropriate forum to discuss such issues, i
would be glad to be pointed to it --in which case, please disregard this
posting!

***************************************************************
/  Christos Siopis              | Tel    : 734-764-3440       \
/  Postdoctoral Research Fellow |                             \
/  Department of Astronomy      | FAX    : 734-763-6317       \
/  University of Michigan       |                             \
/  Ann Arbor, MI 48109-1090     | E-mail : siopis at umich.edu   \
/  U.S.A.  _____________________|                             \
/         / http://www.astro.lsa.umich.edu/People/siopis.html \
***************************************************************


From jh at oobleck.astro.cornell.edu  Sat Nov 24 19:14:02 2001
From: jh at oobleck.astro.cornell.edu (Joe Harrington)
Date: Sat Nov 24 19:14:02 2001
Subject: [Numpy-discussion] Re: Meta: too many numerical libraries doing the same thing?
In-Reply-To: <E167j3q-0006VI-00@usw-sf-list1.sourceforge.net>
	(numpy-discussion-request@lists.sourceforge.net)
References: <E167j3q-0006VI-00@usw-sf-list1.sourceforge.net>
Message-ID: <200111250313.fAP3DUL21168@oobleck.astro.cornell.edu>

Yes, this issue has been raised here before.  It was the main
conclusion of Paul Barrett's and my BOF session at ADASS a 5 years ago
(see our report at
http://oobleck.astro.cornell.edu/jh/ast/papers/idae96.ps).  The main
problems are that we scientists are too individualistic to get
organized around a single library, too pushed by job pressures to
commit much concentrated time to it ourselves, and too poor to pay the
architects, coders, doc writers, testers, etc. to write it for us.
Socially, we *want* to reinvent the wheel, because we want to be
riding on our own wheels.  Once we are riding reasonably well for our
own needs, our interest and commitment vanishes.  We're off to write
the next paper.

Following that conference, I took a poll on this list looking for help
to implement the library.  About half a dozen people responded that
they could put in up to 10 hours a week, which in my experience isn't
enough, once things get hard and attrition sets in.  Nonetheless, Paul
and I proposed to the NASA Astrophysics Data Analysis Program to hire
some people to write it, but we were turned down.  We proposed the
idea to the head of the High Energy Astrophysics group at NASA
Goddard, and he agreed -- as long as what we were really doing was
writing software for his group's special needs.  The frustrating thing
is how many hundreds of astronomy projects hire people to do their 10%
of this problem, and how unwilling they are to pool resources to do
the general problem.

A few of the volunteers in my query to this list have gone on to do
SciPy, to their credit, but I don't see them moving in the direction
we outlined.  Still, they have the capacity to do it right in Python
and compiled code written explicitly for Python.  They won't solve the
general problem, but they may solve the first problem, namely getting
a data analysis environment that is OSS and as good as IDL et al. in
terms of end-to-end functionality, completeness, and documentation.

I like the notion that the present list is for designing and building
the underlying language capabilities into Python, and for getting them
standardized, tested, and included in the main Python distribution.
It is also a good place for debating the merits of different
implementations of particular functionality.  That leaves the job of
building coherent end-user data analysis packages (which necessarily
have to pick one routine to be called "fft", one device-independent
graphics subsystem, etc.) to application groups like SciPy.  There can
be more than one of these, if that's necessary, but they should all
use the same underlying numerical language capability.

I hope that the application groups from several array-based OSS
languages will someday get together and collaborate on an ueberlibrary
of numerical and graphics routines (the latter being the real sticking
point) that are easily wrapped by most languages.  That seems
backwards, but I think the social reality is that that's the way it is
going to be, if it ever happens at all.

--jh--


From paul at pfdubois.com  Sat Nov 24 19:59:01 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Sat Nov 24 19:59:01 2001
Subject: [Numpy-discussion] Re: Meta: too many numerical libraries doing the same thing?
In-Reply-To: <200111250313.fAP3DUL21168@oobleck.astro.cornell.edu>
Message-ID: <000101c17565$12af2760$3d01a8c0@plstn1.sfba.home.com>

There is more to this issue than meets the eye, both technically and
historically.

For numerical algorithms to be available independent of language, they
would have to be packaged as components such as COM objects. While there
is research in this field, nobody knows whether it can be done is a way
that is efficient enough.

For a given language like C, C++, Eiffel or Fortran used as the
speed-demon base for wrapping up in Python, there are some difficult
technical issues. Reusable numerical software needs context to operate
and there is no decent way to supply the context in a
non-object-oriented language. Geoff Furnish wrote a good paper about the
issue for C++ showing the way to truly reusable libraries in that
language, and recent improvements in Eiffel make it easier to do there
now. In C or Fortran you simply can't do it. (Note that Eiffel or C++
versions of some NAG routines typically have methods with one or two
arguments while the C or Fortran ones have 15 or more; a routine is not
reusable if you have to understand that many arguments to try it. There
are also important issue with regard to error handling and memory).

The second issue is the algorithmic issue: most scientists do NOT know
the right algorithms to use, and the ones they do use are often
inferior. The good algorithms are for the most part in commercial
libraries, and the numerical analysis literature, where they were
written by numerical analysts. Often the coding from both sources is
unavailable for free use, in the wrong language, and/or wretched.

The commerical libraries also exist because some companies have
requirements for fiduciary responsibility; in effect, they need a
guarantor of the software to show that they have not carelessly depended
on software of unknown quality. 

In short, computer scientists are not going to be able to write such a
library without an army of numerical analysts familiar with the
literature, and the numerical analysts aren't going to write it unless
they are OO-experienced, which almost all of them aren't, so far.

Most people when they discuss mathematical software think of leaves on
the call tree. In fact the most useful mathematical software, in the
sense that it incorporates the most expertise, is middleware such as ODE
solvers, integrators, root finders, etc. The algorithm itself will have
many controls, optional outputs, etc. This requires a library-wide
design motif.

I thus feel there are perfectly good reasons not to expect such a
library soon. The Python community could do a good OO-design using what
is available (such as LAPACK) but we haven't -- all the contributions
are functional.


From hinsen at cnrs-orleans.fr  Sun Nov 25 04:45:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Sun Nov 25 04:45:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same thing?
Message-ID: <200111251244.fAPCiIj01855@localhost.localdomain>

"Christos Siopis <siopis at umich.edu>" <siopis at umich.edu> writes:

> Don't take me wrong, i am not saying that Numpy, PDL, GSL & co. should be
> somehow "merged" --obviously, one needs different wrappers to call
> numerical routines from Python, Perl, C, C++ or Java. But there should be
> a way so that the actual *implementation* of the numerical algorithms is
> only done once and for all.

I agree that sounds nice in theory. But even if it were technically
feasible (which I doubt) given the language differences, it would be a
development project that is simply too big for scientists to handle as
a side job, even if they were willing (which again I doubt).

My impression is that the organizational aspects of software
development are often neglected. Some people are good programmers but
can't work well in teams. Others can work in teams, but are not good
coordinators. A big project requires at least one, if not several,
people who are good scientist and programmers, have coordinator
skills, and a job description that permits them to take up the task.
Plus a larger number of people who are good scientists and programmers
and can work in teams. Finally, all of these have to agree on
languages, design principles, etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From tim.hochberg at ieee.org  Sun Nov 25 10:50:02 2001
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Sun Nov 25 10:50:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu> <3BF98336.9010500@STScI.Edu>
Message-ID: <01fd01c175e1$e6ae7990$87740918@cx781526b>

From: "Paul Barrett" <Barrett at stsci.edu>


> Perry Greenfield wrote:
>
> >
> > An early beta version is available on sourceforge as the
> > package Numarray (http://sourceforge.net/projects/numpy/)
> >
> > Information on the goals, changes in user interface, open issues,
> > and design can be found at http://aten.stsci.edu/numarray
>
>
>   6) Should array properties be accessible as public attributes
>     instead of through accessor methods?
>
>     We don't currently allow public array attributes to make
>     the Python code simpler and faster (otherwise we will
>     be forced to use __setattr__ and such). This results in
>     incompatibilty with previous code that uses such attributes.
>
>
> I prefer the use of public attributes over accessor methods.

As do I. As of Python 2.2, __getattr__/__setattr__ should not be required
anyway: new style classes allow this to be done in a more pleasent way. (I'm
still too fuzzy on the details to describe it coherently here though).

-tim


From nwagner at mecha.uni-stuttgart.de  Mon Nov 26 01:55:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Mon Nov 26 01:55:03 2001
Subject: [Numpy-discussion] Sort , Complex array
Message-ID: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>

Hi,

How can I sort an array of complex eigenvalues with respect to the
imaginary part
(in ascending order) in Numpy ?
All eigenvalues appear in complex cunjugate pairs.

Nils


From hinsen at cnrs-orleans.fr  Mon Nov 26 02:46:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 02:46:02 2001
Subject: [Numpy-discussion] Sort , Complex array
In-Reply-To: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
References: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
Message-ID: <m37ksdu8ct.fsf@chinon.cnrs-orleans.fr>

Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:

> How can I sort an array of complex eigenvalues with respect to the
> imaginary part
> (in ascending order) in Numpy ?
> All eigenvalues appear in complex cunjugate pairs.

indices = argsort(eigenvalues.imag)
eigenvalues = take(eigenvalues, indices)

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From gvermeul at labs.polycnrs-gre.fr  Mon Nov 26 02:48:02 2001
From: gvermeul at labs.polycnrs-gre.fr (Gerard Vermeulen)
Date: Mon Nov 26 02:48:02 2001
Subject: [Numpy-discussion] Sort , Complex array
In-Reply-To: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
References: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
Message-ID: <01112611475600.19933@taco.polycnrs-gre.fr>

On Monday 26 November 2001 11:53, Nils Wagner wrote:
> Hi,
>
> How can I sort an array of complex eigenvalues with respect to the
> imaginary part
> (in ascending order) in Numpy ?
> All eigenvalues appear in complex cunjugate pairs.
>
> Nils
>

I have solved that like this:
>>> from Numeric import *
>>> a = array([3+3j, 1+1j, 2+2j])
>>> b = a.imag
>>> print take(a, argsort(b))
[ 1.+1.j  2.+2.j  3.+3.j]
>>>

Best regards -- Gerard


From nwagner at mecha.uni-stuttgart.de  Mon Nov 26 07:03:06 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Mon Nov 26 07:03:06 2001
Subject: [Numpy-discussion] Augmented matrix
Message-ID: <3C026834.E56CE70@mecha.uni-stuttgart.de>

Hi,

How can I build an augmented matrix [A,b] in Numpy,
where A is a m * n matrix (m>n) and b is a m*1 vector

Nils


From hinsen at cnrs-orleans.fr  Mon Nov 26 08:34:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 08:34:02 2001
Subject: [Numpy-discussion] Augmented matrix
In-Reply-To: <3C026834.E56CE70@mecha.uni-stuttgart.de>
References: <3C026834.E56CE70@mecha.uni-stuttgart.de>
Message-ID: <m3k7wdsdgj.fsf@chinon.cnrs-orleans.fr>

Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:

> How can I build an augmented matrix [A,b] in Numpy,
> where A is a m * n matrix (m>n) and b is a m*1 vector

AB = concatenate((A, b[:, NewAxis]), -1)

(assuming b is of rank 1)

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From chrishbarker at home.net  Mon Nov 26 10:30:02 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 10:30:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same 
 thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
Message-ID: <3C028E87.82C57211@home.net>

Another factor that complicates things is open source philosophy and the
licenses that go with it.

The GSL project looks very promising, and the ultimate goals of that
project appear to be to create a coherent and complete numerical
library. This kind of thing NEEDS to be open source, and the GSL folks
have chosen a license (GPL) that guarantees that it remains that way.
That is a good thing. The license also make it impossible to use the
library in closed source projects, which is a deal killer for a lot of
people, but it is also an important attribute for many folks that don't
think there should be closed source projects at all. I believe that that
will greatly stifle the potential of the project, but it fits with the
philosophy iof it's creators. Personally I think the LGPL would have
guaranteed the future openness of the source, and allowed a much greater
user (and therefor contributer) base.

BTW, IANAL either, but my reading of the GPL and Python's "GPL
compatable" license, is that GSL could be used with Python, but the
result would have to be released under the GPL. That means it could not
be imbedded in a closed source project. As a rule, Python itself and
most of the libraries I have seen for it (Numeric, wxPython, etc.) are
released under licences that allow propriatary use, so we probably don't
want to make Numeric, or SciPy GPL. too bad. 

On another note, it looks like the blitz++ library might be a good basis
for a general Numerical library (and NumPy 3)  as well. It does come
with a flexible license. Any thoughts?


-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Mon Nov 26 11:40:03 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 11:40:03 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same  thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
	<3C028E87.82C57211@home.net>
Message-ID: <200111261938.fAQJcmd01426@localhost.localdomain>

Chris Barker <chrishbarker at home.net> writes:

> On another note, it looks like the blitz++ library might be a good basis
> for a general Numerical library (and NumPy 3)  as well. It does come
> with a flexible license. Any thoughts?

I think the major question is whether we are willing to move to C++.
And if we want to keep up any pretentions for Numeric becoming part of
the Python core, this translates into whether Guido will accept C++
code in the Python core.

>From a more pragmatic point of view, I wonder what the implications
for efficiency would be. C++ used to be very different in their
optimization abilities, is that still the case? Even more
pragmatically, is blitz++ reasonably efficient with g++?

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From chrishbarker at home.net  Mon Nov 26 12:43:02 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 12:43:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the 
 same  thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
		<3C028E87.82C57211@home.net> <200111261938.fAQJcmd01426@localhost.localdomain>
Message-ID: <3C02ADB3.E314B8FB@home.net>

Konrad Hinsen wrote:
> Chris Barker <chrishbarker at home.net> writes:
> > On another note, it looks like the blitz++ library might be a good basis
> > for a general Numerical library (and NumPy 3)  as well. It does come
> > with a flexible license. Any thoughts?

> I think the major question is whether we are willing to move to C++.
> And if we want to keep up any pretentions for Numeric becoming part of
> the Python core, this translates into whether Guido will accept C++
> code in the Python core.

Actually, It's worse than that. Blitz++ makes heavy use of templates,
and thus only works with compilers that support that well. The current
Python core can compile under a very wide variety of compilers. I doubt
that Guido would want to change that.

Personally, I'm torn. I would very much like to see NumPy arrays become
part of the core Python, but don't want to have to compromise what it
could be to do that. Another idea is to extend the SciPy project to
become a complete Python distribution, that would clearly include
Numeric. One download, and you have all you need.

> >From a more pragmatic point of view, I wonder what the implications
> for efficiency would be. C++ used to be very different in their
> optimization abilities, is that still the case? Even more
> pragmatically, is blitz++ reasonably efficient with g++?

I know g++ is supported (and I think it is their primary development
platform). From the web site:

Is there a way to soup up C++ so that we can keep the advanced language
features but ditch the poor performance? This is the goal of the
Blitz++ project: to develop techniques which will enable C++ to rival --
and in some cases even exceed -- the speed of Fortran for numerical
computing, while preserving an object-oriented interface. The Blitz++
Numerical Library is being constructed as a testbed for these
techniques.

Recent benchmarks show C++ encroaching steadily on Fortran's
high-performance monopoly, and for some benchmarks, C++ is even faster
than Fortran! These results are being obtained not through better
optimizing compilers, preprocessors, or language extensions, but through
the
use of template techniques. By using templates cleverly, optimizations
such as loop fusion, unrolling, tiling, and algorithm specialization can
be
performed automatically at compile time.

see: http://www.oonumerics.org/blitz/whatis.html for more info.

I havn't messed with it myself, but from the web page, it seems the
answer is yes, C++ can produce high performance code.


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Mon Nov 26 12:52:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 12:52:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same  thing?
In-Reply-To: <000301c176b7$26da0680$3d01a8c0@plstn1.sfba.home.com>
	(paul@pfdubois.com)
References: <000301c176b7$26da0680$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <200111262050.fAQKoxB01580@localhost.localdomain>

> We had some meetings to discuss using blitz and the truth is that as
> wrapped by Python there is not much to gain. The efficiency of blitz
> comes up when you do an array expression in C++. Then x = y + z + w + a
> + b gets compiled into one loop with no temporary objects created. But

That could still be of interest to extension module writers. And it
seems conceivable to write some limited Python-C compiler for
numerical expressions that generates extension modules, although this
is more than a weekend project.

Still, I agree that what most people care about is the speed of NumPy
operations. Some lazy evaluation scheme might be more promising to
eliminate the creation of intermediate objects, but that isn't exactly
trivial either...

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From perry at stsci.edu  Mon Nov 26 12:59:03 2001
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Nov 26 12:59:03 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E166deo-0006o6-00@usw-sf-list1.sourceforge.net>
Message-ID: <JFEGLNDJEDNOMPPHDEJFCEJKDLAA.perry@stsci.edu>

> From: Chris Barker <chrishbarker at home.net>
> To: Perry Greenfield <perry at stsci.edu>, 
> numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Re: Re-implementation of Python 
> Numerical arrays (Numeric) available
>  for  download
> 
> I used Poor wording. When I wrote "datatypes", I meant data types in a
> much higher order sense. Perhaps structures or classes would be a better
> term. What I mean is that is should be easy to use an manipulate the
> same multidimensional arrays from both Python and C/C++. In the current
> Numeric, most folks generate a contiguous array, and then just use the
> array->data pointer to get what is essentially a C array. That's fine if
> you are using it in a traditional C way, with fixed dimension, one
> datatype, etc. What I'm imagining is having an object in C or C++ that
> could be easily used as a multidimentional array. I'm thinking C++ would
> probably neccesary, and probably templates as well, which is why blitz++
> looked promising. Of course, blitz++ only compiles with a few up-to-date
> compilers, so you'd never get it into the standard library that way!
> 
Yes, that was an important issue (C++ and the Python Standard Library).
And yes, it is not terribly convenient to access multi-dimensional
arrays in C (of varying sizes). We don't solve that problem in the
way a C++ library could. But I suppose that some might say that C++
libraries may introduce their own, new problems. But coming up with
the one solution to all scientific computing appears well beyond our
grasp at the moment. If someone does see that solution, let us know!

> I agree, but from the newsgroup, it is clear that a lot of folks are
> very reluctant to use something that is not part of the standard
> library.
>
We agree that getting into the standard library is important.
 
> > > >    We estimate
> > > >    that numarray is probably another order of magnitude worse,
> > > >    i.e., that 20K element arrays are at half the asymptotic
> > > >    speed. How much should this be improved?
> > >
> > > A lot. I use arrays smaller than that most of the time!
> > >
> > What is good enough? As fast as current Numeric?
> 
> As fast as current Numeric would be "good enough" for me. It would be a
> shame to go backwards in performance!
> 
> > (IDL does much
> > better than that for example).
> 
> My personal benchmark is MATLAB, which I imagine is similar to IDL in
> performance.
> 
We'll see if we can match current performance (or at least present usable
alternative approaches that are faster).

> > 10 element arrays will never be
> > close to C speed in any array based language embedded in an
> > interpreted environment.
> 
> Well, sure, I'm not expecting that
> 
Good :-)

> > 100, maybe, but will be very hard.
> > 1000 should be possible with some work.
> 
> I suppose MATLAB has it easier, as all arrays are doubles, and, (untill
> recently anyway), all variable where arrays, and all arrays were 2-d.
> NumPy is a lot more flexible that that. Is is the type and size checking
> that takes the time?
>  
Probably, but we haven't started serious benchmarking yet so I wouldn't
put much stock in what I say now.

 
> One of the things I do a lot with are coordinates of points and
> polygons. Sets if points I can handle easily as an NX2 array, but
> polygons don't work so well, as each polgon has a different number of
> points, so I use a list of arrays, which I have to loop over. Each
> polygon can have from about 10 to thousands of points (mostly 10-20,
> however). One way I have dealt with this is to store a polygon set as a
> large array of all the points, and another array with the indexes of the
> start and end of each polygon. That way I can transform the coordinates
> of all the polygons in one operation. It works OK, but sometimes it is
> more useful to have them in a sequence. 
> 
This is a good example of an ensemble of variable sized arrays.

> > As mentioned,
> > we tend to deal with large data sets and so I don't think we have
> > a lot of such examples ourselves.
> 
> I know large datasets were one of your driving factors, but I really
> don't want to make performance on smaller datasets secondary.
> 
> -- 
> Christopher Barker,

That's why we are asking, and it seems so far that there are enough
of those that do care about small arrays to spend the effort to
significantly improve the performance.

Perry


From chrishbarker at home.net  Mon Nov 26 13:03:02 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 13:03:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the 
 same  thing?
References: <000301c176b7$26da0680$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <3C02B298.E1F0E661@home.net>

"Paul F. Dubois" wrote:

> We had some meetings to discuss using blitz and the truth is that as
> wrapped by Python there is not much to gain. The efficiency of blitz
> comes up when you do an array expression in C++. Then x = y + z + w + a
> + b gets compiled into one loop with no temporary objects created. But
> this trick is possible because you can bind the assignment. In python
> you cannot bind the assignment so you cannot do a lazy evaluation of the
> operations, unless you are willing to go with some sort of function call
> like x = evaluate(y + z + w). Immediate evaluations means creating
> temporaries, and performance is dead.
> 
> The only gain then would be when you passed a Python-wrapped blitz array
> back to C++ and did a bunch of operations there.

Personally, I think this could be a big gain. At the moment, if you
don't get the performance you need with NumPy, you have to write some of
your code in C, and using the Numeric and Python C API is a whole lot of
work, particularly if you want your function to work on non-contiguous
arrays and/or arrays of any type. I don't know much C++, and I have no
idea if Blitz++ fits this bill, but it seemed to me that using an object
oriented framework that could take care of reference counting, and allow
you to work with generic arrays, and index them naturally, etc, would be
a great improvement, even if the performance was the same as the current
C API. Perhaps NumPy2 has accomplished that, it sounds like it is a step
in the right direction, at least.

In a sentence: the most important reason for using a C++ object oriented
multi-dimensional array package would be easy of use, not speed.

It's nice to hear Blitz++ was considered, it was proably rejected for
good reason, but it just looked very promising to me.

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From oliphant at ee.byu.edu  Mon Nov 26 13:24:11 2001
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Nov 26 13:24:11 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the
  same  thing?
In-Reply-To: <3C02B298.E1F0E661@home.net>
Message-ID: <Pine.LNX.4.33L2.0111261424200.31928-100000@oliphant.ee.byu.edu>

> In a sentence: the most important reason for using a C++ object oriented
> multi-dimensional array package would be easy of use, not speed.
>
> It's nice to hear Blitz++ was considered, it was proably rejected for
> good reason, but it just looked very promising to me.

I believe that Eric's "compiler" module included in SciPy uses Blitz++ to
optimize Numeric expressions.  You have others who also share your
admiration of Blitz++

-Travis


From chrishbarker at home.net  Mon Nov 26 15:31:05 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 15:31:05 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing thesame  
 thing?
References: <Pine.LNX.4.33L2.0111261424200.31928-100000@oliphant.ee.byu.edu>
Message-ID: <3C02D510.E7454CCA@home.net>

Travis Oliphant wrote:

> I believe that Eric's "compiler" module included in SciPy uses Blitz++ to
> optimize Numeric expressions.  You have others who also share your
> admiration of Blitz++

Yes, it does. That's where I heard about it. That also brings up a good
point. Paul mentioned that using something like Blitz++ would only help
performance if you could pass it an entire expression, like: x =
a+b+c+d. That is exactly what Eric's compiler module does, and it would
sure be easier if NumPy already used Blitz++! In Fact, I suppose Eric's
compiler is a start towards a tool that could comp9le en entire NumPy
function or module. I'd love to be able to just do that (with some
tweeking perhaps) rather than having to code it all by hand.

My fantasies continue...

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From jochen at jochen-kuepper.de  Mon Nov 26 16:34:01 2001
From: jochen at jochen-kuepper.de (Jochen =?iso-8859-1?q?K=FCpper?=)
Date: Mon Nov 26 16:34:01 2001
Subject: [Numpy-discussion] Re: Numpy2 and GSL
In-Reply-To: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
References: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
Message-ID: <m3herhuk76.fsf@box.home.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 26 Nov 2001 08:21:40 +0100 Martin Wiechert wrote:

Martin> Are there any plans to wrap GSL for Numpy2?
Martin> I did not actually try it (It's not Python ;-)),
Martin> but it looks clean and powerful.

There is actually a project to wrap gsl for python:
  http://pygsl.sourceforge.net/
It only provides wrapper for the special functions, but more is to
come. (Hopefully Achim will put the cvs on sf soon.)

Yes, I agree, PyGSL should be fully integrated with Numpy2, but it
should probably also remain a separate project -- as Numpy should stay
a base layer for all kind of numerical stuff and hopefully make it
into core python at some point (my personal wish, no more, AFAICT!).

I think when PyGSL will fully go to SF (or anything similar) more
people would start contributing and we should have a fine general
numerical algorithms library for python soon!

Greetings,
Jochen
- -- 
Einigkeit und Recht und Freiheit                http://www.Jochen-Kuepper.de
    Libert?, ?galit?, Fraternit?                GnuPG key: 44BCCD8E
        Sex, drugs and rock-n-roll
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt and GnuPG <http://www.gnupg.org/>

iD8DBQE8At88iJ/aUUS8zY4RAikdAJ9184yaCSH+GtkDz2mLVlrSh7mjEQCdGSqA
2uhmBKRCFBb9eeq3gmmn9/Q=
=64gm
-----END PGP SIGNATURE-----


From europax at home.com  Mon Nov 26 17:36:16 2001
From: europax at home.com (Rob)
Date: Mon Nov 26 17:36:16 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the 
 same  thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
			<3C028E87.82C57211@home.net> <200111261938.fAQJcmd01426@localhost.localdomain> <3C02ADB3.E314B8FB@home.net>
Message-ID: <3C02ED76.F02F17D8@home.com>

I'm currently testing the SciPy Blitz++ features with FDTD.  Should have
some comparisons soon.  Right now my statements are compiling, but not
giving the right answers :(   I think they might have it fixed soon.  
Rob.


Chris Barker wrote:
> 
> Konrad Hinsen wrote:
> > Chris Barker <chrishbarker at home.net> writes:
> > > On another note, it looks like the blitz++ library might be a good basis
> > > for a general Numerical library (and NumPy 3)  as well. It does come
> > > with a flexible license. Any thoughts?
> 
> > I think the major question is whether we are willing to move to C++.
> > And if we want to keep up any pretentions for Numeric becoming part of
> > the Python core, this translates into whether Guido will accept C++
> > code in the Python core.
> 
> Actually, It's worse than that. Blitz++ makes heavy use of templates,
> and thus only works with compilers that support that well. The current
> Python core can compile under a very wide variety of compilers. I doubt
> that Guido would want to change that.
> 
> Personally, I'm torn. I would very much like to see NumPy arrays become
> part of the core Python, but don't want to have to compromise what it
> could be to do that. Another idea is to extend the SciPy project to
> become a complete Python distribution, that would clearly include
> Numeric. One download, and you have all you need.
> 
> > >From a more pragmatic point of view, I wonder what the implications
> > for efficiency would be. C++ used to be very different in their
> > optimization abilities, is that still the case? Even more
> > pragmatically, is blitz++ reasonably efficient with g++?
> 
> I know g++ is supported (and I think it is their primary development
> platform). From the web site:
> 
> Is there a way to soup up C++ so that we can keep the advanced language
> features but ditch the poor performance? This is the goal of the
> Blitz++ project: to develop techniques which will enable C++ to rival --
> and in some cases even exceed -- the speed of Fortran for numerical
> computing, while preserving an object-oriented interface. The Blitz++
> Numerical Library is being constructed as a testbed for these
> techniques.
> 
> Recent benchmarks show C++ encroaching steadily on Fortran's
> high-performance monopoly, and for some benchmarks, C++ is even faster
> than Fortran! These results are being obtained not through better
> optimizing compilers, preprocessors, or language extensions, but through
> the
> use of template techniques. By using templates cleverly, optimizations
> such as loop fusion, unrolling, tiling, and algorithm specialization can
> be
> performed automatically at compile time.
> 
> see: http://www.oonumerics.org/blitz/whatis.html for more info.
> 
> I havn't messed with it myself, but from the web page, it seems the
> answer is yes, C++ can produce high performance code.
> 
> --
> Christopher Barker,
> Ph.D.
> ChrisHBarker at home.net                 ---           ---           ---
> http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
>                                    ------@@@     ------@@@     ------@@@
> Oil Spill Modeling                ------   @    ------   @   ------   @
> Water Resources Engineering       -------      ---------     --------
> Coastal and Fluvial Hydrodynamics --------------------------------------
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
The Numeric Python EM Project

www.members.home.net/europax


From Achim.Gaedke at uni-koeln.de  Tue Nov 27 00:20:02 2001
From: Achim.Gaedke at uni-koeln.de (Achim Gaedke)
Date: Tue Nov 27 00:20:02 2001
Subject: [Numpy-discussion] Re: Numpy2 and GSL
References: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net> <m3herhuk76.fsf@box.home.de>
Message-ID: <3C034BFA.FBB64E94@uni-koeln.de>

Ok, there is a clear need for the facility of easy contribution.
Please be patient until Friday, December 7th. Then I have time to let it happen.

It is right that the oficial site for this project is at pygsl.sourcefogrge.net
(Brian Gough, can you change the link on the gsl homepage, thanks :-) )

But I will show some discussion points that must be clear before a cvs release:

- Is the file and directory structure fully expandable, can several persons work
parallel?

- Should classes be created with excellent working objects or should it be a 1:1
wrapper?

- should there be one interface dynamic library or more than one?

- Is there an other way expect that of the GPL (personally prefered, but other
opinions should be discussed before the contribution of source)

Some questions of minor weight:

- Is the tuple return value for (value,error) ok in the sf module?

- Test cases are needed

These questions are the reason, why I do not simply "copy" my code into cvs.

Jochen K?pper wrote:
> 
> It only provides wrapper for the special functions, but more is to
> come. (Hopefully Achim will put the cvs on sf soon.)
> 
> Yes, I agree, PyGSL should be fully integrated with Numpy2, but it
> should probably also remain a separate project -- as Numpy should stay
> a base layer for all kind of numerical stuff and hopefully make it
> into core python at some point (my personal wish, no more, AFAICT!).
> 
> I think when PyGSL will fully go to SF (or anything similar) more
> people would start contributing and we should have a fine general
> numerical algorithms library for python soon!
> 

I agree with Jochen and I'd like to move to the core of Python too. But this is
far away and I hate monolithic distributions.

If there is the need to discuss seperately about PyGSL we can do that here or at
the gsl-discuss list mailto:gsl-discuss at sources.redhat.com . But there is also
the possibility of a mailinglist at pygsl.sourceforge.net . Please let me know.


From neelk at cswcasa.com  Tue Nov 27 05:52:05 2001
From: neelk at cswcasa.com (Krishnaswami, Neel)
Date: Tue Nov 27 05:52:05 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical 
	arrays (Numeric) available for download
Message-ID: <B1E4D3274D57D411BE8400D0B783FF32A8D612@exchange1.cswv.com>

Perry Greenfield [mailto:perry at stsci.edu] wrote:
> > 
> > I know large datasets were one of your driving factors, but I really
> > don't want to make performance on smaller datasets secondary.
> 
> That's why we are asking, and it seems so far that there are enough
> of those that do care about small arrays to spend the effort to
> significantly improve the performance.

Well, here's my application. I do data mining work, and one of the
techniques I want to use Numpy for is to implement robust regression
algorithms like least-trimmed-squares. Now for a k-variable regression,
the best-of-breed algorithm for this involves taking hundreds of 
thousands of k-element samples and calculating the fitting hyperplane
through them.

Small matrix performance is thus something this program lives or dies 
by, and right now it seems like 'dies' is the right measure -- it is
about 10x slower than the Gauss program that does the same thing. :(

When I profiled it seems like Numpy is spending almost all of its 
time in _castCopyAndTranspose. Switching to the Intel MKL LAPACK 
had no performance effect, but changing _castCopyAndTranspose into 
a C function was a 20% speed increase. 

If Numpy2 is even slower on small matrices I'd have to give up using
it, and that's a shame: it's a *much* nicer environment than Gauss is.

--
Neel Krishnaswami
neelk at cswcasa.com


From hungjunglu at yahoo.com  Tue Nov 27 08:28:06 2001
From: hungjunglu at yahoo.com (Hung Jung Lu)
Date: Tue Nov 27 08:28:06 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
In-Reply-To: <ye1lmhaibjv.fsf@orange30.ex.ac.uk>
Message-ID: <20011127162705.40865.qmail@web12604.mail.yahoo.com>

Hi,

Thanks to Jon Saenz and Chris Baker for helping out
with fast linear algebra and statistical distribution
routines.

Again, I have a tangential question. I am hitting the
physical limit of the CPU (meaning things have been
optimized down to assembly level), in order to achieve
even higher performance, the only way to go is
hardware.

Is there any recommendation for fast machines at the
price range of a few thousand dollars? (I cannot
afford supercomputers or connection machines.) My
purpose is to run Monte Carlo simulation. This means
that a lot of scenarios can be run in parallel
fashion. Of course I can just use regular cheap
Pentium boxes... but they are kind of bulky, and I
don't need any of the video, audio, USB features (I
think 10 machines at 1GHz each would be the size of
calculation power I need, or equivalently, a single
machine at an equivalent 10GHz. Heck, if there are
some specialized racks/boxes, I can wire the
motherboards myself.) I am wondering what you people
do for heavy number crunching? Are there any cheap yet
specialized machines? What about machines with dual
processor? I would imagine a lot of people in the
number crunching world run into my situation, and
since the number crunching machines don't require much
beyond a motherboard and a small hard-drive, maybe
there are already some cheap solutions out there.

thanks!

Hung Jung


__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1


From rossini at blindglobe.net  Tue Nov 27 09:44:02 2001
From: rossini at blindglobe.net (A.J. Rossini)
Date: Tue Nov 27 09:44:02 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
In-Reply-To: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
References: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
Message-ID: <87vgfwdsao.fsf@jeeves.blindglobe.net>

>>>>> "HJL" == Hung Jung Lu <hungjunglu at yahoo.com> writes:

    HJL> Again, I have a tangential question. I am hitting the
    HJL> physical limit of the CPU (meaning things have been optimized
    HJL> down to assembly level), in order to achieve even higher
    HJL> performance, the only way to go is hardware.

    HJL> Is there any recommendation for fast machines at the price
    HJL> range of a few thousand dollars? (I cannot afford
    HJL> supercomputers or connection machines.) My purpose is to run
    HJL> Monte Carlo simulation. This means that a lot of scenarios
    HJL> can be run in parallel fashion. Of course I can just use
    HJL> regular cheap Pentium boxes... but they are kind of bulky,
    HJL> and I don't need any of the video, audio, USB features (I
    HJL> think 10 machines at 1GHz each would be the size of
    HJL> calculation power I need, or equivalently, a single machine
    HJL> at an equivalent 10GHz. Heck, if there are some specialized
    HJL> racks/boxes, I can wire the motherboards myself.) I am
    HJL> wondering what you people do for heavy number crunching? Are
    HJL> there any cheap yet specialized machines? What about machines
    HJL> with dual processor? I would imagine a lot of people in the
    HJL> number crunching world run into my situation, and since the
    HJL> number crunching machines don't require much beyond a
    HJL> motherboard and a small hard-drive, maybe there are already
    HJL> some cheap solutions out there.

The usual way is to build some "blackboxes", i.e. mobo/cpu/memory/NIC,
diskless or nearly diskless (you don't want to maintain machines :-).
Connect them using 100bT or faster networks (though 100bT should be
fine). 

Do such things exist?  Sort of -- they tend to be more expensive than
building them yourself, but if you've got a reliable local supplier,
they can build them fairly cheaply for you.  I'd go with single or
dual athlons, myself :-).  If power and maintenance is an issue,
duals, and if not, maybe singles.

We use MOSIX (www.mosix.org) for transparent load balancing between
linux machines, and it could be used on the machines I described
(using a floppy or CD to boot).  

The next question is whether some form of parallel RNG will help.  The
answer is "maybe".  I worked with a student who evaluated coupled
chains, and we couldn't do too much better.  

And then after that, is whether you want to figure out how to
post-process the results.  If you want to automate the whole thing
(and it isn't clear that it would be worth it, but...), you could use
PyPVM to front-end the sub-processes distributed on the network,
load-balanced at the system level by MOSIX.

Now for the problems -- MOSIX seems to have difficulties with Python.
Severe difficulties.  I don't know if it still holds true for recent
MOSIX releases.

(note that I use R (www.r-project.org) for most of my simulation work
these days, but am looking at Python for stat analyses, of which MCMC
tools are of interest).

best,
-tony

-- 
A.J. Rossini				Rsrch. Asst. Prof. of Biostatistics
U. of Washington Biostatistics		rossini at u.washington.edu	
FHCRC/SCHARP/HIV Vaccine Trials Net	rossini at scharp.org
-------------- http://software.biostat.washington.edu/ --------------
FHCRC: M-W: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
UW:   T-Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX
Rosen: (Mullins' Lab) Fridays, and I'm unreachable except by email.


From chrishbarker at home.net  Tue Nov 27 10:28:01 2001
From: chrishbarker at home.net (Chris Barker)
Date: Tue Nov 27 10:28:01 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
References: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
Message-ID: <3C03DF8D.3725E2A2@home.net>

Hung Jung Lu wrote:
> Is there any recommendation for fast machines at the
> price range of a few thousand dollars? (I cannot
> afford supercomputers or connection machines.) My
> purpose is to run Monte Carlo simulation. This means
> that a lot of scenarios can be run in parallel
> fashion. Of course I can just use regular cheap
> Pentium boxes... but they are kind of bulky, and I
> don't need any of the video, audio, USB features (I

I've been looking into setting up a system to do similar work, and it
looks to me like the best bang for the buck right now are dual Athlon
systems. If space is an important consideration, you can get dual Athlon
1U rack mount systems for less than $2000. I'm pretty sure the only dual
Athlon board currently available (Tyan K7 thunder) has on board video,
ethernet and SCSI, which means it cost a little more than it could, but
these systems are still a pretty good deal if you get one without a hard
drive (or a very cheap one). I just did quick web search, and epox is
supposed to be coming out with a dual board as well, so there may be
cheaper options soon.

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From wsryu at fas.harvard.edu  Tue Nov 27 15:52:04 2001
From: wsryu at fas.harvard.edu (William Ryu)
Date: Tue Nov 27 15:52:04 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
In-Reply-To: <3C03DF8D.3725E2A2@home.net>
References: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
Message-ID: <5.1.0.14.2.20011127184457.00aa3850@pop.fas.harvard.edu>

At 10:46 AM 11/27/2001 -0800, Chris Barker wrote:
>Hung Jung Lu wrote:
> > Is there any recommendation for fast machines at the
> > price range of a few thousand dollars? (I cannot
> > afford supercomputers or connection machines.) My
> > purpose is to run Monte Carlo simulation. This means
> > that a lot of scenarios can be run in parallel
> > fashion. Of course I can just use regular cheap
> > Pentium boxes... but they are kind of bulky, and I
> > don't need any of the video, audio, USB features (I
>
>I've been looking into setting up a system to do similar work, and it
>looks to me like the best bang for the buck right now are dual Athlon
>systems. If space is an important consideration, you can get dual Athlon
>1U rack mount systems for less than $2000. I'm pretty sure the only dual
>Athlon board currently available (Tyan K7 thunder) has on board video,
>ethernet and SCSI, which means it cost a little more than it could, but
>these systems are still a pretty good deal if you get one without a hard
>drive (or a very cheap one). I just did quick web search, and epox is
>supposed to be coming out with a dual board as well, so there may be
>cheaper options soon.
>
>-Chris

There is a cheaper dual CPU Tyan board which uses the same motherboard 
chipset. Its the Tyan Tiger-MP S2460, which doesn't have SCSI, onboard 
video, or Ethernet, but is half the price (around $200).

-willryu


From eric at enthought.com  Tue Nov 27 16:16:02 2001
From: eric at enthought.com (eric)
Date: Tue Nov 27 16:16:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing thesame
Message-ID: <051001c17799$8bfa68b0$777ba8c0@ericlaptop>

Hey group,

Blitz++ is very cool, but I'm not sure it would make a very good
underpinning for reimplementing Numeric.  There are 2 (well maybe 3) main
points.

1.
Blitz++ declares arrays in the following way:

The first issue deals with how you declare arrays in Blitz++.

    Array<float,3> A(N,N,N);

The big deal here is that the dimensionality of Array is a template
parameter, not a constructor parameter.  In other words, 2D arrays are
effectively a different type than 3D arrays.  Numeric, on the other hand
represents arrays of all dimensions with a single class/type.  For Python,
this makes the most sense.  I think you could fanagle some way of getting
blitz to work, but I'm not sure it would be the desired elegant solution.
I've also tinkered with building a simple C++ templated  (non-blitz)
implementation of Numeric for kicks, but kept coming back to using the
dreaded void* to store the data arrays.  I still haven't completely given up
on a templated solution, but it wasn't as obvious as I thought it would be.

2.
Compiling Blitz++ is slooooow.  scipy.compiler spits out 200-300 line
extension modules at the most.  Depending on hox complicated expressions
are, it can take .5-1.5 minutes to compile a single extension funtion on an
850 MHz PIII.  I can't imagine how long it would take to compile Numeric
arrays for 1 through 11 dimensions (the most blitz supports as I remember)
for all the different data types with 100s of extension functions.  The cost
wouldn't be linear because you do pay a one time hit for some of the
template instantiation.  Also, I've heard gcc 3.0 might be better.  Still,
it'd be a painful development process.

3.
Portability.  This comes at two levels.  The first is that blitz++ has heavy
duty requirements of the compiler.  gcc works fine which is a huge plus, but
a lot of other compilers don't.  MSVC is the most notable of these because
it is so heavily used on windows.

The second level is the portability of C++ extension modules in general.
I've run into this on windows, but I think it is an issue pretty much
everywhere.  For example, MSVC and GCC compiled C extension libraries can
call each other on Windows because they the are binary compatible.  C++
classes are _not_ binary compatible.  This has come up for me with wxPython.
The standard version that Robin Dunn distributes is compiled with MSVC.  If
you build a small
extensions with gcc that make wxPython call, it'll link just fine, but
seg-faults during execution.
Does anyone know if the same sorta thing is true on the Unices?  If it is,
and Numeric was written in C++ then you'd have to compile extension modules
that use Numeric arrays with the same compiler that was used to compile
Numeric.  This can lead to all sorts of hassles, and it has made me lean
back towards C as the preferred language for something as fundemental as
Numeric.  (Note that I do like C++ for modules that don't really define an
API called by other modules).

Ok, so maybe there's a 4th point.  Paul D. pointed out that blitz isn't much
of a win unless you have lazy evaluation (which scipy.compiler already
provides).  I also think improved speed _isn't_ the biggest goal of a
reimplementation (although it can't be sacrificed either).  I'm more excited
about a code base that more people can comprehend.  Perry G. et al's mixed
Python/C implementation with the code generators is a very good idea and a
step in this direction.  I hope the speed issues for small arrays can be
solved.  I also hope the memory mapped aspect doesn't complicate the code
base much.

see ya,
eric


From hinsen at cnrs-orleans.fr  Wed Nov 28 00:09:03 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed Nov 28 00:09:03 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing thesame
Message-ID: <200111280808.fAS889g08217@localhost.localdomain>

"eric" <eric at enthought.com> writes:

> The standard version that Robin Dunn distributes is compiled with MSVC.  If
> you build a small
> extensions with gcc that make wxPython call, it'll link just fine, but
> seg-faults during execution.
> Does anyone know if the same sorta thing is true on the Unices?  If it is,
> and Numeric was written in C++ then you'd have to compile extension modules
> that use Numeric arrays with the same compiler that was used to compile
> Numeric.  This can lead to all sorts of hassles, and it has made me lean

If you rely on dynamic linking for cross-module calls, you'd have the
same problem with Unix, as different compilers use different
name-mangling schemes. One way around this would be to limit
cross-module calls to C functions compiled with "C" linking.

Better yet, don't rely on dynamic linking at all and export a module's
C API via a Python CObject, as described in the extension manual, and
declare all symbols as static (except for the module initialization
function of course). In my experience that is the only method that
works on all platforms, with all compilers. Of course this also
assumes that interfaces are at the C level.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From sag at hydrosphere.com  Wed Nov 28 09:02:05 2001
From: sag at hydrosphere.com (Sue Giller)
Date: Wed Nov 28 09:02:05 2001
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked array
Message-ID: <20011128170140921.AAA253@mail.climatedata.com@SUEW2000>

I posted the following inquiry to python-list at python.org   earlier this 
week, but got no responses, so I thought I'd try a more focused 
group.  I assume MA module falls under NumPy area.

I am using 2 (and more) dimensional masked arrays with some 
numeric data, and using the reduce functionality on the arrays.  I 
use the masking because some of the values in the arrays are 
'missing' and should not be included in the results of the reduction.

For example, assume a 5 x 2 array, with masked values for the 4th 
entry for both of the 2nd dimension cells.  If I want to sum along the 
2nd dimension, I would expect to get a 'missing' value for the 4th 
entry because both of the entries for the sum are 'missing'.  Instead, 
I get 0, which might be a valid number in my data space, and the 
returned 1 dimensional array has no mask associated with it.

Is this expected behavior for masked arrays or a bug or am I 
misusing the mask concept?  Does anyone know how to get the 
reduction to produce a masked value?

Example Code:
>>> import MA
>>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]])
>>> a
   [[  1,  2,  3,-99,  5,]
    [ 10, 20, 30,-99, 50,]]
>>> m = MA.masked_values(a, -99)
>>> m
    array(data = 
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask = 
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

>>> r = MA.sum(m)
>>> r
      array([11,22,33, 0,55,])
>>> t = MA.getmask(r)
>>> print t
      None


From paul at pfdubois.com  Wed Nov 28 20:31:03 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Wed Nov 28 20:31:03 2001
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked array
In-Reply-To: <20011128170140921.AAA253@mail.climatedata.com@SUEW2000>
Message-ID: <000201c1788e$60359ce0$3d01a8c0@plstn1.sfba.home.com>

[dubois at ldorritt ~]$ pydoc MA.sum
Python Library Documentation: function sum in MA

sum(a, axis=0, fill_value=0)
    Sum of elements along a certain axis using fill_value for missing.

If you use add.reduce, you'll get what you want.
>>> print m
[[1 ,2 ,3 ,-- ,5 ,]
 [10 ,20 ,30 ,-- ,50 ,]]
>>> MA.sum(m)
array([11,22,33, 0,55,])
>>> MA.add.reduce(m)
array(data = 
 [ 11, 22, 33,-99, 55,],
      mask = 
 [0,0,0,1,0,],
      fill_value=-99)

In other words,
   sum(m, axis, fill_value) = add.reduce(filled(m, fill_value), axis)

Surprising in your case. Still, both uses are quite common, so I
probably was thinking to myself that since add.reduce already does one
of the jobs, I might as well make sum do the other one. One could have
just as well argued that one was a synonym for the other and so it is
revolting to have them be different.

Well, MA users, is this something I should change, or not?

-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Sue
Giller
Sent: Wednesday, November 28, 2001 9:03 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked
array


I posted the following inquiry to python-list at python.org   earlier this 
week, but got no responses, so I thought I'd try a more focused 
group.  I assume MA module falls under NumPy area.

I am using 2 (and more) dimensional masked arrays with some 
numeric data, and using the reduce functionality on the arrays.  I 
use the masking because some of the values in the arrays are 
'missing' and should not be included in the results of the reduction.

For example, assume a 5 x 2 array, with masked values for the 4th 
entry for both of the 2nd dimension cells.  If I want to sum along the 
2nd dimension, I would expect to get a 'missing' value for the 4th 
entry because both of the entries for the sum are 'missing'.  Instead, 
I get 0, which might be a valid number in my data space, and the 
returned 1 dimensional array has no mask associated with it.

Is this expected behavior for masked arrays or a bug or am I 
misusing the mask concept?  Does anyone know how to get the 
reduction to produce a masked value?

Example Code:
>>> import MA
>>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]])
>>> a
   [[  1,  2,  3,-99,  5,]
    [ 10, 20, 30,-99, 50,]]
>>> m = MA.masked_values(a, -99)
>>> m
    array(data = 
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask = 
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

>>> r = MA.sum(m)
>>> r
      array([11,22,33, 0,55,])
>>> t = MA.getmask(r)
>>> print t
      None


_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From giulio.bottazzi at libero.it  Thu Nov 29 02:10:03 2001
From: giulio.bottazzi at libero.it (Giulio Bottazzi)
Date: Thu Nov 29 02:10:03 2001
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked array
References: <000201c1788e$60359ce0$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <3C05FDA2.AD9C5DCC@libero.it>

My answer is yes: the difference between the two behaviors could be
confusing for the user.

If I can dare to express a "general rule", I would say that
the masks in MA arrays should not disappear if not EXPLICITLY required
to do so!

Of course you can interpret a provided value for the fill_value
parameter
in the sum function as such a request... but if value is not provided,
than
I would say that the correct approach would be to keep the mask on
(after all,
what special about the value 0? For instance, if you have to take
logarithm in the
next step of the calculation, it is a rather bad choice!)

	Giulio.

"Paul F. Dubois" wrote:
> 
> [dubois at ldorritt ~]$ pydoc MA.sum
> Python Library Documentation: function sum in MA
> 
> sum(a, axis=0, fill_value=0)
>     Sum of elements along a certain axis using fill_value for missing.
> 
> If you use add.reduce, you'll get what you want.
> >>> print m
> [[1 ,2 ,3 ,-- ,5 ,]
>  [10 ,20 ,30 ,-- ,50 ,]]
> >>> MA.sum(m)
> array([11,22,33, 0,55,])
> >>> MA.add.reduce(m)
> array(data =
>  [ 11, 22, 33,-99, 55,],
>       mask =
>  [0,0,0,1,0,],
>       fill_value=-99)
> 
> In other words,
>    sum(m, axis, fill_value) = add.reduce(filled(m, fill_value), axis)
> 
> Surprising in your case. Still, both uses are quite common, so I
> probably was thinking to myself that since add.reduce already does one
> of the jobs, I might as well make sum do the other one. One could have
> just as well argued that one was a synonym for the other and so it is
> revolting to have them be different.
> 
> Well, MA users, is this something I should change, or not?
> 
> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Sue
> Giller
> Sent: Wednesday, November 28, 2001 9:03 AM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked
> array
> 
> I posted the following inquiry to python-list at python.org   earlier this
> week, but got no responses, so I thought I'd try a more focused
> group.  I assume MA module falls under NumPy area.
> 
> I am using 2 (and more) dimensional masked arrays with some
> numeric data, and using the reduce functionality on the arrays.  I
> use the masking because some of the values in the arrays are
> 'missing' and should not be included in the results of the reduction.
> 
> For example, assume a 5 x 2 array, with masked values for the 4th
> entry for both of the 2nd dimension cells.  If I want to sum along the
> 2nd dimension, I would expect to get a 'missing' value for the 4th
> entry because both of the entries for the sum are 'missing'.  Instead,
> I get 0, which might be a valid number in my data space, and the
> returned 1 dimensional array has no mask associated with it.
> 
> Is this expected behavior for masked arrays or a bug or am I
> misusing the mask concept?  Does anyone know how to get the
> reduction to produce a masked value?
> 
> Example Code:
> >>> import MA
> >>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]])
> >>> a
>    [[  1,  2,  3,-99,  5,]
>     [ 10, 20, 30,-99, 50,]]
> >>> m = MA.masked_values(a, -99)
> >>> m
>     array(data =
>              [[  1,  2,  3,-99,  5,]
>               [ 10, 20, 30,-99, 50,]],
>            mask =
>               [[0,0,0,1,0,]
>                [0,0,0,1,0,]],
>            fill_value=-99)
> 
> >>> r = MA.sum(m)
> >>> r
>       array([11,22,33, 0,55,])
> >>> t = MA.getmask(r)
> >>> print t
>       None
> 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From sag at hydrosphere.com  Thu Nov 29 09:49:02 2001
From: sag at hydrosphere.com (Sue Giller)
Date: Thu Nov 29 09:49:02 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <3C05FDA2.AD9C5DCC@libero.it>
Message-ID: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>

Thanks for the pointer.

The example I gave using the sum operation is merely an example - 
I could also be doing other manipulations such as min, max, 
average, etc.

I see that the MA.<op>.reduce functions will do what I want, but to 
do an average, I will need to do two steps since the MA.average 
function will have the original 'unexpected' behavior that I don't want.

That raises the question of how to determine a count of valid values 
in a masked array.  Can I assume that I can do 'math' on the mask 
array itself, for example to sum along a given axis and have the 
masked cells add up?

In my original example, I would expect a sum along the second axis 
to return [0,0,0,2,0].  Can I rely on this?  I would suggest that a 
.count operator would be very useful in working with masked arrays 
(count valid and count masked).

>>> m = MA.masked_values(a, -99)
>>> m
    array(data =
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask =
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

To add an opinion on the question from Paul about 'expected' 
behavior, I was working off the documentation for Numerical Python, 
and there were no caveats in there about MA.<op> working one 
way, and MA.<op>.reduce working another.  The answer is always 
in the documentation, especially for users like me who don't have 
time or knkowledge to go reading thru all the code modules to try 
and figure out what is happening.  From a purely user standpoint, I 
would expect a masked array to retain it's mask-edness at all times, 
unless I explicitly tell it not to.  In that case, I would still expect it to 
replace the 'masked' cells with the original masked value, and not 
just arbitrarily assign some other value, such as 0.

Thanks again for the prompt reply.


From reggie at merfinllc.com  Thu Nov 29 10:36:01 2001
From: reggie at merfinllc.com (Reggie Dugard)
Date: Thu Nov 29 10:36:01 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>
Message-ID: <NBBBLCCOCMBHBAJCFIBKCEDCCKAA.reggie@merfinllc.com>

> That raises the question of how to determine a count of valid values
> in a masked array.  Can I assume that I can do 'math' on the mask
> array itself, for example to sum along a given axis and have the
> masked cells add up?
>
> In my original example, I would expect a sum along the second axis
> to return [0,0,0,2,0].  Can I rely on this?  I would suggest that a
> .count operator would be very useful in working with masked arrays
> (count valid and count masked).

Actually masked arrays already have a count method that does what you
want:

Python 2.2b2 (#26, Nov 16 2001, 11:44:11) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pydoc import help
>>> import MA
>>> x = MA.arange(10)
>>> help(x.count)
Help on method count in module MA.MA:

count(self, axis=None) method of MA.MA.MaskedArray instance
    Count of the non-masked elements in a, or along a certain axis.

>>> x.count()
10
>>>


From paul at pfdubois.com  Thu Nov 29 12:54:02 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Thu Nov 29 12:54:02 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>
Message-ID: <000201c17917$ac5efec0$3d01a8c0@plstn1.sfba.home.com>

You have misread my reply. It is not true that MA.op works one way and
MA.op.reduce is different. sum and add.reduce are different, and the
documentation for sum 
DOES say the right thing for sum. The function sum is a special case in
that its native meaning was the same as add.reduce and so the function
is redundant.

I believe you are in error wrt average; average works the way you want. 
Function count can tell you the number of 
non-masked values either in the whole array or axis-wise if you give an
axis argument. Function size gives you the total number, so #invalid is
size(x)-count(x).

maximum and minimum (don't use max and min, they are built-ins that
don't know about Numeric)
have two forms. When called with one argument they return the overall
max or min of the whole array, returning masked only if all entries are
masked. For two arguments, you get element-wise extrema, and the mask is
on where any one of the arguments was masked.

>>> print x
[[1 ,-- ,3 ,]
 [11 ,-- ,-- ,]]
>>> print average(x)
[6.0 ,-- ,3.0 ,] 
>>> y
array(
 [[ 6, 7, 8,]
 [ 9,10,11,]])
>>> print maximum(x,y)
[[6 ,-- ,8 ,]
 [11 ,-- ,-- ,]]
>>> y[0,0]=masked
>>> print maximum(x,y)
[[-- ,-- ,8 ,]
 [11 ,-- ,-- ,]]
-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Sue
Giller
Sent: Thursday, November 29, 2001 9:50 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional
Masked array


Thanks for the pointer.

The example I gave using the sum operation is merely an example - 
I could also be doing other manipulations such as min, max, 
average, etc.

I see that the MA.<op>.reduce functions will do what I want, but to 
do an average, I will need to do two steps since the MA.average 
function will have the original 'unexpected' behavior that I don't want.

That raises the question of how to determine a count of valid values 
in a masked array.  Can I assume that I can do 'math' on the mask 
array itself, for example to sum along a given axis and have the 
masked cells add up?

In my original example, I would expect a sum along the second axis 
to return [0,0,0,2,0].  Can I rely on this?  I would suggest that a 
.count operator would be very useful in working with masked arrays 
(count valid and count masked).

>>> m = MA.masked_values(a, -99)
>>> m
    array(data =
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask =
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

To add an opinion on the question from Paul about 'expected' 
behavior, I was working off the documentation for Numerical Python, 
and there were no caveats in there about MA.<op> working one 
way, and MA.<op>.reduce working another.  The answer is always 
in the documentation, especially for users like me who don't have 
time or knkowledge to go reading thru all the code modules to try 
and figure out what is happening.  From a purely user standpoint, I 
would expect a masked array to retain it's mask-edness at all times, 
unless I explicitly tell it not to.  In that case, I would still expect
it to 
replace the 'masked' cells with the original masked value, and not 
just arbitrarily assign some other value, such as 0.

Thanks again for the prompt reply.


_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From sag at hydrosphere.com  Thu Nov 29 15:21:04 2001
From: sag at hydrosphere.com (Sue Giller)
Date: Thu Nov 29 15:21:04 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <000201c17917$ac5efec0$3d01a8c0@plstn1.sfba.home.com>
References: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>
Message-ID: <20011129232011546.AAA269@mail.climatedata.com@SUEW2000>

Paul,

Well, you're right.  I did misunderstand your reply, as well as what 
the various functions were supposed to do.  I was mis-using the 
sum, minimum, maximum as tho they were MA.<op>.reduce, and 
my test case didn't point out the difference.  I should always have 
been doing the .reduce version.

I apologize for this!

I found a section on page 45 of the Numerical Python text (PDF 
form, July 13, 2001) that defines sum as
   'The sum function is a synonym for the reduce method of the add 
ufunc.  It returns the sum of all the elements in the sequence given 
along the specified axis (first axis by default).'

This is where I would expect to see a caveat about it not retaining 
any mask-edness.

I was misussing the MA.minimum and MA.maximum as tho they 
were .reduce version.  My bad.

The MA.average does produce a masked array, but it has changed 
the 'missing value' to fill_value=[ 1.00000002e+020,]).  I do find this 
a bit odd, since the other reductions didn't change the fill value.

Anyway, I can now get the stats I want in a format I want, and I 
understand better the various functions for array/masked array.

Thanks for the comments/input.

sue


From romberg at fsl.noaa.gov  Fri Nov 30 11:30:04 2001
From: romberg at fsl.noaa.gov (Mike Romberg)
Date: Fri Nov 30 11:30:04 2001
Subject: [Numpy-discussion] equal() and complex
Message-ID: <15367.56879.54329.654575@smaug.fsl.noaa.gov>

  I'm wondering if there is some good reason why equal(), not_equal(),
nonzero() and the like do not work with numeric arrays of tyep
complex.  I can see why operators like less() and less_equal() do not
work.  But the pure equality ones seem like they should work.  Or am I
missing something :).

Thanks,

Mike Romberg (romberg at fsl.noaa.gov)


From hinsen at cnrs-orleans.fr  Fri Nov 30 12:17:04 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Fri Nov 30 12:17:04 2001
Subject: [Numpy-discussion] equal() and complex
References: <15367.56879.54329.654575@smaug.fsl.noaa.gov>
Message-ID: <200111302016.fAUKG9X01351@localhost.localdomain>

Mike Romberg <romberg at fsl.noaa.gov> writes:

>   I'm wondering if there is some good reason why equal(), not_equal(),
> nonzero() and the like do not work with numeric arrays of tyep
> complex.  I can see why operators like less() and less_equal() do not
> work.  But the pure equality ones seem like they should work.  Or am I
> missing something :).

Before Python 2.1, comparison couldn't be implemented for equality
only.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From europax at home.com  Fri Nov 30 17:35:03 2001
From: europax at home.com (Rob)
Date: Fri Nov 30 17:35:03 2001
Subject: [Numpy-discussion] Numeric Python EM Project has moved
Message-ID: <3C083356.31E66685@home.com>

Its now at www.pythonemproject.com.  I can be reached at
rob at pythonemproject.com.  All this has come about since @home is
possibly suspending operation at midnite tonight :(   Rob.

Looks like I need to change my sig too :)


-- 
The Numeric Python EM Project

www.members.home.net/europax


From jjl at pobox.com  Thu Nov  1 11:19:12 2001
From: jjl at pobox.com (John J. Lee)
Date: Thu Nov  1 11:19:12 2001
Subject: [Numpy-discussion] RE: Numeric2
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFKEEODLAA.perry@stsci.edu>
Message-ID: <Pine.SOL.4.30.0111011916360.17706-100000@mimosa.csv.warwick.ac.uk>

On Tue, 30 Oct 2001, Perry Greenfield wrote:
[...]
> > What is the current status of Numeric2?
> >
> We are in the process of putting it up on sourceforge now.
[...]

What does it do??


John


From hungjunglu at yahoo.com  Fri Nov  2 10:24:10 2001
From: hungjunglu at yahoo.com (Hung Jung Lu)
Date: Fri Nov  2 10:24:10 2001
Subject: [Numpy-discussion] Assembly optimized numerical packages?
Message-ID: <20011102182318.66182.qmail@web12606.mail.yahoo.com>

Hi,

This is a tangential topic.

Can someone give me pointers where to find
freeware/shareware/commercial packages for linear
algebra and probability calculations (e.g: Cholesky
decomposition, eigenvalue & eigenvectors in
diagonalization, interpolation, normal distribution,
beta distribution, inverse cumulative normal function,
etc.), and such that it uses assembly level
optimization (I need highspeed, but on mundane Pentium
3 or Pentium 4 machines) and can be used in Windows
platform and from Microsoft's Visual C++?

I know mtxvec from www.dewresearch.com does something
along these lines, but it seems like they are aiming
for specific dev platforms (CBuilder and Delphi).

thanks!

Hung Jung


__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com


From chrishbarker at home.net  Fri Nov  2 11:40:07 2001
From: chrishbarker at home.net (Chris Barker)
Date: Fri Nov  2 11:40:07 2001
Subject: [Numpy-discussion] Assembly optimized numerical packages?
References: <20011102182318.66182.qmail@web12606.mail.yahoo.com>
Message-ID: <3BE2FADF.23D659E9@home.net>

Hung Jung Lu wrote:

> Can someone give me pointers where to find
> freeware/shareware/commercial packages for linear
> algebra and probability calculations (e.g: Cholesky
> decomposition, eigenvalue & eigenvectors in
> diagonalization,

This sounds likeyou are looking for is LAPACK with a good BLAS. Do a web
search, and you'll find lot's of pointers.

interpolation, normal distribution,
> beta distribution, inverse cumulative normal function,
> etc.)

I'm lost here. Perhaps someone else will have some pointers.

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From jsaenz at wm.lc.ehu.es  Mon Nov  5 00:56:09 2001
From: jsaenz at wm.lc.ehu.es (Jon Saenz)
Date: Mon Nov  5 00:56:09 2001
Subject: [Numpy-discussion] Assembly optimized numerical packages?
In-Reply-To: <20011102182318.66182.qmail@web12606.mail.yahoo.com>
Message-ID: <Pine.OSF.3.95.1011105095309.19176F-100000@lcdx00.wm.lc.ehu.es>

On Fri, 2 Nov 2001, Hung Jung Lu wrote:

> Can someone give me pointers where to find
> freeware/shareware/commercial packages for linear
> algebra and probability calculations (e.g: Cholesky
> decomposition, eigenvalue & eigenvectors in
> diagonalization, interpolation, normal distribution,
> beta distribution, inverse cumulative normal function,
> etc.), and such that it uses assembly level
> optimization (I need highspeed, but on mundane Pentium
> 3 or Pentium 4 machines) and can be used in Windows
> platform and from Microsoft's Visual C++?
For statistical distribution functions, you can check DCDFLIB.C:
http://odin.mdacc.tmc.edu/anonftp/page_2.html

It is C, not assembler.

Jon Saenz.				| Tfno: +34 946012445
Depto. Fisica Aplicada II               | Fax:  +34 944648500
Facultad de Ciencias.   \\ Universidad del Pais Vasco \\
Apdo. 644   \\ 48080 - Bilbao  \\ SPAIN


From R.M.Everson at exeter.ac.uk  Tue Nov  6 13:05:05 2001
From: R.M.Everson at exeter.ac.uk (R.M.Everson)
Date: Tue Nov  6 13:05:05 2001
Subject: [Numpy-discussion] Sparse matrices
Message-ID: <ye1y9ljtxwk.fsf@orange30.ex.ac.uk>

Hello,

Does anyone have a working sparse matrix module for Numeric 20.2.0 and
Python 2.1 (or similar).  I'm tryinng to get the version in the SciPy
CVS tree to work - so far without success.

I don't want anything particularly fancy -- not even sparse matrix
inversion.  Addition and multiplication would be fine.

Thanks for any ideas/pointers/software etc!

Cheers,

Richard.

-- 
Department of Computer Science, Exeter University    Voice: +44 1392 264065
R.M.Everson at exeter.ac.uk                         Secretary: +44 1392 264061
http://www.dcs.ex.ac.uk/people/reverson                Fax: +44 1392 264067


From vanandel at atd.ucar.edu  Tue Nov  6 13:15:04 2001
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Tue Nov  6 13:15:04 2001
Subject: [Numpy-discussion] MA - math operations do not preserve fill_value
Message-ID: <3BE852CA.A18F9E5C@atd.ucar.edu>

Using Python 2.1 and Numeric 20.2.1 on Redhat Linux 7.1

Shouldn't masked arrays preserve the fill value of their operands, if
both operands have the same fill value?  Otherwise, if I want to
preserve the value of the fill_value, I have to write expressions like:


d=masked_values((a+b),a.fill_value()) 

Here's a demonstration of the problem:

>>> a = masked_values((1.0,2.0,3.0,4.0,-999.0), -999)
>>> b = masked_values((-999.0,1.0,2.0,3.0,4.0), -999)

>>> a
array(data =
 [   1.,   2.,   3.,   4.,-999.,],
      mask =
 [0,0,0,0,1,],
      fill_value=-999)
 
>>> b
array(data =
 [-999.,   1.,   2.,   3.,   4.,],
      mask =
 [1,0,0,0,0,],
      fill_value=-999)

>>> c=a+b
>>> c
array(data =
 [  1.00000002e+20,  3.00000000e+00,  5.00000000e+00,  7.00000000e+00,
        1.00000002e+20,],
      mask =
 [1,0,0,0,1,],
      fill_value=[  1.00000002e+20,])

>>> d=masked_values((a+b),a.fill_value())
>>> d
array(data =
 [-999.,   3.,   5.,   7.,-999.,],
      mask =
 [1,0,0,0,1,],
      fill_value=-999)
-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From roitblat at hawaii.edu  Tue Nov  6 17:05:03 2001
From: roitblat at hawaii.edu (Herbert L. Roitblat)
Date: Tue Nov  6 17:05:03 2001
Subject: [Numpy-discussion] Sparse matrices
References: <ye1y9ljtxwk.fsf@orange30.ex.ac.uk>
Message-ID: <055701c16727$b57fed90$8fd6afcf@pixi.com>

Travis Oliphant has one.
H.
----- Original Message -----
From: "R.M.Everson" <R.M.Everson at exeter.ac.uk>
To: <numpy-discussion at lists.sourceforge.net>
Sent: Tuesday, November 06, 2001 11:03 AM
Subject: [Numpy-discussion] Sparse matrices


>
> Hello,
>
> Does anyone have a working sparse matrix module for Numeric 20.2.0 and
> Python 2.1 (or similar).  I'm tryinng to get the version in the SciPy
> CVS tree to work - so far without success.
>
> I don't want anything particularly fancy -- not even sparse matrix
> inversion.  Addition and multiplication would be fine.
>
> Thanks for any ideas/pointers/software etc!
>
> Cheers,
>
> Richard.
>
> --
> Department of Computer Science, Exeter University    Voice: +44 1392
264065
> R.M.Everson at exeter.ac.uk                         Secretary: +44 1392
264061
> http://www.dcs.ex.ac.uk/people/reverson                Fax: +44 1392
264067
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From jochen at jochen-kuepper.de  Tue Nov  6 18:57:02 2001
From: jochen at jochen-kuepper.de (Jochen =?iso-8859-1?q?K=FCpper?=)
Date: Tue Nov  6 18:57:02 2001
Subject: [Numpy-discussion] Sparse matrices
In-Reply-To: <055701c16727$b57fed90$8fd6afcf@pixi.com>
References: <ye1y9ljtxwk.fsf@orange30.ex.ac.uk>
	<055701c16727$b57fed90$8fd6afcf@pixi.com>
Message-ID: <m3u1w79tmy.fsf@box.home.de>

On Tue, 6 Nov 2001 15:01:18 -1000 Herbert L Roitblat wrote:

Herbert> Travis Oliphant has one.

Isn't that the one in SciPy?

Herbert> ----- Original Message -----
Herbert> From: "R.M.Everson" <R.M.Everson at exeter.ac.uk>
Herbert> To: <numpy-discussion at lists.sourceforge.net>
Herbert> Sent: Tuesday, November 06, 2001 11:03 AM
Herbert> Subject: [Numpy-discussion] Sparse matrices

>> Does anyone have a working sparse matrix module for Numeric 20.2.0
>> and Python 2.1 (or similar).  I'm tryinng to get the version in the
>> SciPy CVS tree to work - so far without success.

Herbert,

this inverse citing really is counterproductive on mls.

Greetings,
Jochen
-- 
Einigkeit und Recht und Freiheit                http://www.Jochen-Kuepper.de
    Libert?, ?galit?, Fraternit?                GnuPG key: 44BCCD8E
        Sex, drugs and rock-n-roll


From nwagner at mecha.uni-stuttgart.de  Sun Nov 11 07:32:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Sun Nov 11 07:32:03 2001
Subject: [Numpy-discussion] RandomArray - random
Message-ID: <3BEEA88E.742E9225@mecha.uni-stuttgart.de>

Hi,

I tried to produce a random matrix say Q (2ndof \times nsamp+1) with
Numpy 20.2
and
Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "copyright", "credits" or "license" for more information.

Traceback (most recent call last):
  File "modal.py", line 192, in ?
    Q = 2.0*random((2*ndof,nsamp+1))-ones((2*ndof,nsamp+1))
TypeError: random() takes exactly 1 argument (2 given)

Does it require a new syntax to obtain a matrix consisting of uniformly
distributed random numbers in the range +/- 1 ?

Nils


From paul at pfdubois.com  Sun Nov 11 09:14:02 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Sun Nov 11 09:14:02 2001
Subject: [Numpy-discussion] RandomArray - random
In-Reply-To: <3BEEA88E.742E9225@mecha.uni-stuttgart.de>
Message-ID: <000001c16ad3$f3e688a0$3d01a8c0@plstn1.sfba.home.com>

Your reference to random is not fully qualified so I suppose you could
be picking up some other random. But I just tried
RandomArray.random((2,3)) and it worked fine.

BTW you could just do 2.0*random((n,m))-1.0.


-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Nils
Wagner
Sent: Sunday, November 11, 2001 8:34 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] RandomArray - random


Hi,

I tried to produce a random matrix say Q (2ndof \times nsamp+1) with
Numpy 20.2 and Python 2.1.1 (#1, Sep 24 2001, 05:28:47) [GCC 2.95.3
20010315 (SuSE)] on linux2 Type "copyright", "credits" or "license" for
more information.

Traceback (most recent call last):
  File "modal.py", line 192, in ?
    Q = 2.0*random((2*ndof,nsamp+1))-ones((2*ndof,nsamp+1))
TypeError: random() takes exactly 1 argument (2 given)

Does it require a new syntax to obtain a matrix consisting of uniformly
distributed random numbers in the range +/- 1 ?

Nils

_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From nwagner at mecha.uni-stuttgart.de  Mon Nov 12 04:01:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Mon Nov 12 04:01:03 2001
Subject: [Numpy-discussion] RandomArray - random
References: <000001c16ad3$f3e688a0$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <3BEFC88E.F87F363E@mecha.uni-stuttgart.de>

"Paul F. Dubois" schrieb:
> 
> Your reference to random is not fully qualified so I suppose you could
> be picking up some other random. But I just tried
> RandomArray.random((2,3)) and it worked fine.
> 
> BTW you could just do 2.0*random((n,m))-1.0.
> 
It seems to be a conflict with Vpython formerly Visualpython. 
http://cil.andrew.cmu.edu/projects/visual/index.html

Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from Numeric import *
>>> from RandomArray import *
>>> random((2,3))
array([[ 0.68769461,  0.33015978,  0.07285815],
       [ 0.20514929,  0.81925279,  0.50694615]])
>>> from visual import *
Visual-2001-09-24
>>> random((2,3))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: random() takes exactly 1 argument (2 given)
>>>

Nils

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Nils
> Wagner
> Sent: Sunday, November 11, 2001 8:34 AM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] RandomArray - random
> 
> Hi,
> 
> I tried to produce a random matrix say Q (2ndof \times nsamp+1) with
> Numpy 20.2 and Python 2.1.1 (#1, Sep 24 2001, 05:28:47) [GCC 2.95.3
> 20010315 (SuSE)] on linux2 Type "copyright", "credits" or "license" for
> more information.
> 
> Traceback (most recent call last):
>   File "modal.py", line 192, in ?
>     Q = 2.0*random((2*ndof,nsamp+1))-ones((2*ndof,nsamp+1))
> TypeError: random() takes exactly 1 argument (2 given)
> 
> Does it require a new syntax to obtain a matrix consisting of uniformly
> distributed random numbers in the range +/- 1 ?
> 
> Nils
> 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From neelk at cswcasa.com  Mon Nov 12 09:24:02 2001
From: neelk at cswcasa.com (Krishnaswami, Neel)
Date: Mon Nov 12 09:24:02 2001
Subject: [Numpy-discussion] Building Numeric with Intel KML and mingw32
Message-ID: <B1E4D3274D57D411BE8400D0B783FF32A8D5AA@exchange1.cswv.com>

Hello, 

I'm trying to rebuild Numeric with the Intel Kernel Math Libraries. 

I've gotten Numeric building normally with the default BLAS libraries,
but I'm not sure what I need to put into the libraries_dir_list and
libraries_list variables in the setup.py file. 

I have the directories mkl\ia32\bin (contains the DLLs), mkl\ia32\lib 
(contains the lib*.a files), and mkl\include (contains the *.h files). 

Can anyone tells me what goes where? 

--
Neel Krishnaswami
neelk at cswcasa.com
 

From nwagner at mecha.uni-stuttgart.de  Tue Nov 13 02:22:01 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Tue Nov 13 02:22:01 2001
Subject: [Numpy-discussion] Total least squares problem
Message-ID: <3BF102C6.8C651D9E@mecha.uni-stuttgart.de>

Hi,

How do I solve a Total Least Squares problem in Numpy ?
A small example would be appreciated.

The TLS problem assumes an overdetermined set of linear equations
AX = B, where both the data matrix A as well as the observation
matrix B are inaccurate:

Nils

Reference:

R.D.Fierro, G.H. Golub, P.C. Hansen, D.P.O'Leary,
Regularization by truncated total least squares,
SIAM J. Sci. Comput. Vol.18(4) 1997 pp. 1223-1241


From barnard at stat.harvard.edu  Tue Nov 13 06:42:03 2001
From: barnard at stat.harvard.edu (barnard at stat.harvard.edu)
Date: Tue Nov 13 06:42:03 2001
Subject: [Numpy-discussion] Small Bug in multiarray.c
Message-ID: <15345.13522.866400.686203@aragorn.stat.harvard.edu>

When attempting to compile the CVS version of Numpy using MSVC 6
under Windows 2000 I found a small error in multiarray.c: the doc
string for arange contains newlines. The offending code begins one
line # 1168. Simple removing the newlines from the string fixes the
error.

John

********************************
* John Barnard, Ph.D.
* Senior Research Statistician
* deCODE genetics
* 1000 Winter Str., Suite 3100
* Waltham, MA 02451
* Phone (Direct)  : (781) 290-5771 Ext. 27
* Phone (General) : (781) 466-8833
* Fax             : (781) 466-8686
* Email: j.barnard at decode.com
********************************


From oliphant at ee.byu.edu  Tue Nov 13 11:25:03 2001
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Nov 13 11:25:03 2001
Subject: [Numpy-discussion] Total least squares problem
In-Reply-To: <3BF102C6.8C651D9E@mecha.uni-stuttgart.de>
Message-ID: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>

>
> How do I solve a Total Least Squares problem in Numpy ?
> A small example would be appreciated.
>
> The TLS problem assumes an overdetermined set of linear equations
> AX = B, where both the data matrix A as well as the observation
> matrix B are inaccurate:


X, resids, rank, s = LinearAlgebra.linear_least_squares(A,B)

-Travis


From R.M.Everson at exeter.ac.uk  Tue Nov 13 13:53:01 2001
From: R.M.Everson at exeter.ac.uk (R.M.Everson)
Date: Tue Nov 13 13:53:01 2001
Subject: [Numpy-discussion] BLAS and innerproduct
Message-ID: <ye1lmhaibjv.fsf@orange30.ex.ac.uk>

Hello,

So far as I can tell Numeric.dot(), which uses innerproduct() from
multiarraymodule.c doesn't call the BLAS, even if Numeric was compiled
against native BLAS.   This means (at least on my machine) that 

X = ones((150, 16384), 'd')
C = dot(X, tranpose(X))

is about 15 times as slow as the comparable operations in Matlab (v6),
which does, I think, use the native BLAS.

I guess that multiarray.c is not particularly optimised to use the
BLAS because of the difficulties of coping with all sorts of types
(float32, int64 etc), and with non-contiguous arrays.  The
innerproduct is so basic to most of the work I use Numeric for that a
speed up here would make a big difference.  I'm thinking of patching
multiarray.c to use the BLAS when it can, but before I start are there
good reasons for doing something different?

Any advice gratefully received!

Cheers,

Richard.


-- 
Department of Computer Science, Exeter University    Voice: +44 1392 264065
R.M.Everson at exeter.ac.uk                         Secretary: +44 1392 264061
http://www.dcs.ex.ac.uk/people/reverson                Fax: +44 1392 264067


From nwagner at mecha.uni-stuttgart.de  Wed Nov 14 04:44:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Wed Nov 14 04:44:03 2001
Subject: [Numpy-discussion] Total least squares problem
References: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>
Message-ID: <3BF27591.EC1BF4EA@mecha.uni-stuttgart.de>

Travis Oliphant schrieb:
> 
> >
> > How do I solve a Total Least Squares problem in Numpy ?
> > A small example would be appreciated.
> >
> > The TLS problem assumes an overdetermined set of linear equations
> > AX = B, where both the data matrix A as well as the observation
> > matrix B are inaccurate:
> 
> X, resids, rank, s = LinearAlgebra.linear_least_squares(A,B)
> 
> -Travis

Travis,

There is a difference between classical least squares (Numpy)
and TLS (total least squares).
I am attaching a small example for illustration.

Nils
-------------- next part --------------
from Numeric import *
from LinearAlgebra import *
A = zeros((6,3),Float)
b = zeros((6,1),Float)
#
# Example by Van Huffel
# http://www.netlib.org/vanhuffel/dtls-doc
#
A[0,0] = 0.80010002
A[0,1] = 0.39985167
A[0,2] = 0.60005390

A[1,0] = 0.29996484
A[1,1] = 0.69990689
A[1,2] = 0.39997269

A[2,0] = 0.49994235
A[2,1] = 0.60003167
A[2,2] = 0.20012361

A[3,0] = 0.90013643
A[3,1] = 0.20016919
A[3,2] = 0.79995025

A[4,0] = 0.39998539
A[4,1] = 0.80006338
A[4,2] = 0.49985474

A[5,0] = 0.20002274
A[5,1] = 0.90007114
A[5,2] = 0.70009777

b[0] = 0.89999446
b[1] = 0.82997570
b[2] = 0.79011189
b[3] = 0.85002662
b[4] = 0.99016399
b[5] = 0.10299439
print 'Solution of an overdetermined system of linear equations A x = b'
print
print 'A'
print
print A
#
print 'b'
print
print b
#
x, resids, rank, s = linear_least_squares(A,b)
print
print 'Least squares solution (Numpy)'
print
print x
print
print 'Computed rank',rank
print
print 'Sum of the squared residuals', resids
print 
print 'Singular values of A in descending order'
print
print s
#
xtls = zeros((3,1),Float)
#
# total least squares solution given by Van Huffel 
# http://www.netlib.org/vanhuffel/dtls-doc
#
xtls[0] = 0.500254
xtls[1] = 0.800251
xtls[2] = 0.299492
print 
print 'Total least squares solution'
print
print xtls
print
print 'Residuals of LS (Numpy)'
print 
print matrixmultiply(A,x)-b
print 
print 'Residuals of TLS'
print 
print matrixmultiply(A,xtls)-b
print 
#
# Least squares in Numpy A^\top A x = A^\top b
#
Atb = matrixmultiply(transpose(A),b)
AtA = matrixmultiply(transpose(A),A)
xls = solve_linear_equations(AtA,Atb)
print
print 'Least squares solution via normal equation'
print 
print xls

From hinsen at cnrs-orleans.fr  Wed Nov 14 05:30:07 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed Nov 14 05:30:07 2001
Subject: [Numpy-discussion] Total least squares problem
In-Reply-To: <3BF27591.EC1BF4EA@mecha.uni-stuttgart.de>
References: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>
	<3BF27591.EC1BF4EA@mecha.uni-stuttgart.de>
Message-ID: <m31yj1bi1i.fsf@chinon.cnrs-orleans.fr>

Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:

> There is a difference between classical least squares (Numpy)
> and TLS (total least squares).

Algorithmically speaking it is even a very different problem. I'd say
the only reasonable (i.e. efficient) solution for NumPy is to
implement the TLS algorithm in a C subroutine calling LAPACK routines
for SVD etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From nwagner at mecha.uni-stuttgart.de  Wed Nov 14 06:13:07 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Wed Nov 14 06:13:07 2001
Subject: [Numpy-discussion] Total least squares problem
References: <Pine.LNX.4.33L2.0111131223270.27777-100000@oliphant.ee.byu.edu>
		<3BF27591.EC1BF4EA@mecha.uni-stuttgart.de> <m31yj1bi1i.fsf@chinon.cnrs-orleans.fr>
Message-ID: <3BF28365.53373B65@mecha.uni-stuttgart.de>

Konrad Hinsen schrieb:
> 
> Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:
> 
> > There is a difference between classical least squares (Numpy)
> > and TLS (total least squares).
> 
> Algorithmically speaking it is even a very different problem. I'd say
> the only reasonable (i.e. efficient) solution for NumPy is to
> implement the TLS algorithm in a C subroutine calling LAPACK routines
> for SVD etc.
> 
> Konrad.
> --
There are two Fortran implementations of the TLS algorithm already
available via
http://www.netlib.org/vanhuffel/      
.
Moreover there is a tool called f2py that generates Python C/API modules
for wrapping Fortran 77/90/95 codes to Python.

Unfortunately I am not very familar with this tool.
 
Therefore I need some advice for this.

Thanks in advance

 Nils
 
  
> -------------------------------------------------------------------------------
> Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
> Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
> Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
> 45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
> France                                   | Nederlands/Francais
> -------------------------------------------------------------------------------


From nwagner at mecha.uni-stuttgart.de  Thu Nov 15 01:14:01 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Thu Nov 15 01:14:01 2001
Subject: [Numpy-discussion] Numpy, BLAS, LAPACK, f2py
Message-ID: <3BF395DB.DA1C34A4@mecha.uni-stuttgart.de>

Hi,

I have installed f2py on my system for wrapping existing FORTRAN 77
codes to Python.
Then I have gone through the following steps

An example for using a TLS (total least squares routine)
http://www.netlib.org/vanhuffel/

2) Get dtsl.f with dependencies
3) Run
   f2py dtsl.f -m foo -h foo.pyf only: dtsl
        \         \      \       \________ just wrap dtsl function      
         \         \      \______ create signature file
          \         \____ python module name
           \_____ Fortran 77 code
4) Edit foo.pyf to your specific needs (optional)
5) Run
   f2py foo.pyf
   \_____________ this will create Python C/API module foomodule.c
6) Run
   make -f Makefile-foo
   \_____________ this will build the module
7) In python:

Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import foo
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: ./foomodule.so: undefined symbol: dcopy_
>>>

Any suggestions to solve this problem ?

Nils

There are prebuilt libraries of LAPACK and BLAS in /usr/lib

-rw-r--r--    1 root     root       657706 Sep 24 01:00 libblas.a
lrwxrwxrwx    1 root     root           12 Okt 22 19:27 libblas.so ->
libblas.so.2
lrwxrwxrwx    1 root     root           16 Okt 22 19:27 libblas.so.2 ->
libblas.so.2.2.0
-rwxr-xr-x    1 root     root       559600 Sep 24 01:01 libblas.so.2.2.0
-rw-r--r--    1 root     root      5763150 Sep 24 01:00 liblapack.a
lrwxrwxrwx    1 root     root           14 Okt 22 19:27 liblapack.so ->
liblapack.so.3
lrwxrwxrwx    1 root     root           18 Okt 22 19:27 liblapack.so.3
-> liblapack.so.3.0.0
-rwxr-xr-x    1 root     root      4826626 Sep 24 01:01
liblapack.so.3.0.0


From gvermeul at labs.polycnrs-gre.fr  Thu Nov 15 01:28:02 2001
From: gvermeul at labs.polycnrs-gre.fr (Gerard Vermeulen)
Date: Thu Nov 15 01:28:02 2001
Subject: [Numpy-discussion] Numpy, BLAS, LAPACK, f2py
In-Reply-To: <3BF395DB.DA1C34A4@mecha.uni-stuttgart.de>
References: <3BF395DB.DA1C34A4@mecha.uni-stuttgart.de>
Message-ID: <01111510271301.11576@taco.polycnrs-gre.fr>

Hi,

Try to link in the blas library (there is a dcopy_ in my blas library,
but better check the README first).

best regards -- Gerard

On Thursday 15 November 2001 11:15, Nils Wagner wrote:
> Hi,
>
> I have installed f2py on my system for wrapping existing FORTRAN 77
> codes to Python.
> Then I have gone through the following steps
>
> An example for using a TLS (total least squares routine)
> http://www.netlib.org/vanhuffel/
>
> 2) Get dtsl.f with dependencies
> 3) Run
>    f2py dtsl.f -m foo -h foo.pyf only: dtsl
>         \         \      \       \________ just wrap dtsl function
>          \         \      \______ create signature file
>           \         \____ python module name
>            \_____ Fortran 77 code
> 4) Edit foo.pyf to your specific needs (optional)
> 5) Run
>    f2py foo.pyf
>    \_____________ this will create Python C/API module foomodule.c
> 6) Run
>    make -f Makefile-foo
>    \_____________ this will build the module
> 7) In python:
>
> Python 2.1.1 (#1, Sep 24 2001, 05:28:47)
> [GCC 2.95.3 20010315 (SuSE)] on linux2
> Type "copyright", "credits" or "license" for more information.
>
> >>> import foo
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ImportError: ./foomodule.so: undefined symbol: dcopy_
>
>
> Any suggestions to solve this problem ?
>
> Nils
>
> There are prebuilt libraries of LAPACK and BLAS in /usr/lib
>
> -rw-r--r--    1 root     root       657706 Sep 24 01:00 libblas.a
> lrwxrwxrwx    1 root     root           12 Okt 22 19:27 libblas.so ->
> libblas.so.2
> lrwxrwxrwx    1 root     root           16 Okt 22 19:27 libblas.so.2 ->
> libblas.so.2.2.0
> -rwxr-xr-x    1 root     root       559600 Sep 24 01:01 libblas.so.2.2.0
> -rw-r--r--    1 root     root      5763150 Sep 24 01:00 liblapack.a
> lrwxrwxrwx    1 root     root           14 Okt 22 19:27 liblapack.so ->
> liblapack.so.3
> lrwxrwxrwx    1 root     root           18 Okt 22 19:27 liblapack.so.3
> -> liblapack.so.3.0.0
> -rwxr-xr-x    1 root     root      4826626 Sep 24 01:01
> liblapack.so.3.0.0
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From perry at stsci.edu  Fri Nov 16 14:33:02 2001
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Nov 16 14:33:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
Message-ID: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>

We have been working on a reimplementation of Numeric, the
numeric array manipulation extension module for Python. 
The reimplementation is virtually a complete rewrite
and because it is not completely backwards compatible
with Numeric, we have dubbed it numarray to prevent
confusion.

While we think this version is not yet mature enough for
most to use in everyday projects, we are interested in
feedback on the user interface and the open issues (see 
the documents on the web page shown below). We also welcome
those who would like to contribute to this effort by helping
with the development or adding libraries.

An early beta version is available on sourceforge as the
package Numarray (http://sourceforge.net/projects/numpy/)

Information on the goals, changes in user interface, open issues,
and design can be found at http://aten.stsci.edu/numarray


From pete at shinners.org  Fri Nov 16 15:12:02 2001
From: pete at shinners.org (Pete Shinners)
Date: Fri Nov 16 15:12:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
Message-ID: <3BF59D10.2070107@shinners.org>

Perry Greenfield wrote:

> An early beta version is available on sourceforge as the
> package Numarray (http://sourceforge.net/projects/numpy/)
> 
> Information on the goals, changes in user interface, open issues,
> and design can be found at http://aten.stsci.edu/numarray

you ask a few questions on the information website, here are some of my 
answers for things i "care" about.

note that my main use of numpy is as a pixel buffer for images. some of 
the changes like avoiding type promotion sounds really good to me :]

5) should the implementation be bulletproof for private vars?
i don't think you should worry about this. as long as the interface is 
well defined, i wouldn't worry about protecting users from themselves. i 
this it will be the rare numarray user will be in a situation where they 
need to modify the internal C data.

7) necessary to add other types?
yes. i really want unsigned int16 and unsigned int32. all my operations 
are on pixel data, and things can just get messy when i need to treat 
packed color values as signed integers.

8) negative and out-of-range indices?
i'd prefer them to be kept as similar to python as can be. the current 
implementation in Numeric is nice for me.


one other thing i'd like there to be a little focus on is adding my own 
new ufunc operators. for image manipulation i'd like new ufunc operators 
that clamp the results to legal values. i'd be happy to do this myself, 
but i don't believe it's possible with the current Numeric.

the last thing i really really want is for this to be rolled into the 
standard python distribution. that is perhaps the most important aspect 
for me. i do not like requiring the extra dependency for generic numeric 
arrays. :]


From oliphant.travis at ieee.org  Fri Nov 16 18:42:02 2001
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Nov 16 18:42:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
Message-ID: <E164vQQ-0003ai-00@mail.xmission.com>

>
> While we think this version is not yet mature enough for
> most to use in everyday projects, we are interested in
> feedback on the user interface and the open issues (see
> the documents on the web page shown below). We also welcome
> those who would like to contribute to this effort by helping
> with the development or adding libraries.
>

What I've seen looks great.   You've all done some good work here.  

Of course, I do have some feedback.  I haven't looked at everything, these 
points have just caught my eye.

Complex Types:
==============

1)  I don't like the idea of complex types being a separate subclass of 
ndarray.  This makes them "different."   Unless this "difference" can be 
completely hidden (which I doubt), I would prefer complex types to be on the 
same level as other numeric types.

2)  Also,  in your C-API, you have a different pointer to the imaginary data. 
  I much prefer the way it is done currently to have complex numbers 
represented as an 8-byte, or 16-byte chunk of contiguous memory. 


Index Arrays:
===========

1)  For what it's worth, my initial reaction to your indexing scheme is 
negative.  I would prefer that if

a = [[1,2,3,4],
      [5,6,7,8],
      [9,10,11,12],
      [13,14,15,16]]

then 

a[[1,3],[0,3]] returns the sub-matrix:

[[   4,  6],
 [ 12, 14]

i.e. the cross-product of [1,3] x [0,3]   This is the way MATLAB works.  I'm 
not sure what IDL does.

If I understand your scheme, right now, then I would have to append an extra 
dimension to my indexing arrays to get this behavior, right?

2) I would like to be able to index the array in a flattenned sense as well 
(is that possible?)  in other words, it would be nice if a[flat(9,10,11)] or 
something got me the elements 9,10,11 in a one-dimensional interpretation of 
the array.

3) Why can't you combine slice notation and indexing?  Just interpret the 
slice as index array that would be created from using tha range operator on 
the same start, stop, and step objects.  Is this the plan?

That's all for now.  I don't mean to be critical, I'm really impressed with 
what works so far.  These are just some concerns I have right now.

-Travis Oliphant


From europax at home.com  Sat Nov 17 08:06:02 2001
From: europax at home.com (Rob)
Date: Sat Nov 17 08:06:02 2001
Subject: [Numpy-discussion] Numeric Python EM Project may need mirror
Message-ID: <3BF68A67.C4963807@home.com>

Hi all,

I just got an email from @home yesterday, saying that all customers
should back up their web pages, email, etc etc.  I know they are in
bankruptcy, but this email sounded ominous.  I'm wondering if there is
some kindly soul who would want to mirror this site.   I'd really love
to have this site on Starship Python, but haven't had any responses to
emails to them.  

I'm continuously working on more code for the site so I'd hate to see it
go down, even if temporarily.

Sincerely,  Rob.
-- 
The Numeric Python EM Project

www.members.home.net/europax


From greenfield at home.com  Sat Nov 17 14:58:02 2001
From: greenfield at home.com (Perry Greenfield)
Date: Sat Nov 17 14:58:02 2001
Subject: FW: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
Message-ID: <NFBBIJKCMEHBALEOENDECEPACGAA.greenfield@home.com>


-----Original Message-----
>
> What I've seen looks great.   You've all done some good work here.
>
Thanks, you were origin of some of the ideas used.

> Of course, I do have some feedback.  I haven't looked at
> everything, these
> points have just caught my eye.
>
> Complex Types:
> ==============
>
> 1)  I don't like the idea of complex types being a separate subclass of
> ndarray.  This makes them "different."   Unless this "difference" can be
> completely hidden (which I doubt), I would prefer complex types
> to be on the
> same level as other numeric types.
>
I think that we also don't like that, and after doing the original,
somewhat incomplete, implementation using the subclassed approach,
I began to feel that implementing it in C (albeit using a different
approach for the code generation) was probably easier and more
elegant than what was done here. So you are very likely to see
it integrated as a regular numeric type, with a more C-based
implementation.

> 2)  Also,  in your C-API, you have a different pointer to the
> imaginary data.
>   I much prefer the way it is done currently to have complex numbers
> represented as an 8-byte, or 16-byte chunk of contiguous memory.
>
Any reason not to allow both? (The pointer to the real can be interpreted
as either a pointer to 8-byte or 16-byte quantities). It is true
that figuring out the imaginary pointer from the real is trivial
so I suppose it really isn't necessary.
>
> Index Arrays:
> ===========
>
> 1)  For what it's worth, my initial reaction to your indexing scheme is
> negative.  I would prefer that if
>
> a = [[1,2,3,4],
>       [5,6,7,8],
>       [9,10,11,12],
>       [13,14,15,16]]
>
> then
>
> a[[1,3],[0,3]] returns the sub-matrix:
>
> [[   4,  6],
>  [ 12, 14]
>
> i.e. the cross-product of [1,3] x [0,3]   This is the way MATLAB
> works.  I'm
> not sure what IDL does.
>
I'm afraid I don't understand the example. Could you elaborate
a bit more how this is supposed to work? (Or is it possible
there is an error? I would understand it if the result were
[[5, 8],[13,16]] corresponding to the index pairs
[[(1,0),(1,3)],[(3,0),(3,3)]])

> If I understand your scheme, right now, then I would have to
> append an extra
> dimension to my indexing arrays to get this behavior, right?
>
> 2) I would like to be able to index the array in a flattenned
> sense as well
> (is that possible?)  in other words, it would be nice if
> a[flat(9,10,11)] or
> something got me the elements 9,10,11 in a one-dimensional
> interpretation of
> the array.
>
Why not:

ravel(a)[[9,10,11]] ?

> 3) Why can't you combine slice notation and indexing?  Just interpret the
> slice as index array that would be created from using tha range
> operator on
> the same start, stop, and step objects.  Is this the plan?
>
I think that allowing slicing could be possible. But things were
getting pretty complex as they were, and we wanted to see if
there was agreement on how it was being done so far. It could
be extended to handle slices, if there was a well defined
interpretation. (I think there may be at least two possible
interpretations considered). As for the above, sure, but of
course the slice would have to be shape consistent with
the other index arrays (under the current scheme).

> That's all for now.  I don't mean to be critical, I'm really
> impressed with
> what works so far.  These are just some concerns I have right now.
>
> -Travis Oliphant
>
Thanks Travis, we're looking for constructive feedback, positive
or negative.

Perry


From greenfield at home.com  Sat Nov 17 16:28:02 2001
From: greenfield at home.com (Perry Greenfield)
Date: Sat Nov 17 16:28:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E165Ai5-0001VX-00@mail.xmission.com>
Message-ID: <NEBBIJKBMLDBLNCEEFOCGEECCMAA.greenfield@home.com>

> > I think that we also don't like that, and after doing the original,
> > somewhat incomplete, implementation using the subarray approach,
> > I began to feel that implementing it in C (albiet using a different
> > approach for the code generation) was probably easier and more
> > elegant than what was done here. So you are very likely to see
> > it integrated as a regular numeric type, with a more C-based
> > implementation.
>
> Sounds good.   Is development going to take place on the CVS
> tree.  If so, I
> could help out by comitting changes directly.
>
> >
> > > 2)  Also,  in your C-API, you have a different pointer to the
> > > imaginary data.
> > >   I much prefer the way it is done currently to have complex numbers
> > > represented as an 8-byte, or 16-byte chunk of contiguous memory.
> >
> > Any reason not to allow both? (The pointer to the real can be
> interpreted
> > as either a pointer to 8-byte or 16-byte quantities). It is true
> > that figuring out the imaginary pointer from the real is trivial
> > so I suppose it really isn't necessary.
>
> I guess the way you've structured the ndarray, it is possible.  I figured
> some operations might be faster, but perhaps not if you have two pointers
> running at the same time, anyway.
>
Well, the C implementation I was thinking of would only use
one pointer. The API could supply both if some algorithms would
find it useful to just access the imaginary data alone. But as
mentioned, I don't think it is important to include, so we
could easily get rid of it (and probably should)

> >
> > > Index Arrays:
> > > ===========
> > >
> > > 1)  For what it's worth, my initial reaction to your indexing
> scheme is
> > > negative.  I would prefer that if
> > >
> > > a = [[1,2,3,4],
> > >       [5,6,7,8],
> > >       [9,10,11,12],
> > >       [13,14,15,16]]
> > >
> > > then
> > >
> > > a[[1,3],[0,3]] returns the sub-matrix:
> > >
> > > [[   4,  6],
> > >  [ 12, 14]
> > >
> > > i.e. the cross-product of [1,3] x [0,3]   This is the way MATLAB
> > > works.  I'm
> > > not sure what IDL does.
> >
> > I'm afraid I don't understand the example. Could you elaborate
> > a bit more how this is supposed to work? (Or is it possible
> > there is an error? I would understand it if the result were
> > [[5, 8],[13,16]] corresponding to the index pairs
> > [[(1,0),(1,3)],[(3,0),(3,3)]])
> >
>
> The idea is to consider indexing with arrays of integers to be a
> generalization of slice index notation.   Simply interpret the
> slice as an
> array of integers that would be formed by using the range operator.
>
> For example, I would like to see
>
> a[1:5,1:3] be the same thing  as  a[[1,2,3,4],[1,2]]
>
> a[1:5,1:3] selects the 2-d subarray consisting of rows 1 to 4 and
> columns 1
> to 2 (inclusive starting with the first row being row 0).  In
> other words,
> the indices used to select the elements of a are ordered-pairs
> taken from the
> cross-product of the index set:
>
> [1,2,3,4] x [1,2] = [(1,1), (1,2), (2,1), (2,2), (3,1), (3,2),
> (4,1), (4,2)]
> and these selected elements are structured as a 2-d array of shape (4,2)
>
> Does this make more sense?  Indexing would be a natural extension of this
> behavior but allowing sets that can't be necessarily formed from
> the range
> function.
>
I understand this (but is the example in the first message
consistent with this?). This is certainly a reasonable
interpetation. But if this is the way multiple index arrays
are interpreted, how does one easily specify scattered points
in a multidimensional array? The only other alternative I can
think of is to use some of the dimensions of a multidimensional
index array as indicies for each of the dimensions. For example,
if one wanted to index random points in a 2d array, then
supplying an nx2 array would provide a list of n such points.
But I see this as a more limiting way to do this (and there
are often benefits to being able to keep the indices for
different dimensions in separate arrays.

But I think doing what you would like to do is straightforward
even with the existing implementation. For example, if x is a
2d array we could easily develop a function such that:

x[outer_index_product([1,3,4],[1,5])]
# with a better function name!

The function outer_index_product would return a tuple of two
index arrays each with a shape of 3x2. These arrays
would not take up more space than the original
arrays even though they appear to have a much
larger size (the one dimension is replicated by
use of a 0 stride size so the data buffer is
the same as the original). Would this be acceptable?

In the end, all these indexing behaviors can be provided
by different functions. So it isn't really a question of
which one to have and which not to have. The question is
what is supported by the indexing notation? For us, the
behavior we have implemented is far more useful for our
applications than the one you propose. But perhaps we are
in the minority, so I'd be very interested in hearing which
indexing interpretation is most useful to the general
community.

> > Why not:
> >
> > ravel(a)[[9,10,11]] ?
>
> sure, that would work, especially if ravel doesn't make a copy of
> the data
> (which I presume it does not).
>
Correct.

Perry


From greenfield at home.com  Sat Nov 17 17:23:06 2001
From: greenfield at home.com (Perry Greenfield)
Date: Sat Nov 17 17:23:06 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E165Bjn-0006TO-00@usw-sf-list1.sourceforge.net>
Message-ID: <NFBBIJKCMEHBALEOENDEGEPACGAA.greenfield@home.com>

From: Pete Shinners <pete at shinners.org>

> 7) necessary to add other types?
> yes. i really want unsigned int16 and unsigned int32. all my operations 
> are on pixel data, and things can just get messy when i need to treat 
> packed color values as signed integers.
> 
Unsigned int16 is already supported. UInt32 could be done, but
raises some interesting issues with regard to combining with
Int32. I don't believe the current implementation prevents you
from carrying around unsigned data in Int32 arrays. If you
are using them as packed color values, do you ever do any
arithmetic operations on them other than to pack and unpack 
them?

> one other thing i'd like there to be a little focus on is adding my own 
> new ufunc operators. for image manipulation i'd like new ufunc operators 
> that clamp the results to legal values. i'd be happy to do this myself, 
> but i don't believe it's possible with the current Numeric.
> 
It will be possible for users to add their own ufuncs. We will 
eventually document how to do so (and it should be fairly simple
to do once we give a few example templates).

Perry
> 


From alessandro.mirone at wanadoo.fr  Sun Nov 18 07:42:01 2001
From: alessandro.mirone at wanadoo.fr (Alessandro Mirone)
Date: Sun Nov 18 07:42:01 2001
Subject: [Numpy-discussion] Heigenvalues is broken
Message-ID: <3BF7E462.A473B686@wanadoo.fr>

Is it a problem of lapack3.0 of of
LinearAlgebra.py?
..................... ==> (Eigenvalues should be (0,2)) 


>>> a=array([[1,0],[0,1]])
>>> b=array([[0,1],[-1,0]])
>>> M=a+b*complex(0,1.0)
>>> Heigenvalues(M)
array([-2.30277564,  1.30277564])
>>> print M
[[ 1.+0.j  0.+1.j]
 [ 0.-1.j  1.+0.j]]
>>>


From oliphant.travis at ieee.org  Sun Nov 18 19:01:01 2001
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Nov 18 19:01:01 2001
Subject: [Numpy-discussion] Heigenvalues is broken
In-Reply-To: <3BF7E462.A473B686@wanadoo.fr>
References: <3BF7E462.A473B686@wanadoo.fr>
Message-ID: <E165efe-0008M3-00@mail.xmission.com>

On Sunday 18 November 2001 09:40 am, Alessandro Mirone wrote:
> Is it a problem of lapack3.0 of of
> LinearAlgebra.py?
> ..................... ==> (Eigenvalues should be (0,2))
>
> >>> a=array([[1,0],[0,1]])
> >>> b=array([[0,1],[-1,0]])
> >>> M=a+b*complex(0,1.0)
> >>> Heigenvalues(M)

I suspect it is your lapack.  On an Athlon running Mandrake Linux with the  
lapack-3.0-9 package, I get.

>>> a=array([[1,0],[0,1]])
>>> b=array([[0,1],[-1,0]])
>>> M=a+b*complex(0,1.0)
>>> Heigenvalues(M)
array([ 0.,  2.])


-Travis


From nwagner at mecha.uni-stuttgart.de  Sun Nov 18 23:58:01 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Sun Nov 18 23:58:01 2001
Subject: [Numpy-discussion] Heigenvalues is broken
References: <3BF7E462.A473B686@wanadoo.fr>
Message-ID: <3BF8C9FA.97B3AEB1@mecha.uni-stuttgart.de>

Alessandro Mirone schrieb:
> 
> Is it a problem of lapack3.0 of of
> LinearAlgebra.py?
> ..................... ==> (Eigenvalues should be (0,2))
> 
> >>> a=array([[1,0],[0,1]])
> >>> b=array([[0,1],[-1,0]])
> >>> M=a+b*complex(0,1.0)
> >>> Heigenvalues(M)
> array([-2.30277564,  1.30277564])
> >>> print M
> [[ 1.+0.j  0.+1.j]
>  [ 0.-1.j  1.+0.j]]
> >>>
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

On an Athlon running SuSE Linux 7.3 with the  
lapack-3.0-0 package, I get.

[-2.30277564  1.30277564]

Nils


From Peter.Verveer at embl-heidelberg.de  Mon Nov 19 02:45:02 2001
From: Peter.Verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Nov 19 02:45:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <3BF59D10.2070107@shinners.org>
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu> <3BF59D10.2070107@shinners.org>
Message-ID: <E165lul-0007lF-00@usw-sf-list1.sourceforge.net>

On Saturday 17 November 2001 00:11 am, you wrote:
> note that my main use of numpy is as a pixel buffer for images. some of
> the changes like avoiding type promotion sounds really good to me :]

I have exactly the same application so I agree with this.

> 7) necessary to add other types?
> yes. i really want unsigned int16 and unsigned int32. all my operations
> are on pixel data, and things can just get messy when i need to treat
> packed color values as signed integers.

Yes please! One of the things that irritates me most on the original Numeric 
is that some types are lacking. I think the whole range of data types should 
be supported, even if some may be seldom used by most people.

> one other thing i'd like there to be a little focus on is adding my own
> new ufunc operators. for image manipulation i'd like new ufunc operators
> that clamp the results to legal values. i'd be happy to do this myself,
> but i don't believe it's possible with the current Numeric.

I write functions in C that directly access the numeric data. I don't use the 
ufunc api. One reason that I do that is that I want my libary of routines to 
be useful independent of  Numeric, so I only have a tiny glue between my C 
routines and Numeric. I hope that it will be still possible to do this in the 
new version.

> the last thing i really really want is for this to be rolled into the
> standard python distribution. that is perhaps the most important aspect
> for me. i do not like requiring the extra dependency for generic numeric
> arrays. :]

I second that!

Cheers, Peter
-- 
Dr. Peter J. Verveer
Bastiaens Group
Cell Biology and Cell Biophysics Programme
EMBL
Meyerhofstrasse 1
D-69117 Heidelberg
Germany
Tel. : +49 6221 387245
Fax  : +49 6221 387242
Email: Peter.Verveer at embl-heidelberg.de


From tpitts at accentopto.com  Mon Nov 19 05:58:03 2001
From: tpitts at accentopto.com (Todd Alan Pitts, Ph.D.)
Date: Mon Nov 19 05:58:03 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E164vQQ-0003ai-00@mail.xmission.com>; from oliphant.travis@ieee.org on Fri, Nov 16, 2001 at 07:43:41PM -0700
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu> <E164vQQ-0003ai-00@mail.xmission.com>
Message-ID: <20011119065758.B11653@fermi.accentopto.com>

Thanks for all of your work.  Things seem to be shaping up nicely.  I
just wanted to second some of the concerns below:

> Complex Types:
> ==============
> 
> 1)  I don't like the idea of complex types being a separate subclass of 
> ndarray.  This makes them "different."   Unless this "difference" can be 
> completely hidden (which I doubt), I would prefer complex types to be on the 
> same level as other numeric types.
> 
> 2)  Also,  in your C-API, you have a different pointer to the imaginary data. 
>   I much prefer the way it is done currently to have complex numbers 
> represented as an 8-byte, or 16-byte chunk of contiguous memory. 
> 

The second comment above is really critical for accessing utility
available in a very large number of numerical libraries.  In my view
this would "break" the utility of numpy severely -- recopying arrays
both on the way out and the way in would be extremely cumbersome.

-Todd Alan Pitts


From jh at oobleck.astro.cornell.edu  Mon Nov 19 08:47:02 2001
From: jh at oobleck.astro.cornell.edu (Joe Harrington)
Date: Mon Nov 19 08:47:02 2001
Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #345 - 4 msgs
In-Reply-To: <E165Bjn-0006TO-00@usw-sf-list1.sourceforge.net>
	(numpy-discussion-request@lists.sourceforge.net)
References: <E165Bjn-0006TO-00@usw-sf-list1.sourceforge.net>
Message-ID: <200111191646.fAJGkCL28182@oobleck.astro.cornell.edu>

Just to fill in the blanks, here's what IDL does:

IDL> a = [[1,2,3,4], $
IDL>      [5,6,7,8], $
IDL>      [9,10,11,12], $
IDL>      [13,14,15,16]]
IDL> print,a
       1       2       3       4
       5       6       7       8
       9      10      11      12
      13      14      15      16
IDL> print, a[[1,3],[0,3]] 
       2      16

--jh--


From jsw at cdc.noaa.gov  Mon Nov 19 11:37:05 2001
From: jsw at cdc.noaa.gov (Jeff Whitaker)
Date: Mon Nov 19 11:37:05 2001
Subject: [Numpy-discussion] Heigenvalues is broken
In-Reply-To: <E165efe-0008M3-00@mail.xmission.com>
Message-ID: <Pine.OSX.4.40.0111191234250.20848-100000@crdmac10.cdc.noaa.gov>

On Sun, 18 Nov 2001, Travis Oliphant wrote:

> On Sunday 18 November 2001 09:40 am, Alessandro Mirone wrote:
> > Is it a problem of lapack3.0 of of
> > LinearAlgebra.py?
> > ..................... ==> (Eigenvalues should be (0,2))
> >
> > >>> a=array([[1,0],[0,1]])
> > >>> b=array([[0,1],[-1,0]])
> > >>> M=a+b*complex(0,1.0)
> > >>> Heigenvalues(M)
>
> I suspect it is your lapack.  On an Athlon running Mandrake Linux with the
> lapack-3.0-9 package, I get.
>
> >>> a=array([[1,0],[0,1]])
> >>> b=array([[0,1],[-1,0]])
> >>> M=a+b*complex(0,1.0)
> >>> Heigenvalues(M)
> array([ 0.,  2.])

This is definitely a hardware/compiler dependant feature.  I get the
"right" answer on Solaris (with the forte compiler) but the same "wrong"
answer as Alessandro on MacOS X/gcc.  I've tried fiddling with compiler
options on my OS X box, to no avail.

-Jeff

 --
Jeffrey S. Whitaker         Phone  : (303)497-6313
Meteorologist               FAX    : (303)497-6449
NOAA/OAR/CDC  R/CDC1        Email  : jsw at cdc.noaa.gov
325 Broadway                Web    : www.cdc.noaa.gov/~jsw
Boulder, CO, USA 80303-3328 Office : Skaggs Research Cntr 1D-124


From ransom at physics.mcgill.ca  Mon Nov 19 11:47:02 2001
From: ransom at physics.mcgill.ca (Scott Ransom)
Date: Mon Nov 19 11:47:02 2001
Subject: [Numpy-discussion] Heigenvalues is broken
In-Reply-To: <Pine.OSX.4.40.0111191234250.20848-100000@crdmac10.cdc.noaa.gov>
References: <Pine.OSX.4.40.0111191234250.20848-100000@crdmac10.cdc.noaa.gov>
Message-ID: <E165uMn-0002IJ-00@spock.physics.mcgill.ca>

On November 19, 2001 02:36 pm, Jeff Whitaker wrote:
>
> This is definitely a hardware/compiler dependant feature.  I get the
> "right" answer on Solaris (with the forte compiler) but the same "wrong"
> answer as Alessandro on MacOS X/gcc.  I've tried fiddling with compiler
> options on my OS X box, to no avail.

But seemingly it is even stranger than this.  Here are my results from Debian 
unstable using Lapack 3.0 on an Athlon system:

Python 2.1.1 (#1, Nov 11 2001, 18:19:24)
[GCC 2.95.4 20011006 (Debian prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from LinearAlgebra import *
>>> a=array([[1,0],[0,1]])
>>> b=array([[0,1],[-1,0]])
>>> M=a+b*complex(0,1.0)
>>> Heigenvalues(M)
array([ 0.,  2.])

Scott


> On Sun, 18 Nov 2001, Travis Oliphant wrote:
> > On Sunday 18 November 2001 09:40 am, Alessandro Mirone wrote:
> > > Is it a problem of lapack3.0 of of
> > > LinearAlgebra.py?
> > > ..................... ==> (Eigenvalues should be (0,2))
> > >
> > > >>> a=array([[1,0],[0,1]])
> > > >>> b=array([[0,1],[-1,0]])
> > > >>> M=a+b*complex(0,1.0)
> > > >>> Heigenvalues(M)
> >
> > I suspect it is your lapack.  On an Athlon running Mandrake Linux with
> > the lapack-3.0-9 package, I get.
> >
> > >>> a=array([[1,0],[0,1]])
> > >>> b=array([[0,1],[-1,0]])
> > >>> M=a+b*complex(0,1.0)
> > >>> Heigenvalues(M)
> >
> > array([ 0.,  2.])
>
> This is definitely a hardware/compiler dependant feature.  I get the
> "right" answer on Solaris (with the forte compiler) but the same "wrong"
> answer as Alessandro on MacOS X/gcc.  I've tried fiddling with compiler
> options on my OS X box, to no avail.
>
> -Jeff
>
>  --
> Jeffrey S. Whitaker         Phone  : (303)497-6313
> Meteorologist               FAX    : (303)497-6449
> NOAA/OAR/CDC  R/CDC1        Email  : jsw at cdc.noaa.gov
> 325 Broadway                Web    : www.cdc.noaa.gov/~jsw
> Boulder, CO, USA 80303-3328 Office : Skaggs Research Cntr 1D-124
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
Scott M. Ransom              Address:  McGill Univ. Physics Dept.
Phone:  (514) 398-6492                 3600 University St., Rm 338
email:  ransom at physics.mcgill.ca       Montreal, QC  Canada H3A 2T8 
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989


From Barrett at stsci.edu  Mon Nov 19 14:12:02 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Mon Nov 19 14:12:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu>
Message-ID: <3BF98336.9010500@STScI.Edu>

Perry Greenfield wrote:

> 
> An early beta version is available on sourceforge as the
> package Numarray (http://sourceforge.net/projects/numpy/)
> 
> Information on the goals, changes in user interface, open issues,
> and design can be found at http://aten.stsci.edu/numarray


  6) Should array properties be accessible as public attributes
    instead of through accessor methods?

    We don't currently allow public array attributes to make
    the Python code simpler and faster (otherwise we will
    be forced to use __setattr__ and such). This results in
    incompatibilty with previous code that uses such attributes.


I prefer the use of public attributes over accessor methods.


-- 
Paul Barrett, PhD      Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From perry at stsci.edu  Tue Nov 20 12:29:13 2001
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Nov 20 12:29:13 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E166HCW-0007DC-00@usw-sf-list1.sourceforge.net>
Message-ID: <JFEGLNDJEDNOMPPHDEJFMEJDDLAA.perry@stsci.edu>

>   6) Should array properties be accessible as public attributes
>     instead of through accessor methods?
> 
>     We don't currently allow public array attributes to make
>     the Python code simpler and faster (otherwise we will
>     be forced to use __setattr__ and such). This results in
>     incompatibilty with previous code that uses such attributes.
> 
> 
> I prefer the use of public attributes over accessor methods.
> 
> 
> -- 
> Paul Barrett, PhD      Space Telescope Science Institute

The issue of efficiency may not be a problem with Python 2.2
or later since it provides new mechanisms that avoid the need
to use __setattr__ to solve this problem. (e.g. __slots__,
property, __get__, and __set__). So it becomes more
of an issue of which style people prefer rather than simplicity
and speed of the code.

Perry


From chrishbarker at home.net  Tue Nov 20 15:23:12 2001
From: chrishbarker at home.net (Chris Barker)
Date: Tue Nov 20 15:23:12 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical arrays (Numeric) available
 for  download
References: <JFEGLNDJEDNOMPPHDEJFEEJCDLAA.perry@stsci.edu>
Message-ID: <3BFAEA19.3153B495@home.net>

Perry Greenfield wrote:

> > One major comment that isn't directly addressed on the web page is the
> > ease of writing new functions, I suppose Ufuncs, although I don't
> > usually care if they work on anything other than Arrays. I hope the new
> > system will make it easier to write new ones. 
<snip>

> Absolutely. We will provide examples of how to write new ufuncs. It should
> be very simple in one sense (requiring few lines of code) if our code
> generator machinery is used (but context is important here so this
> is why examples or a template is extremely important). But it isn't
> particularly hard to do without the code generator. And such ufuncs
> will handle *all* the generality of arrays including slices, non-aligned
> arrays, byteswapped arrays, and type conversion. I'd like to provide
> examples of writing ufuncs within a few weeks (along with examples
> of other kinds of functions using the C-API as well).

This sounds great! The code generting machinery sound very promising,
and examples are, of course, key. I found digging through the NumPy
source to figure out how to do things very treacherous. Making writing
Ufuncs easy will enocourage a lot more C Ufuncs to be written which
should help perfomance.

> > Also, I can't help wondering if this could leverage more existing code.
> > The blitz++ package being used by Eric Jones in the SciPy.compiler
> > project looks very promising. It's probably too late, but I'm wondering
> > what the reasons are for re-inventing such a general purpose wheel.
> >
> I'm not sure which "wheel" you are talking about :-)

The wheel I'm talking about are multi-dimensional array objects...

> We certainly
> aren't trying to replilcate what Eric Jones has done with the
> SciPy.compiler approach (which is very interesting in its own right).

I know, I just think using an existing set of C++ classes for multiple
typed multidimansional arrays would make sense, although I imagine it is
too late now!

> If the issue is why we are redoing Numeric:

Actually, I think I had a pretty good idea why you were working on this.

> 1) it has to be rewritten to be acceptable to Guido before it can be
>    part of the Standard Library.
> 2) to add new types (e.g. unsigned) and representations (e.g., non-aligned,
>    byteswapped, odd strides, etc). Using memory mapped data requires some
>    of these.
> 3) to make it more memory efficient with large arrays.
> 4) to make it more generally extensible

I'm particualry excited about 1) and 4)

> > As a whole I have found that I would like the transition from Python to
> > Compiled laguages to be smoother. The standard answer to Python
> > perfomance is to profile, and then re-write the computationally intesive
> > pertions in C. This would be a whole lot easier if Python used datatypes
> > that are easy to use from C/C++ as well as Python. I hope NumPy2 can
> > move in this direction.
> >
> What do you see as missing in numarray in that sense? Aside from UInt32
> I'm not aware of any missing type that is available on all platforms.
> There is the issue of Float128 and such. Adding these is not hard.
> The real issue is how to deal with the platforms that don't support them.

I used Poor wording. When I wrote "datatypes", I meant data types in a
much higher order sense. Perhaps structures or classes would be a better
term. What I mean is that is should be easy to use an manipulate the
same multidimensional arrays from both Python and C/C++. In the current
Numeric, most folks generate a contiguous array, and then just use the
array->data pointer to get what is essentially a C array. That's fine if
you are using it in a traditional C way, with fixed dimension, one
datatype, etc. What I'm imagining is having an object in C or C++ that
could be easily used as a multidimentional array. I'm thinking C++ would
probably neccesary, and probably templates as well, which is why blitz++
looked promising. Of course, blitz++ only compiles with a few up-to-date
compilers, so you'd never get it into the standard library that way!

This could also lead the way to being able to compile NumPy code....<end
fantasy>

> I think it is pretty easy to install since it use distutils.

I agree, but from the newsgroup, it is clear that a lot of folks are
very reluctant to use something that is not part of the standard
library.

> > >    We estimate
> > >    that numarray is probably another order of magnitude worse,
> > >    i.e., that 20K element arrays are at half the asymptotic
> > >    speed. How much should this be improved?
> >
> > A lot. I use arrays smaller than that most of the time!
> >
> What is good enough? As fast as current Numeric?

As fast as current Numeric would be "good enough" for me. It would be a
shame to go backwards in performance!

> (IDL does much
> better than that for example).

My personal benchmark is MATLAB, which I imagine is similar to IDL in
performance.

> 10 element arrays will never be
> close to C speed in any array based language embedded in an
> interpreted environment.

Well, sure, I'm not expecting that

> 100, maybe, but will be very hard.
> 1000 should be possible with some work.

I suppose MATLAB has it easier, as all arrays are doubles, and, (untill
recently anyway), all variable where arrays, and all arrays were 2-d.
NumPy is a lot more flexible that that. Is is the type and size checking
that takes the time?
 
> Another approach is to try to cast many of the functions as being
> able to broadcast over repeated small arrays. After all, if one
> is only doing a computation on one small array, it seems unlikely
> that the overhead of Python will be objectionable. Only if you
> have many such arrays to repeat calculations on, should it be
> a problem (or am I wrong about that).

You are probably right about that.

> If these repeated calculations
> can be "assembled"  into a higher dimensionality array (which
> I understand isn't always possible) and operated on in that sense,
> the efficiency issue can be dealt with.

I do that when possible, but it's not always possible.

> But I guess this can only
> be seen with specific existing examples and programs. I would
> be interested in seeing the kinds of applications you have now
> to gauge what the most effective solution would be.

One of the things I do a lot with are coordinates of points and
polygons. Sets if points I can handle easily as an NX2 array, but
polygons don't work so well, as each polgon has a different number of
points, so I use a list of arrays, which I have to loop over. Each
polygon can have from about 10 to thousands of points (mostly 10-20,
however). One way I have dealt with this is to store a polygon set as a
large array of all the points, and another array with the indexes of the
start and end of each polygon. That way I can transform the coordinates
of all the polygons in one operation. It works OK, but sometimes it is
more useful to have them in a sequence. 

> As mentioned,
> we tend to deal with large data sets and so I don't think we have
> a lot of such examples ourselves.

I know large datasets were one of your driving factors, but I really
don't want to make performance on smaller datasets secondary.

I hope I'll get a chance to play with it soon....

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From nwagner at mecha.uni-stuttgart.de  Thu Nov 22 02:43:06 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Thu Nov 22 02:43:06 2001
Subject: [Numpy-discussion] Numpy for FORTRAN users
Message-ID: <3BFCE508.E6C365DF@mecha.uni-stuttgart.de>

Hi,

Currently users must be aware of the fact that multi-dimensional
arrays are stored differently in Python and Fortran.

Is there any progress that users do not need to worry about this 
rather confusing and technical detail ?

Nils


From martin.wiechert at gmx.de  Thu Nov 22 05:23:02 2001
From: martin.wiechert at gmx.de (Martin Wiechert)
Date: Thu Nov 22 05:23:02 2001
Subject: [Numpy-discussion] Numpy2 and GSL
Message-ID: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>

Hi!

Just an uneducated question.
Are there any plans to wrap GSL for Numpy2?
I did not actually try it (It's not Python ;-)),
but it looks clean and powerful.

Regards,
Martin.


From hinsen at cnrs-orleans.fr  Thu Nov 22 06:29:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Thu Nov 22 06:29:02 2001
Subject: [Numpy-discussion] Numpy2 and GSL
In-Reply-To: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
References: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
Message-ID: <m3wv0ilw8e.fsf@chinon.cnrs-orleans.fr>

Martin Wiechert <martin.wiechert at gmx.de> writes:

> Are there any plans to wrap GSL for Numpy2?
> I did not actually try it (It's not Python ;-)),
> but it looks clean and powerful.

I have heard that several projects decided not to use it for legal
reasons; GSL is GPL, not LGPL. Personally I don't see the problem for
Python/NumPy, but then I am not a lawyer...  And I haven't used GSL
either, but it looks good from the description.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From edcjones at erols.com  Thu Nov 22 17:30:10 2001
From: edcjones at erols.com (Edward C. Jones)
Date: Thu Nov 22 17:30:10 2001
Subject: [Numpy-discussion] Numeric & changes in Python division
Message-ID: <3BFDA742.5080109@erols.com>

# Python 2.2b1, Numeric 20.2.0

from __future__ import division
import Numeric

arr = Numeric.ones((2,2), 'f')
arr = arr/2.0

#Traceback (most recent call last):
#  File "bug.py", line 6, in ?
#arr = arr/2.0
#TypeError: unsupported operand type(s) for /


From paul at pfdubois.com  Thu Nov 22 18:51:01 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Thu Nov 22 18:51:01 2001
Subject: [Numpy-discussion] Numeric & changes in Python division
In-Reply-To: <3BFDA742.5080109@erols.com>
Message-ID: <000201c173c9$606902c0$3d01a8c0@plstn1.sfba.home.com>

You know what the doctor said: if it hurts when you do that, don't do
that.

Seriously, I have not the slightest idea what you're doing here. My
project won't get to 2.2 until well into the new year. Especially if
stuff like this has to be fixed. I haven't even read most of the 2.2
changes.

I understand this is also an issue with CXX. Barry Scott runs CXX now
since I am no longer in a job where I use C++. When he will get to this
I don't know. I need to demote myself on the CXX website.

You haven't seen any recent changes to Numpy, or comments from me on
numarray, because I have a release to get out at my job.


-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of
Edward C. Jones
Sent: Thursday, November 22, 2001 5:33 PM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Numeric & changes in Python division


# Python 2.2b1, Numeric 20.2.0

from __future__ import division
import Numeric

arr = Numeric.ones((2,2), 'f')
arr = arr/2.0

#Traceback (most recent call last):
#  File "bug.py", line 6, in ?
#arr = arr/2.0
#TypeError: unsupported operand type(s) for /


_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From siopis at umich.edu  Fri Nov 23 20:59:01 2001
From: siopis at umich.edu (Christos Siopis <siopis@umich.edu>)
Date: Fri Nov 23 20:59:01 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same thing?
In-Reply-To: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
Message-ID: <Pine.LNX.4.33.0111232236590.18487-100000@orb.astro.lsa.umich.edu>

[ This message got longer than i had initially thought, but these thoughts 
  have been bugging me for so long that i cannot resist the temptation to 
  push the send button! Apologies in advance to those not interested...
]

On Mon, 26 Nov 2001, Martin Wiechert wrote:

> Hi!
> 
> Just an uneducated question.
> Are there any plans to wrap GSL for Numpy2?
> I did not actually try it (It's not Python ;-)),
> but it looks clean and powerful.
> 
> Regards,
> Martin.

I actually think that this question has come up before in this list,
perhaps more than once. And i think it brings up a bigger issue, which is:  
to what extent is it useful for the numerical community to have multiple
numerical libraries, and to what extent does this constitute a waste of
resources?

Numpy (Python), PDL (Perl), GSL (C), and a rather large number of other
libraries usually have to re-implement the same old numerical algorithms,
but offered under a different interface each time. However, there is such
a big body of numerical algorithms out there that it's a daunting effort
to integrate them into every language's numerical library (anyone want to
implement LAPACK's functionality in Numpy?) The compromise that is usually
made is to wrap one library around another. While this may be "better than
nothing", it is usually not a pleasant situation as it leads to
inconsistencies in the interface, inconsistencies in the error handling,
difficulties in the installation, problems with licensing,...

Since i have been a beneficiary rather than a contributor to the numerical
open-source community, i feel somewhat hesitant to file this "complaint",
but i really do think that there are relatively few people out there who
are both willing and capable of building quality open-source numerical
software, while there are too many algorithms to implement, so the
community should be vigilant to minimize waste of resources!

Don't take me wrong, i am not saying that Numpy, PDL, GSL & co. should be
somehow "merged" --obviously, one needs different wrappers to call
numerical routines from Python, Perl, C, C++ or Java. But there should be
a way so that the actual *implementation* of the numerical algorithms is
only done once and for all.

So what i envision, in some sense, is a super-library of "all"/as many as
possible numerical algorithms, which will present appropriate (but
consistent) APIs for different programming languages, so that no matter
what language i use, i can expect consistent interface, consistent
numerical behavior, consistent error handling etc. Furthermore, different
levels of access should allow the application developer to access
low-level or high-level routines as needed (and could object orientation
be efficiently built as a higher-level wrapper?)

This way, the programmer won't have to worry whether the secant root
finder that s/he is using handles exceptions well or how NaNs are treated.
Perhaps most importantly, people would feel compelled to go into the pain
of "translating" existing libraries such as LAPACK into this common
framework, because they will know that this will benefit the entire
community and won't go away when the next scripting language du jour
eclipses their current favorite. Over time, this may lead to a truly
precious resource for the numerical community.

Now, i do realize that this may sound like a "holy grail" of numerical
computing, that it is something which is very difficult, if not impossible
to accomplish. It certainly does not seem like a project that the next
ambitious programmer or lab group would want to embark into on a rainy
day. Rather, it would require a number of important requirements and
architectural decisions to be made first, and trade-offs considered. This
would perhaps be best coordinated by the numerical community at large,
perhaps under the auspices of some organization. But this would be time
well-spent, for it would form the foundations on which a truly universal
numerical library could be built. Experience gained from all the numerical
projects to this day would obviously be invaluable in such an endeavor.

I suspect that this list may not be the best place to discuss such a
topic, but i think that some of the most active people in the field lurk
here, and i would love to hear their thoughts and understand why i am
wrong :) If there is a more appropriate forum to discuss such issues, i
would be glad to be pointed to it --in which case, please disregard this
posting!

***************************************************************
/  Christos Siopis              | Tel    : 734-764-3440       \
/  Postdoctoral Research Fellow |                             \
/  Department of Astronomy      | FAX    : 734-763-6317       \
/  University of Michigan       |                             \
/  Ann Arbor, MI 48109-1090     | E-mail : siopis at umich.edu   \
/  U.S.A.  _____________________|                             \
/         / http://www.astro.lsa.umich.edu/People/siopis.html \
***************************************************************


From jh at oobleck.astro.cornell.edu  Sat Nov 24 19:14:02 2001
From: jh at oobleck.astro.cornell.edu (Joe Harrington)
Date: Sat Nov 24 19:14:02 2001
Subject: [Numpy-discussion] Re: Meta: too many numerical libraries doing the same thing?
In-Reply-To: <E167j3q-0006VI-00@usw-sf-list1.sourceforge.net>
	(numpy-discussion-request@lists.sourceforge.net)
References: <E167j3q-0006VI-00@usw-sf-list1.sourceforge.net>
Message-ID: <200111250313.fAP3DUL21168@oobleck.astro.cornell.edu>

Yes, this issue has been raised here before.  It was the main
conclusion of Paul Barrett's and my BOF session at ADASS a 5 years ago
(see our report at
http://oobleck.astro.cornell.edu/jh/ast/papers/idae96.ps).  The main
problems are that we scientists are too individualistic to get
organized around a single library, too pushed by job pressures to
commit much concentrated time to it ourselves, and too poor to pay the
architects, coders, doc writers, testers, etc. to write it for us.
Socially, we *want* to reinvent the wheel, because we want to be
riding on our own wheels.  Once we are riding reasonably well for our
own needs, our interest and commitment vanishes.  We're off to write
the next paper.

Following that conference, I took a poll on this list looking for help
to implement the library.  About half a dozen people responded that
they could put in up to 10 hours a week, which in my experience isn't
enough, once things get hard and attrition sets in.  Nonetheless, Paul
and I proposed to the NASA Astrophysics Data Analysis Program to hire
some people to write it, but we were turned down.  We proposed the
idea to the head of the High Energy Astrophysics group at NASA
Goddard, and he agreed -- as long as what we were really doing was
writing software for his group's special needs.  The frustrating thing
is how many hundreds of astronomy projects hire people to do their 10%
of this problem, and how unwilling they are to pool resources to do
the general problem.

A few of the volunteers in my query to this list have gone on to do
SciPy, to their credit, but I don't see them moving in the direction
we outlined.  Still, they have the capacity to do it right in Python
and compiled code written explicitly for Python.  They won't solve the
general problem, but they may solve the first problem, namely getting
a data analysis environment that is OSS and as good as IDL et al. in
terms of end-to-end functionality, completeness, and documentation.

I like the notion that the present list is for designing and building
the underlying language capabilities into Python, and for getting them
standardized, tested, and included in the main Python distribution.
It is also a good place for debating the merits of different
implementations of particular functionality.  That leaves the job of
building coherent end-user data analysis packages (which necessarily
have to pick one routine to be called "fft", one device-independent
graphics subsystem, etc.) to application groups like SciPy.  There can
be more than one of these, if that's necessary, but they should all
use the same underlying numerical language capability.

I hope that the application groups from several array-based OSS
languages will someday get together and collaborate on an ueberlibrary
of numerical and graphics routines (the latter being the real sticking
point) that are easily wrapped by most languages.  That seems
backwards, but I think the social reality is that that's the way it is
going to be, if it ever happens at all.

--jh--


From paul at pfdubois.com  Sat Nov 24 19:59:01 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Sat Nov 24 19:59:01 2001
Subject: [Numpy-discussion] Re: Meta: too many numerical libraries doing the same thing?
In-Reply-To: <200111250313.fAP3DUL21168@oobleck.astro.cornell.edu>
Message-ID: <000101c17565$12af2760$3d01a8c0@plstn1.sfba.home.com>

There is more to this issue than meets the eye, both technically and
historically.

For numerical algorithms to be available independent of language, they
would have to be packaged as components such as COM objects. While there
is research in this field, nobody knows whether it can be done is a way
that is efficient enough.

For a given language like C, C++, Eiffel or Fortran used as the
speed-demon base for wrapping up in Python, there are some difficult
technical issues. Reusable numerical software needs context to operate
and there is no decent way to supply the context in a
non-object-oriented language. Geoff Furnish wrote a good paper about the
issue for C++ showing the way to truly reusable libraries in that
language, and recent improvements in Eiffel make it easier to do there
now. In C or Fortran you simply can't do it. (Note that Eiffel or C++
versions of some NAG routines typically have methods with one or two
arguments while the C or Fortran ones have 15 or more; a routine is not
reusable if you have to understand that many arguments to try it. There
are also important issue with regard to error handling and memory).

The second issue is the algorithmic issue: most scientists do NOT know
the right algorithms to use, and the ones they do use are often
inferior. The good algorithms are for the most part in commercial
libraries, and the numerical analysis literature, where they were
written by numerical analysts. Often the coding from both sources is
unavailable for free use, in the wrong language, and/or wretched.

The commerical libraries also exist because some companies have
requirements for fiduciary responsibility; in effect, they need a
guarantor of the software to show that they have not carelessly depended
on software of unknown quality. 

In short, computer scientists are not going to be able to write such a
library without an army of numerical analysts familiar with the
literature, and the numerical analysts aren't going to write it unless
they are OO-experienced, which almost all of them aren't, so far.

Most people when they discuss mathematical software think of leaves on
the call tree. In fact the most useful mathematical software, in the
sense that it incorporates the most expertise, is middleware such as ODE
solvers, integrators, root finders, etc. The algorithm itself will have
many controls, optional outputs, etc. This requires a library-wide
design motif.

I thus feel there are perfectly good reasons not to expect such a
library soon. The Python community could do a good OO-design using what
is available (such as LAPACK) but we haven't -- all the contributions
are functional.


From hinsen at cnrs-orleans.fr  Sun Nov 25 04:45:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Sun Nov 25 04:45:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same thing?
Message-ID: <200111251244.fAPCiIj01855@localhost.localdomain>

"Christos Siopis <siopis at umich.edu>" <siopis at umich.edu> writes:

> Don't take me wrong, i am not saying that Numpy, PDL, GSL & co. should be
> somehow "merged" --obviously, one needs different wrappers to call
> numerical routines from Python, Perl, C, C++ or Java. But there should be
> a way so that the actual *implementation* of the numerical algorithms is
> only done once and for all.

I agree that sounds nice in theory. But even if it were technically
feasible (which I doubt) given the language differences, it would be a
development project that is simply too big for scientists to handle as
a side job, even if they were willing (which again I doubt).

My impression is that the organizational aspects of software
development are often neglected. Some people are good programmers but
can't work well in teams. Others can work in teams, but are not good
coordinators. A big project requires at least one, if not several,
people who are good scientist and programmers, have coordinator
skills, and a job description that permits them to take up the task.
Plus a larger number of people who are good scientists and programmers
and can work in teams. Finally, all of these have to agree on
languages, design principles, etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From tim.hochberg at ieee.org  Sun Nov 25 10:50:02 2001
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Sun Nov 25 10:50:02 2001
Subject: [Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download
References: <JFEGLNDJEDNOMPPHDEJFEEIJDLAA.perry@stsci.edu> <3BF98336.9010500@STScI.Edu>
Message-ID: <01fd01c175e1$e6ae7990$87740918@cx781526b>

From: "Paul Barrett" <Barrett at stsci.edu>


> Perry Greenfield wrote:
>
> >
> > An early beta version is available on sourceforge as the
> > package Numarray (http://sourceforge.net/projects/numpy/)
> >
> > Information on the goals, changes in user interface, open issues,
> > and design can be found at http://aten.stsci.edu/numarray
>
>
>   6) Should array properties be accessible as public attributes
>     instead of through accessor methods?
>
>     We don't currently allow public array attributes to make
>     the Python code simpler and faster (otherwise we will
>     be forced to use __setattr__ and such). This results in
>     incompatibilty with previous code that uses such attributes.
>
>
> I prefer the use of public attributes over accessor methods.

As do I. As of Python 2.2, __getattr__/__setattr__ should not be required
anyway: new style classes allow this to be done in a more pleasent way. (I'm
still too fuzzy on the details to describe it coherently here though).

-tim


From nwagner at mecha.uni-stuttgart.de  Mon Nov 26 01:55:03 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Mon Nov 26 01:55:03 2001
Subject: [Numpy-discussion] Sort , Complex array
Message-ID: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>

Hi,

How can I sort an array of complex eigenvalues with respect to the
imaginary part
(in ascending order) in Numpy ?
All eigenvalues appear in complex cunjugate pairs.

Nils


From hinsen at cnrs-orleans.fr  Mon Nov 26 02:46:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 02:46:02 2001
Subject: [Numpy-discussion] Sort , Complex array
In-Reply-To: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
References: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
Message-ID: <m37ksdu8ct.fsf@chinon.cnrs-orleans.fr>

Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:

> How can I sort an array of complex eigenvalues with respect to the
> imaginary part
> (in ascending order) in Numpy ?
> All eigenvalues appear in complex cunjugate pairs.

indices = argsort(eigenvalues.imag)
eigenvalues = take(eigenvalues, indices)

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From gvermeul at labs.polycnrs-gre.fr  Mon Nov 26 02:48:02 2001
From: gvermeul at labs.polycnrs-gre.fr (Gerard Vermeulen)
Date: Mon Nov 26 02:48:02 2001
Subject: [Numpy-discussion] Sort , Complex array
In-Reply-To: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
References: <3C021F19.E0D869CB@mecha.uni-stuttgart.de>
Message-ID: <01112611475600.19933@taco.polycnrs-gre.fr>

On Monday 26 November 2001 11:53, Nils Wagner wrote:
> Hi,
>
> How can I sort an array of complex eigenvalues with respect to the
> imaginary part
> (in ascending order) in Numpy ?
> All eigenvalues appear in complex cunjugate pairs.
>
> Nils
>

I have solved that like this:
>>> from Numeric import *
>>> a = array([3+3j, 1+1j, 2+2j])
>>> b = a.imag
>>> print take(a, argsort(b))
[ 1.+1.j  2.+2.j  3.+3.j]
>>>

Best regards -- Gerard


From nwagner at mecha.uni-stuttgart.de  Mon Nov 26 07:03:06 2001
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Mon Nov 26 07:03:06 2001
Subject: [Numpy-discussion] Augmented matrix
Message-ID: <3C026834.E56CE70@mecha.uni-stuttgart.de>

Hi,

How can I build an augmented matrix [A,b] in Numpy,
where A is a m * n matrix (m>n) and b is a m*1 vector

Nils


From hinsen at cnrs-orleans.fr  Mon Nov 26 08:34:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 08:34:02 2001
Subject: [Numpy-discussion] Augmented matrix
In-Reply-To: <3C026834.E56CE70@mecha.uni-stuttgart.de>
References: <3C026834.E56CE70@mecha.uni-stuttgart.de>
Message-ID: <m3k7wdsdgj.fsf@chinon.cnrs-orleans.fr>

Nils Wagner <nwagner at mecha.uni-stuttgart.de> writes:

> How can I build an augmented matrix [A,b] in Numpy,
> where A is a m * n matrix (m>n) and b is a m*1 vector

AB = concatenate((A, b[:, NewAxis]), -1)

(assuming b is of rank 1)

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From chrishbarker at home.net  Mon Nov 26 10:30:02 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 10:30:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same 
 thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
Message-ID: <3C028E87.82C57211@home.net>

Another factor that complicates things is open source philosophy and the
licenses that go with it.

The GSL project looks very promising, and the ultimate goals of that
project appear to be to create a coherent and complete numerical
library. This kind of thing NEEDS to be open source, and the GSL folks
have chosen a license (GPL) that guarantees that it remains that way.
That is a good thing. The license also make it impossible to use the
library in closed source projects, which is a deal killer for a lot of
people, but it is also an important attribute for many folks that don't
think there should be closed source projects at all. I believe that that
will greatly stifle the potential of the project, but it fits with the
philosophy iof it's creators. Personally I think the LGPL would have
guaranteed the future openness of the source, and allowed a much greater
user (and therefor contributer) base.

BTW, IANAL either, but my reading of the GPL and Python's "GPL
compatable" license, is that GSL could be used with Python, but the
result would have to be released under the GPL. That means it could not
be imbedded in a closed source project. As a rule, Python itself and
most of the libraries I have seen for it (Numeric, wxPython, etc.) are
released under licences that allow propriatary use, so we probably don't
want to make Numeric, or SciPy GPL. too bad. 

On another note, it looks like the blitz++ library might be a good basis
for a general Numerical library (and NumPy 3)  as well. It does come
with a flexible license. Any thoughts?


-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Mon Nov 26 11:40:03 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 11:40:03 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same  thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
	<3C028E87.82C57211@home.net>
Message-ID: <200111261938.fAQJcmd01426@localhost.localdomain>

Chris Barker <chrishbarker at home.net> writes:

> On another note, it looks like the blitz++ library might be a good basis
> for a general Numerical library (and NumPy 3)  as well. It does come
> with a flexible license. Any thoughts?

I think the major question is whether we are willing to move to C++.
And if we want to keep up any pretentions for Numeric becoming part of
the Python core, this translates into whether Guido will accept C++
code in the Python core.

>From a more pragmatic point of view, I wonder what the implications
for efficiency would be. C++ used to be very different in their
optimization abilities, is that still the case? Even more
pragmatically, is blitz++ reasonably efficient with g++?

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From chrishbarker at home.net  Mon Nov 26 12:43:02 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 12:43:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the 
 same  thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
		<3C028E87.82C57211@home.net> <200111261938.fAQJcmd01426@localhost.localdomain>
Message-ID: <3C02ADB3.E314B8FB@home.net>

Konrad Hinsen wrote:
> Chris Barker <chrishbarker at home.net> writes:
> > On another note, it looks like the blitz++ library might be a good basis
> > for a general Numerical library (and NumPy 3)  as well. It does come
> > with a flexible license. Any thoughts?

> I think the major question is whether we are willing to move to C++.
> And if we want to keep up any pretentions for Numeric becoming part of
> the Python core, this translates into whether Guido will accept C++
> code in the Python core.

Actually, It's worse than that. Blitz++ makes heavy use of templates,
and thus only works with compilers that support that well. The current
Python core can compile under a very wide variety of compilers. I doubt
that Guido would want to change that.

Personally, I'm torn. I would very much like to see NumPy arrays become
part of the core Python, but don't want to have to compromise what it
could be to do that. Another idea is to extend the SciPy project to
become a complete Python distribution, that would clearly include
Numeric. One download, and you have all you need.

> >From a more pragmatic point of view, I wonder what the implications
> for efficiency would be. C++ used to be very different in their
> optimization abilities, is that still the case? Even more
> pragmatically, is blitz++ reasonably efficient with g++?

I know g++ is supported (and I think it is their primary development
platform). From the web site:

Is there a way to soup up C++ so that we can keep the advanced language
features but ditch the poor performance? This is the goal of the
Blitz++ project: to develop techniques which will enable C++ to rival --
and in some cases even exceed -- the speed of Fortran for numerical
computing, while preserving an object-oriented interface. The Blitz++
Numerical Library is being constructed as a testbed for these
techniques.

Recent benchmarks show C++ encroaching steadily on Fortran's
high-performance monopoly, and for some benchmarks, C++ is even faster
than Fortran! These results are being obtained not through better
optimizing compilers, preprocessors, or language extensions, but through
the
use of template techniques. By using templates cleverly, optimizations
such as loop fusion, unrolling, tiling, and algorithm specialization can
be
performed automatically at compile time.

see: http://www.oonumerics.org/blitz/whatis.html for more info.

I havn't messed with it myself, but from the web page, it seems the
answer is yes, C++ can produce high performance code.


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Mon Nov 26 12:52:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Mon Nov 26 12:52:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the same  thing?
In-Reply-To: <000301c176b7$26da0680$3d01a8c0@plstn1.sfba.home.com>
	(paul@pfdubois.com)
References: <000301c176b7$26da0680$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <200111262050.fAQKoxB01580@localhost.localdomain>

> We had some meetings to discuss using blitz and the truth is that as
> wrapped by Python there is not much to gain. The efficiency of blitz
> comes up when you do an array expression in C++. Then x = y + z + w + a
> + b gets compiled into one loop with no temporary objects created. But

That could still be of interest to extension module writers. And it
seems conceivable to write some limited Python-C compiler for
numerical expressions that generates extension modules, although this
is more than a weekend project.

Still, I agree that what most people care about is the speed of NumPy
operations. Some lazy evaluation scheme might be more promising to
eliminate the creation of intermediate objects, but that isn't exactly
trivial either...

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From perry at stsci.edu  Mon Nov 26 12:59:03 2001
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Nov 26 12:59:03 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical arrays (Numeric) available for download
In-Reply-To: <E166deo-0006o6-00@usw-sf-list1.sourceforge.net>
Message-ID: <JFEGLNDJEDNOMPPHDEJFCEJKDLAA.perry@stsci.edu>

> From: Chris Barker <chrishbarker at home.net>
> To: Perry Greenfield <perry at stsci.edu>, 
> numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Re: Re-implementation of Python 
> Numerical arrays (Numeric) available
>  for  download
> 
> I used Poor wording. When I wrote "datatypes", I meant data types in a
> much higher order sense. Perhaps structures or classes would be a better
> term. What I mean is that is should be easy to use an manipulate the
> same multidimensional arrays from both Python and C/C++. In the current
> Numeric, most folks generate a contiguous array, and then just use the
> array->data pointer to get what is essentially a C array. That's fine if
> you are using it in a traditional C way, with fixed dimension, one
> datatype, etc. What I'm imagining is having an object in C or C++ that
> could be easily used as a multidimentional array. I'm thinking C++ would
> probably neccesary, and probably templates as well, which is why blitz++
> looked promising. Of course, blitz++ only compiles with a few up-to-date
> compilers, so you'd never get it into the standard library that way!
> 
Yes, that was an important issue (C++ and the Python Standard Library).
And yes, it is not terribly convenient to access multi-dimensional
arrays in C (of varying sizes). We don't solve that problem in the
way a C++ library could. But I suppose that some might say that C++
libraries may introduce their own, new problems. But coming up with
the one solution to all scientific computing appears well beyond our
grasp at the moment. If someone does see that solution, let us know!

> I agree, but from the newsgroup, it is clear that a lot of folks are
> very reluctant to use something that is not part of the standard
> library.
>
We agree that getting into the standard library is important.
 
> > > >    We estimate
> > > >    that numarray is probably another order of magnitude worse,
> > > >    i.e., that 20K element arrays are at half the asymptotic
> > > >    speed. How much should this be improved?
> > >
> > > A lot. I use arrays smaller than that most of the time!
> > >
> > What is good enough? As fast as current Numeric?
> 
> As fast as current Numeric would be "good enough" for me. It would be a
> shame to go backwards in performance!
> 
> > (IDL does much
> > better than that for example).
> 
> My personal benchmark is MATLAB, which I imagine is similar to IDL in
> performance.
> 
We'll see if we can match current performance (or at least present usable
alternative approaches that are faster).

> > 10 element arrays will never be
> > close to C speed in any array based language embedded in an
> > interpreted environment.
> 
> Well, sure, I'm not expecting that
> 
Good :-)

> > 100, maybe, but will be very hard.
> > 1000 should be possible with some work.
> 
> I suppose MATLAB has it easier, as all arrays are doubles, and, (untill
> recently anyway), all variable where arrays, and all arrays were 2-d.
> NumPy is a lot more flexible that that. Is is the type and size checking
> that takes the time?
>  
Probably, but we haven't started serious benchmarking yet so I wouldn't
put much stock in what I say now.

 
> One of the things I do a lot with are coordinates of points and
> polygons. Sets if points I can handle easily as an NX2 array, but
> polygons don't work so well, as each polgon has a different number of
> points, so I use a list of arrays, which I have to loop over. Each
> polygon can have from about 10 to thousands of points (mostly 10-20,
> however). One way I have dealt with this is to store a polygon set as a
> large array of all the points, and another array with the indexes of the
> start and end of each polygon. That way I can transform the coordinates
> of all the polygons in one operation. It works OK, but sometimes it is
> more useful to have them in a sequence. 
> 
This is a good example of an ensemble of variable sized arrays.

> > As mentioned,
> > we tend to deal with large data sets and so I don't think we have
> > a lot of such examples ourselves.
> 
> I know large datasets were one of your driving factors, but I really
> don't want to make performance on smaller datasets secondary.
> 
> -- 
> Christopher Barker,

That's why we are asking, and it seems so far that there are enough
of those that do care about small arrays to spend the effort to
significantly improve the performance.

Perry


From chrishbarker at home.net  Mon Nov 26 13:03:02 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 13:03:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the 
 same  thing?
References: <000301c176b7$26da0680$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <3C02B298.E1F0E661@home.net>

"Paul F. Dubois" wrote:

> We had some meetings to discuss using blitz and the truth is that as
> wrapped by Python there is not much to gain. The efficiency of blitz
> comes up when you do an array expression in C++. Then x = y + z + w + a
> + b gets compiled into one loop with no temporary objects created. But
> this trick is possible because you can bind the assignment. In python
> you cannot bind the assignment so you cannot do a lazy evaluation of the
> operations, unless you are willing to go with some sort of function call
> like x = evaluate(y + z + w). Immediate evaluations means creating
> temporaries, and performance is dead.
> 
> The only gain then would be when you passed a Python-wrapped blitz array
> back to C++ and did a bunch of operations there.

Personally, I think this could be a big gain. At the moment, if you
don't get the performance you need with NumPy, you have to write some of
your code in C, and using the Numeric and Python C API is a whole lot of
work, particularly if you want your function to work on non-contiguous
arrays and/or arrays of any type. I don't know much C++, and I have no
idea if Blitz++ fits this bill, but it seemed to me that using an object
oriented framework that could take care of reference counting, and allow
you to work with generic arrays, and index them naturally, etc, would be
a great improvement, even if the performance was the same as the current
C API. Perhaps NumPy2 has accomplished that, it sounds like it is a step
in the right direction, at least.

In a sentence: the most important reason for using a C++ object oriented
multi-dimensional array package would be easy of use, not speed.

It's nice to hear Blitz++ was considered, it was proably rejected for
good reason, but it just looked very promising to me.

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From oliphant at ee.byu.edu  Mon Nov 26 13:24:11 2001
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Nov 26 13:24:11 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the
  same  thing?
In-Reply-To: <3C02B298.E1F0E661@home.net>
Message-ID: <Pine.LNX.4.33L2.0111261424200.31928-100000@oliphant.ee.byu.edu>

> In a sentence: the most important reason for using a C++ object oriented
> multi-dimensional array package would be easy of use, not speed.
>
> It's nice to hear Blitz++ was considered, it was proably rejected for
> good reason, but it just looked very promising to me.

I believe that Eric's "compiler" module included in SciPy uses Blitz++ to
optimize Numeric expressions.  You have others who also share your
admiration of Blitz++

-Travis


From chrishbarker at home.net  Mon Nov 26 15:31:05 2001
From: chrishbarker at home.net (Chris Barker)
Date: Mon Nov 26 15:31:05 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing thesame  
 thing?
References: <Pine.LNX.4.33L2.0111261424200.31928-100000@oliphant.ee.byu.edu>
Message-ID: <3C02D510.E7454CCA@home.net>

Travis Oliphant wrote:

> I believe that Eric's "compiler" module included in SciPy uses Blitz++ to
> optimize Numeric expressions.  You have others who also share your
> admiration of Blitz++

Yes, it does. That's where I heard about it. That also brings up a good
point. Paul mentioned that using something like Blitz++ would only help
performance if you could pass it an entire expression, like: x =
a+b+c+d. That is exactly what Eric's compiler module does, and it would
sure be easier if NumPy already used Blitz++! In Fact, I suppose Eric's
compiler is a start towards a tool that could comp9le en entire NumPy
function or module. I'd love to be able to just do that (with some
tweeking perhaps) rather than having to code it all by hand.

My fantasies continue...

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From jochen at jochen-kuepper.de  Mon Nov 26 16:34:01 2001
From: jochen at jochen-kuepper.de (Jochen =?iso-8859-1?q?K=FCpper?=)
Date: Mon Nov 26 16:34:01 2001
Subject: [Numpy-discussion] Re: Numpy2 and GSL
In-Reply-To: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
References: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net>
Message-ID: <m3herhuk76.fsf@box.home.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 26 Nov 2001 08:21:40 +0100 Martin Wiechert wrote:

Martin> Are there any plans to wrap GSL for Numpy2?
Martin> I did not actually try it (It's not Python ;-)),
Martin> but it looks clean and powerful.

There is actually a project to wrap gsl for python:
  http://pygsl.sourceforge.net/
It only provides wrapper for the special functions, but more is to
come. (Hopefully Achim will put the cvs on sf soon.)

Yes, I agree, PyGSL should be fully integrated with Numpy2, but it
should probably also remain a separate project -- as Numpy should stay
a base layer for all kind of numerical stuff and hopefully make it
into core python at some point (my personal wish, no more, AFAICT!).

I think when PyGSL will fully go to SF (or anything similar) more
people would start contributing and we should have a fine general
numerical algorithms library for python soon!

Greetings,
Jochen
- -- 
Einigkeit und Recht und Freiheit                http://www.Jochen-Kuepper.de
    Libert?, ?galit?, Fraternit?                GnuPG key: 44BCCD8E
        Sex, drugs and rock-n-roll
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt and GnuPG <http://www.gnupg.org/>

iD8DBQE8At88iJ/aUUS8zY4RAikdAJ9184yaCSH+GtkDz2mLVlrSh7mjEQCdGSqA
2uhmBKRCFBb9eeq3gmmn9/Q=
=64gm
-----END PGP SIGNATURE-----


From europax at home.com  Mon Nov 26 17:36:16 2001
From: europax at home.com (Rob)
Date: Mon Nov 26 17:36:16 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing the 
 same  thing?
References: <200111251244.fAPCiIj01855@localhost.localdomain>
			<3C028E87.82C57211@home.net> <200111261938.fAQJcmd01426@localhost.localdomain> <3C02ADB3.E314B8FB@home.net>
Message-ID: <3C02ED76.F02F17D8@home.com>

I'm currently testing the SciPy Blitz++ features with FDTD.  Should have
some comparisons soon.  Right now my statements are compiling, but not
giving the right answers :(   I think they might have it fixed soon.  
Rob.


Chris Barker wrote:
> 
> Konrad Hinsen wrote:
> > Chris Barker <chrishbarker at home.net> writes:
> > > On another note, it looks like the blitz++ library might be a good basis
> > > for a general Numerical library (and NumPy 3)  as well. It does come
> > > with a flexible license. Any thoughts?
> 
> > I think the major question is whether we are willing to move to C++.
> > And if we want to keep up any pretentions for Numeric becoming part of
> > the Python core, this translates into whether Guido will accept C++
> > code in the Python core.
> 
> Actually, It's worse than that. Blitz++ makes heavy use of templates,
> and thus only works with compilers that support that well. The current
> Python core can compile under a very wide variety of compilers. I doubt
> that Guido would want to change that.
> 
> Personally, I'm torn. I would very much like to see NumPy arrays become
> part of the core Python, but don't want to have to compromise what it
> could be to do that. Another idea is to extend the SciPy project to
> become a complete Python distribution, that would clearly include
> Numeric. One download, and you have all you need.
> 
> > >From a more pragmatic point of view, I wonder what the implications
> > for efficiency would be. C++ used to be very different in their
> > optimization abilities, is that still the case? Even more
> > pragmatically, is blitz++ reasonably efficient with g++?
> 
> I know g++ is supported (and I think it is their primary development
> platform). From the web site:
> 
> Is there a way to soup up C++ so that we can keep the advanced language
> features but ditch the poor performance? This is the goal of the
> Blitz++ project: to develop techniques which will enable C++ to rival --
> and in some cases even exceed -- the speed of Fortran for numerical
> computing, while preserving an object-oriented interface. The Blitz++
> Numerical Library is being constructed as a testbed for these
> techniques.
> 
> Recent benchmarks show C++ encroaching steadily on Fortran's
> high-performance monopoly, and for some benchmarks, C++ is even faster
> than Fortran! These results are being obtained not through better
> optimizing compilers, preprocessors, or language extensions, but through
> the
> use of template techniques. By using templates cleverly, optimizations
> such as loop fusion, unrolling, tiling, and algorithm specialization can
> be
> performed automatically at compile time.
> 
> see: http://www.oonumerics.org/blitz/whatis.html for more info.
> 
> I havn't messed with it myself, but from the web page, it seems the
> answer is yes, C++ can produce high performance code.
> 
> --
> Christopher Barker,
> Ph.D.
> ChrisHBarker at home.net                 ---           ---           ---
> http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
>                                    ------@@@     ------@@@     ------@@@
> Oil Spill Modeling                ------   @    ------   @   ------   @
> Water Resources Engineering       -------      ---------     --------
> Coastal and Fluvial Hydrodynamics --------------------------------------
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
The Numeric Python EM Project

www.members.home.net/europax


From Achim.Gaedke at uni-koeln.de  Tue Nov 27 00:20:02 2001
From: Achim.Gaedke at uni-koeln.de (Achim Gaedke)
Date: Tue Nov 27 00:20:02 2001
Subject: [Numpy-discussion] Re: Numpy2 and GSL
References: <E166tnq-0001g8-00@usw-sf-list1.sourceforge.net> <m3herhuk76.fsf@box.home.de>
Message-ID: <3C034BFA.FBB64E94@uni-koeln.de>

Ok, there is a clear need for the facility of easy contribution.
Please be patient until Friday, December 7th. Then I have time to let it happen.

It is right that the oficial site for this project is at pygsl.sourcefogrge.net
(Brian Gough, can you change the link on the gsl homepage, thanks :-) )

But I will show some discussion points that must be clear before a cvs release:

- Is the file and directory structure fully expandable, can several persons work
parallel?

- Should classes be created with excellent working objects or should it be a 1:1
wrapper?

- should there be one interface dynamic library or more than one?

- Is there an other way expect that of the GPL (personally prefered, but other
opinions should be discussed before the contribution of source)

Some questions of minor weight:

- Is the tuple return value for (value,error) ok in the sf module?

- Test cases are needed

These questions are the reason, why I do not simply "copy" my code into cvs.

Jochen K?pper wrote:
> 
> It only provides wrapper for the special functions, but more is to
> come. (Hopefully Achim will put the cvs on sf soon.)
> 
> Yes, I agree, PyGSL should be fully integrated with Numpy2, but it
> should probably also remain a separate project -- as Numpy should stay
> a base layer for all kind of numerical stuff and hopefully make it
> into core python at some point (my personal wish, no more, AFAICT!).
> 
> I think when PyGSL will fully go to SF (or anything similar) more
> people would start contributing and we should have a fine general
> numerical algorithms library for python soon!
> 

I agree with Jochen and I'd like to move to the core of Python too. But this is
far away and I hate monolithic distributions.

If there is the need to discuss seperately about PyGSL we can do that here or at
the gsl-discuss list mailto:gsl-discuss at sources.redhat.com . But there is also
the possibility of a mailinglist at pygsl.sourceforge.net . Please let me know.


From neelk at cswcasa.com  Tue Nov 27 05:52:05 2001
From: neelk at cswcasa.com (Krishnaswami, Neel)
Date: Tue Nov 27 05:52:05 2001
Subject: [Numpy-discussion] Re: Re-implementation of Python Numerical 
	arrays (Numeric) available for download
Message-ID: <B1E4D3274D57D411BE8400D0B783FF32A8D612@exchange1.cswv.com>

Perry Greenfield [mailto:perry at stsci.edu] wrote:
> > 
> > I know large datasets were one of your driving factors, but I really
> > don't want to make performance on smaller datasets secondary.
> 
> That's why we are asking, and it seems so far that there are enough
> of those that do care about small arrays to spend the effort to
> significantly improve the performance.

Well, here's my application. I do data mining work, and one of the
techniques I want to use Numpy for is to implement robust regression
algorithms like least-trimmed-squares. Now for a k-variable regression,
the best-of-breed algorithm for this involves taking hundreds of 
thousands of k-element samples and calculating the fitting hyperplane
through them.

Small matrix performance is thus something this program lives or dies 
by, and right now it seems like 'dies' is the right measure -- it is
about 10x slower than the Gauss program that does the same thing. :(

When I profiled it seems like Numpy is spending almost all of its 
time in _castCopyAndTranspose. Switching to the Intel MKL LAPACK 
had no performance effect, but changing _castCopyAndTranspose into 
a C function was a 20% speed increase. 

If Numpy2 is even slower on small matrices I'd have to give up using
it, and that's a shame: it's a *much* nicer environment than Gauss is.

--
Neel Krishnaswami
neelk at cswcasa.com


From hungjunglu at yahoo.com  Tue Nov 27 08:28:06 2001
From: hungjunglu at yahoo.com (Hung Jung Lu)
Date: Tue Nov 27 08:28:06 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
In-Reply-To: <ye1lmhaibjv.fsf@orange30.ex.ac.uk>
Message-ID: <20011127162705.40865.qmail@web12604.mail.yahoo.com>

Hi,

Thanks to Jon Saenz and Chris Baker for helping out
with fast linear algebra and statistical distribution
routines.

Again, I have a tangential question. I am hitting the
physical limit of the CPU (meaning things have been
optimized down to assembly level), in order to achieve
even higher performance, the only way to go is
hardware.

Is there any recommendation for fast machines at the
price range of a few thousand dollars? (I cannot
afford supercomputers or connection machines.) My
purpose is to run Monte Carlo simulation. This means
that a lot of scenarios can be run in parallel
fashion. Of course I can just use regular cheap
Pentium boxes... but they are kind of bulky, and I
don't need any of the video, audio, USB features (I
think 10 machines at 1GHz each would be the size of
calculation power I need, or equivalently, a single
machine at an equivalent 10GHz. Heck, if there are
some specialized racks/boxes, I can wire the
motherboards myself.) I am wondering what you people
do for heavy number crunching? Are there any cheap yet
specialized machines? What about machines with dual
processor? I would imagine a lot of people in the
number crunching world run into my situation, and
since the number crunching machines don't require much
beyond a motherboard and a small hard-drive, maybe
there are already some cheap solutions out there.

thanks!

Hung Jung


__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1


From rossini at blindglobe.net  Tue Nov 27 09:44:02 2001
From: rossini at blindglobe.net (A.J. Rossini)
Date: Tue Nov 27 09:44:02 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
In-Reply-To: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
References: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
Message-ID: <87vgfwdsao.fsf@jeeves.blindglobe.net>

>>>>> "HJL" == Hung Jung Lu <hungjunglu at yahoo.com> writes:

    HJL> Again, I have a tangential question. I am hitting the
    HJL> physical limit of the CPU (meaning things have been optimized
    HJL> down to assembly level), in order to achieve even higher
    HJL> performance, the only way to go is hardware.

    HJL> Is there any recommendation for fast machines at the price
    HJL> range of a few thousand dollars? (I cannot afford
    HJL> supercomputers or connection machines.) My purpose is to run
    HJL> Monte Carlo simulation. This means that a lot of scenarios
    HJL> can be run in parallel fashion. Of course I can just use
    HJL> regular cheap Pentium boxes... but they are kind of bulky,
    HJL> and I don't need any of the video, audio, USB features (I
    HJL> think 10 machines at 1GHz each would be the size of
    HJL> calculation power I need, or equivalently, a single machine
    HJL> at an equivalent 10GHz. Heck, if there are some specialized
    HJL> racks/boxes, I can wire the motherboards myself.) I am
    HJL> wondering what you people do for heavy number crunching? Are
    HJL> there any cheap yet specialized machines? What about machines
    HJL> with dual processor? I would imagine a lot of people in the
    HJL> number crunching world run into my situation, and since the
    HJL> number crunching machines don't require much beyond a
    HJL> motherboard and a small hard-drive, maybe there are already
    HJL> some cheap solutions out there.

The usual way is to build some "blackboxes", i.e. mobo/cpu/memory/NIC,
diskless or nearly diskless (you don't want to maintain machines :-).
Connect them using 100bT or faster networks (though 100bT should be
fine). 

Do such things exist?  Sort of -- they tend to be more expensive than
building them yourself, but if you've got a reliable local supplier,
they can build them fairly cheaply for you.  I'd go with single or
dual athlons, myself :-).  If power and maintenance is an issue,
duals, and if not, maybe singles.

We use MOSIX (www.mosix.org) for transparent load balancing between
linux machines, and it could be used on the machines I described
(using a floppy or CD to boot).  

The next question is whether some form of parallel RNG will help.  The
answer is "maybe".  I worked with a student who evaluated coupled
chains, and we couldn't do too much better.  

And then after that, is whether you want to figure out how to
post-process the results.  If you want to automate the whole thing
(and it isn't clear that it would be worth it, but...), you could use
PyPVM to front-end the sub-processes distributed on the network,
load-balanced at the system level by MOSIX.

Now for the problems -- MOSIX seems to have difficulties with Python.
Severe difficulties.  I don't know if it still holds true for recent
MOSIX releases.

(note that I use R (www.r-project.org) for most of my simulation work
these days, but am looking at Python for stat analyses, of which MCMC
tools are of interest).

best,
-tony

-- 
A.J. Rossini				Rsrch. Asst. Prof. of Biostatistics
U. of Washington Biostatistics		rossini at u.washington.edu	
FHCRC/SCHARP/HIV Vaccine Trials Net	rossini at scharp.org
-------------- http://software.biostat.washington.edu/ --------------
FHCRC: M-W: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
UW:   T-Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX
Rosen: (Mullins' Lab) Fridays, and I'm unreachable except by email.


From chrishbarker at home.net  Tue Nov 27 10:28:01 2001
From: chrishbarker at home.net (Chris Barker)
Date: Tue Nov 27 10:28:01 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
References: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
Message-ID: <3C03DF8D.3725E2A2@home.net>

Hung Jung Lu wrote:
> Is there any recommendation for fast machines at the
> price range of a few thousand dollars? (I cannot
> afford supercomputers or connection machines.) My
> purpose is to run Monte Carlo simulation. This means
> that a lot of scenarios can be run in parallel
> fashion. Of course I can just use regular cheap
> Pentium boxes... but they are kind of bulky, and I
> don't need any of the video, audio, USB features (I

I've been looking into setting up a system to do similar work, and it
looks to me like the best bang for the buck right now are dual Athlon
systems. If space is an important consideration, you can get dual Athlon
1U rack mount systems for less than $2000. I'm pretty sure the only dual
Athlon board currently available (Tyan K7 thunder) has on board video,
ethernet and SCSI, which means it cost a little more than it could, but
these systems are still a pretty good deal if you get one without a hard
drive (or a very cheap one). I just did quick web search, and epox is
supposed to be coming out with a dual board as well, so there may be
cheaper options soon.

-Chris


-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------


From wsryu at fas.harvard.edu  Tue Nov 27 15:52:04 2001
From: wsryu at fas.harvard.edu (William Ryu)
Date: Tue Nov 27 15:52:04 2001
Subject: [Numpy-discussion] Hardware for Monte Carlo simulation
In-Reply-To: <3C03DF8D.3725E2A2@home.net>
References: <20011127162705.40865.qmail@web12604.mail.yahoo.com>
Message-ID: <5.1.0.14.2.20011127184457.00aa3850@pop.fas.harvard.edu>

At 10:46 AM 11/27/2001 -0800, Chris Barker wrote:
>Hung Jung Lu wrote:
> > Is there any recommendation for fast machines at the
> > price range of a few thousand dollars? (I cannot
> > afford supercomputers or connection machines.) My
> > purpose is to run Monte Carlo simulation. This means
> > that a lot of scenarios can be run in parallel
> > fashion. Of course I can just use regular cheap
> > Pentium boxes... but they are kind of bulky, and I
> > don't need any of the video, audio, USB features (I
>
>I've been looking into setting up a system to do similar work, and it
>looks to me like the best bang for the buck right now are dual Athlon
>systems. If space is an important consideration, you can get dual Athlon
>1U rack mount systems for less than $2000. I'm pretty sure the only dual
>Athlon board currently available (Tyan K7 thunder) has on board video,
>ethernet and SCSI, which means it cost a little more than it could, but
>these systems are still a pretty good deal if you get one without a hard
>drive (or a very cheap one). I just did quick web search, and epox is
>supposed to be coming out with a dual board as well, so there may be
>cheaper options soon.
>
>-Chris

There is a cheaper dual CPU Tyan board which uses the same motherboard 
chipset. Its the Tyan Tiger-MP S2460, which doesn't have SCSI, onboard 
video, or Ethernet, but is half the price (around $200).

-willryu


From eric at enthought.com  Tue Nov 27 16:16:02 2001
From: eric at enthought.com (eric)
Date: Tue Nov 27 16:16:02 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing thesame
Message-ID: <051001c17799$8bfa68b0$777ba8c0@ericlaptop>

Hey group,

Blitz++ is very cool, but I'm not sure it would make a very good
underpinning for reimplementing Numeric.  There are 2 (well maybe 3) main
points.

1.
Blitz++ declares arrays in the following way:

The first issue deals with how you declare arrays in Blitz++.

    Array<float,3> A(N,N,N);

The big deal here is that the dimensionality of Array is a template
parameter, not a constructor parameter.  In other words, 2D arrays are
effectively a different type than 3D arrays.  Numeric, on the other hand
represents arrays of all dimensions with a single class/type.  For Python,
this makes the most sense.  I think you could fanagle some way of getting
blitz to work, but I'm not sure it would be the desired elegant solution.
I've also tinkered with building a simple C++ templated  (non-blitz)
implementation of Numeric for kicks, but kept coming back to using the
dreaded void* to store the data arrays.  I still haven't completely given up
on a templated solution, but it wasn't as obvious as I thought it would be.

2.
Compiling Blitz++ is slooooow.  scipy.compiler spits out 200-300 line
extension modules at the most.  Depending on hox complicated expressions
are, it can take .5-1.5 minutes to compile a single extension funtion on an
850 MHz PIII.  I can't imagine how long it would take to compile Numeric
arrays for 1 through 11 dimensions (the most blitz supports as I remember)
for all the different data types with 100s of extension functions.  The cost
wouldn't be linear because you do pay a one time hit for some of the
template instantiation.  Also, I've heard gcc 3.0 might be better.  Still,
it'd be a painful development process.

3.
Portability.  This comes at two levels.  The first is that blitz++ has heavy
duty requirements of the compiler.  gcc works fine which is a huge plus, but
a lot of other compilers don't.  MSVC is the most notable of these because
it is so heavily used on windows.

The second level is the portability of C++ extension modules in general.
I've run into this on windows, but I think it is an issue pretty much
everywhere.  For example, MSVC and GCC compiled C extension libraries can
call each other on Windows because they the are binary compatible.  C++
classes are _not_ binary compatible.  This has come up for me with wxPython.
The standard version that Robin Dunn distributes is compiled with MSVC.  If
you build a small
extensions with gcc that make wxPython call, it'll link just fine, but
seg-faults during execution.
Does anyone know if the same sorta thing is true on the Unices?  If it is,
and Numeric was written in C++ then you'd have to compile extension modules
that use Numeric arrays with the same compiler that was used to compile
Numeric.  This can lead to all sorts of hassles, and it has made me lean
back towards C as the preferred language for something as fundemental as
Numeric.  (Note that I do like C++ for modules that don't really define an
API called by other modules).

Ok, so maybe there's a 4th point.  Paul D. pointed out that blitz isn't much
of a win unless you have lazy evaluation (which scipy.compiler already
provides).  I also think improved speed _isn't_ the biggest goal of a
reimplementation (although it can't be sacrificed either).  I'm more excited
about a code base that more people can comprehend.  Perry G. et al's mixed
Python/C implementation with the code generators is a very good idea and a
step in this direction.  I hope the speed issues for small arrays can be
solved.  I also hope the memory mapped aspect doesn't complicate the code
base much.

see ya,
eric


From hinsen at cnrs-orleans.fr  Wed Nov 28 00:09:03 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed Nov 28 00:09:03 2001
Subject: [Numpy-discussion] Meta: too many numerical libraries doing thesame
Message-ID: <200111280808.fAS889g08217@localhost.localdomain>

"eric" <eric at enthought.com> writes:

> The standard version that Robin Dunn distributes is compiled with MSVC.  If
> you build a small
> extensions with gcc that make wxPython call, it'll link just fine, but
> seg-faults during execution.
> Does anyone know if the same sorta thing is true on the Unices?  If it is,
> and Numeric was written in C++ then you'd have to compile extension modules
> that use Numeric arrays with the same compiler that was used to compile
> Numeric.  This can lead to all sorts of hassles, and it has made me lean

If you rely on dynamic linking for cross-module calls, you'd have the
same problem with Unix, as different compilers use different
name-mangling schemes. One way around this would be to limit
cross-module calls to C functions compiled with "C" linking.

Better yet, don't rely on dynamic linking at all and export a module's
C API via a Python CObject, as described in the extension manual, and
declare all symbols as static (except for the module initialization
function of course). In my experience that is the only method that
works on all platforms, with all compilers. Of course this also
assumes that interfaces are at the C level.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From sag at hydrosphere.com  Wed Nov 28 09:02:05 2001
From: sag at hydrosphere.com (Sue Giller)
Date: Wed Nov 28 09:02:05 2001
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked array
Message-ID: <20011128170140921.AAA253@mail.climatedata.com@SUEW2000>

I posted the following inquiry to python-list at python.org   earlier this 
week, but got no responses, so I thought I'd try a more focused 
group.  I assume MA module falls under NumPy area.

I am using 2 (and more) dimensional masked arrays with some 
numeric data, and using the reduce functionality on the arrays.  I 
use the masking because some of the values in the arrays are 
'missing' and should not be included in the results of the reduction.

For example, assume a 5 x 2 array, with masked values for the 4th 
entry for both of the 2nd dimension cells.  If I want to sum along the 
2nd dimension, I would expect to get a 'missing' value for the 4th 
entry because both of the entries for the sum are 'missing'.  Instead, 
I get 0, which might be a valid number in my data space, and the 
returned 1 dimensional array has no mask associated with it.

Is this expected behavior for masked arrays or a bug or am I 
misusing the mask concept?  Does anyone know how to get the 
reduction to produce a masked value?

Example Code:
>>> import MA
>>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]])
>>> a
   [[  1,  2,  3,-99,  5,]
    [ 10, 20, 30,-99, 50,]]
>>> m = MA.masked_values(a, -99)
>>> m
    array(data = 
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask = 
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

>>> r = MA.sum(m)
>>> r
      array([11,22,33, 0,55,])
>>> t = MA.getmask(r)
>>> print t
      None


From paul at pfdubois.com  Wed Nov 28 20:31:03 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Wed Nov 28 20:31:03 2001
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked array
In-Reply-To: <20011128170140921.AAA253@mail.climatedata.com@SUEW2000>
Message-ID: <000201c1788e$60359ce0$3d01a8c0@plstn1.sfba.home.com>

[dubois at ldorritt ~]$ pydoc MA.sum
Python Library Documentation: function sum in MA

sum(a, axis=0, fill_value=0)
    Sum of elements along a certain axis using fill_value for missing.

If you use add.reduce, you'll get what you want.
>>> print m
[[1 ,2 ,3 ,-- ,5 ,]
 [10 ,20 ,30 ,-- ,50 ,]]
>>> MA.sum(m)
array([11,22,33, 0,55,])
>>> MA.add.reduce(m)
array(data = 
 [ 11, 22, 33,-99, 55,],
      mask = 
 [0,0,0,1,0,],
      fill_value=-99)

In other words,
   sum(m, axis, fill_value) = add.reduce(filled(m, fill_value), axis)

Surprising in your case. Still, both uses are quite common, so I
probably was thinking to myself that since add.reduce already does one
of the jobs, I might as well make sum do the other one. One could have
just as well argued that one was a synonym for the other and so it is
revolting to have them be different.

Well, MA users, is this something I should change, or not?

-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Sue
Giller
Sent: Wednesday, November 28, 2001 9:03 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked
array


I posted the following inquiry to python-list at python.org   earlier this 
week, but got no responses, so I thought I'd try a more focused 
group.  I assume MA module falls under NumPy area.

I am using 2 (and more) dimensional masked arrays with some 
numeric data, and using the reduce functionality on the arrays.  I 
use the masking because some of the values in the arrays are 
'missing' and should not be included in the results of the reduction.

For example, assume a 5 x 2 array, with masked values for the 4th 
entry for both of the 2nd dimension cells.  If I want to sum along the 
2nd dimension, I would expect to get a 'missing' value for the 4th 
entry because both of the entries for the sum are 'missing'.  Instead, 
I get 0, which might be a valid number in my data space, and the 
returned 1 dimensional array has no mask associated with it.

Is this expected behavior for masked arrays or a bug or am I 
misusing the mask concept?  Does anyone know how to get the 
reduction to produce a masked value?

Example Code:
>>> import MA
>>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]])
>>> a
   [[  1,  2,  3,-99,  5,]
    [ 10, 20, 30,-99, 50,]]
>>> m = MA.masked_values(a, -99)
>>> m
    array(data = 
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask = 
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

>>> r = MA.sum(m)
>>> r
      array([11,22,33, 0,55,])
>>> t = MA.getmask(r)
>>> print t
      None


_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From giulio.bottazzi at libero.it  Thu Nov 29 02:10:03 2001
From: giulio.bottazzi at libero.it (Giulio Bottazzi)
Date: Thu Nov 29 02:10:03 2001
Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked array
References: <000201c1788e$60359ce0$3d01a8c0@plstn1.sfba.home.com>
Message-ID: <3C05FDA2.AD9C5DCC@libero.it>

My answer is yes: the difference between the two behaviors could be
confusing for the user.

If I can dare to express a "general rule", I would say that
the masks in MA arrays should not disappear if not EXPLICITLY required
to do so!

Of course you can interpret a provided value for the fill_value
parameter
in the sum function as such a request... but if value is not provided,
than
I would say that the correct approach would be to keep the mask on
(after all,
what special about the value 0? For instance, if you have to take
logarithm in the
next step of the calculation, it is a rather bad choice!)

	Giulio.

"Paul F. Dubois" wrote:
> 
> [dubois at ldorritt ~]$ pydoc MA.sum
> Python Library Documentation: function sum in MA
> 
> sum(a, axis=0, fill_value=0)
>     Sum of elements along a certain axis using fill_value for missing.
> 
> If you use add.reduce, you'll get what you want.
> >>> print m
> [[1 ,2 ,3 ,-- ,5 ,]
>  [10 ,20 ,30 ,-- ,50 ,]]
> >>> MA.sum(m)
> array([11,22,33, 0,55,])
> >>> MA.add.reduce(m)
> array(data =
>  [ 11, 22, 33,-99, 55,],
>       mask =
>  [0,0,0,1,0,],
>       fill_value=-99)
> 
> In other words,
>    sum(m, axis, fill_value) = add.reduce(filled(m, fill_value), axis)
> 
> Surprising in your case. Still, both uses are quite common, so I
> probably was thinking to myself that since add.reduce already does one
> of the jobs, I might as well make sum do the other one. One could have
> just as well argued that one was a synonym for the other and so it is
> revolting to have them be different.
> 
> Well, MA users, is this something I should change, or not?
> 
> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Sue
> Giller
> Sent: Wednesday, November 28, 2001 9:03 AM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked
> array
> 
> I posted the following inquiry to python-list at python.org   earlier this
> week, but got no responses, so I thought I'd try a more focused
> group.  I assume MA module falls under NumPy area.
> 
> I am using 2 (and more) dimensional masked arrays with some
> numeric data, and using the reduce functionality on the arrays.  I
> use the masking because some of the values in the arrays are
> 'missing' and should not be included in the results of the reduction.
> 
> For example, assume a 5 x 2 array, with masked values for the 4th
> entry for both of the 2nd dimension cells.  If I want to sum along the
> 2nd dimension, I would expect to get a 'missing' value for the 4th
> entry because both of the entries for the sum are 'missing'.  Instead,
> I get 0, which might be a valid number in my data space, and the
> returned 1 dimensional array has no mask associated with it.
> 
> Is this expected behavior for masked arrays or a bug or am I
> misusing the mask concept?  Does anyone know how to get the
> reduction to produce a masked value?
> 
> Example Code:
> >>> import MA
> >>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]])
> >>> a
>    [[  1,  2,  3,-99,  5,]
>     [ 10, 20, 30,-99, 50,]]
> >>> m = MA.masked_values(a, -99)
> >>> m
>     array(data =
>              [[  1,  2,  3,-99,  5,]
>               [ 10, 20, 30,-99, 50,]],
>            mask =
>               [[0,0,0,1,0,]
>                [0,0,0,1,0,]],
>            fill_value=-99)
> 
> >>> r = MA.sum(m)
> >>> r
>       array([11,22,33, 0,55,])
> >>> t = MA.getmask(r)
> >>> print t
>       None
> 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From sag at hydrosphere.com  Thu Nov 29 09:49:02 2001
From: sag at hydrosphere.com (Sue Giller)
Date: Thu Nov 29 09:49:02 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <3C05FDA2.AD9C5DCC@libero.it>
Message-ID: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>

Thanks for the pointer.

The example I gave using the sum operation is merely an example - 
I could also be doing other manipulations such as min, max, 
average, etc.

I see that the MA.<op>.reduce functions will do what I want, but to 
do an average, I will need to do two steps since the MA.average 
function will have the original 'unexpected' behavior that I don't want.

That raises the question of how to determine a count of valid values 
in a masked array.  Can I assume that I can do 'math' on the mask 
array itself, for example to sum along a given axis and have the 
masked cells add up?

In my original example, I would expect a sum along the second axis 
to return [0,0,0,2,0].  Can I rely on this?  I would suggest that a 
.count operator would be very useful in working with masked arrays 
(count valid and count masked).

>>> m = MA.masked_values(a, -99)
>>> m
    array(data =
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask =
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

To add an opinion on the question from Paul about 'expected' 
behavior, I was working off the documentation for Numerical Python, 
and there were no caveats in there about MA.<op> working one 
way, and MA.<op>.reduce working another.  The answer is always 
in the documentation, especially for users like me who don't have 
time or knkowledge to go reading thru all the code modules to try 
and figure out what is happening.  From a purely user standpoint, I 
would expect a masked array to retain it's mask-edness at all times, 
unless I explicitly tell it not to.  In that case, I would still expect it to 
replace the 'masked' cells with the original masked value, and not 
just arbitrarily assign some other value, such as 0.

Thanks again for the prompt reply.


From reggie at merfinllc.com  Thu Nov 29 10:36:01 2001
From: reggie at merfinllc.com (Reggie Dugard)
Date: Thu Nov 29 10:36:01 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>
Message-ID: <NBBBLCCOCMBHBAJCFIBKCEDCCKAA.reggie@merfinllc.com>

> That raises the question of how to determine a count of valid values
> in a masked array.  Can I assume that I can do 'math' on the mask
> array itself, for example to sum along a given axis and have the
> masked cells add up?
>
> In my original example, I would expect a sum along the second axis
> to return [0,0,0,2,0].  Can I rely on this?  I would suggest that a
> .count operator would be very useful in working with masked arrays
> (count valid and count masked).

Actually masked arrays already have a count method that does what you
want:

Python 2.2b2 (#26, Nov 16 2001, 11:44:11) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pydoc import help
>>> import MA
>>> x = MA.arange(10)
>>> help(x.count)
Help on method count in module MA.MA:

count(self, axis=None) method of MA.MA.MaskedArray instance
    Count of the non-masked elements in a, or along a certain axis.

>>> x.count()
10
>>>


From paul at pfdubois.com  Thu Nov 29 12:54:02 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Thu Nov 29 12:54:02 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>
Message-ID: <000201c17917$ac5efec0$3d01a8c0@plstn1.sfba.home.com>

You have misread my reply. It is not true that MA.op works one way and
MA.op.reduce is different. sum and add.reduce are different, and the
documentation for sum 
DOES say the right thing for sum. The function sum is a special case in
that its native meaning was the same as add.reduce and so the function
is redundant.

I believe you are in error wrt average; average works the way you want. 
Function count can tell you the number of 
non-masked values either in the whole array or axis-wise if you give an
axis argument. Function size gives you the total number, so #invalid is
size(x)-count(x).

maximum and minimum (don't use max and min, they are built-ins that
don't know about Numeric)
have two forms. When called with one argument they return the overall
max or min of the whole array, returning masked only if all entries are
masked. For two arguments, you get element-wise extrema, and the mask is
on where any one of the arguments was masked.

>>> print x
[[1 ,-- ,3 ,]
 [11 ,-- ,-- ,]]
>>> print average(x)
[6.0 ,-- ,3.0 ,] 
>>> y
array(
 [[ 6, 7, 8,]
 [ 9,10,11,]])
>>> print maximum(x,y)
[[6 ,-- ,8 ,]
 [11 ,-- ,-- ,]]
>>> y[0,0]=masked
>>> print maximum(x,y)
[[-- ,-- ,8 ,]
 [11 ,-- ,-- ,]]
-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Sue
Giller
Sent: Thursday, November 29, 2001 9:50 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional
Masked array


Thanks for the pointer.

The example I gave using the sum operation is merely an example - 
I could also be doing other manipulations such as min, max, 
average, etc.

I see that the MA.<op>.reduce functions will do what I want, but to 
do an average, I will need to do two steps since the MA.average 
function will have the original 'unexpected' behavior that I don't want.

That raises the question of how to determine a count of valid values 
in a masked array.  Can I assume that I can do 'math' on the mask 
array itself, for example to sum along a given axis and have the 
masked cells add up?

In my original example, I would expect a sum along the second axis 
to return [0,0,0,2,0].  Can I rely on this?  I would suggest that a 
.count operator would be very useful in working with masked arrays 
(count valid and count masked).

>>> m = MA.masked_values(a, -99)
>>> m
    array(data =
             [[  1,  2,  3,-99,  5,]
              [ 10, 20, 30,-99, 50,]],
           mask =
              [[0,0,0,1,0,]
               [0,0,0,1,0,]],
           fill_value=-99)

To add an opinion on the question from Paul about 'expected' 
behavior, I was working off the documentation for Numerical Python, 
and there were no caveats in there about MA.<op> working one 
way, and MA.<op>.reduce working another.  The answer is always 
in the documentation, especially for users like me who don't have 
time or knkowledge to go reading thru all the code modules to try 
and figure out what is happening.  From a purely user standpoint, I 
would expect a masked array to retain it's mask-edness at all times, 
unless I explicitly tell it not to.  In that case, I would still expect
it to 
replace the 'masked' cells with the original masked value, and not 
just arbitrarily assign some other value, such as 0.

Thanks again for the prompt reply.


_______________________________________________
Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From sag at hydrosphere.com  Thu Nov 29 15:21:04 2001
From: sag at hydrosphere.com (Sue Giller)
Date: Thu Nov 29 15:21:04 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <000201c17917$ac5efec0$3d01a8c0@plstn1.sfba.home.com>
References: <20011129174809062.AAA210@mail.climatedata.com@SUEW2000>
Message-ID: <20011129232011546.AAA269@mail.climatedata.com@SUEW2000>

Paul,

Well, you're right.  I did misunderstand your reply, as well as what 
the various functions were supposed to do.  I was mis-using the 
sum, minimum, maximum as tho they were MA.<op>.reduce, and 
my test case didn't point out the difference.  I should always have 
been doing the .reduce version.

I apologize for this!

I found a section on page 45 of the Numerical Python text (PDF 
form, July 13, 2001) that defines sum as
   'The sum function is a synonym for the reduce method of the add 
ufunc.  It returns the sum of all the elements in the sequence given 
along the specified axis (first axis by default).'

This is where I would expect to see a caveat about it not retaining 
any mask-edness.

I was misussing the MA.minimum and MA.maximum as tho they 
were .reduce version.  My bad.

The MA.average does produce a masked array, but it has changed 
the 'missing value' to fill_value=[ 1.00000002e+020,]).  I do find this 
a bit odd, since the other reductions didn't change the fill value.

Anyway, I can now get the stats I want in a format I want, and I 
understand better the various functions for array/masked array.

Thanks for the comments/input.

sue


From romberg at fsl.noaa.gov  Fri Nov 30 11:30:04 2001
From: romberg at fsl.noaa.gov (Mike Romberg)
Date: Fri Nov 30 11:30:04 2001
Subject: [Numpy-discussion] equal() and complex
Message-ID: <15367.56879.54329.654575@smaug.fsl.noaa.gov>

  I'm wondering if there is some good reason why equal(), not_equal(),
nonzero() and the like do not work with numeric arrays of tyep
complex.  I can see why operators like less() and less_equal() do not
work.  But the pure equality ones seem like they should work.  Or am I
missing something :).

Thanks,

Mike Romberg (romberg at fsl.noaa.gov)


From hinsen at cnrs-orleans.fr  Fri Nov 30 12:17:04 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Fri Nov 30 12:17:04 2001
Subject: [Numpy-discussion] equal() and complex
References: <15367.56879.54329.654575@smaug.fsl.noaa.gov>
Message-ID: <200111302016.fAUKG9X01351@localhost.localdomain>

Mike Romberg <romberg at fsl.noaa.gov> writes:

>   I'm wondering if there is some good reason why equal(), not_equal(),
> nonzero() and the like do not work with numeric arrays of tyep
> complex.  I can see why operators like less() and less_equal() do not
> work.  But the pure equality ones seem like they should work.  Or am I
> missing something :).

Before Python 2.1, comparison couldn't be implemented for equality
only.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From europax at home.com  Fri Nov 30 17:35:03 2001
From: europax at home.com (Rob)
Date: Fri Nov 30 17:35:03 2001
Subject: [Numpy-discussion] Numeric Python EM Project has moved
Message-ID: <3C083356.31E66685@home.com>

Its now at www.pythonemproject.com.  I can be reached at
rob at pythonemproject.com.  All this has come about since @home is
possibly suspending operation at midnite tonight :(   Rob.

Looks like I need to change my sig too :)


-- 
The Numeric Python EM Project

www.members.home.net/europax