From sanner at scripps.edu  Thu Feb  1 15:29:47 2001
From: sanner at scripps.edu (Michel Sanner)
Date: Thu, 1 Feb 2001 12:29:47 -0800
Subject: [Numpy-discussion] New packages
In-Reply-To: numpy-discussion-request@lists.sourceforge.net
        "Numpy-discussion digest, Vol 1 #171 - 2 msgs" (Jan 24,  3:58pm)
References: <E14LZnk-0001o4-00@usw-sf-list1.sourceforge.net>
Message-ID: <1010201122947.ZM12238@noah.scripps.edu>

Hello,

Is there any reason why Numeric never became a "New" Python package with a
__init__.py ?

-Michel

PS: I have been adding __init__.py to my installation for a long time now and
it works just fine. For those who want to be able to import directly we could
extend the python path in __init__ so that after an import Numeric all .so
would be directly loadable


-- 

-----------------------------------------------------------------------

>>>>>>>>>> AREA CODE CHANGE <<<<<<<<< we are now 858 !!!!!!!

Michel F. Sanner Ph.D.                   The Scripps Research Institute
Assistant Professor			Department of Molecular Biology
					  10550 North Torrey Pines Road
Tel. (858) 784-2341				     La Jolla, CA 92037
Fax. (858) 784-2860
sanner at scripps.edu                        http://www.scripps.edu/sanner
-----------------------------------------------------------------------


From paul at pfdubois.com  Fri Feb  2 10:16:17 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Fri, 2 Feb 2001 07:16:17 -0800
Subject: [Numpy-discussion] New packages
In-Reply-To: <1010201122947.ZM12238@noah.scripps.edu>
Message-ID: <ADEOIFHFONCLEEPKCACCAEBOCFAA.paul@pfdubois.com>

Well, we talked about it some but didn't want to break existing code. To my
recollection nobody has suggested the trick you suggest here. I think it
would work, although there are cases where people import Precision in a
given module but not Numeric (the numeric objects they deal with get
returned by C or Fortran calls). Anybody see any real downside here?

-----Original Message-----
Is there any reason why Numeric never became a "New" Python package with a
__init__.py ?

-Michel

PS: I have been adding __init__.py to my installation for a long time now
and
it works just fine. For those who want to be able to import directly we
could
extend the python path in __init__ so that after an import Numeric all .so
would be directly loadable


--

-----------------------------------------------------------------------

>>>>>>>>>> AREA CODE CHANGE <<<<<<<<< we are now 858 !!!!!!!

Michel F. Sanner Ph.D.                   The Scripps Research Institute
Assistant Professor			Department of Molecular Biology
					  10550 North Torrey Pines Road
Tel. (858) 784-2341				     La Jolla, CA 92037
Fax. (858) 784-2860
sanner at scripps.edu                        http://www.scripps.edu/sanner
-----------------------------------------------------------------------


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From jbmoody at oakland.edu  Sat Feb  3 16:16:39 2001
From: jbmoody at oakland.edu (Jon Moody)
Date: Sat, 3 Feb 2001 16:16:39 -0500
Subject: [Numpy-discussion] cephes.arraymap & multipack
Message-ID: <20010203161639.A4309@oakland.edu>

Two questions:

* Has anyone been using the arraymap function that's in Travis's
  cephes 1.2 module?  I think this is an interesting idea: giving
  arbitrary python functions ufunc-like (array broadcasting)
  properties when given numpy array arguments.  (I noticed the Pearu's
  multipack CVS module has only cephes 1.1 which seems to be missing
  arraymap)

* Maybe I'm missing something, but is there any reason why multipack's
  functions are not implemented as ufuncs?  For example, it would be
  useful to be able to use multipack.leastsq() along an axis of a 3-d
  array, or to use multipack.quad() over a 2 or 3-d space of
  integration parameters.

--
Jon Moody


From phrxy at csv.warwick.ac.uk  Sat Feb  3 21:45:51 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Sun, 4 Feb 2001 02:45:51 +0000 (GMT)
Subject: [Numpy-discussion] cephes.arraymap & multipack
In-Reply-To: <20010203161639.A4309@oakland.edu>
Message-ID: <Pine.SOL.4.30.0102040218270.13210-100000@mimosa.csv.warwick.ac.uk>

On Sat, 3 Feb 2001, Jon Moody wrote:
[...]
> * Maybe I'm missing something, but is there any reason why multipack's
>   functions are not implemented as ufuncs?  For example, it would be
>   useful to be able to use multipack.leastsq() along an axis of a 3-d
>   array, or to use multipack.quad() over a 2 or 3-d space of
>   integration parameters.
[...]

Not sure aboout the latter, but couldn't the former just be done by
slicing?  How does this relate to ufuncs?

They were only intended to be simple wrappings around the FORTRAN / C I
think, so I suspect Travis would say 'feel free to add it'.


John


From nwagner at isd.uni-stuttgart.de  Mon Feb  5 09:38:07 2001
From: nwagner at isd.uni-stuttgart.de (Nils Wagner)
Date: Mon, 05 Feb 2001 15:38:07 +0100
Subject: [Numpy-discussion] Python routines for evaluation of special functions
Message-ID: <3A7EBACF.4C1678D5@isd.uni-stuttgart.de>

Hi,

I am looking for Python routines for evaluation of special functions,
including Airy, Bessel, beta, exponential integrals, logarithmic
integrals.

Thanks

                                        Nils


From jbmoody at oakland.edu  Mon Feb  5 11:56:02 2001
From: jbmoody at oakland.edu (Jon Moody)
Date: Mon, 5 Feb 2001 11:56:02 -0500
Subject: [Numpy-discussion] cephes.arraymap & multipack
In-Reply-To: <Pine.SOL.4.30.0102040218270.13210-100000@mimosa.csv.warwick.ac.uk>; from phrxy@csv.warwick.ac.uk on Sun, Feb 04, 2001 at 02:45:51AM +0000
References: <20010203161639.A4309@oakland.edu> <Pine.SOL.4.30.0102040218270.13210-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <20010205115602.A6611@oakland.edu>

On Sun, Feb 04, 2001 at 02:45:51AM +0000, John J. Lee wrote:
> Not sure aboout the latter, but couldn't the former just be done by
> slicing?  How does this relate to ufuncs?

I'm probably being a little foggy on the distinction between the
ufuncs (element-wise operations on arrays) and the array functions
that sometimes allow you to specify an axis along which to apply the
function.

The problem I'm having with multipack.leastsq() is that the python
function I supply as the model for the fit is expected to return a 1-d
array or a single value.

So if, for example, the independent variable is a 1-d array of shape
(4,) and the data is a 3-d array of shape (4,256,256), to apply
leastsq() along axis 0 you have to either loop in python or set things
up so that you can map(leastsq, ....).  

It seems this kind of thing should properly be in the wrapper, and I
would guess it should be in the C half of the wrapper for speed,
unless there's some clever way to phrase it using native Numeric
functions from python.

> 
> They were only intended to be simple wrappings around the FORTRAN / C I
> think, so I suspect Travis would say 'feel free to add it'.

Maybe I should try.  Is there any general objection (don't all barf at
once) to using Fortran for an wrapper via pyfort (I don't know C)?

--
Jon Moody


From hoel at germanlloyd.org  Tue Feb  6 04:31:15 2001
From: hoel at germanlloyd.org (Berthold =?iso-8859-1?q?H=F6llmann?=)
Date: 06 Feb 2001 10:31:15 +0100
Subject: [Numpy-discussion] more general LAPACK support for NumPy
Message-ID: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>

Hello,

>From time to time we need an additional linear algebra routine to be
avaible in Python. I find myself wrapping these functions then. As
these are FORTRAN routines, doing this for Python version on Solaris
(Sun CC/Sun F77), linux (gcc/g77) and Windows (VC++/Digital VF)
becomes nontrivial. Neither f2py nor pyfort provide Win support and I
doubt that automatic generation is usefull for many LAPACK routines,
especially those that need workspace, because usually we want LWORK to
be the optimal size.

So my Question is, are there other users wrapping LAPACK routines for
NumPy. If so, how are you doing it. For the C/FORTRAN wrapping I
somehow like the approach used for cfortran.h (see
http://www-zeus.desy.de/~burow/cfortran/index.html), but I'm afraid,
the license is not acceptable.

Is anyone aware of a C version of LAPACK besides the f2c version on
netlib. I do like the approach used in the ATLAS clapack part, but
there are only a very few LAPACK routines handeled there. If there is
greater need for additional LAPACK routines for NumPy, should we
bundle the efforts in

 (a) developing guidelines for how to write Python wrappers for LAPACK
     routines and

 (b) collecting routines provided by different users to provide a
     hopefully growing support for LAPACK in Python.

Greetings

Berthold
-- 
       email: hoel at GermanLloyd.org
   )   tel. : +49 (40) 3 61 49 - 73 74
  (
C[_]  These opinions might be mine, but never those of my employer.


From jhauser at ifm.uni-kiel.de  Tue Feb  6 07:07:30 2001
From: jhauser at ifm.uni-kiel.de (Janko Hauser)
Date: Tue, 6 Feb 2001 13:07:30 +0100 (CET)
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>
References: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>
Message-ID: <20010206120730.29928.qmail@lisboa.ifm.uni-kiel.de>

There is an old binding to the complete CLapack package, which can be
found at 

ftp://dirac.cnrs-orleans.fr/pub/PyLapack.tar.gz

It does not seem to have special support for Windows, but one can perhaps
start from there. I have tried cfortran ones and the wrapped function
signatures become quite long and verbose. I think one important
extension would be to make the automatic wrapper generators pyfort and
f2py support some windows compilers, although I must admit, I have not
looked into them for windows support yet.

__Janko


From hoel at germanlloyd.org  Tue Feb  6 08:55:04 2001
From: hoel at germanlloyd.org (Berthold =?iso-8859-1?q?H=F6llmann?=)
Date: 06 Feb 2001 14:55:04 +0100
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: Janko Hauser's message of "Tue, 6 Feb 2001 13:07:30 +0100 (CET)"
References: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>
	<20010206120730.29928.qmail@lisboa.ifm.uni-kiel.de>
Message-ID: <se8znjzzk7.fsf@pc961225.GermanLloyd.de>

OK, I got, compiled and installed PyLapack.tar.gz. It looks quite
complete, for LAPACK 1 or 2, but, of course those routines I need are
new in LAPACK 3 :-(. 

It seemes, Doug Heisterkamp used some kind of a script or program to
generate the wrapper. Is there anyone who knows, whether this program
is avaible anywhere, so it could be extended for LAPACK 3 and Win?


Thanks

Berthold
-- 
       email: hoel at GermanLloyd.org
   )   tel. : +49 (40) 3 61 49 - 73 74
  (
C[_]  These opinions might be mine, but never those of my employer.


From paul at pfdubois.com  Tue Feb  6 10:26:09 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Tue, 6 Feb 2001 07:26:09 -0800
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: <20010206120730.29928.qmail@lisboa.ifm.uni-kiel.de>
Message-ID: <ADEOIFHFONCLEEPKCACCIEDICFAA.paul@pfdubois.com>

Two notes in regard to this thread:
1. I am in progress making Pyfort support Digital Visual Fortran on Windows.
2. Pyfort does have a facility for automatic allocation of work space, at
least in the case that the size can be computed using ordinary arithmetic
from the sizes of other arguments or other integer arguments.

-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Janko
Hauser
Sent: Tuesday, February 06, 2001 4:08 AM
To: Berthold Hollmann
Cc: numpy-discussion at lists.sourceforge.net
Subject: Re: [Numpy-discussion] more general LAPACK support for NumPy


There is an old binding to the complete CLapack package, which can be
found at

ftp://dirac.cnrs-orleans.fr/pub/PyLapack.tar.gz

It does not seem to have special support for Windows, but one can perhaps
start from there. I have tried cfortran ones and the wrapped function
signatures become quite long and verbose. I think one important
extension would be to make the automatic wrapper generators pyfort and
f2py support some windows compilers, although I must admit, I have not
looked into them for windows support yet.

__Janko


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From hoel at germanlloyd.org  Tue Feb  6 10:47:15 2001
From: hoel at germanlloyd.org (Berthold =?iso-8859-1?q?H=F6llmann?=)
Date: 06 Feb 2001 16:47:15 +0100
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: "Paul F. Dubois"'s message of "Tue, 6 Feb 2001 07:26:09 -0800"
References: <ADEOIFHFONCLEEPKCACCIEDICFAA.paul@pfdubois.com>
Message-ID: <seofwfyfss.fsf@pc961225.GermanLloyd.de>

"Paul F. Dubois" <paul at pfdubois.com> writes:

> Two notes in regard to this thread:
> 1. I am in progress making Pyfort support Digital Visual Fortran on
> Windows.

GREAT

> 2. Pyfort does have a facility for automatic allocation of work
> space, at least in the case that the size can be computed using
> ordinary arithmetic from the sizes of other arguments or other
> integer arguments.

I know of that, but the optimal workspace size for LAPACK routines is
for optimal efficiency. The size can be returned by the routine or by
calling the FORTRAN function ILAENV. It would be great, if
workspacesize could be made depending on the result of functions.

Thanks

Berthold

-- 
       email: hoel at GermanLloyd.org
   )   tel. : +49 (40) 3 61 49 - 73 74
  (
C[_]  These opinions might be mine, but never those of my employer.


From hinsen at cnrs-orleans.fr  Wed Feb  7 09:16:34 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 7 Feb 2001 15:16:34 +0100
Subject: [Numpy-discussion] Is this a wheel?
In-Reply-To: <Pine.OSF.3.95.1010129153436.7116A-100000@lcdx00.wm.lc.ehu.es>
	(message from Jon Saenz on Mon, 29 Jan 2001 15:46:44 +0100 (MET))
References: <Pine.OSF.3.95.1010129153436.7116A-100000@lcdx00.wm.lc.ehu.es>
Message-ID: <200102071416.PAA10170@chinon.cnrs-orleans.fr>

> element of a NumPy array. I seeked through the documentation and found the
> argmax/argmin functions. However, they must be called recursively to find
> the greatest(smallest) element of a multidimendional array. As I needed to

You could run it on Numeric.ravel(array) (which shouldn't make a
copy), and then reconstruct the multidimensional indices from the
single index into the flattened array. The additional overhead
should be minimal, and you don't need any C code.

Konrad
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Wed Feb  7 09:19:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 7 Feb 2001 15:19:02 +0100
Subject: [Numpy-discussion] New packages
In-Reply-To: <ADEOIFHFONCLEEPKCACCAEBOCFAA.paul@pfdubois.com>
References: <ADEOIFHFONCLEEPKCACCAEBOCFAA.paul@pfdubois.com>
Message-ID: <200102071419.PAA10175@chinon.cnrs-orleans.fr>

> Well, we talked about it some but didn't want to break existing code. To my
> recollection nobody has suggested the trick you suggest here. I think it
> would work, although there are cases where people import Precision in a
> given module but not Numeric (the numeric objects they deal with get
> returned by C or Fortran calls). Anybody see any real downside here?

At least not immediately. Importing Numeric involves almost no
overhead when you use array-generating modules anyway (they need to
import at least multiarray). I'll make this modification to my
installation and see if I get any bad surprises.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From dpgrote at lbl.gov  Wed Feb  7 14:41:48 2001
From: dpgrote at lbl.gov (David P Grote)
Date: Wed, 07 Feb 2001 11:41:48 -0800
Subject: [Numpy-discussion] Is this a wheel?
References: <Pine.OSF.3.95.1010129153436.7116A-100000@lcdx00.wm.lc.ehu.es> <200102071416.PAA10170@chinon.cnrs-orleans.fr>
Message-ID: <3A81A4FC.70704@lbl.gov>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20010207/536e73f3/attachment.html>

From phrxy at csv.warwick.ac.uk  Thu Feb  8 03:58:23 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Thu, 8 Feb 2001 08:58:23 +0000 (GMT)
Subject: [Numpy-discussion] Is this a wheel?
In-Reply-To: <3A81A4FC.70704@lbl.gov>
Message-ID: <Pine.SOL.4.30.0102080853350.936-100000@mimosa.csv.warwick.ac.uk>

On Wed, 7 Feb 2001, David P Grote wrote:

> Ravel does make a copy when the array is not contiguous. I asked this
> question before but didn't get any response - is there a way to get the
> argmax/min or max/min of a non-contiguous multi-dimensional array without
> making a contiguous copy? I use python as an interface to fortran code
> and so I am constantly dealing with arrays that are not contiguous, i.e.
> not with C ordering. Any help is appreciated.

Aren't FORTRAN arrays just stored in the reverse order to C?  Isn't this
just dealt with by having the stride lengths of your Numeric array in the
opposite order?  Or does FORTRAN sometimes allocate multidimensional
arrays with gaps in memory??  I don't see why they should not be
contiguous.


John


From dpgrote at lbl.gov  Thu Feb  8 12:39:18 2001
From: dpgrote at lbl.gov (David P Grote)
Date: Thu, 08 Feb 2001 09:39:18 -0800
Subject: [Numpy-discussion] Is this a wheel?
References: <Pine.SOL.4.30.0102080853350.936-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <3A82D9C6.40409@lbl.gov>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20010208/e3c71497/attachment.html>

From phrxy at csv.warwick.ac.uk  Thu Feb  8 16:01:39 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Thu, 8 Feb 2001 21:01:39 +0000 (GMT)
Subject: [Numpy-discussion] Is this a wheel?
In-Reply-To: <3A82D9C6.40409@lbl.gov>
Message-ID: <Pine.SOL.4.30.0102081755340.14657-100000@mimosa.csv.warwick.ac.uk>

On Thu, 8 Feb 2001, David P Grote wrote:

> What I meant by "not contiguous" is that the? Numeric flag "contiguous"
> is set to false. This flag is only true when Numeric arrays have their
> strides in C ordering. Any rearrangement of the strides causes the flag
> to be set to false - a transpose for example. The data in the fortran
> arrays is contiguous in memory. Here's an example using ravel.
[...]

Oh, I see.

>  Ravel does make a copy when the array is not contiguous. I asked this
> question before but didn't get any response - is there a way to get the
> argmax/min or max/min of a non-contiguous multi-dimensional array without
> making a contiguous copy? I use python as an interface to fortran code
> and so I am constantly dealing with arrays that are not contiguous, i.e.
> not with C ordering. Any help is appreciated.

I don't know about doing it with one of the Numeric functions, but it's
very easy to write in C -- just this week I wrote a max() that works on
(contiguous or not)  Numeric arrays.  I think I wrote it as a C function
(not callable from Python) for the function I was wrapping to use, but it
would be easy to change it to be a proper Python function.  I'll mail you
a copy if you like.


John


From Barrett at stsci.edu  Fri Feb  9 10:45:50 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri,  9 Feb 2001 10:45:50 -0500 (EST)
Subject: [Numpy-discussion] A Numerical Python BoF at Python 9
Message-ID: <14980.2832.659186.913578@nem-srvr.stsci.edu>

I've been encouraged to set-up a BoF at Python 9 to discuss Numerical
Python issues, specifically the design and implemenation of Numeric 2.
I'd like to get a head count of those interested in attending such a
BoF.  So far there are 3 of us at STScI who are interested.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From eq3pvl at eq.uc.pt  Fri Feb  9 11:14:46 2001
From: eq3pvl at eq.uc.pt (Pedro Vale Lima)
Date: Fri, 09 Feb 2001 16:14:46 +0000
Subject: [Numpy-discussion] Travis Oliphant optimization.py
References: <14980.2832.659186.913578@nem-srvr.stsci.edu>
Message-ID: <3A841776.37AD73F1@eq.uc.pt>

Travis Oliphant website seems to be with problems (maybe the starship
virus).
I wanted to download his optimization.py. Could pleaseTravis or someone
else
mail me that routine.

thank you
pedro lima


From nwagner at isd.uni-stuttgart.de  Fri Feb  9 11:51:19 2001
From: nwagner at isd.uni-stuttgart.de (Nils Wagner)
Date: Fri, 09 Feb 2001 17:51:19 +0100
Subject: [Numpy-discussion] Jordan's normal form of matrices
Message-ID: <3A842007.A93BB722@isd.uni-stuttgart.de>

Hi,

I am looking for a program to calculate the Jordan normal form of a real
or complex matrix.

Thanks in advance.

Nils Wagner


From jhauser at ifm.uni-kiel.de  Fri Feb  9 18:08:46 2001
From: jhauser at ifm.uni-kiel.de (Janko Hauser)
Date: Sat, 10 Feb 2001 00:08:46 +0100 (CET)
Subject: [Numpy-discussion] A Numerical Python BoF at Python 9
In-Reply-To: <14980.2832.659186.913578@nem-srvr.stsci.edu>
References: <14980.2832.659186.913578@nem-srvr.stsci.edu>
Message-ID: <20010209230846.1655.qmail@lisboa.ifm.uni-kiel.de>

May I suggest that you repost your PEP to the matrix-sig. This PEP is
more fleshed out, than the last mails from Travis, regarding Numeric2.

IMHO,
__Janko

Paul Barrett writes:
 > 
 > I've been encouraged to set-up a BoF at Python 9 to discuss Numerical
 > Python issues, specifically the design and implemenation of Numeric 2.
 > I'd like to get a head count of those interested in attending such a
 > BoF.  So far there are 3 of us at STScI who are interested.
 > 
 > -- 
 > Dr. Paul Barrett       Space Telescope Science Institute
 > Phone: 410-338-4475    ESS/Science Software Group
 > FAX:   410-338-4767    Baltimore, MD 21218
 > 
 > _______________________________________________
 > Numpy-discussion mailing list
 > Numpy-discussion at lists.sourceforge.net
 > http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From gillet at scripps.edu  Fri Feb  9 20:37:54 2001
From: gillet at scripps.edu (Alexandre Gillet)
Date: Fri, 09 Feb 2001 17:37:54 -0800
Subject: [Numpy-discussion] problem with numeric array on window.
Message-ID: <3A849B72.28AEAA22@scripps.edu>

Hi all,

We are having problems when creating numeric arrays in C extensions
under
windows. We narrowed the problem down to a very simple example (shown
below) where we simply allocate some memory, create a Numeric array
using
PyArrayFromDimsAndData. If we call that function as soon as we delete
the
returned array the python interpreter crashes as it tries to free
self->data. If we do not the the OWN_DATA flag in the array it works
fine
but we have a memory leak.
We tried this using both release and Debug versions of Python1.5.2, and
Numeric 17.1.1.
I have been using this mechanism under Unix for a long time and have not
had this problem before !
Using the same extension and test under unix works fine.
Has something changed ? Any help is welcome ...
Thank you

I join the two files we are using.
 bug.c : a C module that create an array numeric with the flags set to
own_data, so when the array is garbage collected, the memory is free.


#################
bug.c#####################################################
#ifdef WIN32
#include <stdlib.h>
#include <malloc.h>
#endif
#include "Python.h"
#include "arrayobject.h"

static PyObject* createArray(PyObject* self, PyObject* args)
{
    int *dims;
    float *data;
    PyArrayObject *out;

    dims = (int *)malloc(2 * sizeof(int));
    dims[0] = 500;
    dims[1] = 2;
    data = (float *)malloc(100000 * sizeof(float));
    out = (PyArrayObject *)PyArray_FromDimsAndData(2, dims,
                                                   PyArray_FLOAT,
                                                   (char *)data);
    if (!out) {
      PyErr_SetString(PyExc_RuntimeError,
                      "Failed to allocate memory for normals");
      return NULL;
    }
    out->flags |= OWN_DATA;   /*so we'll free this memory when this 
                                array will be garbage collected */
    return (PyObject *)out;
}

static PyMethodDef bug_methods[] = {
  {"createArray",             createArray,      1}, 
  {NULL,      NULL}        /* Sentinel */
};

static char bug_documentation[] = "No Doc";

#ifdef WIN32
extern __declspec(dllexport)
#endif
void initbug()
{
        PyObject *m, *d;
        m = Py_InitModule4("bug",
                           bug_methods,
                           bug_documentation,
                           (PyObject *)NULL,
                           PYTHON_API_VERSION);
        d = PyModule_GetDict(m);
        import_array();

        if (PyErr_Occurred()) 
                Py_FatalError("can't initialize module bug");
}
#####################################################################

###############testbug.py############################################
import bug
ar = bug.createArray()
del ar # fatal if ar owns the data member

#####################################################################


-- 
**********************************
Alexandre Gillet
The Scripps Research Institute,       tel: (858) 784-9557
Dept. Molecular Biology,  MB-5,       fax: (858) 784-2860
10550  North Torrey Pines Road,       email: gillet at scripps.edu
La Jolla,  CA 92037-1000,  USA.


From Barrett at stsci.edu  Tue Feb 13 10:52:18 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Tue, 13 Feb 2001 10:52:18 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
Message-ID: <14985.22458.685587.538866@nem-srvr.stsci.edu>


The first draft of PEP 209: Multi-dimensional Arrays is ready for
comment.  It's primary emphasis is aimed at array operations, but its
design is intended to provide a general framework for working with
multi-dimensional arrays.  This PEP covers a lot of ground and so does
not go into much detail at this stage. The hope is that we can fill
them in as time goes on.  It also presents several Open Issues that
need to be discussed.

Cheers,
Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

PEP: 209
Title: Multi-dimensional Arrays
Version: 
Author: barrett at stsci.edu (Paul Barrett), oliphant at ee.byu.edu (Travis Oliphant)
Python-Version: 2.2
Status: Draft
Type: Standards Track
Created: 03-Jan-2001
Post-History: 


Abstract

    This PEP proposes a redesign and re-implementation of the multi-
    dimensional array module, Numeric, to make it easier to add new
    features and functionality to the module.  Aspects of Numeric 2
    that will receive special attention are efficient access to arrays
    exceeding a gigabyte in size and composed of inhomogeneous data
    structures or records.  The proposed design uses four Python
    classes: ArrayType, UFunc, Array, and ArrayView; and a low-level
    C-extension module, _ufunc, to handle the array operations
    efficiently.  In addition, each array type has its own C-extension
    module which defines the coercion rules, operations, and methods
    for that type.  This design enables new types, features, and
    functionality to be added in a modular fashion.  The new version
    will introduce some incompatibilities with the current Numeric.


Motivation

    Multi-dimensional arrays are commonly used to store and manipulate
    data in science, engineering, and computing.  Python currently has
    an extension module, named Numeric (henceforth called Numeric 1),
    which provides a satisfactory set of functionality for users
    manipulating homogeneous arrays of data of moderate size (of order
    10 MB).  For access to larger arrays (of order 100 MB or more) of
    possibly inhomogeneous data, the implementation of Numeric 1 is
    inefficient and cumbersome.  In the future, requests by the
    Numerical Python community for additional functionality is also
    likely as PEPs 211: Adding New Linear Operators to Python, and
    225: Elementwise/Objectwise Operators illustrate.


Proposal

    This proposal recommends a re-design and re-implementation of
    Numeric 1, henceforth called Numeric 2, which will enable new
    types, features, and functionality to be added in an easy and
    modular manner.  The initial design of Numeric 2 should focus on
    providing a generic framework for manipulating arrays of various
    types and should enable a straightforward mechanism for adding new
    array types and UFuncs.  Functional methods that are more specific
    to various disciplines can then be layered on top of this core.
    This new module will still be called Numeric and most of the
    behavior found in Numeric 1 will be preserved.

    The proposed design uses four Python classes: ArrayType, UFunc,
    Array, and ArrayView; and a low-level C-extension module to handle
    the array operations efficiently.  In addition, each array type
    has its own C-extension module which defines the coercion rules,
    operations, and methods for that type.  At a later date, when core
    functionality is stable, some Python classes can be converted to
    C-extension types.

    Some planned features are:
    
    1.  Improved memory usage
    
    This feature is particularly important when handling large arrays
    and can produce significant improvements in performance as well as
    memory usage.  We have identified several areas where memory usage
    can be improved:
    
        a.  Use a local coercion model
    
        Instead of using Python's global coercion model which creates
        temporary arrays, Numeric 2, like Numeric 1, will implement a
        local coercion model as described in PEP 208 which defers the
        responsibility of coercion to the operator.  By using internal
        buffers, a coercion operation can be done for each array
        (including output arrays), if necessary, at the time of the
        operation.  Benchmarks [1] have shown that performance is at
        most degraded only slightly and is improved in cases where the
        internal buffers are less than the L2 cache size and the
        processor is under load.  To avoid array coercion altogether,
        C functions having arguments of mixed type are allowed in
        Numeric 2.
    
        b.  Avoid creation of temporary arrays
    
        In complex array expressions (i.e. having more than one
        operation), each operation will create a temporary array which
        will be used and then deleted by the succeeding operation.  A
        better approach would be to identify these temporary arrays
        and reuse their data buffers when possible, namely when the
        array shape and type are the same as the temporary array being
        created.  This can be done by checking the temparory array's
        reference count.  If it is 1, then it will be deleted once the
        operation is done and is a candidate for reuse.
    
        c.  Optional use of memory-mapped files
    
        Numeric users sometimes need to access data from very large
        files or to handle data that is greater than the available
        memory.  Memory-mapped arrays provide a mechanism to do this
        by storing the data on disk while making it appear to be in
        memory.  Memory- mapped arrays should improve access to all
        files by eliminating one of two copy steps during a file
        access.  Numeric should be able to access in-memory and
        memory-mapped arrays transparently.
    
        d.  Record access

        In some fields of science, data is stored in files as binary
        records.  For example in astronomy, photon data is stored as a
        1 dimensional list of photons in order of arrival time.  These
        records or C-like structures contain information about the
        detected photon, such as its arrival time, its position on the
        detector, and its energy.  Each field may be of a different
        type, such as char, int, or float.  Such arrays introduce new
        issues that must be dealt with, in particular byte alignment
        or byte swapping may need to be performed for the numeric
        values to be properly accessed (though byte swapping is also
        an issue for memory mapped data).  Numeric 2 is designed to
        automatically handle alignment and representational issues
        when data is accessed or operated on.  There are two
        approaches to implementing records; as either a derived array
        class or a special array type, depending on your point-of-
        view.  We defer this discussion to the Open Issues section.
    
    
    2.  Additional array types
    
    Numeric 1 has 11 defined types: char, ubyte, sbyte, short, int,
    long, float, double, cfloat, cdouble, and object.  There are no
    ushort, uint, or ulong types, nor are there more complex types
    such as a bit type which is of use to some fields of science and
    possibly for implementing masked-arrays.  The design of Numeric 1
    makes the addition of these and other types a difficult and
    error-prone process.  To enable the easy addition (and deletion)
    of new array types such as a bit type described below, a re-design
    of Numeric is necessary.
    
        a.  Bit type
    
        The result of a rich comparison between arrays is an array of
        boolean values.  The result can be stored in an array of type
        char, but this is an unnecessary waste of memory.  A better
        implementation would use a bit or boolean type, compressing
        the array size by a factor of eight.  This is currently being
        implemented for Numeric 1 (by Travis Oliphant) and should be
        included in Numeric 2.

    3.  Enhanced array indexing syntax
    
    The extended slicing syntax was added to Python to provide greater
    flexibility when manipulating Numeric arrays by allowing
    step-sizes greater than 1.  This syntax works well as a shorthand
    for a list of regularly spaced indices.  For those situations
    where a list of irregularly spaced indices are needed, an enhanced
    array indexing syntax would allow 1-D arrays to be arguments.
    
    4.  Rich comparisons
    
    The implementation of PEP 207: Rich Comparisons in Python 2.1
    provides additional flexibility when manipulating arrays.  We
    intend to implement this feature in Numeric 2.
    
    5. Array broadcasting rules
    
    When an operation between a scalar and an array is done, the
    implied behavior is to create a new array having the same shape as
    the array operand containing the scalar value.  This is called
    array broadcasting.  It also works with arrays of lesser rank,
    such as vectors.  This implicit behavior is implemented in Numeric
    1 and will also be implemented in Numeric 2.


Design and Implementation

    The design of Numeric 2 has four primary classes:
    
    1.  ArrayType:
    
    This is a simple class that describes the fundamental properties
    of an array-type, e.g. its name, its size in bytes, its coercion
    relations with respect to other types, etc., e.g.
    
    > Int32 = ArrayType('Int32', 4, 'doc-string')
    
    Its relation to the other types is defined when the C-extension
    module for that type is imported.  The corresponding Python code
    is:
    
    > Int32.astype[Real64] = Real64
    
    This says that the Real64 array-type has higher priority than the
    Int32 array-type.
    
    The following attributes and methods are proposed for the core
    implementation.  Additional attributes can be added on an
    individual basis, e.g. .bitsize or .bitstrides for the bit type.
    
    Attributes:
        .name:                  e.g. "Int32", "Float64", etc.
        .typecode:              e.g. 'i', 'f', etc.
                                (for backward compatibility)
        .size (in bytes):       e.g. 4, 8, etc.
        .array_rules (mapping): rules between array types
        .pyobj_rules (mapping): rules between array and python types
        .doc:                   documentation string
    Methods:
        __init__():             initialization
        __del__():              destruction
        __repr__():             representation
    
    C-API:
        This still needs to be fleshed-out.
    
    
    2.  UFunc:
    
    This class is the heart of Numeric 2.  Its design is similar to
    that of ArrayType in that the UFunc creates a singleton callable
    object whose attributes are name, total and input number of
    arguments, a document string, and an empty CFunc dictionary; e.g.
    
    > add = UFunc('add', 3, 2, 'doc-string')
    
    When defined the add instance has no C functions associated with
    it and therefore can do no work.  The CFunc dictionary is
    populated or registerd later when the C-extension module for an
    array-type is imported.  The arguments of the regiser method are:
    function name, function descriptor, and the CUFunc object.  The
    corresponding Python code is
    
    > add.register('add', (Int32, Int32, Int32), cfunc-add)
    
    In the initialization function of an array type module, e.g.
    Int32, there are two C API functions: one to initialize the
    coercion rules and the other to register the CFunc objects.
    
    When an operation is applied to some arrays, the __call__ method
    is invoked.  It gets the type of each array (if the output array
    is not given, it is created from the coercion rules) and checks
    the CFunc dictionary for a key that matches the argument types.
    If it exists the operation is performed immediately, otherwise the
    coercion rules are used to search for a related operation and set
    of conversion functions.  The __call__ method then invokes a
    compute method written in C to iterate over slices of each array,
    namely:
    
    > _ufunc.compute(slice, data, func, swap, conv)
    
    The 'func' argument is a CFuncObject, while the 'swap' and 'conv'
    arguments are lists of CFuncObjects for those arrays needing pre-
    or post-processing, otherwise None is used.  The data argument is
    a list of buffer objects, and the slice argument gives the number
    of iterations for each dimension along with the buffer offset and
    step size for each array and each dimension.
    
    We have predefined several UFuncs for use by the __call__ method:
    cast, swap, getobj, and setobj.  The cast and swap functions do
    coercion and byte-swapping, resp. and the getobj and setobj
    functions do coercion between Numeric arrays and Python sequences.
    
    The following attributes and methods are proposed for the core
    implementation.
    
    Attributes:
        .name:                  e.g. "add", "subtract", etc.
        .nargs:                 number of total arguments
        .iargs:                 number of input arguments
        .cfuncs (mapping):      the set C functions
        .doc:                   documentation string
    Methods:
        __init__():             initialization
        __del__():              destruction
        __repr__():             representation
        __call__():             look-up and dispatch method
        initrule():             initialize coercion rule
        uninitrule():           uninitialize coercion rule
        register():             register a CUFunc
        unregister():           unregister a CUFunc

    C-API:
        This still needs to be fleshed-out.
    
    3.  Array:
    
    This class contains information about the array, such as shape,
    type, endian-ness of the data, etc..  Its operators, '+', '-',
    etc. just invoke the corresponding UFunc function, e.g.
    
    > def __add__(self, other):
    >     return ufunc.add(self, other)

    The following attributes, methods, and functions are proposed for
    the core implementation.
    
    Attributes:
        .shape:                 shape of the array
        .format:                type of the array
        .real (only complex):   real part of a complex array
        .imag (only complex):   imaginary part of a complex array
    Methods:
        __init__():             initialization
        __del__():              destruction
        __repr_():              representation
        __str__():              pretty representation
        __cmp__():              rich comparison
        __len__():
        __getitem__():
        __setitem__():
        __getslice__():
        __setslice__():
        numeric methods:
        copy():                 copy of array
        aslist():               create list from array
        asstring():             create string from array
        
    Functions:
        fromlist():             create array from sequence
        fromstring():           create array from string
        array():                create array with shape and value
        concat():               concatenate two arrays
        resize():               resize array

    C-API:
        This still needs to be fleshed-out.

    4.  ArrayView

    This class is similar to the Array class except that the reshape
    and flat methods will raise exceptions, since non-contiguous
    arrays cannot be reshaped or flattened using just pointer and
    step-size information.

    C-API:
        This still needs to be fleshed-out.
    
    5.  C-extension modules:
    
    Numeric2 will have several C-extension modules.

        a.  _ufunc:

        The primary module of this set is the _ufuncmodule.c.  The
        intention of this module is to do the bare minimum,
        i.e. iterate over arrays using a specified C function.  The
        interface of these functions is the same as Numeric 1, i.e.

        int (*CFunc)(char *data, int *steps, int repeat, void *func);

        and their functionality is expected to be the same, i.e. they
        iterate over the inner-most dimension.

        The following attributes and methods are proposed for the core
        implementation.
    
        Attibutes:
        
        Methods:
            compute():

        C-API:
            This still needs to be fleshed-out.

        b.  _int32, _real64, etc.:
    
        There will also be C-extension modules for each array type,
        e.g. _int32module.c, _real64module.c, etc.  As mentioned
        previously, when these modules are imported by the UFunc
        module, they will automatically register their functions and
        coercion rules.  New or improved versions of these modules can
        be easily implemented and used without affecting the rest of
        Numeric 2.


Open Issues

    1.  Does slicing syntax default to copy or view behavior?

    The default behavior of Python is to return a copy of a sub-list
    or tuple when slicing syntax is used, whereas Numeric 1 returns a
    view into the array.  The choice made for Numeric 1 is apparently
    for reasons of performance: the developers wish to avoid the
    penalty of allocating and copying the data buffer during each
    array operation and feel that the need for a deepcopy of an array
    to be rare.  Yet, some have argued that Numeric's slice notation
    should also have copy behavior to be consistent with Python lists.
    In this case the performance penalty associated with copy behavior
    can be minimized by implementing copy-on-write.  This scheme has
    both arrays sharing one data buffer (as in view behavior) until
    either array is assigned new data at which point a copy of the
    data buffer is made.  View behavior would then be implemented by
    an ArrayView class, whose behavior be similar to Numeric 1 arrays,
    i.e. .shape is not settable for non-contiguous arrays.  The use of
    an ArrayView class also makes explicit what type of data the array
    contains.

    2.  Does item syntax default to copy or view behavior?

    A similar question arises with the item syntax.  For example, if a
    = [[0,1,2], [3,4,5]] and b = a[0], then changing b[0] also changes
    a[0][0], because a[0] is a reference or view of the first row of
    a.  Therefore, if c is a 2-d array, it would appear that c[i]
    should return a 1-d array which is a view into, instead of a copy
    of, c for consistency.  Yet, c[i] can be considered just a
    shorthand for c[i,:] which would imply copy behavior assuming
    slicing syntax returns a copy.  Should Numeric 2 behave the same
    way as lists and return a view or should it return a copy.
    
    3.  How is scalar coercion implemented?

    Python has fewer numeric types than Numeric which can cause
    coercion problems.  For example when multiplying a Python scalar
    of type float and a Numeric array of type float, the Numeric array
    is converted to a double, since the Python float type is actually
    a double.  This is often not the desired behavior, since the
    Numeric array will be doubled in size which is likely to be
    annoying, particularly for very large arrays.  We prefer that the
    array type trumps the python type for the same type class, namely
    integer, float, and complex.  Therefore an operation between a
    Python integer and an Int16 (short) array will return an Int16
    array.  Whereas an operation between a Python float and an Int16
    array would return a Float64 (double) array.  Operations between
    two arrays use normal coercion rules.
    
    4.  How is integer division handled?
    
    In a future version of Python, the behavior of integer division
    will change.  The operands will be converted to floats, so the
    result will be a float.  If we implement the proposed scalar
    coercion rules where arrays have precedence over Python scalars,
    then dividing an array by an integer will return an integer array
    and will not be consistent with a future version of Python which
    would return an array of type double.  Scientific programmers are
    familiar with the distinction between integer and float-point
    division, so should Numeric 2 continue with this behavior?

    5.  How should records be implemented?

    There are two approaches to implementing records depending on your
    point-of-view.  The first is two divide arrays into separate
    classes depending on the behavior of their types.  For example
    numeric arrays are one class, strings a second, and records a
    third, because the range and type of operations of each class
    differ.  As such, a record array is not a new type, but a
    mechanism for a more flexible form of array.  To easily access and
    manipulate such complex data, the class is comprised of numeric
    arrays having different byte offsets into the data buffer.  For
    example, one might have a table consisting of an array of Int16,
    Real32 values.  Two numeric arrays, one with an offset of 0 bytes
    and a stride of 6 bytes to be interpeted as Int16, and one with an
    offset of 2 bytes and a stride of 6 bytes to be interpreted as
    Real32 would represent the record array.  Both numeric arrays
    would refer to the same data buffer, but have different offset and
    stride attributes, and a different numeric type.

    The second approach is to consider a record as one of many array
    types, albeit with fewer, and possibly different, array operations
    than for numeric arrays.  This approach considers an array type to
    be a mapping of a fixed-length string.  The mapping can either be
    simple, like integer and floating-point numbers, or complex, like
    a complex number, a byte string, and a C-structure.  The record
    type effectively merges the struct and Numeric modules into a
    multi-dimensional struct array.  This approach implies certain
    changes to the array interface.  For example, the 'typecode'
    keyword argument should probably be changed to the more
    descriptive 'format' keyword.

        a.  How are record semantics defined and implemented?

        Which ever implementation approach is taken for records, the
        syntax and semantics of how they are to be accessed and
        manipulated must be decided, if one wishes to have access to
        sub-fields of records.  In this case, the record type can
        essentially be considered an inhomogeneous list, like a tuple
        returned by the unpack method of the struct module; and a 1-d
        array of records may be interpreted as a 2-d array with the
        second dimension being the index into the list of fields.
        This enhanced array semantics makes access to an array of one
        or more of the fields easy and straightforward.  It also
        allows a user to do array operations on a field in a natural
        and intuitive way.  If we assume that records are implemented
        as an array type, then last dimension defaults to 0 and can
        therefore be neglected for arrays comprised of simple types,
        like numeric.
   
    6.  How are masked-arrays implemented?

    Masked-arrays in Numeric 1 are implemented as a separate array
    class.  With the ability to add new array types to Numeric 2, it
    is possible that masked-arrays in Numeric 2 could be implemented
    as a new array type instead of an array class.
    
    7.  How are numerical errors handled (IEEE floating-point errors in
        particular)?

    It is not clear to the proposers (Paul Barrett and Travis
    Oliphant) what is the best or preferred way of handling errors.
    Since most of the C functions that do the operation, iterate over
    the inner-most (last) dimension of the array.  This dimension
    could contain a thousand or more items having one or more errors
    of differing type, such as divide-by-zero, underflow, and
    overflow.  Additionally, keeping track of these errors may come at
    the expense of performance.  Therefore, we suggest several
    options:

        a.  Print a message of the most severe error, leaving it to
        the user to locate the errors.

        b.  Print a message of all errors that occurred and the number
        of occurrences, leaving it to the user to locate the errors.

        c.  Print a message of all errors that occurred and a list of
        where they occurred.

        d.  Or use a hybrid approach, printing only the most severe
        error, yet keeping track of what and where the errors
        occurred.  This would allow the user to locate the errors
        while keeping the error message brief.

    8.  What features are needed to ease the integration of FORTRAN
        libraries and code?

    It would be a good idea at this stage to consider how to ease the
    integration of FORTRAN libraries and user code in Numeric 2.


Implementation Steps

    1.  Implement basic UFunc capability
    
        a.  Minimal Array class:

        Necessary class attributes and methods, e.g. .shape, .data,
        .type, etc.

        b.  Minimal ArrayType class:

        Int32, Real64, Complex64, Char, Object

        c.  Minimall UFunc class:

        UFunc instantiation, CFunction registration, UFunc call for
        1-D arrays including the rules for doing alignment,
        byte-swapping, and coercion.

        d.  Minimal C-extension module:

        _UFunc, which does the innermost array loop in C.
    
        This step implements whatever is needed to do: 'c = add(a, b)'
        where a, b, and c are 1-D arrays.  It teaches us how to add
        new UFuncs, to coerce the arrays, to pass the necessary
        information to a C iterator method and to do the actually
        computation.
    
    2.  Continue enhancing the UFunc iterator and Array class
    
        a.  Implement some access methods for the Array class:
            print, repr, getitem, setitem, etc.

        b.  Implement multidimensional arrays

        c.  Implement some of basic Array methods using UFuncs:
            +, -, *, /, etc.

        d.  Enable UFuncs to use Python sequences.
    
    3.  Complete the standard UFunc and Array class behavior
    
        a.  Implement getslice and setslice behavior

        b.  Work on Array broadcasting rules

        c.  Implement Record type

    4.  Add additional functionality
    
        a.  Add more UFuncs

        b.  Implement buffer or mmap access


Incompatibilities

    The following is a list of incompatibilities in behavior between
    Numeric 1 and Numeric 2.

    1.  Scalar corcion rules

    Numeric 1 has single set of coercion rules for array and Python
    numeric types.  This can cause unexpected and annoying problems
    during the calculation of an array expression.  Numeric 2 intends
    to overcome these problems by having two sets of coercion rules:
    one for arrays and Python numeric types, and another just for
    arrays.

    2.  No savespace attribute

    The savespace attribute in Numeric 1 makes arrays with this
    attribute set take precedence over those that do not have it set.
    Numeric 2 will not have such an attribute and therefore normal
    array coercion rules will be in effect.

    3.  Slicing syntax returns a copy

    The slicing syntax in Numeric 1 returns a view into the original
    array.  The slicing behavior for Numeric 2 will be a copy.  You
    should use the ArrayView class to get a view into an array.

    4.  Boolean comparisons return a boolean array

    A comparison between arrays in Numeric 1 results in a Boolean
    scalar, because of current limitations in Python.  The advent of
    Rich Comparisons in Python 2.1 will allow an array of Booleans to
    be returned.

    5.  Type characters are depricated

    Numeric 2 will have an ArrayType class composed of Type instances,
    for example Int8, Int16, Int32, and Int for signed integers.  The
    typecode scheme in Numeric 1 will be available for backward
    compatibility, but will be depricated.


Appendices

    A.  Implicit sub-arrays iteration

    A computer animation is composed of a number of 2-D images or
    frames of identical shape.  By stacking these images into a single
    block of memory, a 3-D array is created.  Yet the operations to be
    performed are not meant for the entire 3-D array, but on the set
    of 2-D sub-arrays.  In most array languages, each frame has to be
    extracted, operated on, and then reinserted into the output array
    using a for-like loop.  The J language allows the programmer to
    perform such operations implicitly by having a rank for the frame
    and array.  By default these ranks will be the same during the
    creation of the array.  It was the intention of the Numeric 1
    developers to implement this feature, since it is based on the
    language J.  The Numeric 1 code has the required variables for
    implementing this behavior, but was never implemented.  We intend
    to implement implicit sub-array iteration in Numeric 2, if the
    array broadcasting rules found in Numeric 1 do not fully support
    this behavior.


Copyright

    This document is placed in the public domain.


Related PEPs

    PEP 207: Rich Comparisons
        by Guido van Rossum and David Ascher

    PEP 208: Reworking the Coercion Model
        by Neil Schemenauer and Marc-Andre' Lemburg

    PEP 211: Adding New Linear Algebra Operators to Python
        by Greg Wilson

    PEP 225: Elementwise/Objectwise Operators
        by Huaiyu Zhu

    PEP 228: Reworking Python's Numeric Model
        by Moshe Zadka


References

    [1] P. Greenfield 2000. private communication.


From rob at hooft.net  Wed Feb 14 02:42:36 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Wed, 14 Feb 2001 08:42:36 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14985.22458.685587.538866@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
Message-ID: <14986.14060.238048.161366@temoleh.chem.uu.nl>

Some random PEP talk.

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> 2.  Additional array types
    
 PB> Numeric 1 has 11 defined types: char, ubyte, sbyte, short, int,
 PB> long, float, double, cfloat, cdouble, and object.  There are no
 PB> ushort, uint, or ulong types, nor are there more complex types
 PB> such as a bit type which is of use to some fields of science and
 PB> possibly for implementing masked-arrays.

True: I would have had a much easier life with a ushort type. 
    
 PB> Its relation to the other types is defined when the C-extension
 PB> module for that type is imported.  The corresponding Python code
 PB> is:
    
     >> Int32.astype[Real64] = Real64

I understand this is to be done by the Int32 C extension module. 
But how does it know about Real64?
    
 PB> Attributes:
 PB> .name:                  e.g. "Int32", "Float64", etc.
 PB> .typecode:              e.g. 'i', 'f', etc.
 PB> (for backward compatibility)

.typecode() is a method now.

 PB> .size (in bytes):       e.g. 4, 8, etc.

"element size?"

 >> add.register('add', (Int32, Int32, Int32), cfunc-add)

Typo: cfunc-add is an expression, not an identifier.

An implementation of a (Int32, Float32, Float32) add is possible and
desirable as mentioned earlier in the document. Which C module is
going to declare such a combination?

 PB> asstring():             create string from array

Not "tostring" like now?
        
 PB> 4.  ArrayView

 PB> This class is similar to the Array class except that the reshape
 PB> and flat methods will raise exceptions, since non-contiguous
 PB> arrays cannot be reshaped or flattened using just pointer and
 PB> step-size information.

This was completely unclear to me until here. I must say I find this a
strange way of handling things. I haven't looked into implementation
details, but wouldn't it feel more natural if an Array would just be
the "data", and an ArrayView would contain the dimensions and
strides. Completely separated. One would always need a pair, but more
than one ArrayView could use the same Array.

 PB> a.  _ufunc:

 PB> 1.  Does slicing syntax default to copy or view behavior?

Numeric 1 uses slicing for view, and a method for copy. "Feeling"
compatible with core python would require copy on rhs, and view on lhs
of an assignment. Is that distinction possible?

If copy is the default for slicing, how would one make a view?

 PB> 2.  Does item syntax default to copy or view behavior?

view.

 PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 PB> would imply copy behavior assuming slicing syntax returns a copy.

If you reason that way, then c is just a shorthand for c[...] too.

 PB> 3.  How is scalar coercion implemented?

 PB> Python has fewer numeric types than Numeric which can cause
 PB> coercion problems.  For example when multiplying a Python scalar
 PB> of type float and a Numeric array of type float, the Numeric array
 PB> is converted to a double, since the Python float type is actually
 PB> a double.  This is often not the desired behavior, since the
 PB> Numeric array will be doubled in size which is likely to be
 PB> annoying, particularly for very large arrays.

Sure. That is handled reasonably well by the current Numeric 1.

To extend this, I'd like to comment that I have never really understood
the philosophy of taking the largest type for coercion in all languages.
Being a scientist, I have learned that when you multiply a very accurate
number with a very approximate number, your result is going to be very
approximate, not very accurate! It would thus be more logical to have
Float32*Float64 return a Float32!

 PB> In a future version of Python, the behavior of integer division
 PB> will change.  The operands will be converted to floats, so the
 PB> result will be a float.  If we implement the proposed scalar
 PB> coercion rules where arrays have precedence over Python scalars,
 PB> then dividing an array by an integer will return an integer array
 PB> and will not be consistent with a future version of Python which
 PB> would return an array of type double.  Scientific programmers are
 PB> familiar with the distinction between integer and float-point
 PB> division, so should Numeric 2 continue with this behavior?

Numeric 2 should be as compatible as reasonably possible with core python.
But my question is: how would we do integer division of arrays? A ufunc
for which no operator shortcut exists?

 PB> 7.  How are numerical errors handled (IEEE floating-point errors in
 PB> particular)?

I am developing my code on Linux and IRIX. I have seen that where
Numeric code on Linux runs fine, the same code on IRIX may "core dump"
on a FPE (e.g. arctan2(0,0)). That difference should be avoided.

 PB> a.  Print a message of the most severe error, leaving it to
 PB> the user to locate the errors.

What is the most severe error?

 PB> c.  Minimall UFunc class:

Typo: Minimal?

Regards,

Rob Hooft
-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From Barrett at stsci.edu  Wed Feb 14 12:09:45 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Wed, 14 Feb 2001 12:09:45 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.14060.238048.161366@temoleh.chem.uu.nl>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
Message-ID: <14986.43001.213738.708354@nem-srvr.stsci.edu>

Rob W. W. Hooft writes:
 > Some random PEP talk.
 > 
 > >>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:
 >     
 >  PB> Its relation to the other types is defined when the C-extension
 >  PB> module for that type is imported.  The corresponding Python code
 >  PB> is:
 >     
 >      >> Int32.astype[Real64] = Real64
 > 
 > I understand this is to be done by the Int32 C extension module. 
 > But how does it know about Real64?

This approach assumes that there are a basic set of predefined types.
In the above example, the Real64 type is one of them.  But let's
consider creating a completely new type, say Real128.  This type knows
its relation to the other previously defined types, namely Real32,
Real64, etc., but they do not know their relationship to it.  That's
still OK, because the Real128 type is imbued with this required
information and is willing to share it with the other types.

By way of bootstrapping, only one predefined type need be known, say,
Int32.  The operations associated with this type can only be Int32
operations, because this is the only type it knows about.  Yet, we can
add another type, say Real64, which has not only Real64 operations,
BUT also Int32 and Real64 mixed operations, since it knows about
Int32.  The Real64 type provides the necessary information to relate
the Int32 and Int64 types.  Let's now add a third type, then a fourth,
etc., each knowing about its predecessor types but not its successors.

This approach is identical to the way core Python adds new classes or
C-extension types, so this is nothing new.  The current types do not
know about the new type, but the new type knows about them.  As long
as one type knows the relationship between the two that is sufficient
for the scheme to work.

 >  PB> Attributes:
 >  PB> .name:                  e.g. "Int32", "Float64", etc.
 >  PB> .typecode:              e.g. 'i', 'f', etc.
 >  PB> (for backward compatibility)
 > 
 > .typecode() is a method now.

Yes, I propose that it become a settable attribute.

 >  PB> .size (in bytes):       e.g. 4, 8, etc.
 > 
 > "element size?"

Yes.

 >  >> add.register('add', (Int32, Int32, Int32), cfunc-add)
 > 
 > Typo: cfunc-add is an expression, not an identifier.

No, it is a Python object that encompasses and describes a C function
that adds two Int32 arrays and returns an Int32 array.  It is
essentially a Python wrapper of a C-function UFunc.  It has been
suggested that you should also be able to register Python expressions
using the same interface.

 > An implementation of a (Int32, Float32, Float32) add is possible and
 > desirable as mentioned earlier in the document. Which C module is
 > going to declare such a combination?
 > 
 >  PB> asstring():             create string from array
 > 
 > Not "tostring" like now?

This is proposed so as to be a little more consistent with Core Python
which uses 'from-' and 'as-' prefixes.  But I'm don't have strong
opinions either way.

 >  PB> 4.  ArrayView
 > 
 >  PB> This class is similar to the Array class except that the reshape
 >  PB> and flat methods will raise exceptions, since non-contiguous
 >  PB> arrays cannot be reshaped or flattened using just pointer and
 >  PB> step-size information.
 > 
 > This was completely unclear to me until here. I must say I find this a
 > strange way of handling things. I haven't looked into implementation
 > details, but wouldn't it feel more natural if an Array would just be
 > the "data", and an ArrayView would contain the dimensions and
 > strides. Completely separated. One would always need a pair, but more
 > than one ArrayView could use the same Array.

In my definition, an Array that has no knowledge of its shape and type 
is not an Array, it's a data or character buffer.  An array in my
definition is a data buffer with information on how that buffer is to
be mapped, i.e. shape, type, etc.  An ArrayView is an Array that
shares its data buffer with another Array, but may contain a different 
mapping of that Array, ie. its shape and type are different.

If this is what you mean, then the answer is "Yes".  This is how we
intend to implement Arrays and ArrayViews.

 >  PB> a.  _ufunc:
 > 
 >  PB> 1.  Does slicing syntax default to copy or view behavior?
 > 
 > Numeric 1 uses slicing for view, and a method for copy. "Feeling"
 > compatible with core python would require copy on rhs, and view on lhs
 > of an assignment. Is that distinction possible?
 > 
 > If copy is the default for slicing, how would one make a view?

B = A.V[:10] or A.view[:10] are some possibilities.  B is now an
ArrayView class.

 >  PB> 2.  Does item syntax default to copy or view behavior?
 > 
 > view.
 > 
 >  PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 >  PB> would imply copy behavior assuming slicing syntax returns a copy.
 > 
 > If you reason that way, then c is just a shorthand for c[...] too.

Yes, that is correct, but that is not how Python currently behaves.
The motivation for these questions is consistency with core Python
behavior.  The current Numeric does not follow this pattern for
reasons of performance.  If we assume performance is NOT an issue
(ie. we can get similar performance by using various tricks), then
what behavior is more intuitive for the average, and novice, user?

 >  PB> 3.  How is scalar coercion implemented?
 > 
 >  PB> Python has fewer numeric types than Numeric which can cause
 >  PB> coercion problems.  For example when multiplying a Python scalar
 >  PB> of type float and a Numeric array of type float, the Numeric array
 >  PB> is converted to a double, since the Python float type is actually
 >  PB> a double.  This is often not the desired behavior, since the
 >  PB> Numeric array will be doubled in size which is likely to be
 >  PB> annoying, particularly for very large arrays.
 > 
 > Sure. That is handled reasonably well by the current Numeric 1.
 > 
 > To extend this, I'd like to comment that I have never really understood
 > the philosophy of taking the largest type for coercion in all languages.
 > Being a scientist, I have learned that when you multiply a very accurate
 > number with a very approximate number, your result is going to be very
 > approximate, not very accurate! It would thus be more logical to have
 > Float32*Float64 return a Float32!

If numeric precision was all that mattered, then you would be correct.
But numeric range is also important.  I would hate to take the chance
of overflowing the above multiplication because I stored the result as 
a Float32, instead of a Float64, even though the Float64 is overkill
in terms of precision.  FORTRAN has made an attempt to address this
issue in FORTRAN 9X by allowing the user to indicate the range and
precision of the calculation.

 >  PB> In a future version of Python, the behavior of integer division
 >  PB> will change.  The operands will be converted to floats, so the
 >  PB> result will be a float.  If we implement the proposed scalar
 >  PB> coercion rules where arrays have precedence over Python scalars,
 >  PB> then dividing an array by an integer will return an integer array
 >  PB> and will not be consistent with a future version of Python which
 >  PB> would return an array of type double.  Scientific programmers are
 >  PB> familiar with the distinction between integer and float-point
 >  PB> division, so should Numeric 2 continue with this behavior?
 > 
 > Numeric 2 should be as compatible as reasonably possible with core python.
 > But my question is: how would we do integer division of arrays? A ufunc
 > for which no operator shortcut exists?

I don't understand either question.

We have devised a scheme where there are two sets of coercion rules.
One for coercion between array types, and one for array and Python
scalar types.  This latter set of rules can either have higher
precedence for array types or Python scalar types.  We favor array
types having precedence.

A more complex set of coercion rules is also possible, if you prefer.

 >  PB> 7.  How are numerical errors handled (IEEE floating-point errors in
 >  PB> particular)?
 > 
 > I am developing my code on Linux and IRIX. I have seen that where
 > Numeric code on Linux runs fine, the same code on IRIX may "core dump"
 > on a FPE (e.g. arctan2(0,0)). That difference should be avoided.
 > 
 >  PB> a.  Print a message of the most severe error, leaving it to
 >  PB> the user to locate the errors.
 > 
 > What is the most severe error?

Well, divide by zero and overflow come to mind.  Underflows are often
considered less severe.  Yet this is up to you to decide.

 >  PB> c.  Minimall UFunc class:
 > 
 > Typo: Minimal?

Got it!


Thanks for your comments.

I-obviously-have-a-lot-of-explaining-to-do-ly yours,
Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From hinsen at cnrs-orleans.fr  Wed Feb 14 13:03:20 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 14 Feb 2001 19:03:20 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14985.22458.685587.538866@nem-srvr.stsci.edu> (message from Paul
	Barrett on Tue, 13 Feb 2001 10:52:18 -0500 (EST))
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
Message-ID: <200102141803.TAA20224@chinon.cnrs-orleans.fr>

> Design and Implementation

Some parts of this look a bit imprecise and I don't claim to
understand them. For example:

>     Its relation to the other types is defined when the C-extension
>     module for that type is imported.  The corresponding Python code
>     is:
>     
>     > Int32.astype[Real64] = Real64
>     
>     This says that the Real64 array-type has higher priority than the
>     Int32 array-type.

I'd choose a clearer name than "astype" for this, but that's a minor
detail. More important is how this is supposed to work. Suppose that
in Int32 you say that Real64 has higher priority, and in Real64 you
say that Int32 has higher priority. Would this raise an exception, and
if so, when?

Perhaps the coercion question should be treated in a separate PEP that
also covers standard Python types and provides a mechanism that any
type implementer can use. I could think of a number of cases where I
have wished I could define coercions between my own and some other
types properly.

>     3.  Array:
>     
>     This class contains information about the array, such as shape,
>     type, endian-ness of the data, etc..  Its operators, '+', '-',

What about the data itself?

>     4.  ArrayView
> 
>     This class is similar to the Array class except that the reshape
>     and flat methods will raise exceptions, since non-contiguous

There are no reshape and flat methods in this proposal...

>     1.  Does slicing syntax default to copy or view behavior?
> 
>     The default behavior of Python is to return a copy of a sub-list
>     or tuple when slicing syntax is used, whereas Numeric 1 returns a
>     view into the array.  The choice made for Numeric 1 is apparently
>     for reasons of performance: the developers wish to avoid the

Yes, performance was the main reason. But there is another one: if
slicing returns a view, you can make a copy based on it, but if
slicing returns a copy, there's no way to make a view. So if you
change this, you must provide some other way to generate a view, and
please keep the syntax simple (there are many practical cases where a
view is required).

>     In this case the performance penalty associated with copy behavior
>     can be minimized by implementing copy-on-write.  This scheme has

Indeed, that's what most APL implementations do.

>     data buffer is made.  View behavior would then be implemented by
>     an ArrayView class, whose behavior be similar to Numeric 1 arrays,

So users would have to write something like

    ArrayView(array, indices)

That looks a bit cumbersome, and any straightforward way to write the
indices is illegal according to the current syntax rules.

>     2.  Does item syntax default to copy or view behavior?

If compatibility with lists is a criterion at all, then I'd apply it
consistently and use view semantics. Otherwise let's forget about
lists and discuss 1. and 2. from a purely array-oriented point of
view. And then I'd argue that view semantics is more frequent and
should thus be the default for both slicing and item extraction.

>     3.  How is scalar coercion implemented?

The old discussion again...

>     annoying, particularly for very large arrays.  We prefer that the
>     array type trumps the python type for the same type class, namely

That is a completely arbitrary rule from any but the "large array
performance" point of view. And it's against the Principle of Least
Surprise.

Now that we have the PEP procedure for proposing any change
whatsoever, why not lobby for the addition of a float scalar type to
Python, with its own syntax for constants? That looks like the best
solution from everybody's point of view.

>     4.  How is integer division handled?
>     
>     In a future version of Python, the behavior of integer division
>     will change.  The operands will be converted to floats, so the

Has that been decided already?

>     7.  How are numerical errors handled (IEEE floating-point errors in
>         particular)?
> 
>     It is not clear to the proposers (Paul Barrett and Travis
>     Oliphant) what is the best or preferred way of handling errors.
>     Since most of the C functions that do the operation, iterate over
>     the inner-most (last) dimension of the array.  This dimension
>     could contain a thousand or more items having one or more errors
>     of differing type, such as divide-by-zero, underflow, and
>     overflow.  Additionally, keeping track of these errors may come at
>     the expense of performance.  Therefore, we suggest several
>     options:

I'd like to add another one:

e. Keep some statistics about the errors that occur during the
   operation, and if at the end the error count is > 0, raise
   an exception containing as much useful information as possible.

I would certainly not want any Python program to *print* anything
unless I have explicitly told it to do so.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Wed Feb 14 13:09:59 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 14 Feb 2001 19:09:59 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.14060.238048.161366@temoleh.chem.uu.nl> (rob@hooft.net)
References: <14985.22458.685587.538866@nem-srvr.stsci.edu> <14986.14060.238048.161366@temoleh.chem.uu.nl>
Message-ID: <200102141809.TAA20233@chinon.cnrs-orleans.fr>

> Being a scientist, I have learned that when you multiply a very accurate
> number with a very approximate number, your result is going to be very
> approximate, not very accurate! It would thus be more logical to have
> Float32*Float64 return a Float32!

Accuracy is not the right concept, but storage capacity. A Float64 can
store any value that can be stored in a Float32, but the inverse is
not true. Accuracy is not a property of a number, but of a value
and its representation in the computer. The float value "1." can
be perfectly accurate, even in 32 bits, or it can be an approximation
for 1.-1.e-50, which cannot be represented precisely.

BTW, Float64 also has a larger range of magnitudes than Float32,
not just more significant digits.

> Numeric 2 should be as compatible as reasonably possible with core python.
> But my question is: how would we do integer division of arrays? A ufunc
> for which no operator shortcut exists?

Sounds fine. On the other hand, if and when Python's integer division
behaviour is changed, there will be some new syntax for integer division,
which should then also work on arrays.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From rob at hooft.net  Wed Feb 14 16:17:18 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Wed, 14 Feb 2001 22:17:18 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.43001.213738.708354@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
Message-ID: <14986.62942.460585.961514@temoleh.chem.uu.nl>

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> By way of bootstrapping, only one predefined type need be known,
 PB> say, Int32.  The operations associated with this type can only be
 PB> Int32 operations, because this is the only type it knows about.
 PB> Yet, we can add another type, say Real64, which has not only
 PB> Real64 operations, BUT also Int32 and Real64 mixed operations,
 PB> since it knows about Int32.  The Real64 type provides the
 PB> necessary information to relate the Int32 and Int64 types.  Let's
 PB> now add a third type, then a fourth, etc., each knowing about its
 PB> predecessor types but not its successors.

 PB> This approach is identical to the way core Python adds new
 PB> classes or C-extension types, so this is nothing new.  The
 PB> current types do not know about the new type, but the new type
 PB> knows about them.  As long as one type knows the relationship
 PB> between the two that is sufficient for the scheme to work.

Yuck. I'm thinking how long it would take to load the Int256 class,
because it will need to import all other types before defining the 
relations.... [see below for another idea]

 PB> Attributes: .name: e.g. "Int32", "Float64", etc. .typecode:
 PB> e.g. 'i', 'f', etc. (for backward compatibility)
 >>  .typecode() is a method now.

 PB> Yes, I propose that it become a settable attribute.

Then it is not backwards compatible anyway, and you could leave it out.

 PB> .size (in bytes): e.g. 4, 8, etc.
 >>  "element size?"

 PB> Yes.

I think it should be called like that in that case. I dnt lk abbrvs.
size could be misread as the size of the total object.

 >> >> add.register('add', (Int32, Int32, Int32), cfunc-add)
 >> 
 >> Typo: cfunc-add is an expression, not an identifier.

 PB> No, it is a Python object that encompasses and describes a C
 PB> function that adds two Int32 arrays and returns an Int32 array.

I understand that, but in general a "-" in pseudo-code is the
minus operator. I'd write cfunc_add instead.

 >> An implementation of a (Int32, Float32, Float32) add is possible
 >> and desirable as mentioned earlier in the document. Which C module
 >> is going to declare such a combination?

Now that I re-think this: would it be possible for the type-loader to check
for each type that it loads whether a cross-type module is available with
a previously loaded type? That way all types can be independent. There would
be a Int32 module knowing only Int32 types, and Float32 only knowing Float32 types.
Then there would be a Int32Float32 type that handles cross-type functions.
When Int32 or Float32 is loaded, the loader can see whether the other has
been loaded earlier, and if it is, load the cross-definitions as well.

Only problem I can think of is functions linking 3 or more types.

 PB> asstring(): create string from array
 >>  Not "tostring" like now?

 PB> This is proposed so as to be a little more consistent with Core
 PB> Python which uses 'from-' and 'as-' prefixes.  But I'm don't have
 PB> strong opinions either way.

PIL uses tostring as well. Anyway, I understand the buffer interface
is a nicer way to communicate.

 PB> 4.  ArrayView
 >>
 PB> This class is similar to the Array class except that the reshape
 PB> and flat methods will raise exceptions, since non-contiguous
 PB> arrays cannot be reshaped or flattened using just pointer and
 PB> step-size information.
 >>  This was completely unclear to me until here. I must say I find
 >> this a strange way of handling things. I haven't looked into
 >> implementation details, but wouldn't it feel more natural if an
 >> Array would just be the "data", and an ArrayView would contain the
 >> dimensions and strides. Completely separated. One would always
 >> need a pair, but more than one ArrayView could use the same Array.

 PB> In my definition, an Array that has no knowledge of its shape and
 PB> type is not an Array, it's a data or character buffer.  An array
 PB> in my definition is a data buffer with information on how that
 PB> buffer is to be mapped, i.e. shape, type, etc.  An ArrayView is
 PB> an Array that shares its data buffer with another Array, but may
 PB> contain a different mapping of that Array, ie. its shape and type
 PB> are different.

 PB> If this is what you mean, then the answer is "Yes".  This is how
 PB> we intend to implement Arrays and ArrayViews.

No, it is not what I meant. Reading your answer I'd say that I wouldn't
see the need for an Array. We only need a data buffer and an ArrayView.
If there are two parts of the functionality, it is much cleaner to make 
the cut in an orthogonal way.

 PB> B = A.V[:10] or A.view[:10] are some possibilities.  B is now an
 PB> ArrayView class.

I hate magic attributes like this. I do not like abbrevs at all. It is
not at all obvious what A.T or A.V mean.

 PB> 2.  Does item syntax default to copy or view behavior?
 >>  view.
 >> 
 PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 PB> would imply copy behavior assuming slicing syntax returns a copy.
 >>  If you reason that way, then c is just a shorthand for c[...]
 >> too.

 PB> Yes, that is correct, but that is not how Python currently
 PB> behaves.

Current python also doesn't treat c[i] as a shorthand for c[i,:] or
c[i,...]

Regards,

Rob Hooft

-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From Barrett at stsci.edu  Wed Feb 14 18:05:17 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Wed, 14 Feb 2001 18:05:17 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.62942.460585.961514@temoleh.chem.uu.nl>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
Message-ID: <14987.246.29693.379005@nem-srvr.stsci.edu>

Rob W. W. Hooft writes:
 > >>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:
 > 
 >  PB> By way of bootstrapping, only one predefined type need be known,
 >  PB> say, Int32.  The operations associated with this type can only be
 >  PB> Int32 operations, because this is the only type it knows about.
 >  PB> Yet, we can add another type, say Real64, which has not only
 >  PB> Real64 operations, BUT also Int32 and Real64 mixed operations,
 >  PB> since it knows about Int32.  The Real64 type provides the
 >  PB> necessary information to relate the Int32 and Int64 types.  Let's
 >  PB> now add a third type, then a fourth, etc., each knowing about its
 >  PB> predecessor types but not its successors.
 > 
 >  PB> This approach is identical to the way core Python adds new
 >  PB> classes or C-extension types, so this is nothing new.  The
 >  PB> current types do not know about the new type, but the new type
 >  PB> knows about them.  As long as one type knows the relationship
 >  PB> between the two that is sufficient for the scheme to work.
 > 
 > Yuck. I'm thinking how long it would take to load the Int256 class,
 > because it will need to import all other types before defining the 
 > relations.... [see below for another idea]


First, I'm not proposing that we use this method of bootstapping from
just one type.  I was just demonstrating that it could be done.  Users 
could then create their own types and dynamically add them to the
module by the above scheme.

Second, I think your making the situation more complex than it really
is.  It doesn't take that long to initialize the type rules and
register the functions, because both arrays are sparsely populated.
If there isn't a rule between two types, you don't have to create a
dictionary entry.  The size of the coecion table is equal to or less
than the number of types, so that's small.  The function table is a
sparsely populated square array.  We just envision populating its
diagonal elements and using coercion rules for the empty off-diagonal
elements.  The point is that if an off-diagonal element is filled,
then it will be used.

I'll include our proposed implementation in the PEP for clarification.


 >  PB> Attributes: .name: e.g. "Int32", "Float64", etc. .typecode:
 >  PB> e.g. 'i', 'f', etc. (for backward compatibility)
 >  >>  .typecode() is a method now.
 > 
 >  PB> Yes, I propose that it become a settable attribute.
 > 
 > Then it is not backwards compatible anyway, and you could leave it out.


I'd like to, but others have strongly objected to leaving out
typecodes.


 >  PB> .size (in bytes): e.g. 4, 8, etc.
 >  >>  "element size?"
 > 
 >  PB> Yes.
 > 
 > I think it should be called like that in that case. I dnt lk abbrvs.
 > size could be misread as the size of the total object.


How about item_size?


 >  >> >> add.register('add', (Int32, Int32, Int32), cfunc-add)
 >  >> 
 >  >> Typo: cfunc-add is an expression, not an identifier.
 > 
 >  PB> No, it is a Python object that encompasses and describes a C
 >  PB> function that adds two Int32 arrays and returns an Int32 array.
 > 
 > I understand that, but in general a "-" in pseudo-code is the
 > minus operator. I'd write cfunc_add instead.


Yes. I understand now.


 >  PB> 4.  ArrayView
 >  >>
 >  PB> This class is similar to the Array class except that the reshape
 >  PB> and flat methods will raise exceptions, since non-contiguous
 >  PB> arrays cannot be reshaped or flattened using just pointer and
 >  PB> step-size information.
 >  >>  This was completely unclear to me until here. I must say I find
 >  >> this a strange way of handling things. I haven't looked into
 >  >> implementation details, but wouldn't it feel more natural if an
 >  >> Array would just be the "data", and an ArrayView would contain the
 >  >> dimensions and strides. Completely separated. One would always
 >  >> need a pair, but more than one ArrayView could use the same Array.
 > 
 >  PB> In my definition, an Array that has no knowledge of its shape and
 >  PB> type is not an Array, it's a data or character buffer.  An array
 >  PB> in my definition is a data buffer with information on how that
 >  PB> buffer is to be mapped, i.e. shape, type, etc.  An ArrayView is
 >  PB> an Array that shares its data buffer with another Array, but may
 >  PB> contain a different mapping of that Array, ie. its shape and type
 >  PB> are different.
 > 
 >  PB> If this is what you mean, then the answer is "Yes".  This is how
 >  PB> we intend to implement Arrays and ArrayViews.
 > 
 > No, it is not what I meant. Reading your answer I'd say that I wouldn't
 > see the need for an Array. We only need a data buffer and an ArrayView.
 > If there are two parts of the functionality, it is much cleaner to make 
 > the cut in an orthogonal way.


I just don't see what you are getting at here!   What attributes does
your Array have, if it doesn't have a shape or type?

If Arrays only have view behavior; then Yes, there is no need for the
ArrayView class.  Whereas if Arrays have copy behavior, it might be a
good idea to distinguish between an ordinary Array and a ArrayView.
An alternative would be to have a view attribute.


 >  PB> B = A.V[:10] or A.view[:10] are some possibilities.  B is now an
 >  PB> ArrayView class.
 > 
 > I hate magic attributes like this. I do not like abbrevs at all. It is
 > not at all obvious what A.T or A.V mean.


I'm not a fan of them either, but I'm looking for concensus on these
issues.


 >  PB> 2.  Does item syntax default to copy or view behavior?
 >  >>  view.
 >  >> 
 >  PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 >  PB> would imply copy behavior assuming slicing syntax returns a copy.
 >  >>  If you reason that way, then c is just a shorthand for c[...]
 >  >> too.
 > 
 >  PB> Yes, that is correct, but that is not how Python currently
 >  PB> behaves.
 > 
 > Current python also doesn't treat c[i] as a shorthand for c[i,:] or
 > c[i,...]

Because there aren't any multi-dimensional lists in Python, only
nested 1-dimensional lists.  There is a structural difference.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From Robert.Harrison at pnl.gov  Wed Feb 14 18:06:23 2001
From: Robert.Harrison at pnl.gov (Harrison, Robert J)
Date: Wed, 14 Feb 2001 15:06:23 -0800
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
Message-ID: <4F638A86A844A148876B65CC635038860E8844@pnlmse16.pnl.gov>

Paul Barrett writes:
> Rob W. W. Hooft writes:
>  > Being a scientist, I have learned that when you multiply a 
> very accurate
>  > number with a very approximate number, your result is 
> going to be very
>  > approximate, not very accurate! It would thus be more 
> logical to have
>  > Float32*Float64 return a Float32!
> 
> If numeric precision was all that mattered, then you would be correct.
> But numeric range is also important.  I would hate to take the chance
> of overflowing the above multiplication because I stored the 
> result as 
> a Float32, instead of a Float64, even though the Float64 is overkill
> in terms of precision.  FORTRAN has made an attempt to address this
> issue in FORTRAN 9X by allowing the user to indicate the range and
> precision of the calculation.
> 

A number in a floating point representation is not necessarily 
represented inexactly.  The discussion of Barrett and Hooft 
is confusing the distinct concepts of precision and accuracy.
Well worth reading is Kahan's scathing critcism of Java's 
floating-point model, at least some of which relates directly to
that of Python or proposals in PEPs 209 and 228.  
http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf
See p18 for "definitions" of precision and accuracy.
There's a lot more material in the literature, on Kahan's web-site,
and the following is an excellent discussion of floating point
arithmetic and the IEEE standards.
http://cch.loria.fr/documentation/IEEE754/ACM/goldberg.pdf

With regard to the treatment of errors:
Correct and detailed handling of floating-point exceptions need
not impact speed, provided that a mechanism is provided to
(en/dis)able each exception.  Users not interested in exceptions
can simply mask them.  I recall relevant prior discussion including 
constructive comments from Tim Peters.  Many modern and efficient
numerical algorithms, and also effective debugging of numerical 
programs that use large datasets, *require* accurate
and prompt identification of exceptions.  Accurate meaning that
the arrays, their indices, the operation, traceback and type of exception
must be reported.  Delayed reporting of errors is not satisfactory 
since operations performed in the interim may destroy valuable data,
or take a very long time (esp. if many exceptions are being generated).
It is probably unreasonable to ask for more than the capabilities 
provided by some subset of the still platform dependent optimizing 
compilers used to implement Python/Numpy, but I don't see why we should 
have much less. 

I would encourage the developers of PEPs 209 and 228 to submit their
designs for review by a panel of professional numerical analysts 
(not just numerically literate programmers or scientists).  
While full IEEE 754 within Python or NumPy may still be just
a pipe-dream (for some at least), we can at least take a step closer.


Robert


Robert Harrison
Pacific Northwest National Laboratory
Richland, Washington 99352
(509) 375-2037
robert.harrison at pnl.gov
 

From frohne at gci.net  Wed Feb 14 21:29:25 2001
From: frohne at gci.net (Ivan Frohne)
Date: Wed, 14 Feb 2001 17:29:25 -0900
Subject: [Numpy-discussion] Numeric 2 :  Arrays and Floating Point in C#
Message-ID: <001401c096f7$2312d860$e498ed18@d4100>

Microsoft's new  language C# (c-sharp) implements
the IEEE-754 floating point standard.  There are positive and
negative infinities and zeros, NaNs, and arithmetic operations
involving these values behave properly.  C# also has both
multidimensional rectangular and ragged arrays,
and combinations thereof.

Since a version of Python based on C# will soon be released,
(by ActiveState), any Numeric-2 development that doesn't take
these accomplishments seriously is in danger of becoming
obsolete before it gets documented.

The C# language specification is at the web site below (make
one line out of it).  See, in particular, sections 4.1.5 and 12.1.

--Ivan Frohne

http://msdn.microsoft.com/library/default.asp?URL=/library/dotnet/csspec/vcl
rfcsharpspec_Start.htm


From paul at pfdubois.com  Wed Feb 14 23:22:32 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Wed, 14 Feb 2001 20:22:32 -0800
Subject: [Numpy-discussion] Numeric 2 :  Arrays and Floating Point in C#
In-Reply-To: <001401c096f7$2312d860$e498ed18@d4100>
Message-ID: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com>

Thank you for pointing this out. I have two questions.

1. Note that we could not reach a consensus about using C++ for future
versions, even though C++ is quite aged by now, because of complaints that
acceptable (ie, standard-conforming) compilers were not available (a) for
free and (b) on all platforms. When would C# likely be able to meet these
conditions?

2. Java flunked the Kindergarten test -- it did not like to play with
others. Will C# pass it? If I want to use many of the available algorithms,
I have to be able to call C and Fortran. The fact that Python itself is
implemented in a given language is of almost no value in and of itself.
Nobody is going to rewrite Linpack and Spherepack in C# next month.

My questions may sound rhetorical, but they are not. Although I have glanced
through the C# spec, and am somewhat pleased with it, I do not know the
answers to these questions.


-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Ivan
Frohne
Sent: Wednesday, February 14, 2001 6:29 PM
To: Numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Numeric 2 : Arrays and Floating Point in C#


Microsoft's new  language C# (c-sharp) implements
the IEEE-754 floating point standard.  There are positive and
negative infinities and zeros, NaNs, and arithmetic operations
involving these values behave properly.  C# also has both
multidimensional rectangular and ragged arrays,
and combinations thereof.

Since a version of Python based on C# will soon be released,
(by ActiveState), any Numeric-2 development that doesn't take
these accomplishments seriously is in danger of becoming
obsolete before it gets documented.

The C# language specification is at the web site below (make
one line out of it).  See, in particular, sections 4.1.5 and 12.1.

--Ivan Frohne

http://msdn.microsoft.com/library/default.asp?URL=/library/dotnet/csspec/vcl
rfcsharpspec_Start.htm


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From frohne at gci.net  Thu Feb 15 14:22:08 2001
From: frohne at gci.net (Ivan Frohne)
Date: Thu, 15 Feb 2001 10:22:08 -0900
Subject: [Numpy-discussion] Numeric 2 :  Arrays and Floating Point in C#
References: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com>
Message-ID: <002601c09784$d0955a20$5599ed18@d4100>

----- Original Message -----
From: "Paul F. Dubois" <paul at pfdubois.com>
To: "Ivan Frohne" <frohne at gci.net>; <Numpy-discussion at lists.sourceforge.net>
Sent: Wednesday, February 14, 2001 19:22
Subject: RE: [Numpy-discussion] Numeric 2 : Arrays and Floating Point in C#


> Thank you for pointing this out. I have two questions.
>
> 1. Note that we could not reach a consensus about using C++ for future
> versions, even though C++ is quite aged by now, because of complaints that
> acceptable (ie, standard-conforming) compilers were not available (a) for
> free and (b) on all platforms. When would C# likely be able to meet these
> conditions?
>
> 2. Java flunked the Kindergarten test -- it did not like to play with
> others. Will C# pass it? If I want to use many of the available
algorithms,
> I have to be able to call C and Fortran. The fact that Python itself is
> implemented in a given language is of almost no value in and of itself.
> Nobody is going to rewrite Linpack and Spherepack in C# next month.
>
> My questions may sound rhetorical, but they are not. Although I have
glanced
> through the C# spec, and am somewhat pleased with it, I do not know the
> answers to these questions.
>

Microsoft has a long list of languages which they claim will
support C# and the .NET Framework, including C++, Python,
Perl, Eiffel, Oberon, Haskell, Smalltalk, and even COBOL.
Fortran is conspicuous by its absence on the list, but Fujitsu is
doing the COBOL port. Fujitsu and Lahey Fortran
are working partners.  Or maybe Compaq/Digital has something
on the back burner?

http://msdn.microsoft.com/net/thirdparty/default.asp#lang

http://msdn.microsoft.com/library/default.asp?URL=/library/techart/Interopdo
tNET.htm

What's encouraging about C# and the .NET Framework is that
they appear to have been designed to address some of the more
serious shortcomings of JAVA:

(0)  Many languages will be supported.
(1)  The C# language specification has been submitted to the
international standards body ECMA for standardization.
(2)  Built-in types (ints, longs, doubles, arrays, etc.) are objects.
(3)  Unsigned integer types are included.
(4)  There is full IEEE 754 floating point support.
(5)  There is native support for multidimensional arrays, not just
awkward ragged arrays.
(6)  Most operators can be overloaded.
(7)  If you must, pointers are supported.

Python supports complex arithmetic out of the box.  But to invert
a matrix you have to twist yourself into a pretzel.

Ivan Frohne


From wsryu at fas.harvard.edu  Thu Feb 15 22:59:14 2001
From: wsryu at fas.harvard.edu (William Ryu)
Date: Thu, 15 Feb 2001 22:59:14 -0500
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <002601c09784$d0955a20$5599ed18@d4100>
References: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com>
Message-ID: <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20010215/e87ae02c/attachment.html>

From pplumlee at omnigon.com  Fri Feb 16 00:31:14 2001
From: pplumlee at omnigon.com (Phlip)
Date: Thu, 15 Feb 2001 21:31:14 -0800
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>
References: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com> <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>
Message-ID: <01021521311401.11060@cuzco.concentric.net>

[Could someone configure this mailing list so ReplyTo goes to the list not 
the most recent participant?]

Proclaimed William Ryu from the mountaintops:

> <html>
> <font size=3>Was wondering if there is a &quot;standard&quot; library of
> curve fitting routines that people use with Numeric Python. I'm
> especially interested in high order Bezier curves, but would like to save
> some time if someone has put together a good curve fitting
> packaging.<br>


The ScientificPython should have this...

         http://starship.python.net/crew/hinsen/scientific.html

...but it appears the closest it has is Least Squares line fitting, and 
curved lines in its displays.

Maybe you could add your results to it. So much for saving time ;-)

-- 
  Phlip                          phlip_cpp at my-deja.com
============ http://c2.com/cgi/wiki?PhlIp ============
  --  Please state the nature of the programming emergency  --


From phrxy at csv.warwick.ac.uk  Fri Feb 16 03:24:45 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 16 Feb 2001 08:24:45 +0000 (GMT)
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>
Message-ID: <Pine.SOL.4.30.0102160820220.12950-100000@mimosa.csv.warwick.ac.uk>

On Thu, 15 Feb 2001, William Ryu wrote:

> Was wondering if there is a "standard" library of curve fitting routines
> that people use with Numeric Python. I'm especially interested in high
> order Bezier curves, but would like to save some time if someone has put
> together a good curve fitting packaging.

There are simple wrappers of minpack (which includes non-linear least
squares) and dierckx (splines, I think) libraries in Travis Oliphant's
Multipack.  Travis is still in the process of moving its homepage ATM I
think, but they are available somewhere near here:

cens.ioc.ee/cgi-bin/cvsweb/python/multipack/

If you search back in the archives of this list, you'll find a pointer to
instructions for how to get them with cvs.


John


From rob at hooft.net  Fri Feb 16 04:18:11 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Fri, 16 Feb 2001 10:18:11 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14987.246.29693.379005@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
	<14987.246.29693.379005@nem-srvr.stsci.edu>
Message-ID: <14988.61523.833334.328664@temoleh.chem.uu.nl>

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:


 PB> .size (in bytes): e.g. 4, 8, etc.
 >> >> "element size?"

 PB> How about item_size?

OK.

 >>  No, it is not what I meant. Reading your answer I'd say that I
 >> wouldn't see the need for an Array. We only need a data buffer and
 >> an ArrayView. If there are two parts of the functionality, it is
 >> much cleaner to make the cut in an orthogonal way.


 PB> I just don't see what you are getting at here!  What attributes
 PB> does your Array have, if it doesn't have a shape or type?

A piece of memory. It needs nothing more. A buffer[1]. You'd always
need an ArrayView.  The Arrayview contains information like
dimensions, strides, data type, endianness.

Making a new _view_ would consist of making a new ArrayView, and pointing
its data object to the same data array. 

Making a new _copy_ would consist of making a new ArrayView, and
marking the "copy-on-write" features (however that needs to be
implemented, I have never done that. Does it involve weak
references?).

Different Views on the same data can even have different data types:
e.g. character and byte, or even floating point and integer (I am
a happy user of the fortran EQUIVALENCE statement that way too).

The speed up by re-use of temporary arrays becomes very easy this way
too: one can even re-use a floating point data array as integer result
if the reference count of both the data array and its (only) view is
one.

[1] Could the python buffer interface be used as a pre-existing
    implementation here? Would that make it possible to implement
    Array.append()? I don't always know beforehand how large my
    numeric arrays will become.

Rob

-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From Barrett at stsci.edu  Fri Feb 16 11:18:53 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 16 Feb 2001 11:18:53 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14988.61523.833334.328664@temoleh.chem.uu.nl>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
	<14987.246.29693.379005@nem-srvr.stsci.edu>
	<14988.61523.833334.328664@temoleh.chem.uu.nl>
Message-ID: <14989.18319.218930.846896@nem-srvr.stsci.edu>

Rob W. W. Hooft writes:
 > 
 >  >>  No, it is not what I meant. Reading your answer I'd say that I
 >  >> wouldn't see the need for an Array. We only need a data buffer and
 >  >> an ArrayView. If there are two parts of the functionality, it is
 >  >> much cleaner to make the cut in an orthogonal way.
 > 
 > 
 >  PB> I just don't see what you are getting at here!  What attributes
 >  PB> does your Array have, if it doesn't have a shape or type?
 > 
 > A piece of memory. It needs nothing more. A buffer[1]. You'd always
 > need an ArrayView.  The Arrayview contains information like
 > dimensions, strides, data type, endianness.
 > 
 > Making a new _view_ would consist of making a new ArrayView, and pointing
 > its data object to the same data array. 
 > 
 > Making a new _copy_ would consist of making a new ArrayView, and
 > marking the "copy-on-write" features (however that needs to be
 > implemented, I have never done that. Does it involve weak
 > references?).
 > 
 > Different Views on the same data can even have different data types:
 > e.g. character and byte, or even floating point and integer (I am
 > a happy user of the fortran EQUIVALENCE statement that way too).

I think our approaches are very similar.  It's the meaning that we
ascribe to Array and ArrayView that appears to be causing the
confusion.  Your Array object is our Data object and your ArrayView
object is our Array attributes, ie. the information to map/interpret
the Data object.  We view an Array as being composed of two entities,
its attributes and a Data object.  And we entirely agree with the
above definitions of _view_ and _copy_.  But you haven't told us what
object associates your Array and ArrayView to make a usable array that 
can be sliced, diced, and Julian fried.

My impression of your slice method would be:

slice(Array, ArrayView, slice expression)

I'm not too keen on this approach. :-)

 > The speed up by re-use of temporary arrays becomes very easy this way
 > too: one can even re-use a floating point data array as integer result
 > if the reference count of both the data array and its (only) view is
 > one.

Yes!  This is our intended implementation.  But instead of re-using
your Array object, we will be re-using a (data-) buffer object, or a
memory-mapped object, or whatever else in which the data is stored.

 > [1] Could the python buffer interface be used as a pre-existing
 >     implementation here? Would that make it possible to implement
 >     Array.append()? I don't always know beforehand how large my
 >     numeric arrays will become.

In a way, yes.  I've considered creating an in-memory object that has
similar properties to the memory-mapped object (e.g. it might have a
read-only property), so that the two data objects can be used
interchangeably.  The in-memory object would replace the string object 
as a data store, since the string object is meant to be read-only.

 -- Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From clee at gnwy100.wuh.wustl.edu  Fri Feb 16 14:32:49 2001
From: clee at gnwy100.wuh.wustl.edu (Christopher Lee)
Date: Fri, 16 Feb 2001 13:32:49 -0600
Subject: [Numpy-discussion] configuration ideas
Message-ID: <200102161932.NAA16245@gnwy100.wuh.wustl.edu>

I am preparing a patch to Numeric 17.3.0 that allows for easier integration
of native BLAS/Lapack libraries with Numeric's dot() function and with the
LAPACK package.

What I would like to know is how/where should I specify build preferences.

The current situation is that I have added a config.py to the top
directory.  Inside this file, python variables like HAVE_CBLAS and/or
HAVE_FBLAS control linking and preprocessor flags. By default, the
distribution would build w/o the native libraries.  Necessary info like
library directories, includes and link flags would be listed as well and
available to distutils for Numeric and any of it's sub-packages.

How does this approach sound?

-chris


From beausol at hpl.hp.com  Fri Feb 16 15:38:10 2001
From: beausol at hpl.hp.com (Raymond Beausoleil)
Date: Fri, 16 Feb 2001 12:38:10 -0800
Subject: [Numpy-discussion] configuration ideas
In-Reply-To: <200102161932.NAA16245@gnwy100.wuh.wustl.edu>
Message-ID: <5.0.2.1.2.20010216123234.00aaa8c8@hplex1.hpl.hp.com>

Actually, this approach sounds more convenient than the one I've been using 
to integrate the Intel native BLAS and (most of) LAPACK provided for 
Windows. I built my original version(s) using a visual IDE, and I've been 
fiddling around with the standard distribution to try to get the paths 
right. Could you please send me your scripts so that I can modify them for 
my application?

= Ray

At 01:32 PM 2/16/2001 -0600, Christopher Lee wrote:
>I am preparing a patch to Numeric 17.3.0 that allows for easier integration
>of native BLAS/Lapack libraries with Numeric's dot() function and with the
>LAPACK package.
>
>What I would like to know is how/where should I specify build preferences.
>
>The current situation is that I have added a config.py to the top
>directory.  Inside this file, python variables like HAVE_CBLAS and/or
>HAVE_FBLAS control linking and preprocessor flags. By default, the
>distribution would build w/o the native libraries.  Necessary info like
>library directories, includes and link flags would be listed as well and
>available to distutils for Numeric and any of it's sub-packages.
>
>How does this approach sound?
>
>-chris
>
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>http://lists.sourceforge.net/lists/listinfo/numpy-discussion

============================
Ray Beausoleil
Hewlett-Packard Laboratories
mailto:beausol at hpl.hp.com
425-883-6648    Office
425-957-4951    Telnet
425-941-2566    Mobile
============================


From phrxy at csv.warwick.ac.uk  Fri Feb 16 18:16:07 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 16 Feb 2001 23:16:07 +0000 (GMT)
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <Pine.SOL.4.30.0102160820220.12950-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <Pine.SOL.4.30.0102162305260.22841-100000@mimosa.csv.warwick.ac.uk>

On Fri, 16 Feb 2001, John J. Lee wrote:

> On Thu, 15 Feb 2001, William Ryu wrote:
>
> > Was wondering if there is a "standard" library of curve fitting routines
> > that people use with Numeric Python. I'm especially interested in high
> > order Bezier curves, but would like to save some time if someone has put
> > together a good curve fitting packaging.
>
> There are simple wrappers of minpack (which includes non-linear least
> squares) and dierckx (splines, I think) libraries in Travis Oliphant's
> Multipack.  Travis is still in the process of moving its homepage ATM I
> think, but they are available somewhere near here:
>
> cens.ioc.ee/cgi-bin/cvsweb/python/multipack/
>
> If you search back in the archives of this list, you'll find a pointer to
> instructions for how to get them with cvs.

Just occurred to me there might be drawing programs out there with bezier
fitting routines.  I'd been (probably falsely) assuming that everyone here
is doing science or engineering (come to think of it, I don't know if
there *are* any applications of bezier curves in science).

Google is very useful:

http://www.google.fr/search?q=bezier+python+fitting&hq=&hl=en&safe=off&csr=

http://sketch.sourceforge.net/devnotes.html

> Sketch 0.7.4 (December 23rd, 1999)
> [...]
> Moved more of the curve fitting code for the freehand tool to C to
> make it faster.
>
> [...]
> Sketch 0.7.3 (October 17th, 1999)
>
> A freehand tool. The implementation of the curve fitting is a bit slow
> at the moment, because much of the computation is done in Python.
> Moving more parts to C should improve performance substantially.

OTOH, perhaps you are working on this very program??


John


From rob at hooft.net  Mon Feb 19 02:50:47 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Mon, 19 Feb 2001 08:50:47 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14989.18319.218930.846896@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
	<14987.246.29693.379005@nem-srvr.stsci.edu>
	<14988.61523.833334.328664@temoleh.chem.uu.nl>
	<14989.18319.218930.846896@nem-srvr.stsci.edu>
Message-ID: <14992.53335.696707.589726@temoleh.chem.uu.nl>

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> we entirely agree with the above definitions of _view_ and
 PB> _copy_.  But you haven't told us what object associates your
 PB> Array and ArrayView to make a usable array that can be sliced,
 PB> diced, and Julian fried.

Hm. You know, I am not so deep into the python internals. I am a fairly
high-level programmer. Not a CS type, but a chemist.... There might be
much of implementation detail that escapes me. But I'm just trying to keep
things beautiful (as in Erich Gamma et.al.)

I thought an ArrayView would have a pointer to the data array. Like the
either like the .data attribute in the Numeric 1 API, or as a python
object pointer.

 PB> My impression of your slice method would be:

 PB> slice(Array, ArrayView, slice expression)

If ArrayView.HasA(Array), that would not be required.

Regards,

Rob Hooft.

-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From sdhyok at email.unc.edu  Tue Feb 20 23:04:28 2001
From: sdhyok at email.unc.edu (Daehyok Shin)
Date: Tue, 20 Feb 2001 20:04:28 -0800
Subject: [Numpy-discussion] Sparse matrix support?
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
Message-ID: <017101c09bbb$649d6480$56111918@nc.rr.com>

Is there any plan to support sparse matrices in NumPy?

Peter


From victor at idaccr.org  Thu Feb 22 16:47:30 2001
From: victor at idaccr.org (Victor S. Miller)
Date: 22 Feb 2001 16:47:30 -0500
Subject: [Numpy-discussion] Handling underflow
Message-ID: <ulg0h6tmod.fsf@runner.princeton.idaccr.org>

Is there some way of having calculations which cause underflow
automatically set their result to 0.0?  For example when I take
exp(a), where a is a floating point array.
-- 
Victor S. Miller     | " ... Meanwhile, those of us who can compute can hardly
victor at idaccr.org    | be expected to keep writing papers saying 'I can do the
CCR, Princeton, NJ   | following useless calculation in 2 seconds', and indeed
    08540 USA        | what editor would publish them?"  -- Oliver Atkin


From phrxy at csv.warwick.ac.uk  Fri Feb 23 07:38:54 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 23 Feb 2001 12:38:54 +0000 (GMT)
Subject: [Numpy-discussion] Handling underflow
In-Reply-To: <ulg0h6tmod.fsf@runner.princeton.idaccr.org>
Message-ID: <Pine.SOL.4.30.0102231236140.19229-100000@mimosa.csv.warwick.ac.uk>

On 22 Feb 2001, Victor S. Miller wrote:

> Is there some way of having calculations which cause underflow
> automatically set their result to 0.0?  For example when I take
> exp(a), where a is a floating point array.

Not 'automatically', but:

a = whatever()
choose(greater(a, MAX), (a, MAX))
answer = exp(-a)

any good?  This is with a 1D array -- I haven't used higher dimensions
much.


John


From phrxy at csv.warwick.ac.uk  Fri Feb 23 10:10:11 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 23 Feb 2001 15:10:11 +0000 (GMT)
Subject: [Numpy-discussion] checking identity of arrays?
In-Reply-To: <Pine.SOL.4.30.0102231236140.19229-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <Pine.SOL.4.30.0102231500130.1734-100000@mimosa.csv.warwick.ac.uk>

I must be missing something obvious: how does one check if two variables
refer to the same array object?


John


From kern at its.caltech.edu  Fri Feb 23 10:35:51 2001
From: kern at its.caltech.edu (Robert Kern)
Date: Fri, 23 Feb 2001 07:35:51 -0800 (PST)
Subject: [Numpy-discussion] checking identity of arrays?
In-Reply-To: <Pine.SOL.4.30.0102231500130.1734-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <Pine.GSO.4.21.0102230734540.20009-100000@screwdriver>

On Fri, 23 Feb 2001, John J. Lee wrote:

> I must be missing something obvious: how does one check if two variables
> refer to the same array object?

Python's id() builtin function?

> John

--
Robert Kern
kern at caltech.edu

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter


From paul at pfdubois.com  Fri Feb 23 10:35:13 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Fri, 23 Feb 2001 07:35:13 -0800
Subject: [Numpy-discussion] checking identity of arrays?
In-Reply-To: <Pine.SOL.4.30.0102231500130.1734-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <ADEOIFHFONCLEEPKCACCKEPICFAA.paul@pfdubois.com>

if a is b:
...

-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of John
J. Lee
Sent: Friday, February 23, 2001 7:10 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] checking identity of arrays?


I must be missing something obvious: how does one check if two variables
refer to the same array object?


John


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From rlw at stsci.edu  Fri Feb 23 10:51:11 2001
From: rlw at stsci.edu (rlw at stsci.edu)
Date: Fri, 23 Feb 2001 10:51:11 -0500 (EST)
Subject: [Numpy-discussion] checking identity of arrays?
Message-ID: <200102231551.KAA17392@sundog.stsci.edu>

John J. Lee:

>I must be missing something obvious: how does one check if two variables
>refer to the same array object?

Paul Dubois:

>if a is b:

I thought he was asking something different.  Suppose I do this:

a = zeros(100)
b = a[10:20]

Now b is a view of a's data.  Is there any way to test that
a and b refer to the same data?  (Even if this was not John's
question, I'm curious to know the answer.)


From vanandel at atd.ucar.edu  Fri Feb 23 15:02:40 2001
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Fri, 23 Feb 2001 13:02:40 -0700
Subject: [Numpy-discussion] Threading, multi-processors, and Numeric Python
Message-ID: <3A96C1E0.1497E41E@atd.ucar.edu>

I've written Numeric Python code (with Python 1.5.2) to analyze weather
radar data.  In an attempt to speed up this code, I used threads to
perform some of the computations.  

I'm running on a dual processor Linux machine running 2.4.1 with SMP
enabled.  I'm using Numeric -17.3.0 with Python 1.5.2

When I run the threaded code, and monitor the system with 'top', 1
processor spends much of its time idle, and I rarely see two copies of
my 'compute' thread executing.  Each thread is computing its results
from different arrays.  However, all arrays are referenced from the same
dictionary.

Any ideas on how to get both threads computing at the same time?

Thanks for your help!

-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From vanandel at atd.ucar.edu  Fri Feb 23 17:18:35 2001
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Fri, 23 Feb 2001 15:18:35 -0700
Subject: [Numpy-discussion] Threading, multi-processors, and Numeric Python
References: <3A96C1E0.1497E41E@atd.ucar.edu>
Message-ID: <3A96E1BA.772C9CD0@atd.ucar.edu>

OK, I guess I've just discovered the answer to my own question.  [Don't
you just hate that! :-) ]


My C++ extensions that perform the real calculations needed to be
modified to support multiple threads.  By adding the
Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS macros, the Python
interpreter knew it was safe to allow other threads to execute during
computations, just like it allows other threads to execute during I/O.

 To accomodate the Python global thread lock, I needed to change my code
as follows:

my_func()
{

	/* python calls */

        Py_BEGIN_ALLOW_THREADS

        /* computations that don't use python API */


	Py_END_ALLOW_THREADS

	/* python API calls */
}

Now, I can see both processors being used.

-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From crag at arsdigita.com  Tue Feb 27 04:03:24 2001
From: crag at arsdigita.com (crag wolfe)
Date: Tue, 27 Feb 2001 04:03:24 -0500
Subject: [Numpy-discussion] Trouble importing LinearAlgebra
Message-ID: <3A9B6D5B.B8B16598@arsdigita.com>

Well, this a problem that is similar to the one discussed in the "Da
Blas" thread in September 2000.

My first command which works fine is "from Numeric import *".  The
problem is when I "import * from LinearAlgebra" I get "
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File
"/usr/local/lib/python2.0/site-packages/Numeric/LinearAlgebra.py", line
8, in ?
    import lapack_lite
ImportError: /usr/lib/liblapack.so.3: undefined symbol: e_wsfe"

I'm using Numeric-17.1.2.  I installed the lapack and blas rpms for Red
Hat 6.2, which I'm running.  I edited the setup.py file in the LALITE
directory as indicated.  I run "python setup_all.py install" which seems
to work fine (and I'm clearing out the build directories in between
install attempts).  "gcc -shared
build/temp.linux-i686-2.0/Src/lapack_litemodule.o -lblas -llapack -o
build/lib.linux-i686-2.0/lapack_lite.so" which scrolls by after running
the setup_all.py script, compiles without errors.

After the above failed, I tried manually running g77 in place of gcc (a
hint from the "Da Blas" thread) but then instead of the "undefined
symbol: e_wsfe" error I just get a seg fault.

Any help to get this package installed is much appreciated.
--Crag


From Barrett at stsci.edu  Tue Feb 27 13:25:26 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Tue, 27 Feb 2001 13:25:26 -0500 (EST)
Subject: [Numpy-discussion] Proposed agenda for Numeric Bof at Python 9
Message-ID: <15003.60611.366223.805114@nem-srvr.stsci.edu>

Here is the preliminary agenda for the Enhancing Numeric Python BoF at
Python 9.  We have requested the room for 2 hours (which may not be
enough time to discusss these contentious issues).

Please let me know if you would like a topic added.

 -- Paul


                       Enhancing Numeric Python

                                Agenda


1. Behavior  (~1 hr)

   a. Copy behavior for slice and item syntax

   b. Scalar coercion

   c. Record semantics

   d. Enhanced indexing

2. Implementation  (~1/2 hr)

   a. Type versus class approach

   b. C versus C++


From kern at its.caltech.edu  Tue Feb 27 13:36:26 2001
From: kern at its.caltech.edu (Robert Kern)
Date: Tue, 27 Feb 2001 10:36:26 -0800 (PST)
Subject: [Numpy-discussion] Trouble importing LinearAlgebra
In-Reply-To: <3A9B6D5B.B8B16598@arsdigita.com>
Message-ID: <Pine.GSO.4.21.0102271035040.25255-100000@screwdriver>

On Tue, 27 Feb 2001, crag wolfe wrote:

> Well, this a problem that is similar to the one discussed in the "Da
> Blas" thread in September 2000.
> 
> My first command which works fine is "from Numeric import *".  The
> problem is when I "import * from LinearAlgebra" I get "
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File
> "/usr/local/lib/python2.0/site-packages/Numeric/LinearAlgebra.py", line
> 8, in ?
>     import lapack_lite
> ImportError: /usr/lib/liblapack.so.3: undefined symbol: e_wsfe"

Try adding 'g2c' to the list of libraries. You may just end up 
with the same problem as linking with g77, but it's worth a shot.

[snip]

--
Robert Kern
kern at caltech.edu

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter


From sanner at scripps.edu  Thu Feb  1 15:29:47 2001
From: sanner at scripps.edu (Michel Sanner)
Date: Thu, 1 Feb 2001 12:29:47 -0800
Subject: [Numpy-discussion] New packages
In-Reply-To: numpy-discussion-request@lists.sourceforge.net
        "Numpy-discussion digest, Vol 1 #171 - 2 msgs" (Jan 24,  3:58pm)
References: <E14LZnk-0001o4-00@usw-sf-list1.sourceforge.net>
Message-ID: <1010201122947.ZM12238@noah.scripps.edu>

Hello,

Is there any reason why Numeric never became a "New" Python package with a
__init__.py ?

-Michel

PS: I have been adding __init__.py to my installation for a long time now and
it works just fine. For those who want to be able to import directly we could
extend the python path in __init__ so that after an import Numeric all .so
would be directly loadable


-- 

-----------------------------------------------------------------------

>>>>>>>>>> AREA CODE CHANGE <<<<<<<<< we are now 858 !!!!!!!

Michel F. Sanner Ph.D.                   The Scripps Research Institute
Assistant Professor			Department of Molecular Biology
					  10550 North Torrey Pines Road
Tel. (858) 784-2341				     La Jolla, CA 92037
Fax. (858) 784-2860
sanner at scripps.edu                        http://www.scripps.edu/sanner
-----------------------------------------------------------------------


From paul at pfdubois.com  Fri Feb  2 10:16:17 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Fri, 2 Feb 2001 07:16:17 -0800
Subject: [Numpy-discussion] New packages
In-Reply-To: <1010201122947.ZM12238@noah.scripps.edu>
Message-ID: <ADEOIFHFONCLEEPKCACCAEBOCFAA.paul@pfdubois.com>

Well, we talked about it some but didn't want to break existing code. To my
recollection nobody has suggested the trick you suggest here. I think it
would work, although there are cases where people import Precision in a
given module but not Numeric (the numeric objects they deal with get
returned by C or Fortran calls). Anybody see any real downside here?

-----Original Message-----
Is there any reason why Numeric never became a "New" Python package with a
__init__.py ?

-Michel

PS: I have been adding __init__.py to my installation for a long time now
and
it works just fine. For those who want to be able to import directly we
could
extend the python path in __init__ so that after an import Numeric all .so
would be directly loadable


--

-----------------------------------------------------------------------

>>>>>>>>>> AREA CODE CHANGE <<<<<<<<< we are now 858 !!!!!!!

Michel F. Sanner Ph.D.                   The Scripps Research Institute
Assistant Professor			Department of Molecular Biology
					  10550 North Torrey Pines Road
Tel. (858) 784-2341				     La Jolla, CA 92037
Fax. (858) 784-2860
sanner at scripps.edu                        http://www.scripps.edu/sanner
-----------------------------------------------------------------------


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From jbmoody at oakland.edu  Sat Feb  3 16:16:39 2001
From: jbmoody at oakland.edu (Jon Moody)
Date: Sat, 3 Feb 2001 16:16:39 -0500
Subject: [Numpy-discussion] cephes.arraymap & multipack
Message-ID: <20010203161639.A4309@oakland.edu>

Two questions:

* Has anyone been using the arraymap function that's in Travis's
  cephes 1.2 module?  I think this is an interesting idea: giving
  arbitrary python functions ufunc-like (array broadcasting)
  properties when given numpy array arguments.  (I noticed the Pearu's
  multipack CVS module has only cephes 1.1 which seems to be missing
  arraymap)

* Maybe I'm missing something, but is there any reason why multipack's
  functions are not implemented as ufuncs?  For example, it would be
  useful to be able to use multipack.leastsq() along an axis of a 3-d
  array, or to use multipack.quad() over a 2 or 3-d space of
  integration parameters.

--
Jon Moody


From phrxy at csv.warwick.ac.uk  Sat Feb  3 21:45:51 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Sun, 4 Feb 2001 02:45:51 +0000 (GMT)
Subject: [Numpy-discussion] cephes.arraymap & multipack
In-Reply-To: <20010203161639.A4309@oakland.edu>
Message-ID: <Pine.SOL.4.30.0102040218270.13210-100000@mimosa.csv.warwick.ac.uk>

On Sat, 3 Feb 2001, Jon Moody wrote:
[...]
> * Maybe I'm missing something, but is there any reason why multipack's
>   functions are not implemented as ufuncs?  For example, it would be
>   useful to be able to use multipack.leastsq() along an axis of a 3-d
>   array, or to use multipack.quad() over a 2 or 3-d space of
>   integration parameters.
[...]

Not sure aboout the latter, but couldn't the former just be done by
slicing?  How does this relate to ufuncs?

They were only intended to be simple wrappings around the FORTRAN / C I
think, so I suspect Travis would say 'feel free to add it'.


John


From nwagner at isd.uni-stuttgart.de  Mon Feb  5 09:38:07 2001
From: nwagner at isd.uni-stuttgart.de (Nils Wagner)
Date: Mon, 05 Feb 2001 15:38:07 +0100
Subject: [Numpy-discussion] Python routines for evaluation of special functions
Message-ID: <3A7EBACF.4C1678D5@isd.uni-stuttgart.de>

Hi,

I am looking for Python routines for evaluation of special functions,
including Airy, Bessel, beta, exponential integrals, logarithmic
integrals.

Thanks

                                        Nils


From jbmoody at oakland.edu  Mon Feb  5 11:56:02 2001
From: jbmoody at oakland.edu (Jon Moody)
Date: Mon, 5 Feb 2001 11:56:02 -0500
Subject: [Numpy-discussion] cephes.arraymap & multipack
In-Reply-To: <Pine.SOL.4.30.0102040218270.13210-100000@mimosa.csv.warwick.ac.uk>; from phrxy@csv.warwick.ac.uk on Sun, Feb 04, 2001 at 02:45:51AM +0000
References: <20010203161639.A4309@oakland.edu> <Pine.SOL.4.30.0102040218270.13210-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <20010205115602.A6611@oakland.edu>

On Sun, Feb 04, 2001 at 02:45:51AM +0000, John J. Lee wrote:
> Not sure aboout the latter, but couldn't the former just be done by
> slicing?  How does this relate to ufuncs?

I'm probably being a little foggy on the distinction between the
ufuncs (element-wise operations on arrays) and the array functions
that sometimes allow you to specify an axis along which to apply the
function.

The problem I'm having with multipack.leastsq() is that the python
function I supply as the model for the fit is expected to return a 1-d
array or a single value.

So if, for example, the independent variable is a 1-d array of shape
(4,) and the data is a 3-d array of shape (4,256,256), to apply
leastsq() along axis 0 you have to either loop in python or set things
up so that you can map(leastsq, ....).  

It seems this kind of thing should properly be in the wrapper, and I
would guess it should be in the C half of the wrapper for speed,
unless there's some clever way to phrase it using native Numeric
functions from python.

> 
> They were only intended to be simple wrappings around the FORTRAN / C I
> think, so I suspect Travis would say 'feel free to add it'.

Maybe I should try.  Is there any general objection (don't all barf at
once) to using Fortran for an wrapper via pyfort (I don't know C)?

--
Jon Moody


From hoel at germanlloyd.org  Tue Feb  6 04:31:15 2001
From: hoel at germanlloyd.org (Berthold =?iso-8859-1?q?H=F6llmann?=)
Date: 06 Feb 2001 10:31:15 +0100
Subject: [Numpy-discussion] more general LAPACK support for NumPy
Message-ID: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>

Hello,

>From time to time we need an additional linear algebra routine to be
avaible in Python. I find myself wrapping these functions then. As
these are FORTRAN routines, doing this for Python version on Solaris
(Sun CC/Sun F77), linux (gcc/g77) and Windows (VC++/Digital VF)
becomes nontrivial. Neither f2py nor pyfort provide Win support and I
doubt that automatic generation is usefull for many LAPACK routines,
especially those that need workspace, because usually we want LWORK to
be the optimal size.

So my Question is, are there other users wrapping LAPACK routines for
NumPy. If so, how are you doing it. For the C/FORTRAN wrapping I
somehow like the approach used for cfortran.h (see
http://www-zeus.desy.de/~burow/cfortran/index.html), but I'm afraid,
the license is not acceptable.

Is anyone aware of a C version of LAPACK besides the f2c version on
netlib. I do like the approach used in the ATLAS clapack part, but
there are only a very few LAPACK routines handeled there. If there is
greater need for additional LAPACK routines for NumPy, should we
bundle the efforts in

 (a) developing guidelines for how to write Python wrappers for LAPACK
     routines and

 (b) collecting routines provided by different users to provide a
     hopefully growing support for LAPACK in Python.

Greetings

Berthold
-- 
       email: hoel at GermanLloyd.org
   )   tel. : +49 (40) 3 61 49 - 73 74
  (
C[_]  These opinions might be mine, but never those of my employer.


From jhauser at ifm.uni-kiel.de  Tue Feb  6 07:07:30 2001
From: jhauser at ifm.uni-kiel.de (Janko Hauser)
Date: Tue, 6 Feb 2001 13:07:30 +0100 (CET)
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>
References: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>
Message-ID: <20010206120730.29928.qmail@lisboa.ifm.uni-kiel.de>

There is an old binding to the complete CLapack package, which can be
found at 

ftp://dirac.cnrs-orleans.fr/pub/PyLapack.tar.gz

It does not seem to have special support for Windows, but one can perhaps
start from there. I have tried cfortran ones and the wrapped function
signatures become quite long and verbose. I think one important
extension would be to make the automatic wrapper generators pyfort and
f2py support some windows compilers, although I must admit, I have not
looked into them for windows support yet.

__Janko


From hoel at germanlloyd.org  Tue Feb  6 08:55:04 2001
From: hoel at germanlloyd.org (Berthold =?iso-8859-1?q?H=F6llmann?=)
Date: 06 Feb 2001 14:55:04 +0100
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: Janko Hauser's message of "Tue, 6 Feb 2001 13:07:30 +0100 (CET)"
References: <sey9vk17ks.fsf@pc961225.GermanLloyd.de>
	<20010206120730.29928.qmail@lisboa.ifm.uni-kiel.de>
Message-ID: <se8znjzzk7.fsf@pc961225.GermanLloyd.de>

OK, I got, compiled and installed PyLapack.tar.gz. It looks quite
complete, for LAPACK 1 or 2, but, of course those routines I need are
new in LAPACK 3 :-(. 

It seemes, Doug Heisterkamp used some kind of a script or program to
generate the wrapper. Is there anyone who knows, whether this program
is avaible anywhere, so it could be extended for LAPACK 3 and Win?


Thanks

Berthold
-- 
       email: hoel at GermanLloyd.org
   )   tel. : +49 (40) 3 61 49 - 73 74
  (
C[_]  These opinions might be mine, but never those of my employer.


From paul at pfdubois.com  Tue Feb  6 10:26:09 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Tue, 6 Feb 2001 07:26:09 -0800
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: <20010206120730.29928.qmail@lisboa.ifm.uni-kiel.de>
Message-ID: <ADEOIFHFONCLEEPKCACCIEDICFAA.paul@pfdubois.com>

Two notes in regard to this thread:
1. I am in progress making Pyfort support Digital Visual Fortran on Windows.
2. Pyfort does have a facility for automatic allocation of work space, at
least in the case that the size can be computed using ordinary arithmetic
from the sizes of other arguments or other integer arguments.

-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Janko
Hauser
Sent: Tuesday, February 06, 2001 4:08 AM
To: Berthold Hollmann
Cc: numpy-discussion at lists.sourceforge.net
Subject: Re: [Numpy-discussion] more general LAPACK support for NumPy


There is an old binding to the complete CLapack package, which can be
found at

ftp://dirac.cnrs-orleans.fr/pub/PyLapack.tar.gz

It does not seem to have special support for Windows, but one can perhaps
start from there. I have tried cfortran ones and the wrapped function
signatures become quite long and verbose. I think one important
extension would be to make the automatic wrapper generators pyfort and
f2py support some windows compilers, although I must admit, I have not
looked into them for windows support yet.

__Janko


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From hoel at germanlloyd.org  Tue Feb  6 10:47:15 2001
From: hoel at germanlloyd.org (Berthold =?iso-8859-1?q?H=F6llmann?=)
Date: 06 Feb 2001 16:47:15 +0100
Subject: [Numpy-discussion] more general LAPACK support for NumPy
In-Reply-To: "Paul F. Dubois"'s message of "Tue, 6 Feb 2001 07:26:09 -0800"
References: <ADEOIFHFONCLEEPKCACCIEDICFAA.paul@pfdubois.com>
Message-ID: <seofwfyfss.fsf@pc961225.GermanLloyd.de>

"Paul F. Dubois" <paul at pfdubois.com> writes:

> Two notes in regard to this thread:
> 1. I am in progress making Pyfort support Digital Visual Fortran on
> Windows.

GREAT

> 2. Pyfort does have a facility for automatic allocation of work
> space, at least in the case that the size can be computed using
> ordinary arithmetic from the sizes of other arguments or other
> integer arguments.

I know of that, but the optimal workspace size for LAPACK routines is
for optimal efficiency. The size can be returned by the routine or by
calling the FORTRAN function ILAENV. It would be great, if
workspacesize could be made depending on the result of functions.

Thanks

Berthold

-- 
       email: hoel at GermanLloyd.org
   )   tel. : +49 (40) 3 61 49 - 73 74
  (
C[_]  These opinions might be mine, but never those of my employer.


From hinsen at cnrs-orleans.fr  Wed Feb  7 09:16:34 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 7 Feb 2001 15:16:34 +0100
Subject: [Numpy-discussion] Is this a wheel?
In-Reply-To: <Pine.OSF.3.95.1010129153436.7116A-100000@lcdx00.wm.lc.ehu.es>
	(message from Jon Saenz on Mon, 29 Jan 2001 15:46:44 +0100 (MET))
References: <Pine.OSF.3.95.1010129153436.7116A-100000@lcdx00.wm.lc.ehu.es>
Message-ID: <200102071416.PAA10170@chinon.cnrs-orleans.fr>

> element of a NumPy array. I seeked through the documentation and found the
> argmax/argmin functions. However, they must be called recursively to find
> the greatest(smallest) element of a multidimendional array. As I needed to

You could run it on Numeric.ravel(array) (which shouldn't make a
copy), and then reconstruct the multidimensional indices from the
single index into the flattened array. The additional overhead
should be minimal, and you don't need any C code.

Konrad
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Wed Feb  7 09:19:02 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 7 Feb 2001 15:19:02 +0100
Subject: [Numpy-discussion] New packages
In-Reply-To: <ADEOIFHFONCLEEPKCACCAEBOCFAA.paul@pfdubois.com>
References: <ADEOIFHFONCLEEPKCACCAEBOCFAA.paul@pfdubois.com>
Message-ID: <200102071419.PAA10175@chinon.cnrs-orleans.fr>

> Well, we talked about it some but didn't want to break existing code. To my
> recollection nobody has suggested the trick you suggest here. I think it
> would work, although there are cases where people import Precision in a
> given module but not Numeric (the numeric objects they deal with get
> returned by C or Fortran calls). Anybody see any real downside here?

At least not immediately. Importing Numeric involves almost no
overhead when you use array-generating modules anyway (they need to
import at least multiarray). I'll make this modification to my
installation and see if I get any bad surprises.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From dpgrote at lbl.gov  Wed Feb  7 14:41:48 2001
From: dpgrote at lbl.gov (David P Grote)
Date: Wed, 07 Feb 2001 11:41:48 -0800
Subject: [Numpy-discussion] Is this a wheel?
References: <Pine.OSF.3.95.1010129153436.7116A-100000@lcdx00.wm.lc.ehu.es> <200102071416.PAA10170@chinon.cnrs-orleans.fr>
Message-ID: <3A81A4FC.70704@lbl.gov>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20010207/536e73f3/attachment-0001.html>

From phrxy at csv.warwick.ac.uk  Thu Feb  8 03:58:23 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Thu, 8 Feb 2001 08:58:23 +0000 (GMT)
Subject: [Numpy-discussion] Is this a wheel?
In-Reply-To: <3A81A4FC.70704@lbl.gov>
Message-ID: <Pine.SOL.4.30.0102080853350.936-100000@mimosa.csv.warwick.ac.uk>

On Wed, 7 Feb 2001, David P Grote wrote:

> Ravel does make a copy when the array is not contiguous. I asked this
> question before but didn't get any response - is there a way to get the
> argmax/min or max/min of a non-contiguous multi-dimensional array without
> making a contiguous copy? I use python as an interface to fortran code
> and so I am constantly dealing with arrays that are not contiguous, i.e.
> not with C ordering. Any help is appreciated.

Aren't FORTRAN arrays just stored in the reverse order to C?  Isn't this
just dealt with by having the stride lengths of your Numeric array in the
opposite order?  Or does FORTRAN sometimes allocate multidimensional
arrays with gaps in memory??  I don't see why they should not be
contiguous.


John


From dpgrote at lbl.gov  Thu Feb  8 12:39:18 2001
From: dpgrote at lbl.gov (David P Grote)
Date: Thu, 08 Feb 2001 09:39:18 -0800
Subject: [Numpy-discussion] Is this a wheel?
References: <Pine.SOL.4.30.0102080853350.936-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <3A82D9C6.40409@lbl.gov>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20010208/e3c71497/attachment-0001.html>

From phrxy at csv.warwick.ac.uk  Thu Feb  8 16:01:39 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Thu, 8 Feb 2001 21:01:39 +0000 (GMT)
Subject: [Numpy-discussion] Is this a wheel?
In-Reply-To: <3A82D9C6.40409@lbl.gov>
Message-ID: <Pine.SOL.4.30.0102081755340.14657-100000@mimosa.csv.warwick.ac.uk>

On Thu, 8 Feb 2001, David P Grote wrote:

> What I meant by "not contiguous" is that the? Numeric flag "contiguous"
> is set to false. This flag is only true when Numeric arrays have their
> strides in C ordering. Any rearrangement of the strides causes the flag
> to be set to false - a transpose for example. The data in the fortran
> arrays is contiguous in memory. Here's an example using ravel.
[...]

Oh, I see.

>  Ravel does make a copy when the array is not contiguous. I asked this
> question before but didn't get any response - is there a way to get the
> argmax/min or max/min of a non-contiguous multi-dimensional array without
> making a contiguous copy? I use python as an interface to fortran code
> and so I am constantly dealing with arrays that are not contiguous, i.e.
> not with C ordering. Any help is appreciated.

I don't know about doing it with one of the Numeric functions, but it's
very easy to write in C -- just this week I wrote a max() that works on
(contiguous or not)  Numeric arrays.  I think I wrote it as a C function
(not callable from Python) for the function I was wrapping to use, but it
would be easy to change it to be a proper Python function.  I'll mail you
a copy if you like.


John


From Barrett at stsci.edu  Fri Feb  9 10:45:50 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri,  9 Feb 2001 10:45:50 -0500 (EST)
Subject: [Numpy-discussion] A Numerical Python BoF at Python 9
Message-ID: <14980.2832.659186.913578@nem-srvr.stsci.edu>

I've been encouraged to set-up a BoF at Python 9 to discuss Numerical
Python issues, specifically the design and implemenation of Numeric 2.
I'd like to get a head count of those interested in attending such a
BoF.  So far there are 3 of us at STScI who are interested.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From eq3pvl at eq.uc.pt  Fri Feb  9 11:14:46 2001
From: eq3pvl at eq.uc.pt (Pedro Vale Lima)
Date: Fri, 09 Feb 2001 16:14:46 +0000
Subject: [Numpy-discussion] Travis Oliphant optimization.py
References: <14980.2832.659186.913578@nem-srvr.stsci.edu>
Message-ID: <3A841776.37AD73F1@eq.uc.pt>

Travis Oliphant website seems to be with problems (maybe the starship
virus).
I wanted to download his optimization.py. Could pleaseTravis or someone
else
mail me that routine.

thank you
pedro lima


From nwagner at isd.uni-stuttgart.de  Fri Feb  9 11:51:19 2001
From: nwagner at isd.uni-stuttgart.de (Nils Wagner)
Date: Fri, 09 Feb 2001 17:51:19 +0100
Subject: [Numpy-discussion] Jordan's normal form of matrices
Message-ID: <3A842007.A93BB722@isd.uni-stuttgart.de>

Hi,

I am looking for a program to calculate the Jordan normal form of a real
or complex matrix.

Thanks in advance.

Nils Wagner


From jhauser at ifm.uni-kiel.de  Fri Feb  9 18:08:46 2001
From: jhauser at ifm.uni-kiel.de (Janko Hauser)
Date: Sat, 10 Feb 2001 00:08:46 +0100 (CET)
Subject: [Numpy-discussion] A Numerical Python BoF at Python 9
In-Reply-To: <14980.2832.659186.913578@nem-srvr.stsci.edu>
References: <14980.2832.659186.913578@nem-srvr.stsci.edu>
Message-ID: <20010209230846.1655.qmail@lisboa.ifm.uni-kiel.de>

May I suggest that you repost your PEP to the matrix-sig. This PEP is
more fleshed out, than the last mails from Travis, regarding Numeric2.

IMHO,
__Janko

Paul Barrett writes:
 > 
 > I've been encouraged to set-up a BoF at Python 9 to discuss Numerical
 > Python issues, specifically the design and implemenation of Numeric 2.
 > I'd like to get a head count of those interested in attending such a
 > BoF.  So far there are 3 of us at STScI who are interested.
 > 
 > -- 
 > Dr. Paul Barrett       Space Telescope Science Institute
 > Phone: 410-338-4475    ESS/Science Software Group
 > FAX:   410-338-4767    Baltimore, MD 21218
 > 
 > _______________________________________________
 > Numpy-discussion mailing list
 > Numpy-discussion at lists.sourceforge.net
 > http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From gillet at scripps.edu  Fri Feb  9 20:37:54 2001
From: gillet at scripps.edu (Alexandre Gillet)
Date: Fri, 09 Feb 2001 17:37:54 -0800
Subject: [Numpy-discussion] problem with numeric array on window.
Message-ID: <3A849B72.28AEAA22@scripps.edu>

Hi all,

We are having problems when creating numeric arrays in C extensions
under
windows. We narrowed the problem down to a very simple example (shown
below) where we simply allocate some memory, create a Numeric array
using
PyArrayFromDimsAndData. If we call that function as soon as we delete
the
returned array the python interpreter crashes as it tries to free
self->data. If we do not the the OWN_DATA flag in the array it works
fine
but we have a memory leak.
We tried this using both release and Debug versions of Python1.5.2, and
Numeric 17.1.1.
I have been using this mechanism under Unix for a long time and have not
had this problem before !
Using the same extension and test under unix works fine.
Has something changed ? Any help is welcome ...
Thank you

I join the two files we are using.
 bug.c : a C module that create an array numeric with the flags set to
own_data, so when the array is garbage collected, the memory is free.


#################
bug.c#####################################################
#ifdef WIN32
#include <stdlib.h>
#include <malloc.h>
#endif
#include "Python.h"
#include "arrayobject.h"

static PyObject* createArray(PyObject* self, PyObject* args)
{
    int *dims;
    float *data;
    PyArrayObject *out;

    dims = (int *)malloc(2 * sizeof(int));
    dims[0] = 500;
    dims[1] = 2;
    data = (float *)malloc(100000 * sizeof(float));
    out = (PyArrayObject *)PyArray_FromDimsAndData(2, dims,
                                                   PyArray_FLOAT,
                                                   (char *)data);
    if (!out) {
      PyErr_SetString(PyExc_RuntimeError,
                      "Failed to allocate memory for normals");
      return NULL;
    }
    out->flags |= OWN_DATA;   /*so we'll free this memory when this 
                                array will be garbage collected */
    return (PyObject *)out;
}

static PyMethodDef bug_methods[] = {
  {"createArray",             createArray,      1}, 
  {NULL,      NULL}        /* Sentinel */
};

static char bug_documentation[] = "No Doc";

#ifdef WIN32
extern __declspec(dllexport)
#endif
void initbug()
{
        PyObject *m, *d;
        m = Py_InitModule4("bug",
                           bug_methods,
                           bug_documentation,
                           (PyObject *)NULL,
                           PYTHON_API_VERSION);
        d = PyModule_GetDict(m);
        import_array();

        if (PyErr_Occurred()) 
                Py_FatalError("can't initialize module bug");
}
#####################################################################

###############testbug.py############################################
import bug
ar = bug.createArray()
del ar # fatal if ar owns the data member

#####################################################################


-- 
**********************************
Alexandre Gillet
The Scripps Research Institute,       tel: (858) 784-9557
Dept. Molecular Biology,  MB-5,       fax: (858) 784-2860
10550  North Torrey Pines Road,       email: gillet at scripps.edu
La Jolla,  CA 92037-1000,  USA.


From Barrett at stsci.edu  Tue Feb 13 10:52:18 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Tue, 13 Feb 2001 10:52:18 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
Message-ID: <14985.22458.685587.538866@nem-srvr.stsci.edu>


The first draft of PEP 209: Multi-dimensional Arrays is ready for
comment.  It's primary emphasis is aimed at array operations, but its
design is intended to provide a general framework for working with
multi-dimensional arrays.  This PEP covers a lot of ground and so does
not go into much detail at this stage. The hope is that we can fill
them in as time goes on.  It also presents several Open Issues that
need to be discussed.

Cheers,
Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

PEP: 209
Title: Multi-dimensional Arrays
Version: 
Author: barrett at stsci.edu (Paul Barrett), oliphant at ee.byu.edu (Travis Oliphant)
Python-Version: 2.2
Status: Draft
Type: Standards Track
Created: 03-Jan-2001
Post-History: 


Abstract

    This PEP proposes a redesign and re-implementation of the multi-
    dimensional array module, Numeric, to make it easier to add new
    features and functionality to the module.  Aspects of Numeric 2
    that will receive special attention are efficient access to arrays
    exceeding a gigabyte in size and composed of inhomogeneous data
    structures or records.  The proposed design uses four Python
    classes: ArrayType, UFunc, Array, and ArrayView; and a low-level
    C-extension module, _ufunc, to handle the array operations
    efficiently.  In addition, each array type has its own C-extension
    module which defines the coercion rules, operations, and methods
    for that type.  This design enables new types, features, and
    functionality to be added in a modular fashion.  The new version
    will introduce some incompatibilities with the current Numeric.


Motivation

    Multi-dimensional arrays are commonly used to store and manipulate
    data in science, engineering, and computing.  Python currently has
    an extension module, named Numeric (henceforth called Numeric 1),
    which provides a satisfactory set of functionality for users
    manipulating homogeneous arrays of data of moderate size (of order
    10 MB).  For access to larger arrays (of order 100 MB or more) of
    possibly inhomogeneous data, the implementation of Numeric 1 is
    inefficient and cumbersome.  In the future, requests by the
    Numerical Python community for additional functionality is also
    likely as PEPs 211: Adding New Linear Operators to Python, and
    225: Elementwise/Objectwise Operators illustrate.


Proposal

    This proposal recommends a re-design and re-implementation of
    Numeric 1, henceforth called Numeric 2, which will enable new
    types, features, and functionality to be added in an easy and
    modular manner.  The initial design of Numeric 2 should focus on
    providing a generic framework for manipulating arrays of various
    types and should enable a straightforward mechanism for adding new
    array types and UFuncs.  Functional methods that are more specific
    to various disciplines can then be layered on top of this core.
    This new module will still be called Numeric and most of the
    behavior found in Numeric 1 will be preserved.

    The proposed design uses four Python classes: ArrayType, UFunc,
    Array, and ArrayView; and a low-level C-extension module to handle
    the array operations efficiently.  In addition, each array type
    has its own C-extension module which defines the coercion rules,
    operations, and methods for that type.  At a later date, when core
    functionality is stable, some Python classes can be converted to
    C-extension types.

    Some planned features are:
    
    1.  Improved memory usage
    
    This feature is particularly important when handling large arrays
    and can produce significant improvements in performance as well as
    memory usage.  We have identified several areas where memory usage
    can be improved:
    
        a.  Use a local coercion model
    
        Instead of using Python's global coercion model which creates
        temporary arrays, Numeric 2, like Numeric 1, will implement a
        local coercion model as described in PEP 208 which defers the
        responsibility of coercion to the operator.  By using internal
        buffers, a coercion operation can be done for each array
        (including output arrays), if necessary, at the time of the
        operation.  Benchmarks [1] have shown that performance is at
        most degraded only slightly and is improved in cases where the
        internal buffers are less than the L2 cache size and the
        processor is under load.  To avoid array coercion altogether,
        C functions having arguments of mixed type are allowed in
        Numeric 2.
    
        b.  Avoid creation of temporary arrays
    
        In complex array expressions (i.e. having more than one
        operation), each operation will create a temporary array which
        will be used and then deleted by the succeeding operation.  A
        better approach would be to identify these temporary arrays
        and reuse their data buffers when possible, namely when the
        array shape and type are the same as the temporary array being
        created.  This can be done by checking the temparory array's
        reference count.  If it is 1, then it will be deleted once the
        operation is done and is a candidate for reuse.
    
        c.  Optional use of memory-mapped files
    
        Numeric users sometimes need to access data from very large
        files or to handle data that is greater than the available
        memory.  Memory-mapped arrays provide a mechanism to do this
        by storing the data on disk while making it appear to be in
        memory.  Memory- mapped arrays should improve access to all
        files by eliminating one of two copy steps during a file
        access.  Numeric should be able to access in-memory and
        memory-mapped arrays transparently.
    
        d.  Record access

        In some fields of science, data is stored in files as binary
        records.  For example in astronomy, photon data is stored as a
        1 dimensional list of photons in order of arrival time.  These
        records or C-like structures contain information about the
        detected photon, such as its arrival time, its position on the
        detector, and its energy.  Each field may be of a different
        type, such as char, int, or float.  Such arrays introduce new
        issues that must be dealt with, in particular byte alignment
        or byte swapping may need to be performed for the numeric
        values to be properly accessed (though byte swapping is also
        an issue for memory mapped data).  Numeric 2 is designed to
        automatically handle alignment and representational issues
        when data is accessed or operated on.  There are two
        approaches to implementing records; as either a derived array
        class or a special array type, depending on your point-of-
        view.  We defer this discussion to the Open Issues section.
    
    
    2.  Additional array types
    
    Numeric 1 has 11 defined types: char, ubyte, sbyte, short, int,
    long, float, double, cfloat, cdouble, and object.  There are no
    ushort, uint, or ulong types, nor are there more complex types
    such as a bit type which is of use to some fields of science and
    possibly for implementing masked-arrays.  The design of Numeric 1
    makes the addition of these and other types a difficult and
    error-prone process.  To enable the easy addition (and deletion)
    of new array types such as a bit type described below, a re-design
    of Numeric is necessary.
    
        a.  Bit type
    
        The result of a rich comparison between arrays is an array of
        boolean values.  The result can be stored in an array of type
        char, but this is an unnecessary waste of memory.  A better
        implementation would use a bit or boolean type, compressing
        the array size by a factor of eight.  This is currently being
        implemented for Numeric 1 (by Travis Oliphant) and should be
        included in Numeric 2.

    3.  Enhanced array indexing syntax
    
    The extended slicing syntax was added to Python to provide greater
    flexibility when manipulating Numeric arrays by allowing
    step-sizes greater than 1.  This syntax works well as a shorthand
    for a list of regularly spaced indices.  For those situations
    where a list of irregularly spaced indices are needed, an enhanced
    array indexing syntax would allow 1-D arrays to be arguments.
    
    4.  Rich comparisons
    
    The implementation of PEP 207: Rich Comparisons in Python 2.1
    provides additional flexibility when manipulating arrays.  We
    intend to implement this feature in Numeric 2.
    
    5. Array broadcasting rules
    
    When an operation between a scalar and an array is done, the
    implied behavior is to create a new array having the same shape as
    the array operand containing the scalar value.  This is called
    array broadcasting.  It also works with arrays of lesser rank,
    such as vectors.  This implicit behavior is implemented in Numeric
    1 and will also be implemented in Numeric 2.


Design and Implementation

    The design of Numeric 2 has four primary classes:
    
    1.  ArrayType:
    
    This is a simple class that describes the fundamental properties
    of an array-type, e.g. its name, its size in bytes, its coercion
    relations with respect to other types, etc., e.g.
    
    > Int32 = ArrayType('Int32', 4, 'doc-string')
    
    Its relation to the other types is defined when the C-extension
    module for that type is imported.  The corresponding Python code
    is:
    
    > Int32.astype[Real64] = Real64
    
    This says that the Real64 array-type has higher priority than the
    Int32 array-type.
    
    The following attributes and methods are proposed for the core
    implementation.  Additional attributes can be added on an
    individual basis, e.g. .bitsize or .bitstrides for the bit type.
    
    Attributes:
        .name:                  e.g. "Int32", "Float64", etc.
        .typecode:              e.g. 'i', 'f', etc.
                                (for backward compatibility)
        .size (in bytes):       e.g. 4, 8, etc.
        .array_rules (mapping): rules between array types
        .pyobj_rules (mapping): rules between array and python types
        .doc:                   documentation string
    Methods:
        __init__():             initialization
        __del__():              destruction
        __repr__():             representation
    
    C-API:
        This still needs to be fleshed-out.
    
    
    2.  UFunc:
    
    This class is the heart of Numeric 2.  Its design is similar to
    that of ArrayType in that the UFunc creates a singleton callable
    object whose attributes are name, total and input number of
    arguments, a document string, and an empty CFunc dictionary; e.g.
    
    > add = UFunc('add', 3, 2, 'doc-string')
    
    When defined the add instance has no C functions associated with
    it and therefore can do no work.  The CFunc dictionary is
    populated or registerd later when the C-extension module for an
    array-type is imported.  The arguments of the regiser method are:
    function name, function descriptor, and the CUFunc object.  The
    corresponding Python code is
    
    > add.register('add', (Int32, Int32, Int32), cfunc-add)
    
    In the initialization function of an array type module, e.g.
    Int32, there are two C API functions: one to initialize the
    coercion rules and the other to register the CFunc objects.
    
    When an operation is applied to some arrays, the __call__ method
    is invoked.  It gets the type of each array (if the output array
    is not given, it is created from the coercion rules) and checks
    the CFunc dictionary for a key that matches the argument types.
    If it exists the operation is performed immediately, otherwise the
    coercion rules are used to search for a related operation and set
    of conversion functions.  The __call__ method then invokes a
    compute method written in C to iterate over slices of each array,
    namely:
    
    > _ufunc.compute(slice, data, func, swap, conv)
    
    The 'func' argument is a CFuncObject, while the 'swap' and 'conv'
    arguments are lists of CFuncObjects for those arrays needing pre-
    or post-processing, otherwise None is used.  The data argument is
    a list of buffer objects, and the slice argument gives the number
    of iterations for each dimension along with the buffer offset and
    step size for each array and each dimension.
    
    We have predefined several UFuncs for use by the __call__ method:
    cast, swap, getobj, and setobj.  The cast and swap functions do
    coercion and byte-swapping, resp. and the getobj and setobj
    functions do coercion between Numeric arrays and Python sequences.
    
    The following attributes and methods are proposed for the core
    implementation.
    
    Attributes:
        .name:                  e.g. "add", "subtract", etc.
        .nargs:                 number of total arguments
        .iargs:                 number of input arguments
        .cfuncs (mapping):      the set C functions
        .doc:                   documentation string
    Methods:
        __init__():             initialization
        __del__():              destruction
        __repr__():             representation
        __call__():             look-up and dispatch method
        initrule():             initialize coercion rule
        uninitrule():           uninitialize coercion rule
        register():             register a CUFunc
        unregister():           unregister a CUFunc

    C-API:
        This still needs to be fleshed-out.
    
    3.  Array:
    
    This class contains information about the array, such as shape,
    type, endian-ness of the data, etc..  Its operators, '+', '-',
    etc. just invoke the corresponding UFunc function, e.g.
    
    > def __add__(self, other):
    >     return ufunc.add(self, other)

    The following attributes, methods, and functions are proposed for
    the core implementation.
    
    Attributes:
        .shape:                 shape of the array
        .format:                type of the array
        .real (only complex):   real part of a complex array
        .imag (only complex):   imaginary part of a complex array
    Methods:
        __init__():             initialization
        __del__():              destruction
        __repr_():              representation
        __str__():              pretty representation
        __cmp__():              rich comparison
        __len__():
        __getitem__():
        __setitem__():
        __getslice__():
        __setslice__():
        numeric methods:
        copy():                 copy of array
        aslist():               create list from array
        asstring():             create string from array
        
    Functions:
        fromlist():             create array from sequence
        fromstring():           create array from string
        array():                create array with shape and value
        concat():               concatenate two arrays
        resize():               resize array

    C-API:
        This still needs to be fleshed-out.

    4.  ArrayView

    This class is similar to the Array class except that the reshape
    and flat methods will raise exceptions, since non-contiguous
    arrays cannot be reshaped or flattened using just pointer and
    step-size information.

    C-API:
        This still needs to be fleshed-out.
    
    5.  C-extension modules:
    
    Numeric2 will have several C-extension modules.

        a.  _ufunc:

        The primary module of this set is the _ufuncmodule.c.  The
        intention of this module is to do the bare minimum,
        i.e. iterate over arrays using a specified C function.  The
        interface of these functions is the same as Numeric 1, i.e.

        int (*CFunc)(char *data, int *steps, int repeat, void *func);

        and their functionality is expected to be the same, i.e. they
        iterate over the inner-most dimension.

        The following attributes and methods are proposed for the core
        implementation.
    
        Attibutes:
        
        Methods:
            compute():

        C-API:
            This still needs to be fleshed-out.

        b.  _int32, _real64, etc.:
    
        There will also be C-extension modules for each array type,
        e.g. _int32module.c, _real64module.c, etc.  As mentioned
        previously, when these modules are imported by the UFunc
        module, they will automatically register their functions and
        coercion rules.  New or improved versions of these modules can
        be easily implemented and used without affecting the rest of
        Numeric 2.


Open Issues

    1.  Does slicing syntax default to copy or view behavior?

    The default behavior of Python is to return a copy of a sub-list
    or tuple when slicing syntax is used, whereas Numeric 1 returns a
    view into the array.  The choice made for Numeric 1 is apparently
    for reasons of performance: the developers wish to avoid the
    penalty of allocating and copying the data buffer during each
    array operation and feel that the need for a deepcopy of an array
    to be rare.  Yet, some have argued that Numeric's slice notation
    should also have copy behavior to be consistent with Python lists.
    In this case the performance penalty associated with copy behavior
    can be minimized by implementing copy-on-write.  This scheme has
    both arrays sharing one data buffer (as in view behavior) until
    either array is assigned new data at which point a copy of the
    data buffer is made.  View behavior would then be implemented by
    an ArrayView class, whose behavior be similar to Numeric 1 arrays,
    i.e. .shape is not settable for non-contiguous arrays.  The use of
    an ArrayView class also makes explicit what type of data the array
    contains.

    2.  Does item syntax default to copy or view behavior?

    A similar question arises with the item syntax.  For example, if a
    = [[0,1,2], [3,4,5]] and b = a[0], then changing b[0] also changes
    a[0][0], because a[0] is a reference or view of the first row of
    a.  Therefore, if c is a 2-d array, it would appear that c[i]
    should return a 1-d array which is a view into, instead of a copy
    of, c for consistency.  Yet, c[i] can be considered just a
    shorthand for c[i,:] which would imply copy behavior assuming
    slicing syntax returns a copy.  Should Numeric 2 behave the same
    way as lists and return a view or should it return a copy.
    
    3.  How is scalar coercion implemented?

    Python has fewer numeric types than Numeric which can cause
    coercion problems.  For example when multiplying a Python scalar
    of type float and a Numeric array of type float, the Numeric array
    is converted to a double, since the Python float type is actually
    a double.  This is often not the desired behavior, since the
    Numeric array will be doubled in size which is likely to be
    annoying, particularly for very large arrays.  We prefer that the
    array type trumps the python type for the same type class, namely
    integer, float, and complex.  Therefore an operation between a
    Python integer and an Int16 (short) array will return an Int16
    array.  Whereas an operation between a Python float and an Int16
    array would return a Float64 (double) array.  Operations between
    two arrays use normal coercion rules.
    
    4.  How is integer division handled?
    
    In a future version of Python, the behavior of integer division
    will change.  The operands will be converted to floats, so the
    result will be a float.  If we implement the proposed scalar
    coercion rules where arrays have precedence over Python scalars,
    then dividing an array by an integer will return an integer array
    and will not be consistent with a future version of Python which
    would return an array of type double.  Scientific programmers are
    familiar with the distinction between integer and float-point
    division, so should Numeric 2 continue with this behavior?

    5.  How should records be implemented?

    There are two approaches to implementing records depending on your
    point-of-view.  The first is two divide arrays into separate
    classes depending on the behavior of their types.  For example
    numeric arrays are one class, strings a second, and records a
    third, because the range and type of operations of each class
    differ.  As such, a record array is not a new type, but a
    mechanism for a more flexible form of array.  To easily access and
    manipulate such complex data, the class is comprised of numeric
    arrays having different byte offsets into the data buffer.  For
    example, one might have a table consisting of an array of Int16,
    Real32 values.  Two numeric arrays, one with an offset of 0 bytes
    and a stride of 6 bytes to be interpeted as Int16, and one with an
    offset of 2 bytes and a stride of 6 bytes to be interpreted as
    Real32 would represent the record array.  Both numeric arrays
    would refer to the same data buffer, but have different offset and
    stride attributes, and a different numeric type.

    The second approach is to consider a record as one of many array
    types, albeit with fewer, and possibly different, array operations
    than for numeric arrays.  This approach considers an array type to
    be a mapping of a fixed-length string.  The mapping can either be
    simple, like integer and floating-point numbers, or complex, like
    a complex number, a byte string, and a C-structure.  The record
    type effectively merges the struct and Numeric modules into a
    multi-dimensional struct array.  This approach implies certain
    changes to the array interface.  For example, the 'typecode'
    keyword argument should probably be changed to the more
    descriptive 'format' keyword.

        a.  How are record semantics defined and implemented?

        Which ever implementation approach is taken for records, the
        syntax and semantics of how they are to be accessed and
        manipulated must be decided, if one wishes to have access to
        sub-fields of records.  In this case, the record type can
        essentially be considered an inhomogeneous list, like a tuple
        returned by the unpack method of the struct module; and a 1-d
        array of records may be interpreted as a 2-d array with the
        second dimension being the index into the list of fields.
        This enhanced array semantics makes access to an array of one
        or more of the fields easy and straightforward.  It also
        allows a user to do array operations on a field in a natural
        and intuitive way.  If we assume that records are implemented
        as an array type, then last dimension defaults to 0 and can
        therefore be neglected for arrays comprised of simple types,
        like numeric.
   
    6.  How are masked-arrays implemented?

    Masked-arrays in Numeric 1 are implemented as a separate array
    class.  With the ability to add new array types to Numeric 2, it
    is possible that masked-arrays in Numeric 2 could be implemented
    as a new array type instead of an array class.
    
    7.  How are numerical errors handled (IEEE floating-point errors in
        particular)?

    It is not clear to the proposers (Paul Barrett and Travis
    Oliphant) what is the best or preferred way of handling errors.
    Since most of the C functions that do the operation, iterate over
    the inner-most (last) dimension of the array.  This dimension
    could contain a thousand or more items having one or more errors
    of differing type, such as divide-by-zero, underflow, and
    overflow.  Additionally, keeping track of these errors may come at
    the expense of performance.  Therefore, we suggest several
    options:

        a.  Print a message of the most severe error, leaving it to
        the user to locate the errors.

        b.  Print a message of all errors that occurred and the number
        of occurrences, leaving it to the user to locate the errors.

        c.  Print a message of all errors that occurred and a list of
        where they occurred.

        d.  Or use a hybrid approach, printing only the most severe
        error, yet keeping track of what and where the errors
        occurred.  This would allow the user to locate the errors
        while keeping the error message brief.

    8.  What features are needed to ease the integration of FORTRAN
        libraries and code?

    It would be a good idea at this stage to consider how to ease the
    integration of FORTRAN libraries and user code in Numeric 2.


Implementation Steps

    1.  Implement basic UFunc capability
    
        a.  Minimal Array class:

        Necessary class attributes and methods, e.g. .shape, .data,
        .type, etc.

        b.  Minimal ArrayType class:

        Int32, Real64, Complex64, Char, Object

        c.  Minimall UFunc class:

        UFunc instantiation, CFunction registration, UFunc call for
        1-D arrays including the rules for doing alignment,
        byte-swapping, and coercion.

        d.  Minimal C-extension module:

        _UFunc, which does the innermost array loop in C.
    
        This step implements whatever is needed to do: 'c = add(a, b)'
        where a, b, and c are 1-D arrays.  It teaches us how to add
        new UFuncs, to coerce the arrays, to pass the necessary
        information to a C iterator method and to do the actually
        computation.
    
    2.  Continue enhancing the UFunc iterator and Array class
    
        a.  Implement some access methods for the Array class:
            print, repr, getitem, setitem, etc.

        b.  Implement multidimensional arrays

        c.  Implement some of basic Array methods using UFuncs:
            +, -, *, /, etc.

        d.  Enable UFuncs to use Python sequences.
    
    3.  Complete the standard UFunc and Array class behavior
    
        a.  Implement getslice and setslice behavior

        b.  Work on Array broadcasting rules

        c.  Implement Record type

    4.  Add additional functionality
    
        a.  Add more UFuncs

        b.  Implement buffer or mmap access


Incompatibilities

    The following is a list of incompatibilities in behavior between
    Numeric 1 and Numeric 2.

    1.  Scalar corcion rules

    Numeric 1 has single set of coercion rules for array and Python
    numeric types.  This can cause unexpected and annoying problems
    during the calculation of an array expression.  Numeric 2 intends
    to overcome these problems by having two sets of coercion rules:
    one for arrays and Python numeric types, and another just for
    arrays.

    2.  No savespace attribute

    The savespace attribute in Numeric 1 makes arrays with this
    attribute set take precedence over those that do not have it set.
    Numeric 2 will not have such an attribute and therefore normal
    array coercion rules will be in effect.

    3.  Slicing syntax returns a copy

    The slicing syntax in Numeric 1 returns a view into the original
    array.  The slicing behavior for Numeric 2 will be a copy.  You
    should use the ArrayView class to get a view into an array.

    4.  Boolean comparisons return a boolean array

    A comparison between arrays in Numeric 1 results in a Boolean
    scalar, because of current limitations in Python.  The advent of
    Rich Comparisons in Python 2.1 will allow an array of Booleans to
    be returned.

    5.  Type characters are depricated

    Numeric 2 will have an ArrayType class composed of Type instances,
    for example Int8, Int16, Int32, and Int for signed integers.  The
    typecode scheme in Numeric 1 will be available for backward
    compatibility, but will be depricated.


Appendices

    A.  Implicit sub-arrays iteration

    A computer animation is composed of a number of 2-D images or
    frames of identical shape.  By stacking these images into a single
    block of memory, a 3-D array is created.  Yet the operations to be
    performed are not meant for the entire 3-D array, but on the set
    of 2-D sub-arrays.  In most array languages, each frame has to be
    extracted, operated on, and then reinserted into the output array
    using a for-like loop.  The J language allows the programmer to
    perform such operations implicitly by having a rank for the frame
    and array.  By default these ranks will be the same during the
    creation of the array.  It was the intention of the Numeric 1
    developers to implement this feature, since it is based on the
    language J.  The Numeric 1 code has the required variables for
    implementing this behavior, but was never implemented.  We intend
    to implement implicit sub-array iteration in Numeric 2, if the
    array broadcasting rules found in Numeric 1 do not fully support
    this behavior.


Copyright

    This document is placed in the public domain.


Related PEPs

    PEP 207: Rich Comparisons
        by Guido van Rossum and David Ascher

    PEP 208: Reworking the Coercion Model
        by Neil Schemenauer and Marc-Andre' Lemburg

    PEP 211: Adding New Linear Algebra Operators to Python
        by Greg Wilson

    PEP 225: Elementwise/Objectwise Operators
        by Huaiyu Zhu

    PEP 228: Reworking Python's Numeric Model
        by Moshe Zadka


References

    [1] P. Greenfield 2000. private communication.


From rob at hooft.net  Wed Feb 14 02:42:36 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Wed, 14 Feb 2001 08:42:36 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14985.22458.685587.538866@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
Message-ID: <14986.14060.238048.161366@temoleh.chem.uu.nl>

Some random PEP talk.

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> 2.  Additional array types
    
 PB> Numeric 1 has 11 defined types: char, ubyte, sbyte, short, int,
 PB> long, float, double, cfloat, cdouble, and object.  There are no
 PB> ushort, uint, or ulong types, nor are there more complex types
 PB> such as a bit type which is of use to some fields of science and
 PB> possibly for implementing masked-arrays.

True: I would have had a much easier life with a ushort type. 
    
 PB> Its relation to the other types is defined when the C-extension
 PB> module for that type is imported.  The corresponding Python code
 PB> is:
    
     >> Int32.astype[Real64] = Real64

I understand this is to be done by the Int32 C extension module. 
But how does it know about Real64?
    
 PB> Attributes:
 PB> .name:                  e.g. "Int32", "Float64", etc.
 PB> .typecode:              e.g. 'i', 'f', etc.
 PB> (for backward compatibility)

.typecode() is a method now.

 PB> .size (in bytes):       e.g. 4, 8, etc.

"element size?"

 >> add.register('add', (Int32, Int32, Int32), cfunc-add)

Typo: cfunc-add is an expression, not an identifier.

An implementation of a (Int32, Float32, Float32) add is possible and
desirable as mentioned earlier in the document. Which C module is
going to declare such a combination?

 PB> asstring():             create string from array

Not "tostring" like now?
        
 PB> 4.  ArrayView

 PB> This class is similar to the Array class except that the reshape
 PB> and flat methods will raise exceptions, since non-contiguous
 PB> arrays cannot be reshaped or flattened using just pointer and
 PB> step-size information.

This was completely unclear to me until here. I must say I find this a
strange way of handling things. I haven't looked into implementation
details, but wouldn't it feel more natural if an Array would just be
the "data", and an ArrayView would contain the dimensions and
strides. Completely separated. One would always need a pair, but more
than one ArrayView could use the same Array.

 PB> a.  _ufunc:

 PB> 1.  Does slicing syntax default to copy or view behavior?

Numeric 1 uses slicing for view, and a method for copy. "Feeling"
compatible with core python would require copy on rhs, and view on lhs
of an assignment. Is that distinction possible?

If copy is the default for slicing, how would one make a view?

 PB> 2.  Does item syntax default to copy or view behavior?

view.

 PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 PB> would imply copy behavior assuming slicing syntax returns a copy.

If you reason that way, then c is just a shorthand for c[...] too.

 PB> 3.  How is scalar coercion implemented?

 PB> Python has fewer numeric types than Numeric which can cause
 PB> coercion problems.  For example when multiplying a Python scalar
 PB> of type float and a Numeric array of type float, the Numeric array
 PB> is converted to a double, since the Python float type is actually
 PB> a double.  This is often not the desired behavior, since the
 PB> Numeric array will be doubled in size which is likely to be
 PB> annoying, particularly for very large arrays.

Sure. That is handled reasonably well by the current Numeric 1.

To extend this, I'd like to comment that I have never really understood
the philosophy of taking the largest type for coercion in all languages.
Being a scientist, I have learned that when you multiply a very accurate
number with a very approximate number, your result is going to be very
approximate, not very accurate! It would thus be more logical to have
Float32*Float64 return a Float32!

 PB> In a future version of Python, the behavior of integer division
 PB> will change.  The operands will be converted to floats, so the
 PB> result will be a float.  If we implement the proposed scalar
 PB> coercion rules where arrays have precedence over Python scalars,
 PB> then dividing an array by an integer will return an integer array
 PB> and will not be consistent with a future version of Python which
 PB> would return an array of type double.  Scientific programmers are
 PB> familiar with the distinction between integer and float-point
 PB> division, so should Numeric 2 continue with this behavior?

Numeric 2 should be as compatible as reasonably possible with core python.
But my question is: how would we do integer division of arrays? A ufunc
for which no operator shortcut exists?

 PB> 7.  How are numerical errors handled (IEEE floating-point errors in
 PB> particular)?

I am developing my code on Linux and IRIX. I have seen that where
Numeric code on Linux runs fine, the same code on IRIX may "core dump"
on a FPE (e.g. arctan2(0,0)). That difference should be avoided.

 PB> a.  Print a message of the most severe error, leaving it to
 PB> the user to locate the errors.

What is the most severe error?

 PB> c.  Minimall UFunc class:

Typo: Minimal?

Regards,

Rob Hooft
-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From Barrett at stsci.edu  Wed Feb 14 12:09:45 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Wed, 14 Feb 2001 12:09:45 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.14060.238048.161366@temoleh.chem.uu.nl>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
Message-ID: <14986.43001.213738.708354@nem-srvr.stsci.edu>

Rob W. W. Hooft writes:
 > Some random PEP talk.
 > 
 > >>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:
 >     
 >  PB> Its relation to the other types is defined when the C-extension
 >  PB> module for that type is imported.  The corresponding Python code
 >  PB> is:
 >     
 >      >> Int32.astype[Real64] = Real64
 > 
 > I understand this is to be done by the Int32 C extension module. 
 > But how does it know about Real64?

This approach assumes that there are a basic set of predefined types.
In the above example, the Real64 type is one of them.  But let's
consider creating a completely new type, say Real128.  This type knows
its relation to the other previously defined types, namely Real32,
Real64, etc., but they do not know their relationship to it.  That's
still OK, because the Real128 type is imbued with this required
information and is willing to share it with the other types.

By way of bootstrapping, only one predefined type need be known, say,
Int32.  The operations associated with this type can only be Int32
operations, because this is the only type it knows about.  Yet, we can
add another type, say Real64, which has not only Real64 operations,
BUT also Int32 and Real64 mixed operations, since it knows about
Int32.  The Real64 type provides the necessary information to relate
the Int32 and Int64 types.  Let's now add a third type, then a fourth,
etc., each knowing about its predecessor types but not its successors.

This approach is identical to the way core Python adds new classes or
C-extension types, so this is nothing new.  The current types do not
know about the new type, but the new type knows about them.  As long
as one type knows the relationship between the two that is sufficient
for the scheme to work.

 >  PB> Attributes:
 >  PB> .name:                  e.g. "Int32", "Float64", etc.
 >  PB> .typecode:              e.g. 'i', 'f', etc.
 >  PB> (for backward compatibility)
 > 
 > .typecode() is a method now.

Yes, I propose that it become a settable attribute.

 >  PB> .size (in bytes):       e.g. 4, 8, etc.
 > 
 > "element size?"

Yes.

 >  >> add.register('add', (Int32, Int32, Int32), cfunc-add)
 > 
 > Typo: cfunc-add is an expression, not an identifier.

No, it is a Python object that encompasses and describes a C function
that adds two Int32 arrays and returns an Int32 array.  It is
essentially a Python wrapper of a C-function UFunc.  It has been
suggested that you should also be able to register Python expressions
using the same interface.

 > An implementation of a (Int32, Float32, Float32) add is possible and
 > desirable as mentioned earlier in the document. Which C module is
 > going to declare such a combination?
 > 
 >  PB> asstring():             create string from array
 > 
 > Not "tostring" like now?

This is proposed so as to be a little more consistent with Core Python
which uses 'from-' and 'as-' prefixes.  But I'm don't have strong
opinions either way.

 >  PB> 4.  ArrayView
 > 
 >  PB> This class is similar to the Array class except that the reshape
 >  PB> and flat methods will raise exceptions, since non-contiguous
 >  PB> arrays cannot be reshaped or flattened using just pointer and
 >  PB> step-size information.
 > 
 > This was completely unclear to me until here. I must say I find this a
 > strange way of handling things. I haven't looked into implementation
 > details, but wouldn't it feel more natural if an Array would just be
 > the "data", and an ArrayView would contain the dimensions and
 > strides. Completely separated. One would always need a pair, but more
 > than one ArrayView could use the same Array.

In my definition, an Array that has no knowledge of its shape and type 
is not an Array, it's a data or character buffer.  An array in my
definition is a data buffer with information on how that buffer is to
be mapped, i.e. shape, type, etc.  An ArrayView is an Array that
shares its data buffer with another Array, but may contain a different 
mapping of that Array, ie. its shape and type are different.

If this is what you mean, then the answer is "Yes".  This is how we
intend to implement Arrays and ArrayViews.

 >  PB> a.  _ufunc:
 > 
 >  PB> 1.  Does slicing syntax default to copy or view behavior?
 > 
 > Numeric 1 uses slicing for view, and a method for copy. "Feeling"
 > compatible with core python would require copy on rhs, and view on lhs
 > of an assignment. Is that distinction possible?
 > 
 > If copy is the default for slicing, how would one make a view?

B = A.V[:10] or A.view[:10] are some possibilities.  B is now an
ArrayView class.

 >  PB> 2.  Does item syntax default to copy or view behavior?
 > 
 > view.
 > 
 >  PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 >  PB> would imply copy behavior assuming slicing syntax returns a copy.
 > 
 > If you reason that way, then c is just a shorthand for c[...] too.

Yes, that is correct, but that is not how Python currently behaves.
The motivation for these questions is consistency with core Python
behavior.  The current Numeric does not follow this pattern for
reasons of performance.  If we assume performance is NOT an issue
(ie. we can get similar performance by using various tricks), then
what behavior is more intuitive for the average, and novice, user?

 >  PB> 3.  How is scalar coercion implemented?
 > 
 >  PB> Python has fewer numeric types than Numeric which can cause
 >  PB> coercion problems.  For example when multiplying a Python scalar
 >  PB> of type float and a Numeric array of type float, the Numeric array
 >  PB> is converted to a double, since the Python float type is actually
 >  PB> a double.  This is often not the desired behavior, since the
 >  PB> Numeric array will be doubled in size which is likely to be
 >  PB> annoying, particularly for very large arrays.
 > 
 > Sure. That is handled reasonably well by the current Numeric 1.
 > 
 > To extend this, I'd like to comment that I have never really understood
 > the philosophy of taking the largest type for coercion in all languages.
 > Being a scientist, I have learned that when you multiply a very accurate
 > number with a very approximate number, your result is going to be very
 > approximate, not very accurate! It would thus be more logical to have
 > Float32*Float64 return a Float32!

If numeric precision was all that mattered, then you would be correct.
But numeric range is also important.  I would hate to take the chance
of overflowing the above multiplication because I stored the result as 
a Float32, instead of a Float64, even though the Float64 is overkill
in terms of precision.  FORTRAN has made an attempt to address this
issue in FORTRAN 9X by allowing the user to indicate the range and
precision of the calculation.

 >  PB> In a future version of Python, the behavior of integer division
 >  PB> will change.  The operands will be converted to floats, so the
 >  PB> result will be a float.  If we implement the proposed scalar
 >  PB> coercion rules where arrays have precedence over Python scalars,
 >  PB> then dividing an array by an integer will return an integer array
 >  PB> and will not be consistent with a future version of Python which
 >  PB> would return an array of type double.  Scientific programmers are
 >  PB> familiar with the distinction between integer and float-point
 >  PB> division, so should Numeric 2 continue with this behavior?
 > 
 > Numeric 2 should be as compatible as reasonably possible with core python.
 > But my question is: how would we do integer division of arrays? A ufunc
 > for which no operator shortcut exists?

I don't understand either question.

We have devised a scheme where there are two sets of coercion rules.
One for coercion between array types, and one for array and Python
scalar types.  This latter set of rules can either have higher
precedence for array types or Python scalar types.  We favor array
types having precedence.

A more complex set of coercion rules is also possible, if you prefer.

 >  PB> 7.  How are numerical errors handled (IEEE floating-point errors in
 >  PB> particular)?
 > 
 > I am developing my code on Linux and IRIX. I have seen that where
 > Numeric code on Linux runs fine, the same code on IRIX may "core dump"
 > on a FPE (e.g. arctan2(0,0)). That difference should be avoided.
 > 
 >  PB> a.  Print a message of the most severe error, leaving it to
 >  PB> the user to locate the errors.
 > 
 > What is the most severe error?

Well, divide by zero and overflow come to mind.  Underflows are often
considered less severe.  Yet this is up to you to decide.

 >  PB> c.  Minimall UFunc class:
 > 
 > Typo: Minimal?

Got it!


Thanks for your comments.

I-obviously-have-a-lot-of-explaining-to-do-ly yours,
Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From hinsen at cnrs-orleans.fr  Wed Feb 14 13:03:20 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 14 Feb 2001 19:03:20 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14985.22458.685587.538866@nem-srvr.stsci.edu> (message from Paul
	Barrett on Tue, 13 Feb 2001 10:52:18 -0500 (EST))
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
Message-ID: <200102141803.TAA20224@chinon.cnrs-orleans.fr>

> Design and Implementation

Some parts of this look a bit imprecise and I don't claim to
understand them. For example:

>     Its relation to the other types is defined when the C-extension
>     module for that type is imported.  The corresponding Python code
>     is:
>     
>     > Int32.astype[Real64] = Real64
>     
>     This says that the Real64 array-type has higher priority than the
>     Int32 array-type.

I'd choose a clearer name than "astype" for this, but that's a minor
detail. More important is how this is supposed to work. Suppose that
in Int32 you say that Real64 has higher priority, and in Real64 you
say that Int32 has higher priority. Would this raise an exception, and
if so, when?

Perhaps the coercion question should be treated in a separate PEP that
also covers standard Python types and provides a mechanism that any
type implementer can use. I could think of a number of cases where I
have wished I could define coercions between my own and some other
types properly.

>     3.  Array:
>     
>     This class contains information about the array, such as shape,
>     type, endian-ness of the data, etc..  Its operators, '+', '-',

What about the data itself?

>     4.  ArrayView
> 
>     This class is similar to the Array class except that the reshape
>     and flat methods will raise exceptions, since non-contiguous

There are no reshape and flat methods in this proposal...

>     1.  Does slicing syntax default to copy or view behavior?
> 
>     The default behavior of Python is to return a copy of a sub-list
>     or tuple when slicing syntax is used, whereas Numeric 1 returns a
>     view into the array.  The choice made for Numeric 1 is apparently
>     for reasons of performance: the developers wish to avoid the

Yes, performance was the main reason. But there is another one: if
slicing returns a view, you can make a copy based on it, but if
slicing returns a copy, there's no way to make a view. So if you
change this, you must provide some other way to generate a view, and
please keep the syntax simple (there are many practical cases where a
view is required).

>     In this case the performance penalty associated with copy behavior
>     can be minimized by implementing copy-on-write.  This scheme has

Indeed, that's what most APL implementations do.

>     data buffer is made.  View behavior would then be implemented by
>     an ArrayView class, whose behavior be similar to Numeric 1 arrays,

So users would have to write something like

    ArrayView(array, indices)

That looks a bit cumbersome, and any straightforward way to write the
indices is illegal according to the current syntax rules.

>     2.  Does item syntax default to copy or view behavior?

If compatibility with lists is a criterion at all, then I'd apply it
consistently and use view semantics. Otherwise let's forget about
lists and discuss 1. and 2. from a purely array-oriented point of
view. And then I'd argue that view semantics is more frequent and
should thus be the default for both slicing and item extraction.

>     3.  How is scalar coercion implemented?

The old discussion again...

>     annoying, particularly for very large arrays.  We prefer that the
>     array type trumps the python type for the same type class, namely

That is a completely arbitrary rule from any but the "large array
performance" point of view. And it's against the Principle of Least
Surprise.

Now that we have the PEP procedure for proposing any change
whatsoever, why not lobby for the addition of a float scalar type to
Python, with its own syntax for constants? That looks like the best
solution from everybody's point of view.

>     4.  How is integer division handled?
>     
>     In a future version of Python, the behavior of integer division
>     will change.  The operands will be converted to floats, so the

Has that been decided already?

>     7.  How are numerical errors handled (IEEE floating-point errors in
>         particular)?
> 
>     It is not clear to the proposers (Paul Barrett and Travis
>     Oliphant) what is the best or preferred way of handling errors.
>     Since most of the C functions that do the operation, iterate over
>     the inner-most (last) dimension of the array.  This dimension
>     could contain a thousand or more items having one or more errors
>     of differing type, such as divide-by-zero, underflow, and
>     overflow.  Additionally, keeping track of these errors may come at
>     the expense of performance.  Therefore, we suggest several
>     options:

I'd like to add another one:

e. Keep some statistics about the errors that occur during the
   operation, and if at the end the error count is > 0, raise
   an exception containing as much useful information as possible.

I would certainly not want any Python program to *print* anything
unless I have explicitly told it to do so.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen at cnrs-orleans.fr  Wed Feb 14 13:09:59 2001
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 14 Feb 2001 19:09:59 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.14060.238048.161366@temoleh.chem.uu.nl> (rob@hooft.net)
References: <14985.22458.685587.538866@nem-srvr.stsci.edu> <14986.14060.238048.161366@temoleh.chem.uu.nl>
Message-ID: <200102141809.TAA20233@chinon.cnrs-orleans.fr>

> Being a scientist, I have learned that when you multiply a very accurate
> number with a very approximate number, your result is going to be very
> approximate, not very accurate! It would thus be more logical to have
> Float32*Float64 return a Float32!

Accuracy is not the right concept, but storage capacity. A Float64 can
store any value that can be stored in a Float32, but the inverse is
not true. Accuracy is not a property of a number, but of a value
and its representation in the computer. The float value "1." can
be perfectly accurate, even in 32 bits, or it can be an approximation
for 1.-1.e-50, which cannot be represented precisely.

BTW, Float64 also has a larger range of magnitudes than Float32,
not just more significant digits.

> Numeric 2 should be as compatible as reasonably possible with core python.
> But my question is: how would we do integer division of arrays? A ufunc
> for which no operator shortcut exists?

Sounds fine. On the other hand, if and when Python's integer division
behaviour is changed, there will be some new syntax for integer division,
which should then also work on arrays.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From rob at hooft.net  Wed Feb 14 16:17:18 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Wed, 14 Feb 2001 22:17:18 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.43001.213738.708354@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
Message-ID: <14986.62942.460585.961514@temoleh.chem.uu.nl>

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> By way of bootstrapping, only one predefined type need be known,
 PB> say, Int32.  The operations associated with this type can only be
 PB> Int32 operations, because this is the only type it knows about.
 PB> Yet, we can add another type, say Real64, which has not only
 PB> Real64 operations, BUT also Int32 and Real64 mixed operations,
 PB> since it knows about Int32.  The Real64 type provides the
 PB> necessary information to relate the Int32 and Int64 types.  Let's
 PB> now add a third type, then a fourth, etc., each knowing about its
 PB> predecessor types but not its successors.

 PB> This approach is identical to the way core Python adds new
 PB> classes or C-extension types, so this is nothing new.  The
 PB> current types do not know about the new type, but the new type
 PB> knows about them.  As long as one type knows the relationship
 PB> between the two that is sufficient for the scheme to work.

Yuck. I'm thinking how long it would take to load the Int256 class,
because it will need to import all other types before defining the 
relations.... [see below for another idea]

 PB> Attributes: .name: e.g. "Int32", "Float64", etc. .typecode:
 PB> e.g. 'i', 'f', etc. (for backward compatibility)
 >>  .typecode() is a method now.

 PB> Yes, I propose that it become a settable attribute.

Then it is not backwards compatible anyway, and you could leave it out.

 PB> .size (in bytes): e.g. 4, 8, etc.
 >>  "element size?"

 PB> Yes.

I think it should be called like that in that case. I dnt lk abbrvs.
size could be misread as the size of the total object.

 >> >> add.register('add', (Int32, Int32, Int32), cfunc-add)
 >> 
 >> Typo: cfunc-add is an expression, not an identifier.

 PB> No, it is a Python object that encompasses and describes a C
 PB> function that adds two Int32 arrays and returns an Int32 array.

I understand that, but in general a "-" in pseudo-code is the
minus operator. I'd write cfunc_add instead.

 >> An implementation of a (Int32, Float32, Float32) add is possible
 >> and desirable as mentioned earlier in the document. Which C module
 >> is going to declare such a combination?

Now that I re-think this: would it be possible for the type-loader to check
for each type that it loads whether a cross-type module is available with
a previously loaded type? That way all types can be independent. There would
be a Int32 module knowing only Int32 types, and Float32 only knowing Float32 types.
Then there would be a Int32Float32 type that handles cross-type functions.
When Int32 or Float32 is loaded, the loader can see whether the other has
been loaded earlier, and if it is, load the cross-definitions as well.

Only problem I can think of is functions linking 3 or more types.

 PB> asstring(): create string from array
 >>  Not "tostring" like now?

 PB> This is proposed so as to be a little more consistent with Core
 PB> Python which uses 'from-' and 'as-' prefixes.  But I'm don't have
 PB> strong opinions either way.

PIL uses tostring as well. Anyway, I understand the buffer interface
is a nicer way to communicate.

 PB> 4.  ArrayView
 >>
 PB> This class is similar to the Array class except that the reshape
 PB> and flat methods will raise exceptions, since non-contiguous
 PB> arrays cannot be reshaped or flattened using just pointer and
 PB> step-size information.
 >>  This was completely unclear to me until here. I must say I find
 >> this a strange way of handling things. I haven't looked into
 >> implementation details, but wouldn't it feel more natural if an
 >> Array would just be the "data", and an ArrayView would contain the
 >> dimensions and strides. Completely separated. One would always
 >> need a pair, but more than one ArrayView could use the same Array.

 PB> In my definition, an Array that has no knowledge of its shape and
 PB> type is not an Array, it's a data or character buffer.  An array
 PB> in my definition is a data buffer with information on how that
 PB> buffer is to be mapped, i.e. shape, type, etc.  An ArrayView is
 PB> an Array that shares its data buffer with another Array, but may
 PB> contain a different mapping of that Array, ie. its shape and type
 PB> are different.

 PB> If this is what you mean, then the answer is "Yes".  This is how
 PB> we intend to implement Arrays and ArrayViews.

No, it is not what I meant. Reading your answer I'd say that I wouldn't
see the need for an Array. We only need a data buffer and an ArrayView.
If there are two parts of the functionality, it is much cleaner to make 
the cut in an orthogonal way.

 PB> B = A.V[:10] or A.view[:10] are some possibilities.  B is now an
 PB> ArrayView class.

I hate magic attributes like this. I do not like abbrevs at all. It is
not at all obvious what A.T or A.V mean.

 PB> 2.  Does item syntax default to copy or view behavior?
 >>  view.
 >> 
 PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 PB> would imply copy behavior assuming slicing syntax returns a copy.
 >>  If you reason that way, then c is just a shorthand for c[...]
 >> too.

 PB> Yes, that is correct, but that is not how Python currently
 PB> behaves.

Current python also doesn't treat c[i] as a shorthand for c[i,:] or
c[i,...]

Regards,

Rob Hooft

-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From Barrett at stsci.edu  Wed Feb 14 18:05:17 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Wed, 14 Feb 2001 18:05:17 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14986.62942.460585.961514@temoleh.chem.uu.nl>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
Message-ID: <14987.246.29693.379005@nem-srvr.stsci.edu>

Rob W. W. Hooft writes:
 > >>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:
 > 
 >  PB> By way of bootstrapping, only one predefined type need be known,
 >  PB> say, Int32.  The operations associated with this type can only be
 >  PB> Int32 operations, because this is the only type it knows about.
 >  PB> Yet, we can add another type, say Real64, which has not only
 >  PB> Real64 operations, BUT also Int32 and Real64 mixed operations,
 >  PB> since it knows about Int32.  The Real64 type provides the
 >  PB> necessary information to relate the Int32 and Int64 types.  Let's
 >  PB> now add a third type, then a fourth, etc., each knowing about its
 >  PB> predecessor types but not its successors.
 > 
 >  PB> This approach is identical to the way core Python adds new
 >  PB> classes or C-extension types, so this is nothing new.  The
 >  PB> current types do not know about the new type, but the new type
 >  PB> knows about them.  As long as one type knows the relationship
 >  PB> between the two that is sufficient for the scheme to work.
 > 
 > Yuck. I'm thinking how long it would take to load the Int256 class,
 > because it will need to import all other types before defining the 
 > relations.... [see below for another idea]


First, I'm not proposing that we use this method of bootstapping from
just one type.  I was just demonstrating that it could be done.  Users 
could then create their own types and dynamically add them to the
module by the above scheme.

Second, I think your making the situation more complex than it really
is.  It doesn't take that long to initialize the type rules and
register the functions, because both arrays are sparsely populated.
If there isn't a rule between two types, you don't have to create a
dictionary entry.  The size of the coecion table is equal to or less
than the number of types, so that's small.  The function table is a
sparsely populated square array.  We just envision populating its
diagonal elements and using coercion rules for the empty off-diagonal
elements.  The point is that if an off-diagonal element is filled,
then it will be used.

I'll include our proposed implementation in the PEP for clarification.


 >  PB> Attributes: .name: e.g. "Int32", "Float64", etc. .typecode:
 >  PB> e.g. 'i', 'f', etc. (for backward compatibility)
 >  >>  .typecode() is a method now.
 > 
 >  PB> Yes, I propose that it become a settable attribute.
 > 
 > Then it is not backwards compatible anyway, and you could leave it out.


I'd like to, but others have strongly objected to leaving out
typecodes.


 >  PB> .size (in bytes): e.g. 4, 8, etc.
 >  >>  "element size?"
 > 
 >  PB> Yes.
 > 
 > I think it should be called like that in that case. I dnt lk abbrvs.
 > size could be misread as the size of the total object.


How about item_size?


 >  >> >> add.register('add', (Int32, Int32, Int32), cfunc-add)
 >  >> 
 >  >> Typo: cfunc-add is an expression, not an identifier.
 > 
 >  PB> No, it is a Python object that encompasses and describes a C
 >  PB> function that adds two Int32 arrays and returns an Int32 array.
 > 
 > I understand that, but in general a "-" in pseudo-code is the
 > minus operator. I'd write cfunc_add instead.


Yes. I understand now.


 >  PB> 4.  ArrayView
 >  >>
 >  PB> This class is similar to the Array class except that the reshape
 >  PB> and flat methods will raise exceptions, since non-contiguous
 >  PB> arrays cannot be reshaped or flattened using just pointer and
 >  PB> step-size information.
 >  >>  This was completely unclear to me until here. I must say I find
 >  >> this a strange way of handling things. I haven't looked into
 >  >> implementation details, but wouldn't it feel more natural if an
 >  >> Array would just be the "data", and an ArrayView would contain the
 >  >> dimensions and strides. Completely separated. One would always
 >  >> need a pair, but more than one ArrayView could use the same Array.
 > 
 >  PB> In my definition, an Array that has no knowledge of its shape and
 >  PB> type is not an Array, it's a data or character buffer.  An array
 >  PB> in my definition is a data buffer with information on how that
 >  PB> buffer is to be mapped, i.e. shape, type, etc.  An ArrayView is
 >  PB> an Array that shares its data buffer with another Array, but may
 >  PB> contain a different mapping of that Array, ie. its shape and type
 >  PB> are different.
 > 
 >  PB> If this is what you mean, then the answer is "Yes".  This is how
 >  PB> we intend to implement Arrays and ArrayViews.
 > 
 > No, it is not what I meant. Reading your answer I'd say that I wouldn't
 > see the need for an Array. We only need a data buffer and an ArrayView.
 > If there are two parts of the functionality, it is much cleaner to make 
 > the cut in an orthogonal way.


I just don't see what you are getting at here!   What attributes does
your Array have, if it doesn't have a shape or type?

If Arrays only have view behavior; then Yes, there is no need for the
ArrayView class.  Whereas if Arrays have copy behavior, it might be a
good idea to distinguish between an ordinary Array and a ArrayView.
An alternative would be to have a view attribute.


 >  PB> B = A.V[:10] or A.view[:10] are some possibilities.  B is now an
 >  PB> ArrayView class.
 > 
 > I hate magic attributes like this. I do not like abbrevs at all. It is
 > not at all obvious what A.T or A.V mean.


I'm not a fan of them either, but I'm looking for concensus on these
issues.


 >  PB> 2.  Does item syntax default to copy or view behavior?
 >  >>  view.
 >  >> 
 >  PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 >  PB> would imply copy behavior assuming slicing syntax returns a copy.
 >  >>  If you reason that way, then c is just a shorthand for c[...]
 >  >> too.
 > 
 >  PB> Yes, that is correct, but that is not how Python currently
 >  PB> behaves.
 > 
 > Current python also doesn't treat c[i] as a shorthand for c[i,:] or
 > c[i,...]

Because there aren't any multi-dimensional lists in Python, only
nested 1-dimensional lists.  There is a structural difference.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From Robert.Harrison at pnl.gov  Wed Feb 14 18:06:23 2001
From: Robert.Harrison at pnl.gov (Harrison, Robert J)
Date: Wed, 14 Feb 2001 15:06:23 -0800
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
Message-ID: <4F638A86A844A148876B65CC635038860E8844@pnlmse16.pnl.gov>

Paul Barrett writes:
> Rob W. W. Hooft writes:
>  > Being a scientist, I have learned that when you multiply a 
> very accurate
>  > number with a very approximate number, your result is 
> going to be very
>  > approximate, not very accurate! It would thus be more 
> logical to have
>  > Float32*Float64 return a Float32!
> 
> If numeric precision was all that mattered, then you would be correct.
> But numeric range is also important.  I would hate to take the chance
> of overflowing the above multiplication because I stored the 
> result as 
> a Float32, instead of a Float64, even though the Float64 is overkill
> in terms of precision.  FORTRAN has made an attempt to address this
> issue in FORTRAN 9X by allowing the user to indicate the range and
> precision of the calculation.
> 

A number in a floating point representation is not necessarily 
represented inexactly.  The discussion of Barrett and Hooft 
is confusing the distinct concepts of precision and accuracy.
Well worth reading is Kahan's scathing critcism of Java's 
floating-point model, at least some of which relates directly to
that of Python or proposals in PEPs 209 and 228.  
http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf
See p18 for "definitions" of precision and accuracy.
There's a lot more material in the literature, on Kahan's web-site,
and the following is an excellent discussion of floating point
arithmetic and the IEEE standards.
http://cch.loria.fr/documentation/IEEE754/ACM/goldberg.pdf

With regard to the treatment of errors:
Correct and detailed handling of floating-point exceptions need
not impact speed, provided that a mechanism is provided to
(en/dis)able each exception.  Users not interested in exceptions
can simply mask them.  I recall relevant prior discussion including 
constructive comments from Tim Peters.  Many modern and efficient
numerical algorithms, and also effective debugging of numerical 
programs that use large datasets, *require* accurate
and prompt identification of exceptions.  Accurate meaning that
the arrays, their indices, the operation, traceback and type of exception
must be reported.  Delayed reporting of errors is not satisfactory 
since operations performed in the interim may destroy valuable data,
or take a very long time (esp. if many exceptions are being generated).
It is probably unreasonable to ask for more than the capabilities 
provided by some subset of the still platform dependent optimizing 
compilers used to implement Python/Numpy, but I don't see why we should 
have much less. 

I would encourage the developers of PEPs 209 and 228 to submit their
designs for review by a panel of professional numerical analysts 
(not just numerically literate programmers or scientists).  
While full IEEE 754 within Python or NumPy may still be just
a pipe-dream (for some at least), we can at least take a step closer.


Robert


Robert Harrison
Pacific Northwest National Laboratory
Richland, Washington 99352
(509) 375-2037
robert.harrison at pnl.gov
 

From frohne at gci.net  Wed Feb 14 21:29:25 2001
From: frohne at gci.net (Ivan Frohne)
Date: Wed, 14 Feb 2001 17:29:25 -0900
Subject: [Numpy-discussion] Numeric 2 :  Arrays and Floating Point in C#
Message-ID: <001401c096f7$2312d860$e498ed18@d4100>

Microsoft's new  language C# (c-sharp) implements
the IEEE-754 floating point standard.  There are positive and
negative infinities and zeros, NaNs, and arithmetic operations
involving these values behave properly.  C# also has both
multidimensional rectangular and ragged arrays,
and combinations thereof.

Since a version of Python based on C# will soon be released,
(by ActiveState), any Numeric-2 development that doesn't take
these accomplishments seriously is in danger of becoming
obsolete before it gets documented.

The C# language specification is at the web site below (make
one line out of it).  See, in particular, sections 4.1.5 and 12.1.

--Ivan Frohne

http://msdn.microsoft.com/library/default.asp?URL=/library/dotnet/csspec/vcl
rfcsharpspec_Start.htm


From paul at pfdubois.com  Wed Feb 14 23:22:32 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Wed, 14 Feb 2001 20:22:32 -0800
Subject: [Numpy-discussion] Numeric 2 :  Arrays and Floating Point in C#
In-Reply-To: <001401c096f7$2312d860$e498ed18@d4100>
Message-ID: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com>

Thank you for pointing this out. I have two questions.

1. Note that we could not reach a consensus about using C++ for future
versions, even though C++ is quite aged by now, because of complaints that
acceptable (ie, standard-conforming) compilers were not available (a) for
free and (b) on all platforms. When would C# likely be able to meet these
conditions?

2. Java flunked the Kindergarten test -- it did not like to play with
others. Will C# pass it? If I want to use many of the available algorithms,
I have to be able to call C and Fortran. The fact that Python itself is
implemented in a given language is of almost no value in and of itself.
Nobody is going to rewrite Linpack and Spherepack in C# next month.

My questions may sound rhetorical, but they are not. Although I have glanced
through the C# spec, and am somewhat pleased with it, I do not know the
answers to these questions.


-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Ivan
Frohne
Sent: Wednesday, February 14, 2001 6:29 PM
To: Numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] Numeric 2 : Arrays and Floating Point in C#


Microsoft's new  language C# (c-sharp) implements
the IEEE-754 floating point standard.  There are positive and
negative infinities and zeros, NaNs, and arithmetic operations
involving these values behave properly.  C# also has both
multidimensional rectangular and ragged arrays,
and combinations thereof.

Since a version of Python based on C# will soon be released,
(by ActiveState), any Numeric-2 development that doesn't take
these accomplishments seriously is in danger of becoming
obsolete before it gets documented.

The C# language specification is at the web site below (make
one line out of it).  See, in particular, sections 4.1.5 and 12.1.

--Ivan Frohne

http://msdn.microsoft.com/library/default.asp?URL=/library/dotnet/csspec/vcl
rfcsharpspec_Start.htm


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From frohne at gci.net  Thu Feb 15 14:22:08 2001
From: frohne at gci.net (Ivan Frohne)
Date: Thu, 15 Feb 2001 10:22:08 -0900
Subject: [Numpy-discussion] Numeric 2 :  Arrays and Floating Point in C#
References: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com>
Message-ID: <002601c09784$d0955a20$5599ed18@d4100>

----- Original Message -----
From: "Paul F. Dubois" <paul at pfdubois.com>
To: "Ivan Frohne" <frohne at gci.net>; <Numpy-discussion at lists.sourceforge.net>
Sent: Wednesday, February 14, 2001 19:22
Subject: RE: [Numpy-discussion] Numeric 2 : Arrays and Floating Point in C#


> Thank you for pointing this out. I have two questions.
>
> 1. Note that we could not reach a consensus about using C++ for future
> versions, even though C++ is quite aged by now, because of complaints that
> acceptable (ie, standard-conforming) compilers were not available (a) for
> free and (b) on all platforms. When would C# likely be able to meet these
> conditions?
>
> 2. Java flunked the Kindergarten test -- it did not like to play with
> others. Will C# pass it? If I want to use many of the available
algorithms,
> I have to be able to call C and Fortran. The fact that Python itself is
> implemented in a given language is of almost no value in and of itself.
> Nobody is going to rewrite Linpack and Spherepack in C# next month.
>
> My questions may sound rhetorical, but they are not. Although I have
glanced
> through the C# spec, and am somewhat pleased with it, I do not know the
> answers to these questions.
>

Microsoft has a long list of languages which they claim will
support C# and the .NET Framework, including C++, Python,
Perl, Eiffel, Oberon, Haskell, Smalltalk, and even COBOL.
Fortran is conspicuous by its absence on the list, but Fujitsu is
doing the COBOL port. Fujitsu and Lahey Fortran
are working partners.  Or maybe Compaq/Digital has something
on the back burner?

http://msdn.microsoft.com/net/thirdparty/default.asp#lang

http://msdn.microsoft.com/library/default.asp?URL=/library/techart/Interopdo
tNET.htm

What's encouraging about C# and the .NET Framework is that
they appear to have been designed to address some of the more
serious shortcomings of JAVA:

(0)  Many languages will be supported.
(1)  The C# language specification has been submitted to the
international standards body ECMA for standardization.
(2)  Built-in types (ints, longs, doubles, arrays, etc.) are objects.
(3)  Unsigned integer types are included.
(4)  There is full IEEE 754 floating point support.
(5)  There is native support for multidimensional arrays, not just
awkward ragged arrays.
(6)  Most operators can be overloaded.
(7)  If you must, pointers are supported.

Python supports complex arithmetic out of the box.  But to invert
a matrix you have to twist yourself into a pretzel.

Ivan Frohne


From wsryu at fas.harvard.edu  Thu Feb 15 22:59:14 2001
From: wsryu at fas.harvard.edu (William Ryu)
Date: Thu, 15 Feb 2001 22:59:14 -0500
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <002601c09784$d0955a20$5599ed18@d4100>
References: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com>
Message-ID: <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20010215/e87ae02c/attachment-0001.html>

From pplumlee at omnigon.com  Fri Feb 16 00:31:14 2001
From: pplumlee at omnigon.com (Phlip)
Date: Thu, 15 Feb 2001 21:31:14 -0800
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>
References: <ADEOIFHFONCLEEPKCACCCEJJCFAA.paul@pfdubois.com> <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>
Message-ID: <01021521311401.11060@cuzco.concentric.net>

[Could someone configure this mailing list so ReplyTo goes to the list not 
the most recent participant?]

Proclaimed William Ryu from the mountaintops:

> <html>
> <font size=3>Was wondering if there is a &quot;standard&quot; library of
> curve fitting routines that people use with Numeric Python. I'm
> especially interested in high order Bezier curves, but would like to save
> some time if someone has put together a good curve fitting
> packaging.<br>


The ScientificPython should have this...

         http://starship.python.net/crew/hinsen/scientific.html

...but it appears the closest it has is Least Squares line fitting, and 
curved lines in its displays.

Maybe you could add your results to it. So much for saving time ;-)

-- 
  Phlip                          phlip_cpp at my-deja.com
============ http://c2.com/cgi/wiki?PhlIp ============
  --  Please state the nature of the programming emergency  --


From phrxy at csv.warwick.ac.uk  Fri Feb 16 03:24:45 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 16 Feb 2001 08:24:45 +0000 (GMT)
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <4.3.2.7.2.20010215225445.00b563d8@pop.fas.harvard.edu>
Message-ID: <Pine.SOL.4.30.0102160820220.12950-100000@mimosa.csv.warwick.ac.uk>

On Thu, 15 Feb 2001, William Ryu wrote:

> Was wondering if there is a "standard" library of curve fitting routines
> that people use with Numeric Python. I'm especially interested in high
> order Bezier curves, but would like to save some time if someone has put
> together a good curve fitting packaging.

There are simple wrappers of minpack (which includes non-linear least
squares) and dierckx (splines, I think) libraries in Travis Oliphant's
Multipack.  Travis is still in the process of moving its homepage ATM I
think, but they are available somewhere near here:

cens.ioc.ee/cgi-bin/cvsweb/python/multipack/

If you search back in the archives of this list, you'll find a pointer to
instructions for how to get them with cvs.


John


From rob at hooft.net  Fri Feb 16 04:18:11 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Fri, 16 Feb 2001 10:18:11 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14987.246.29693.379005@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
	<14987.246.29693.379005@nem-srvr.stsci.edu>
Message-ID: <14988.61523.833334.328664@temoleh.chem.uu.nl>

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:


 PB> .size (in bytes): e.g. 4, 8, etc.
 >> >> "element size?"

 PB> How about item_size?

OK.

 >>  No, it is not what I meant. Reading your answer I'd say that I
 >> wouldn't see the need for an Array. We only need a data buffer and
 >> an ArrayView. If there are two parts of the functionality, it is
 >> much cleaner to make the cut in an orthogonal way.


 PB> I just don't see what you are getting at here!  What attributes
 PB> does your Array have, if it doesn't have a shape or type?

A piece of memory. It needs nothing more. A buffer[1]. You'd always
need an ArrayView.  The Arrayview contains information like
dimensions, strides, data type, endianness.

Making a new _view_ would consist of making a new ArrayView, and pointing
its data object to the same data array. 

Making a new _copy_ would consist of making a new ArrayView, and
marking the "copy-on-write" features (however that needs to be
implemented, I have never done that. Does it involve weak
references?).

Different Views on the same data can even have different data types:
e.g. character and byte, or even floating point and integer (I am
a happy user of the fortran EQUIVALENCE statement that way too).

The speed up by re-use of temporary arrays becomes very easy this way
too: one can even re-use a floating point data array as integer result
if the reference count of both the data array and its (only) view is
one.

[1] Could the python buffer interface be used as a pre-existing
    implementation here? Would that make it possible to implement
    Array.append()? I don't always know beforehand how large my
    numeric arrays will become.

Rob

-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From Barrett at stsci.edu  Fri Feb 16 11:18:53 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 16 Feb 2001 11:18:53 -0500 (EST)
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14988.61523.833334.328664@temoleh.chem.uu.nl>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
	<14987.246.29693.379005@nem-srvr.stsci.edu>
	<14988.61523.833334.328664@temoleh.chem.uu.nl>
Message-ID: <14989.18319.218930.846896@nem-srvr.stsci.edu>

Rob W. W. Hooft writes:
 > 
 >  >>  No, it is not what I meant. Reading your answer I'd say that I
 >  >> wouldn't see the need for an Array. We only need a data buffer and
 >  >> an ArrayView. If there are two parts of the functionality, it is
 >  >> much cleaner to make the cut in an orthogonal way.
 > 
 > 
 >  PB> I just don't see what you are getting at here!  What attributes
 >  PB> does your Array have, if it doesn't have a shape or type?
 > 
 > A piece of memory. It needs nothing more. A buffer[1]. You'd always
 > need an ArrayView.  The Arrayview contains information like
 > dimensions, strides, data type, endianness.
 > 
 > Making a new _view_ would consist of making a new ArrayView, and pointing
 > its data object to the same data array. 
 > 
 > Making a new _copy_ would consist of making a new ArrayView, and
 > marking the "copy-on-write" features (however that needs to be
 > implemented, I have never done that. Does it involve weak
 > references?).
 > 
 > Different Views on the same data can even have different data types:
 > e.g. character and byte, or even floating point and integer (I am
 > a happy user of the fortran EQUIVALENCE statement that way too).

I think our approaches are very similar.  It's the meaning that we
ascribe to Array and ArrayView that appears to be causing the
confusion.  Your Array object is our Data object and your ArrayView
object is our Array attributes, ie. the information to map/interpret
the Data object.  We view an Array as being composed of two entities,
its attributes and a Data object.  And we entirely agree with the
above definitions of _view_ and _copy_.  But you haven't told us what
object associates your Array and ArrayView to make a usable array that 
can be sliced, diced, and Julian fried.

My impression of your slice method would be:

slice(Array, ArrayView, slice expression)

I'm not too keen on this approach. :-)

 > The speed up by re-use of temporary arrays becomes very easy this way
 > too: one can even re-use a floating point data array as integer result
 > if the reference count of both the data array and its (only) view is
 > one.

Yes!  This is our intended implementation.  But instead of re-using
your Array object, we will be re-using a (data-) buffer object, or a
memory-mapped object, or whatever else in which the data is stored.

 > [1] Could the python buffer interface be used as a pre-existing
 >     implementation here? Would that make it possible to implement
 >     Array.append()? I don't always know beforehand how large my
 >     numeric arrays will become.

In a way, yes.  I've considered creating an in-memory object that has
similar properties to the memory-mapped object (e.g. it might have a
read-only property), so that the two data objects can be used
interchangeably.  The in-memory object would replace the string object 
as a data store, since the string object is meant to be read-only.

 -- Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From clee at gnwy100.wuh.wustl.edu  Fri Feb 16 14:32:49 2001
From: clee at gnwy100.wuh.wustl.edu (Christopher Lee)
Date: Fri, 16 Feb 2001 13:32:49 -0600
Subject: [Numpy-discussion] configuration ideas
Message-ID: <200102161932.NAA16245@gnwy100.wuh.wustl.edu>

I am preparing a patch to Numeric 17.3.0 that allows for easier integration
of native BLAS/Lapack libraries with Numeric's dot() function and with the
LAPACK package.

What I would like to know is how/where should I specify build preferences.

The current situation is that I have added a config.py to the top
directory.  Inside this file, python variables like HAVE_CBLAS and/or
HAVE_FBLAS control linking and preprocessor flags. By default, the
distribution would build w/o the native libraries.  Necessary info like
library directories, includes and link flags would be listed as well and
available to distutils for Numeric and any of it's sub-packages.

How does this approach sound?

-chris


From beausol at hpl.hp.com  Fri Feb 16 15:38:10 2001
From: beausol at hpl.hp.com (Raymond Beausoleil)
Date: Fri, 16 Feb 2001 12:38:10 -0800
Subject: [Numpy-discussion] configuration ideas
In-Reply-To: <200102161932.NAA16245@gnwy100.wuh.wustl.edu>
Message-ID: <5.0.2.1.2.20010216123234.00aaa8c8@hplex1.hpl.hp.com>

Actually, this approach sounds more convenient than the one I've been using 
to integrate the Intel native BLAS and (most of) LAPACK provided for 
Windows. I built my original version(s) using a visual IDE, and I've been 
fiddling around with the standard distribution to try to get the paths 
right. Could you please send me your scripts so that I can modify them for 
my application?

= Ray

At 01:32 PM 2/16/2001 -0600, Christopher Lee wrote:
>I am preparing a patch to Numeric 17.3.0 that allows for easier integration
>of native BLAS/Lapack libraries with Numeric's dot() function and with the
>LAPACK package.
>
>What I would like to know is how/where should I specify build preferences.
>
>The current situation is that I have added a config.py to the top
>directory.  Inside this file, python variables like HAVE_CBLAS and/or
>HAVE_FBLAS control linking and preprocessor flags. By default, the
>distribution would build w/o the native libraries.  Necessary info like
>library directories, includes and link flags would be listed as well and
>available to distutils for Numeric and any of it's sub-packages.
>
>How does this approach sound?
>
>-chris
>
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>http://lists.sourceforge.net/lists/listinfo/numpy-discussion

============================
Ray Beausoleil
Hewlett-Packard Laboratories
mailto:beausol at hpl.hp.com
425-883-6648    Office
425-957-4951    Telnet
425-941-2566    Mobile
============================


From phrxy at csv.warwick.ac.uk  Fri Feb 16 18:16:07 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 16 Feb 2001 23:16:07 +0000 (GMT)
Subject: [Numpy-discussion] Curve fitting routines?
In-Reply-To: <Pine.SOL.4.30.0102160820220.12950-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <Pine.SOL.4.30.0102162305260.22841-100000@mimosa.csv.warwick.ac.uk>

On Fri, 16 Feb 2001, John J. Lee wrote:

> On Thu, 15 Feb 2001, William Ryu wrote:
>
> > Was wondering if there is a "standard" library of curve fitting routines
> > that people use with Numeric Python. I'm especially interested in high
> > order Bezier curves, but would like to save some time if someone has put
> > together a good curve fitting packaging.
>
> There are simple wrappers of minpack (which includes non-linear least
> squares) and dierckx (splines, I think) libraries in Travis Oliphant's
> Multipack.  Travis is still in the process of moving its homepage ATM I
> think, but they are available somewhere near here:
>
> cens.ioc.ee/cgi-bin/cvsweb/python/multipack/
>
> If you search back in the archives of this list, you'll find a pointer to
> instructions for how to get them with cvs.

Just occurred to me there might be drawing programs out there with bezier
fitting routines.  I'd been (probably falsely) assuming that everyone here
is doing science or engineering (come to think of it, I don't know if
there *are* any applications of bezier curves in science).

Google is very useful:

http://www.google.fr/search?q=bezier+python+fitting&hq=&hl=en&safe=off&csr=

http://sketch.sourceforge.net/devnotes.html

> Sketch 0.7.4 (December 23rd, 1999)
> [...]
> Moved more of the curve fitting code for the freehand tool to C to
> make it faster.
>
> [...]
> Sketch 0.7.3 (October 17th, 1999)
>
> A freehand tool. The implementation of the curve fitting is a bit slow
> at the moment, because much of the computation is done in Python.
> Moving more parts to C should improve performance substantially.

OTOH, perhaps you are working on this very program??


John


From rob at hooft.net  Mon Feb 19 02:50:47 2001
From: rob at hooft.net (Rob W. W. Hooft)
Date: Mon, 19 Feb 2001 08:50:47 +0100
Subject: [Numpy-discussion] PEP 209: Multi-dimensional Arrays
In-Reply-To: <14989.18319.218930.846896@nem-srvr.stsci.edu>
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
	<14986.14060.238048.161366@temoleh.chem.uu.nl>
	<14986.43001.213738.708354@nem-srvr.stsci.edu>
	<14986.62942.460585.961514@temoleh.chem.uu.nl>
	<14987.246.29693.379005@nem-srvr.stsci.edu>
	<14988.61523.833334.328664@temoleh.chem.uu.nl>
	<14989.18319.218930.846896@nem-srvr.stsci.edu>
Message-ID: <14992.53335.696707.589726@temoleh.chem.uu.nl>

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> we entirely agree with the above definitions of _view_ and
 PB> _copy_.  But you haven't told us what object associates your
 PB> Array and ArrayView to make a usable array that can be sliced,
 PB> diced, and Julian fried.

Hm. You know, I am not so deep into the python internals. I am a fairly
high-level programmer. Not a CS type, but a chemist.... There might be
much of implementation detail that escapes me. But I'm just trying to keep
things beautiful (as in Erich Gamma et.al.)

I thought an ArrayView would have a pointer to the data array. Like the
either like the .data attribute in the Numeric 1 API, or as a python
object pointer.

 PB> My impression of your slice method would be:

 PB> slice(Array, ArrayView, slice expression)

If ArrayView.HasA(Array), that would not be required.

Regards,

Rob Hooft.

-- 
=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========


From sdhyok at email.unc.edu  Tue Feb 20 23:04:28 2001
From: sdhyok at email.unc.edu (Daehyok Shin)
Date: Tue, 20 Feb 2001 20:04:28 -0800
Subject: [Numpy-discussion] Sparse matrix support?
References: <14985.22458.685587.538866@nem-srvr.stsci.edu>
Message-ID: <017101c09bbb$649d6480$56111918@nc.rr.com>

Is there any plan to support sparse matrices in NumPy?

Peter


From victor at idaccr.org  Thu Feb 22 16:47:30 2001
From: victor at idaccr.org (Victor S. Miller)
Date: 22 Feb 2001 16:47:30 -0500
Subject: [Numpy-discussion] Handling underflow
Message-ID: <ulg0h6tmod.fsf@runner.princeton.idaccr.org>

Is there some way of having calculations which cause underflow
automatically set their result to 0.0?  For example when I take
exp(a), where a is a floating point array.
-- 
Victor S. Miller     | " ... Meanwhile, those of us who can compute can hardly
victor at idaccr.org    | be expected to keep writing papers saying 'I can do the
CCR, Princeton, NJ   | following useless calculation in 2 seconds', and indeed
    08540 USA        | what editor would publish them?"  -- Oliver Atkin


From phrxy at csv.warwick.ac.uk  Fri Feb 23 07:38:54 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 23 Feb 2001 12:38:54 +0000 (GMT)
Subject: [Numpy-discussion] Handling underflow
In-Reply-To: <ulg0h6tmod.fsf@runner.princeton.idaccr.org>
Message-ID: <Pine.SOL.4.30.0102231236140.19229-100000@mimosa.csv.warwick.ac.uk>

On 22 Feb 2001, Victor S. Miller wrote:

> Is there some way of having calculations which cause underflow
> automatically set their result to 0.0?  For example when I take
> exp(a), where a is a floating point array.

Not 'automatically', but:

a = whatever()
choose(greater(a, MAX), (a, MAX))
answer = exp(-a)

any good?  This is with a 1D array -- I haven't used higher dimensions
much.


John


From phrxy at csv.warwick.ac.uk  Fri Feb 23 10:10:11 2001
From: phrxy at csv.warwick.ac.uk (John J. Lee)
Date: Fri, 23 Feb 2001 15:10:11 +0000 (GMT)
Subject: [Numpy-discussion] checking identity of arrays?
In-Reply-To: <Pine.SOL.4.30.0102231236140.19229-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <Pine.SOL.4.30.0102231500130.1734-100000@mimosa.csv.warwick.ac.uk>

I must be missing something obvious: how does one check if two variables
refer to the same array object?


John


From kern at its.caltech.edu  Fri Feb 23 10:35:51 2001
From: kern at its.caltech.edu (Robert Kern)
Date: Fri, 23 Feb 2001 07:35:51 -0800 (PST)
Subject: [Numpy-discussion] checking identity of arrays?
In-Reply-To: <Pine.SOL.4.30.0102231500130.1734-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <Pine.GSO.4.21.0102230734540.20009-100000@screwdriver>

On Fri, 23 Feb 2001, John J. Lee wrote:

> I must be missing something obvious: how does one check if two variables
> refer to the same array object?

Python's id() builtin function?

> John

--
Robert Kern
kern at caltech.edu

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter


From paul at pfdubois.com  Fri Feb 23 10:35:13 2001
From: paul at pfdubois.com (Paul F. Dubois)
Date: Fri, 23 Feb 2001 07:35:13 -0800
Subject: [Numpy-discussion] checking identity of arrays?
In-Reply-To: <Pine.SOL.4.30.0102231500130.1734-100000@mimosa.csv.warwick.ac.uk>
Message-ID: <ADEOIFHFONCLEEPKCACCKEPICFAA.paul@pfdubois.com>

if a is b:
...

-----Original Message-----
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of John
J. Lee
Sent: Friday, February 23, 2001 7:10 AM
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] checking identity of arrays?


I must be missing something obvious: how does one check if two variables
refer to the same array object?


John


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/numpy-discussion


From rlw at stsci.edu  Fri Feb 23 10:51:11 2001
From: rlw at stsci.edu (rlw at stsci.edu)
Date: Fri, 23 Feb 2001 10:51:11 -0500 (EST)
Subject: [Numpy-discussion] checking identity of arrays?
Message-ID: <200102231551.KAA17392@sundog.stsci.edu>

John J. Lee:

>I must be missing something obvious: how does one check if two variables
>refer to the same array object?

Paul Dubois:

>if a is b:

I thought he was asking something different.  Suppose I do this:

a = zeros(100)
b = a[10:20]

Now b is a view of a's data.  Is there any way to test that
a and b refer to the same data?  (Even if this was not John's
question, I'm curious to know the answer.)


From vanandel at atd.ucar.edu  Fri Feb 23 15:02:40 2001
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Fri, 23 Feb 2001 13:02:40 -0700
Subject: [Numpy-discussion] Threading, multi-processors, and Numeric Python
Message-ID: <3A96C1E0.1497E41E@atd.ucar.edu>

I've written Numeric Python code (with Python 1.5.2) to analyze weather
radar data.  In an attempt to speed up this code, I used threads to
perform some of the computations.  

I'm running on a dual processor Linux machine running 2.4.1 with SMP
enabled.  I'm using Numeric -17.3.0 with Python 1.5.2

When I run the threaded code, and monitor the system with 'top', 1
processor spends much of its time idle, and I rarely see two copies of
my 'compute' thread executing.  Each thread is computing its results
from different arrays.  However, all arrays are referenced from the same
dictionary.

Any ideas on how to get both threads computing at the same time?

Thanks for your help!

-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From vanandel at atd.ucar.edu  Fri Feb 23 17:18:35 2001
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Fri, 23 Feb 2001 15:18:35 -0700
Subject: [Numpy-discussion] Threading, multi-processors, and Numeric Python
References: <3A96C1E0.1497E41E@atd.ucar.edu>
Message-ID: <3A96E1BA.772C9CD0@atd.ucar.edu>

OK, I guess I've just discovered the answer to my own question.  [Don't
you just hate that! :-) ]


My C++ extensions that perform the real calculations needed to be
modified to support multiple threads.  By adding the
Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS macros, the Python
interpreter knew it was safe to allow other threads to execute during
computations, just like it allows other threads to execute during I/O.

 To accomodate the Python global thread lock, I needed to change my code
as follows:

my_func()
{

	/* python calls */

        Py_BEGIN_ALLOW_THREADS

        /* computations that don't use python API */


	Py_END_ALLOW_THREADS

	/* python API calls */
}

Now, I can see both processors being used.

-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From crag at arsdigita.com  Tue Feb 27 04:03:24 2001
From: crag at arsdigita.com (crag wolfe)
Date: Tue, 27 Feb 2001 04:03:24 -0500
Subject: [Numpy-discussion] Trouble importing LinearAlgebra
Message-ID: <3A9B6D5B.B8B16598@arsdigita.com>

Well, this a problem that is similar to the one discussed in the "Da
Blas" thread in September 2000.

My first command which works fine is "from Numeric import *".  The
problem is when I "import * from LinearAlgebra" I get "
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File
"/usr/local/lib/python2.0/site-packages/Numeric/LinearAlgebra.py", line
8, in ?
    import lapack_lite
ImportError: /usr/lib/liblapack.so.3: undefined symbol: e_wsfe"

I'm using Numeric-17.1.2.  I installed the lapack and blas rpms for Red
Hat 6.2, which I'm running.  I edited the setup.py file in the LALITE
directory as indicated.  I run "python setup_all.py install" which seems
to work fine (and I'm clearing out the build directories in between
install attempts).  "gcc -shared
build/temp.linux-i686-2.0/Src/lapack_litemodule.o -lblas -llapack -o
build/lib.linux-i686-2.0/lapack_lite.so" which scrolls by after running
the setup_all.py script, compiles without errors.

After the above failed, I tried manually running g77 in place of gcc (a
hint from the "Da Blas" thread) but then instead of the "undefined
symbol: e_wsfe" error I just get a seg fault.

Any help to get this package installed is much appreciated.
--Crag


From Barrett at stsci.edu  Tue Feb 27 13:25:26 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Tue, 27 Feb 2001 13:25:26 -0500 (EST)
Subject: [Numpy-discussion] Proposed agenda for Numeric Bof at Python 9
Message-ID: <15003.60611.366223.805114@nem-srvr.stsci.edu>

Here is the preliminary agenda for the Enhancing Numeric Python BoF at
Python 9.  We have requested the room for 2 hours (which may not be
enough time to discusss these contentious issues).

Please let me know if you would like a topic added.

 -- Paul


                       Enhancing Numeric Python

                                Agenda


1. Behavior  (~1 hr)

   a. Copy behavior for slice and item syntax

   b. Scalar coercion

   c. Record semantics

   d. Enhanced indexing

2. Implementation  (~1/2 hr)

   a. Type versus class approach

   b. C versus C++


From kern at its.caltech.edu  Tue Feb 27 13:36:26 2001
From: kern at its.caltech.edu (Robert Kern)
Date: Tue, 27 Feb 2001 10:36:26 -0800 (PST)
Subject: [Numpy-discussion] Trouble importing LinearAlgebra
In-Reply-To: <3A9B6D5B.B8B16598@arsdigita.com>
Message-ID: <Pine.GSO.4.21.0102271035040.25255-100000@screwdriver>

On Tue, 27 Feb 2001, crag wolfe wrote:

> Well, this a problem that is similar to the one discussed in the "Da
> Blas" thread in September 2000.
> 
> My first command which works fine is "from Numeric import *".  The
> problem is when I "import * from LinearAlgebra" I get "
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File
> "/usr/local/lib/python2.0/site-packages/Numeric/LinearAlgebra.py", line
> 8, in ?
>     import lapack_lite
> ImportError: /usr/lib/liblapack.so.3: undefined symbol: e_wsfe"

Try adding 'g2c' to the list of libraries. You may just end up 
with the same problem as linking with g77, but it's worth a shot.

[snip]

--
Robert Kern
kern at caltech.edu

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter