From vanandel at atd.ucar.edu  Wed Feb  2 13:35:34 2000
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Wed, 02 Feb 2000 11:35:34 -0700
Subject: [Numpy-discussion] single precision routines in NumPy?
Message-ID: <389878F6.B7F2DAF@atd.ucar.edu>

I would like a single precision version of 'interp' in the Numeric
Core.   (I want such a routine since I'm operating on huge single
precision arrays, that I don't want promoted to double precision.)

I've written such a routine, but Paul Dubois and I are discussing the
best way of integrating it into the core.

One solution is to simply add a new function 'interpf' to
arrayfnsmodule.c .  

Another solution is to add a typecode=Float option to interp.

Any opinions on how this single precision version be handled?


-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From tla at research.nj.nec.com  Thu Feb  3 16:57:41 2000
From: tla at research.nj.nec.com (Tom Adelman)
Date: Thu, 03 Feb 2000 16:57:41 -0500
Subject: [Numpy-discussion] newbie: PyArray_Check difficulties
Message-ID: <3.0.1.32.20000203165741.00958d00@zingo.nj.nec.com>

I'm having a problem with PyArray_Check.  If I just call
PyArray_Check(args) I don't have a problem, but if I try to assign the
result to anything, etc., it crashes (due to acces violation).  So, for
example the code at the end of this note doesn't work, yet I know an array
is being passed and I can, for example, calculate its trace correctly if I
type cast it as a PyArrayObject*.

Also, a more general question: is this the recommended way to input numpy
arrays when using swig, or do most people find it easier to use more
elaborate typemaps, or something else?

Finally, I apologize if this is the wrong forum to post this question.
Please let me know.

Thanks, Tom 


Method from C++ class:

PyObject * Test01::trace(PyObject * args)
{
	if (!(PyArray_Check(args))) {  // <- crashes here
		PyErr_SetString(PyExc_ValueError, "must use NumPy array");
		return NULL;
	}
	return NULL;
}


Swig file: (where typemaps are the ones included with most recent swig)

/* TMatrix.i */
%module Ptest
%include "typemaps.i"
%{
#include "Test01.h"
%}

class Test01  
{
public:
	PyObject * trace(PyObject *INPUT);
	Test01();
	virtual ~Test01();
};


Python code:

import Ptest
t = Ptest.Test01()

import Numeric
a = Numeric.arange(1.1, 2.7, .1)
b = Numeric.reshape(a, (4,4))

x = t.trace(b)


From Oliphant.Travis at mayo.edu  Fri Feb  4 15:49:34 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Fri, 4 Feb 2000 14:49:34 -0600 (CST)
Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #7 - 1 msg
In-Reply-To: <200002042005.MAA15424@lists.sourceforge.net>
Message-ID: <Pine.LNX.4.10.10002041445100.1101-100000@us2.mayo.edu>

> I'm having a problem with PyArray_Check.  If I just call
> PyArray_Check(args) I don't have a problem, but if I try to assign the
> result to anything, etc., it crashes (due to acces violation).  So, for
> example the code at the end of this note doesn't work, yet I know an array
> is being passed and I can, for example, calculate its trace correctly if I
> type cast it as a PyArrayObject*.
> 
> Also, a more general question: is this the recommended way to input numpy
> arrays when using swig, or do most people find it easier to use more
> elaborate typemaps, or something else?

I have some experience with SWIG but it is not my favorite method to use
Numerical Python with C, since you have so little control over how things
get allocated.

Your problem is probably due to the fact that you do not run
import_array() in the module header.  

There is a typemap in SWIG that let's you put commands to run at module
initialization.   Try this in your *.i file.

%init %{
        import_array();
%}

This may help.


Best,

Travis


From Oliphant.Travis at mayo.edu  Mon Feb  7 19:08:43 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Mon, 7 Feb 2000 18:08:43 -0600 (CST)
Subject: [Numpy-discussion] An Experiment in code-cleanup.
Message-ID: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>

I wanted to let users of the community know (so they can help if they
want, or offer criticism or comments) that over the next several
months I will be experimenting with a branch of the main Numerical source
tree and endeavoring to "clean-up" the code for Numerical Python.  I have
in mind a few (in my opinion minor) alterations to the current code-base
which necessitates a branch. 

Guido has made some good suggestions for improving the code base, and both
David Ascher and Paul Dubois have expressed concerns over the current
state of the source code and given suggestions as to how to improve it.  
That said, I should emphasize that my work is not authorized,
or endorsed, by any of the people mentioned above.  It is simply my little
experiment.

My intent is not to re-create Numerical Python --- I like most of the
current functionality --- but to merely, clean-up the code, comment it, 
and change the underlying structure just a bit and add some features I
want.  One goal I have is to create something that can go into Python
1.7 at some future point, so this incarnation of Numerical Python may not
be completely C-source compatible with current Numerical Python (but it
will be close).  This means C-extensions that access the underlying
structure of the current arrayobject may need some alterations to use this
experimental branch if it every becomes useful.

I don't know how long this will take me.  I'm not promising anything.  The
purpose of this announcement is just to invite interested parties into the
discussion.   

These are the (somewhat negotiable) directions I will be pursuing.

1) Still written in C but heavily (in my opinion) commented.

2) Addition of bit-types and unsigned integer types.

3) Facility for memory-mapped dataspace in arrays.

4) Slices become copies with the addition of methods for current strict
referencing behavior.

5) Handling of sliceobjects which consist of sequences of indices (so that
setting and getting elements of arrays using their index is possible). 

6) Rank-0 arrays will not be autoconverted to Python scalars, but will
still behave as Python scalars whenever Python allows general scalar-like
objects in it's operations.  Methods will allow the
user-controlled conversion to the Python scalars.  

7) Addition of attributes so that different users can configure aspects of
the math behavior, to their hearts content.

If their is anyone interested in helping in this "unofficial branch
work" let me know and we'll see about setting up someplace to work.  Be
warned, however, that I like actual code or code-templates more than just
great ideas (truly great ideas are never turned away however ;-) )

If something I do benefits the current NumPy source in a non-invasive,
backwards compatible way, I will try to place it in the current CVS tree,
but that won't be a priority, as my time does have limitations, and I'm
scratching my own itch at this point.

Best regards,

Travis Oliphant


From dubois1 at llnl.gov  Mon Feb  7 19:22:45 2000
From: dubois1 at llnl.gov (Paul F. Dubois)
Date: Mon, 7 Feb 2000 16:22:45 -0800
Subject: [Numpy-discussion] RE: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <NDBBIEFMILBFPMDHJIMFKEKLCCAA.dubois1@llnl.gov>

Travis says that I don't necessarily endorse his goals but in fact I do,
strongly.

If I understand right he intends to make a CVS branch for this experiment
and that is fine with me.

The only goal I didn't quite understand was:

Addition of attributes so that different users can configure aspects of
the math behavior, to their hearts content.

In a world of reusable components the situation is complicated. I would not
like to support a dot-product routine, for example, if the user could turn
off any double precision behind my back. My needs for precision are local to
my algorithm.


From archiver at db.geocrawler.com  Tue Feb  8 10:52:47 2000
From: archiver at db.geocrawler.com (John Travers)
Date: Tue, 8 Feb 2000 07:52:47 -0800
Subject: [Numpy-discussion] Re: A proposal for dot product
Message-ID: <200002081552.HAA10267@www.geocrawler.com>

This message was sent from Geocrawler.com by "John Travers" <j.c.travers at durham.ac.uk>
Be sure to reply to that address.

If the above was implemented, I would be very happy indeed. As a maths student, I use NumPy 
a lot. And get infuriated with the current implementation.

John.

Geocrawler.com - The Knowledge Archive


From hinsen at cnrs-orleans.fr  Tue Feb  8 12:12:56 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Tue, 8 Feb 2000 18:12:56 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> (message
	from Travis Oliphant on Mon, 7 Feb 2000 18:08:43 -0600 (CST))
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <200002081712.SAA03158@chinon.cnrs-orleans.fr>

> 3) Facility for memory-mapped dataspace in arrays.

I'd really like to have that...

> 4) Slices become copies with the addition of methods for current strict
> referencing behavior.

This will break a lot of code, and in a way that will be difficult to
debug. In fact, this is the only point you mention which would be
reason enough for me not to use your modified version; going through
all of my code to check what effect this might have sounds like a
nightmare.

I see the point of having a copying version as well, but why not
implement the copying behaviour as methods and leave indexing as it
is?

> 5) Handling of sliceobjects which consist of sequences of indices (so that
> setting and getting elements of arrays using their index is possible). 

Sounds good as well...

> 6) Rank-0 arrays will not be autoconverted to Python scalars, but will
> still behave as Python scalars whenever Python allows general scalar-like
> objects in it's operations.  Methods will allow the
> user-controlled conversion to the Python scalars.  

I suspect that full behaviour-compatibility with scalars is
impossible, but I am willing to be proven wrong. For example, Python
scalars are immutable, arrays aren't. This also means that rank-0
arrays can't be used as keys in dictionaries.

How do you plan to implement mixed arithmetic with scalars? If the
return value is a rank-0 array, then a single library returning
a rank-0 array somewhere could mess up a program well enough that
debugging becomes a nightmare.

> 7) Addition of attributes so that different users can configure aspects of
> the math behavior, to their hearts content.

You mean global attributes? That could be the end of universally
usable library modules, supposing that people actually use them.

> If their is anyone interested in helping in this "unofficial branch
> work" let me know and we'll see about setting up someplace to work.  Be

I don't have much time at the moment, but I could still help out with
testing etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen at dirac.cnrs-orleans.fr  Tue Feb  8 12:13:20 2000
From: hinsen at dirac.cnrs-orleans.fr (hinsen at dirac.cnrs-orleans.fr)
Date: Tue, 8 Feb 2000 18:13:20 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> (message
	from Travis Oliphant on Mon, 7 Feb 2000 18:08:43 -0600 (CST))
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <200002081713.SAA03161@chinon.cnrs-orleans.fr>

> 3) Facility for memory-mapped dataspace in arrays.

I'd really like to have that...

> 4) Slices become copies with the addition of methods for current strict
> referencing behavior.

This will break a lot of code, and in a way that will be difficult to
debug. In fact, this is the only point you mention which would be
reason enough for me not to use your modified version; going through
all of my code to check what effect this might have sounds like a
nightmare.

I see the point of having a copying version as well, but why not
implement the copying behaviour as methods and leave indexing as it
is?

> 5) Handling of sliceobjects which consist of sequences of indices (so that
> setting and getting elements of arrays using their index is possible). 

Sounds good as well...

> 6) Rank-0 arrays will not be autoconverted to Python scalars, but will
> still behave as Python scalars whenever Python allows general scalar-like
> objects in it's operations.  Methods will allow the
> user-controlled conversion to the Python scalars.  

I suspect that full behaviour-compatibility with scalars is
impossible, but I am willing to be proven wrong. For example, Python
scalars are immutable, arrays aren't. This also means that rank-0
arrays can't be used as keys in dictionaries.

How do you plan to implement mixed arithmetic with scalars? If the
return value is a rank-0 array, then a single library returning
a rank-0 array somewhere could mess up a program well enough that
debugging becomes a nightmare.

> 7) Addition of attributes so that different users can configure aspects of
> the math behavior, to their hearts content.

You mean global attributes? That could be the end of universally
usable library modules, supposing that people actually use them.

> If their is anyone interested in helping in this "unofficial branch
> work" let me know and we'll see about setting up someplace to work.  Be

I don't have much time at the moment, but I could still help out with
testing etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Oliphant.Travis at mayo.edu  Tue Feb  8 12:38:26 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Tue, 8 Feb 2000 11:38:26 -0600 (CST)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <200002081712.SAA03158@chinon.cnrs-orleans.fr>
Message-ID: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>

> > 3) Facility for memory-mapped dataspace in arrays.
> 
> I'd really like to have that...

This is pretty easy to add but it does require some changes to the
underlying structure, So you can expect it.
> 
> > 4) Slices become copies with the addition of methods for current strict
> > referencing behavior.
> 
> This will break a lot of code, and in a way that will be difficult to
> debug. In fact, this is the only point you mention which would be
> reason enough for me not to use your modified version; going through
> all of my code to check what effect this might have sounds like a
> nightmare.

I know this will be a sticky point.  I'm not sure what to do exactly, but
the current behavior and implementation makes the semantics for slicing an
array using a sequence problematic since I don't see a way to represent a
reference to a sequence of indices in the underlying structure of an
array. So such slices would have to be copies and not references, which
makes for an inconsistent code.  

> 
> I see the point of having a copying version as well, but why not
> implement the copying behaviour as methods and leave indexing as it
> is?

I want to agree with you, but I think we may need to change the behavior
eventually so when is it going to happen?

> 
> > 5) Handling of sliceobjects which consist of sequences of indices (so that
> > setting and getting elements of arrays using their index is possible). 
> 
> Sounds good as well...

This facility is already embedded in the underlying structure.  My plan is
to go with the original idea that Jim Hugunin and Chris Chase had for
slice objects.   The sliceobject in python is already general enough for
this to work.

> 
> > 6) Rank-0 arrays will not be autoconverted to Python scalars, but will
> > still behave as Python scalars whenever Python allows general scalar-like
> > objects in it's operations.  Methods will allow the
> > user-controlled conversion to the Python scalars.  
> 
> I suspect that full behaviour-compatibility with scalars is
> impossible, but I am willing to be proven wrong. For example, Python
> scalars are immutable, arrays aren't. This also means that rank-0
> arrays can't be used as keys in dictionaries.
> 
> How do you plan to implement mixed arithmetic with scalars? If the
> return value is a rank-0 array, then a single library returning
> a rank-0 array somewhere could mess up a program well enough that
> debugging becomes a nightmare.
>

Mixed arithmetic in general is another sticky point.  I went back and read
the discussion of this point which occured 1995-1996.  It was very
interesting reading and a lot of points were made.  Now we have several
years of experience and we should apply what we've learned (of course
we've all learned different things :-) ).  

Konrad, you had a lot to say on this point 4 years ago.  I've had a long
discussion with a colleague who is starting to "get in" to Numerical
Python and he has really been annoyed with the current mixed arithmetic
rules.  The seem to try to outguess the user.  The spacesaving concept
helps, but it still seem's like a hack to me.

I know there are several opinions, so I'll offer mine.  We need 
simple rules that are easy to teach a newcomer.  Right now the rule is
farily simple in that coercion always proceeds up.  But, mixed arithmetic
with a float and a double does not produce something with double
precision -- yet that's our rule.  I think any automatic conversion should
go the other way.  

Konrad, 4 years ago, you talked about unexpected losses of precision if
this were allowed to happen, but I couldn't understand how.  To me, it is
unexpected to have double precision arrays which are really only
carrying single-precision results.  My idea of the coercion hierchy is 
shown below with conversion always happening down when called for.  The
Python scalars get mapped to the "largest precision" in their category and
then normal coercions rules take place.  

The casual user will never use single precision arrays and so will not
even notice they are there unless they request them.   If they do request
them, they don't want them suddenly changing precision.  That is my take
anyway.

Boolean 
Character
Unsigned
        long
	int
	short
Signed
	long
	int 
	short
Real
	/* long double */
	double
	float
Complex
	/* __complex__ long double */
	__complex__ double
	__complex__ float
Object

> > 7) Addition of attributes so that different users can configure aspects of
> > the math behavior, to their hearts content.
> 
> You mean global attributes? That could be the end of universally
> usable library modules, supposing that people actually use them.

I thought I did, but I've changed my mind after reading the discussion in
1995.  I don't like global attributes either, so I'm not going there.

> 
> > If their is anyone interested in helping in this "unofficial branch
> > work" let me know and we'll see about setting up someplace to work.  Be
> 
> I don't have much time at the moment, but I could still help out with
> testing etc.

Konrad you were very instrumental in getting NumPy off the ground in the
first place and I will always appreciate your input.


From pauldubois at home.com  Tue Feb  8 12:56:11 2000
From: pauldubois at home.com (Paul F. Dubois)
Date: Tue, 8 Feb 2000 09:56:11 -0800
Subject: [Numpy-discussion] precision isn't just precision
In-Reply-To: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>
Message-ID: <NDBBIEFMILBFPMDHJIMFOELECCAA.pauldubois@home.com>

Before we all rattle on too long about precision, I'd like to point out that
selecting a precision actually carries two consequences in the context of
computer languages:

1. Expected: The number of digits of accuracy in the representation of a
floating point number.
2. Unexpected: The range of numbers that can be represented by this type.

Thus, to a scientist it is perfectly logical that if d is a double and f is
a single,

d * f

has only single precision validity.

Unfortunately in a computer if you hold this answer in a single, then it may
fail if the contents of d include numbers outside the single range, even if
f is 1.0.

Thus the rules in C and Fortran that coercion is UP had to do as much with
range as precision.


From pearu at ioc.ee  Tue Feb  8 14:46:16 2000
From: pearu at ioc.ee (Pearu Peterson)
Date: Tue, 8 Feb 2000 21:46:16 +0200 (EET)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>
Message-ID: <Pine.HPX.4.05.10002082122490.8962-100000@egoist.ioc.ee>

On Tue, 8 Feb 2000, Travis Oliphant wrote:

> I know there are several opinions, so I'll offer mine.  We need 
> simple rules that are easy to teach a newcomer.  Right now the rule is
> farily simple in that coercion always proceeds up.  But, mixed arithmetic
> with a float and a double does not produce something with double
> precision -- yet that's our rule.  I think any automatic conversion should
> go the other way.  

Remark:
If you are consistent then you say here that mixed arithmetic with an int
and a float/double produces int?! Right? (I hope that I am wrong.)

> Boolean 
> Character
> Unsigned
>         long
> 	int
> 	short
> Signed
> 	long
> 	int 
> 	short

How about `/* long long */'? Is this left out intentionally?

> Real
> 	/* long double */
> 	double
> 	float

Travis, while you are doing revision on NumPy, could you also estimate the
degree of difficulty of introducing column major order arrays?

Pearu


From hinsen at cnrs-orleans.fr  Tue Feb  8 14:56:21 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Tue, 8 Feb 2000 20:56:21 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> (message
	from Travis Oliphant on Tue, 8 Feb 2000 11:38:26 -0600 (CST))
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>
Message-ID: <200002081956.UAA03241@chinon.cnrs-orleans.fr>

> I know this will be a sticky point.  I'm not sure what to do exactly, but
> the current behavior and implementation makes the semantics for slicing an
> array using a sequence problematic since I don't see a way to represent a

You are right there. But is it really necessary to extend the meaning
of slices? Of course everyone wants the functionality of indexing with
a sequence, but I'd be perfectly happy to have it implemented as a
method. Indexing would remain as it is (by reference), and a new
method would provide copying behaviour for element extraction and also
permit more generalized sequence indices.

In addition to backwards compatibility, there is another argument for
keeping indexing behaviour as it is: compatibility with other Python
sequence types. If you have a list of lists, which in many ways
behaves like a 2D array, and extract the third element (which is thus
a list), then this data is shared with the full nested list.

> > How do you plan to implement mixed arithmetic with scalars? If the
> > return value is a rank-0 array, then a single library returning
> > a rank-0 array somewhere could mess up a program well enough that
> > debugging becomes a nightmare.
> >
> 
> Mixed arithmetic in general is another sticky point.  I went back and read
> the discussion of this point which occured 1995-1996.  It was very

What I meant was not mixed-precision arithmetic, but arithmetic in which
one operand is a scalar and the other one a rank-0 array.

Which reminds me: rank-0 arrays are also incompatible with the
nested-list view of arrays. The elements of a list of numbers are
numbers, not number-like sequence objects.

But back to precision, which is also a popular subject:

> discussion with a colleague who is starting to "get in" to Numerical
> Python and he has really been annoyed with the current mixed arithmetic
> rules.  The seem to try to outguess the user.  The spacesaving concept
> helps, but it still seem's like a hack to me.

I wouldn't say that the current system tries to outguess the user. It
simply gives precision a higher priority than memory space. That might
not coincide with what a particular user wants, but it is consistent
and easy to understand.

> I know there are several opinions, so I'll offer mine.  We need 
> simple rules that are easy to teach a newcomer.  Right now the rule is
> farily simple in that coercion always proceeds up.  But, mixed arithmetic

Like in Python (for scalars), C, Fortran, and all other languages that
I can think of.

> Konrad, 4 years ago, you talked about unexpected losses of precision if
> this were allowed to happen, but I couldn't understand how.  To me, it is
> unexpected to have double precision arrays which are really only
> carrying single-precision results.  My idea of the coercion hierchy is 

I think this is a confusion of two different meanings of "precision".
In numerical algorithms, precision refers to the deviation between an
ideal and a real numerical value. In programming languages, it refers
to the *maximum* precision that can be stored in a given data type
(and is in fact often combined with a difference in range).

The upcasting rule thus ensures that

1) No precision is lost accidentally. If you multiply a float by
   a double, the float might contain the exact number 2, and thus
   have infinite precision. The language can't know this, so it
   acts conservatively and chooses the "bigger" type.

2) No overflow occurs unless it is unavoidable (the range problem).

> The casual user will never use single precision arrays and so will not
> even notice they are there unless they request them.   If they do request

There are many ways in which single-precision arrays can creep into a
program without a user's attention. Suppose you send me some data in a
pickled array, which happens to be single-precision. Or I call a
library routine that does some internal calculation on huge data
arrays, which it keeps at single precision, and (intentionally or by
error) returns a single-precision result.

I think your "active flag" solution is a rather good solution to the
casting problem, because it gives access to a different behaviour in a
very explicit way. So unless future experience points out problems,
I'd propose to keep it.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Barrett at stsci.edu  Tue Feb  8 15:10:39 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Tue,  8 Feb 2000 15:10:39 -0500 (EST)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002081110510.6700-100000@us2.mayo.edu>
References: <14496.16890.698835.619131@nem-srvr.stsci.edu>
	<Pine.LNX.4.10.10002081110510.6700-100000@us2.mayo.edu>
Message-ID: <14496.26037.829754.450187@nem-srvr.stsci.edu>

Travis Oliphant writes:
 > > 
 > > 1) The re-use of temporary arrays -- to conserve memory.
 > 
 > Please elaborate about this request.

When Python evaluates the expression:

>>> Y = B*X + A

where A, B, X, and Y are all arrays, B*X creates a temporary array, T.
A new array, Y, will be created to hold the result of T + A, and T
will be deleted.  If T and Y have the same shape and typecode, then
instead of creating Y, T can be re-used to conserve memory.

 > > 
 > > 2) A copy-on-write option -- to enhance performance.
 > > 
 > 
 > I need more explanation of this as well.

This would be an advanced feature of arrays that use memory-mapping or 
access their arrays from disk.  It is similar to the secondary cache
of a CPU.  The data is held in memory until a write request is made.

 > >
 > > 3) The initialization of arrays by default -- to help novices.
 > 
 > What kind of initialization are you taking about (we have zeros and ones
 > and random already).

For mixed-type (or object) arrays containing strings, zeros() and
ones() would be confusing.  Therefore by default, integer and floating
types are initialized to 0 and string types to ' ', and the option
would be available to not initialize the array for performance.

 > > 
 > > 4) The creation of a standard API -- which I guess is assumed, if it
 > >    is to be part of the Python standard distribution.
 > 
 > Any suggestions as to what needs to be changed in the already somewhat
 > standard API.

No, not exactly.  But the last time I looked, I thought some
improvements could be made to it.

 > > 
 > > 5) The inclusion of IEEE support.
 > 
 > This was supposed to be there from the beginning, but it didn't get
 > finished.  Jim's original idea was to have two math modules, one which
 > checked and gave error's for 1/0 and another that returned IEEE inf for
 > 1/0. 
 > 
 > The current umath does both with different types which is annoying. 

When I last spoke to Jim about this at IPC6, I was under the
impression that IEEE support was not fully implemented and much work 
still needed to be done.  Has this situation changed since then?

 > > 
 > >    And
 > > 
 > > 6) Enhanced support for mixed-types or objects.
 > > 
 > > This last issue is very import to me and the astronomical community,
 > > since we routinely store data as (multi-dimensional) arrays of fixed
 > > length records or C-structures.  A current deficiency of NumPy is that
 > > the object typecode does not work with the fromstring() method, so
 > > importing arrays of records from a binary file is just not possible.
 > > I've been developing my own C-extension type to handle this situation
 > > and have come to realize that my record type is really just a
 > > generalization of NumPy's types.  
 > 
 > 
 > I would like to see the code for your generalized type which would help me
 > see if there were some relatively painless way the two could be merged.

recordmodule.c is part of my PyFITS module for dealing with FITS
files.  You can find it here:

   ftp://ra.stsci.edu/pub/barrett/PyFITS_0.3.tgz

I use NumPy to access fixed-type arrays and the record type for
accessing mixed-type arrays.  A common example is accessing the second
element of a mixed-type (ie. an object) from the entire array.  This
returns a record type with a single element, which is equivalent to a
NumPy array of fixed type.  Therefore users expect this object to be a 
NumPy array and it isn't. They have to convert it to one.

 > > two C-extension types merged.  I think this enhancement can be done
 > > with minimal change to the current NumPy behavior and minor changes to
 > > the typecode system.
 > 
 > If you already see how to do it, then great.

Note that NumPy already has some support for an Object type.  It has
been proposed that it be removed, because it is not well supported and
hence few people use it.  I have the contrary opinion and feel we
should enhance the Object type and make it much more usable.  If you
don't need it, then you don't have to use it.  This enhancement really
shouldn't get in the way of those who only use fixed-type arrays.

So what changes to NumPy are needed?

1) Instead of a typecode (or in addition to the typecode for backward
   compatibility), I suggest an optional format keyword, which can be
   used to specify the mixed-type or object format.  Namely, format =
   'i, f, s10', where 'i' is an integer type, 'f' a floating point
   type, and s10 is a string of 10 characters.

2) Array access will be the same as it is now.  For example

   #  Create a 10x10 mixed-type array.
   A = array((10, 10), format = 'i, f, 10s')
   #  Create a 10x10 fixed-type array.
   B = array((10, 10), typecode = 'i')

   #  Print a 5x5 subarray of mixed-type.
   print A[:5,:5]

   #  Print a 5x5 subarray of fixed-type
   print B[:5,:5]
   #  Or 
   #  (Note that the 3rd index is optional for fixed-type arrays, it
   #  always defaults to 0.)
   print B[:5,:5,0]
   
   #  Print the second element of the mixed-type of the entire array.
   #  Note that this is now an array of fixed-type.
   print A[:,:,1]

   The major thorn that I see at this point is how to reconcile the
   behavior of numbers and strings during operations.  But I don't see 
   this as an intractable problem.

   I actually believe this enhancement will encourage us to create a
   better and more generic multi-dimensional array module by
   concentrating on the behavioral aspects of this extension type.

   Note that J, which NumPy is base upon, allows such mixed-types.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-516-6714    DESD/DPT
FAX:   410-516-8615    Baltimore, MD 21218


From pauldubois at home.com  Tue Feb  8 15:16:55 2000
From: pauldubois at home.com (Paul F. Dubois)
Date: Tue, 8 Feb 2000 12:16:55 -0800
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <200002081956.UAA03241@chinon.cnrs-orleans.fr>
Message-ID: <NDBBIEFMILBFPMDHJIMFCELICCAA.pauldubois@home.com>

Konrad wrote:
>
> In addition to backwards compatibility, there is another argument for
> keeping indexing behaviour as it is: compatibility with other Python
> sequence types.

I claim the current Numeric is INconsistent with other Python sequence
types:

>>> x = [1, 2, 3, 4, 5]
>>> y = x[2:5]
>>> x
[1, 2, 3, 4, 5]
>>> y
[3, 4, 5]
>>> y[1] = 7
>>> y
[3, 7, 5]
>>> x
[1, 2, 3, 4, 5]

So, y is a copy of x[2:5], not a reference.


From DavidA at ActiveState.com  Tue Feb  8 15:30:23 2000
From: DavidA at ActiveState.com (David Ascher)
Date: Tue, 8 Feb 2000 12:30:23 -0800
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <14496.26037.829754.450187@nem-srvr.stsci.edu>
Message-ID: <000101bf7273$531e3f80$c355cfc0@ski.org>

> So what changes to NumPy are needed?
>
> 1) Instead of a typecode (or in addition to the typecode for backward
>    compatibility), I suggest an optional format keyword, which can be
>    used to specify the mixed-type or object format.  Namely, format =
>    'i, f, s10', where 'i' is an integer type, 'f' a floating point
>    type, and s10 is a string of 10 characters.

I'd suggest to go all the way and make it a real object, not just a string.
That object can then have useful attributes, like size in bytes, maxval,
minval, some indication of precision, etc.

Logically, itemsize should be an attribute of the numeric type of an array,
not of the array itself.

--david ascher


From beausol at exch.hpl.hp.com  Tue Feb  8 16:31:30 2000
From: beausol at exch.hpl.hp.com (Beausoleil, Raymond)
Date: Tue, 8 Feb 2000 13:31:30 -0800 
Subject: [Numpy-discussion] RE: [Matrix-SIG] An Experiment in code-cleanup.
Message-ID: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>

I've been reading the posts on this topic with considerable interest. For a
moment, I want to emphasize the "code-cleanup" angle more literally than the
functionality mods suggested so far.

Several months ago, I hacked my personal copy of the NumPy distribution so
that I could use the Intel Math Kernel Library for Windows. The IMKL is
(1) freely available from Intel at
http://developer.intel.com/vtune/perflibst/mkl/index.htm;
(2) basically BLAS and LAPACK, with an FFT or two thrown in for good
measure;
(3) optimized for the different x86 processors (e.g., generic x86, Pentium
II & III);
(4) configured to use 1, 2, or 4 processors; and
(5) configured to use multithreading.
It is an impressive, fast implementation. I'm sure there are similar native
libraries available on other platforms.

Probably due to my inexperience with both Python and NumPy, it took me a
couple of days to successfully tear out the f2c'd stuff and get the IMKL
linking correctly. The parts I've used work fine, but there are probably
other features that I haven't tested yet that still aren't up to snuff. In
any case, the resulting code wasn't very pretty.

As long as the NumPy code is going to be commented and cleaned up, I'd be
glad to help make sure that the process of using a native BLAS/LAPACK
distribution (which was probably compiled using Fortran storage and naming
conventions) is more straightforward. Among the more tedious issues to
consider are:
(1) The extent of the support for LAPACK. Do we want to stick with LAPACK
Lite?
(2) The storage format. If we've still got row-ordered matrices under the
hood, and we want to use native LAPACK libraries that were compiled using
column-major format, then we'll have to be careful to set all of the flags
correctly. This isn't going to be a big deal, _unless_ NumPy will support
more of LAPACK when a native library is available. Then, of course, there
are the special cases: the IMKL has both a C and a Fortran interface to the
BLAS.
(3) Through the judicious use of header files with compiler-dependent flags,
we could accommodate the various naming conventions used when the FORTRAN
libraries were compiled (e.g., sgetrf_ or SGETRF).

The primary output of this effort would be an expansion of the "Compilation
Notes" subsection of Section 15 of the NumPy documentation, and some header
files that made the recompilation easier than it is now.

Regards,

Ray

============================
Ray Beausoleil
Hewlett-Packard Laboratories
mailto:beausol at hpl.hp.com
Vox: 425-883-6648
Fax: 425-883-2535
HP Telnet: 957-4951
============================


From Oliphant.Travis at mayo.edu  Tue Feb  8 16:32:57 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Tue, 8 Feb 2000 15:32:57 -0600 (CST)
Subject: [Numpy-discussion] Come take an informal survey.
In-Reply-To: <200002082004.MAA26529@lists.sourceforge.net>
Message-ID: <Pine.LNX.4.10.10002081530250.7305-100000@us2.mayo.edu>

In an effort to try and get data about what users' attitudes are toward
Numerical Python, I'm conducting a survey at sourceforge.net

If you would like to participate in the survey, please go to
http://www.sourceforge.net, log-in with your sourceforge id and go to the
numpy page:
http://sourceforge.net/project/?group_id=1369

In the Public Survey section there is a short survey you can fill out.

Thank you,

Travis Oliphant
NumPy Developer


From phil at geog.ubc.ca  Tue Feb  8 18:33:18 2000
From: phil at geog.ubc.ca (Phil Austin)
Date: Tue, 8 Feb 2000 15:33:18 -0800 (PST)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <14496.42942.5355.849670@brant.geog.ubc.ca>

Travis Oliphant writes:
 > 
 > 3) Facility for memory-mapped dataspace in arrays.
 > 

For the NumPy users who are as ignorant about mmap, msync,
and madvise as I am, I've put a couple of documents on
my web site:

1) http://www.geog.ubc.ca/~phil/mmap/mmap.pdf

A pdf version of Kevin Sheehan's paper: "Why aren't you
using mmap yet?" (19 page Frame postscript orginal, page order
back to front).  He gives a good discusion of the SV4 VM model,
with some mmap examples in C.

2) http://www.geog.ubc.ca/~phil/mmap/threads.html

An archived email exchange (initially on the F90 mailing list) between
Kevin (who is an independent Solaris consultant) and Brian Sumner
(SGI) about the pros and cons of using mmap.  

Executive summary:

i) mmap on Solaris can be a very big win (see bottom of
http://www.geog.ubc.ca/~phil/mmap/msg00003.html) when
used in combination with WILLNEED/WONTNEED  madvise calls to
guide the page prefetching.

ii) IRIX and some other Unices (Linux 2.2 in particular), haven't
implemented madvise, and naive use of mmap without madvise can produce
lots of page faulting and much slower io than, say, asynchronous io
calls on IRIX.  (http://www.geog.ubc.ca/~phil/mmap/msg00009.html)

So I'd love to see mmap in Numpy, but we may need to produce a
tutorial outlining the tradeoffs, and giving some examples of
madvise/msync/mmap used together (with a few benchmarks).  Any mmap
module would need to include member functions that call madvise/msync
for the mmapped array (but these may be no-ops on several popular OSes.)

Regards, Phil


From jrwebb at goodnet.com  Tue Feb  8 01:03:42 2000
From: jrwebb at goodnet.com (James R. Webb)
Date: Mon, 7 Feb 2000 23:03:42 -0700
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>
Message-ID: <001801bf71fa$41f681a0$01f936d1@janus>

There is now a linux native BLAS available through links at
http://www.cs.utk.edu/~ghenry/distrib/  courtesy of the ASCI Option Red
Project.

There is also ATLAS (http://www.netlib.org/atlas/).

Either library seems to link into NumPy without a hitch.

----- Original Message -----
From: "Beausoleil, Raymond" <beausol at exch.hpl.hp.com>
To: <numpy-discussion at lists.sourceforge.net>
Cc: <matrix-sig at python.org>
Sent: Tuesday, February 08, 2000 2:31 PM
Subject: RE: [Matrix-SIG] An Experiment in code-cleanup.


> I've been reading the posts on this topic with considerable interest. For
a
> moment, I want to emphasize the "code-cleanup" angle more literally than
the
> functionality mods suggested so far.
>
> Several months ago, I hacked my personal copy of the NumPy distribution so
> that I could use the Intel Math Kernel Library for Windows. The IMKL is
> (1) freely available from Intel at
> http://developer.intel.com/vtune/perflibst/mkl/index.htm;
> (2) basically BLAS and LAPACK, with an FFT or two thrown in for good
> measure;
> (3) optimized for the different x86 processors (e.g., generic x86, Pentium
> II & III);
> (4) configured to use 1, 2, or 4 processors; and
> (5) configured to use multithreading.
> It is an impressive, fast implementation. I'm sure there are similar
native
> libraries available on other platforms.
>
> Probably due to my inexperience with both Python and NumPy, it took me a
> couple of days to successfully tear out the f2c'd stuff and get the IMKL
> linking correctly. The parts I've used work fine, but there are probably
> other features that I haven't tested yet that still aren't up to snuff. In
> any case, the resulting code wasn't very pretty.
>
> As long as the NumPy code is going to be commented and cleaned up, I'd be
> glad to help make sure that the process of using a native BLAS/LAPACK
> distribution (which was probably compiled using Fortran storage and naming
> conventions) is more straightforward. Among the more tedious issues to
> consider are:
> (1) The extent of the support for LAPACK. Do we want to stick with LAPACK
> Lite?
> (2) The storage format. If we've still got row-ordered matrices under the
> hood, and we want to use native LAPACK libraries that were compiled using
> column-major format, then we'll have to be careful to set all of the flags
> correctly. This isn't going to be a big deal, _unless_ NumPy will support
> more of LAPACK when a native library is available. Then, of course, there
> are the special cases: the IMKL has both a C and a Fortran interface to
the
> BLAS.
> (3) Through the judicious use of header files with compiler-dependent
flags,
> we could accommodate the various naming conventions used when the FORTRAN
> libraries were compiled (e.g., sgetrf_ or SGETRF).
>
> The primary output of this effort would be an expansion of the
"Compilation
> Notes" subsection of Section 15 of the NumPy documentation, and some
header
> files that made the recompilation easier than it is now.
>
> Regards,
>
> Ray
>
> ============================
> Ray Beausoleil
> Hewlett-Packard Laboratories
> mailto:beausol at hpl.hp.com
> Vox: 425-883-6648
> Fax: 425-883-2535
> HP Telnet: 957-4951
> ============================
>
> _______________________________________________
> Matrix-SIG maillist  -  Matrix-SIG at python.org
> http://www.python.org/mailman/listinfo/matrix-sig
>


From amullhau at zen-pharaohs.com  Wed Feb  9 01:51:09 2000
From: amullhau at zen-pharaohs.com (Andrew P. Mullhaupt)
Date: Wed, 9 Feb 2000 01:51:09 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr>
Message-ID: <03f401bf72ca$0e0608e0$5063cb0a@amullhau>

> I'd be perfectly happy to have it implemented as a
> method. Indexing would remain as it is (by reference), and a new
> method would provide copying behaviour for element extraction and also
> permit more generalized sequence indices.

I think I can live with that, as long as it _syntactically_ looks like
indexing. This is one case where the syntax is more important than
functionality. There are things you want to index with indices, etc., and
the composition with parenthesis-like (Dyck language) syntax has proved to
be one of the few readable ways to do it.

> In addition to backwards compatibility, there is another argument for
> keeping indexing behaviour as it is: compatibility with other Python
> sequence types. If you have a list of lists, which in many ways
> behaves like a 2D array, and extract the third element (which is thus
> a list), then this data is shared with the full nested list.

_Avoiding_ data sharing will eventually be more important that supporting
data sharing since memory continues to get cheaper but memory bandwidth and
latency do not improve at the same rate. Locality of reference is hard to
control when there is a lot of default data sharing, and performance
suffers, yet it becomes important on more and more scales as memory systems
become more and more hierarchical.

Ultimately, the _semantics_ we like will be implemented efficiently by
emulating references and copies in code which copies and references as it
sees fit and keeps track of which copies are "really" references and which
references are really "copies". I've thought this through for the
"everything gets copied" languages and it isn't too mentally distressing -
you simply reference count fake copies. The "everything is a reference"
languages are less clean, but the database people have confronted that
problem.

> Which reminds me: rank-0 arrays are also incompatible with the
> nested-list view of arrays.

There are ways out of that trap. Most post-ISO APLs provide examples of how
to cope.

> > I know there are several opinions, so I'll offer mine.  We need
> > simple rules that are easy to teach a newcomer.  Right now the rule is
> > farily simple in that coercion always proceeds up.  But, mixed
arithmetic
>
> Like in Python (for scalars), C, Fortran, and all other languages that
> I can think of.

And that is not a bad thing. But which way is "up"? (See example below.)

> > Konrad, 4 years ago, you talked about unexpected losses of precision if
> > this were allowed to happen, but I couldn't understand how.  To me, it
is
> > unexpected to have double precision arrays which are really only
> > carrying single-precision results.

Most people always hate, and only sometimes detect, when that happens. It
specifically contravenes the Geneva conventions on programming mental
hygiene.

> The upcasting rule thus ensures that
>
> 1) No precision is lost accidentally.

More or less.

More precisely, it depends on what you call an accident. What happens when
you add the IEEE single precision floating point value 1.0 to the 32-bit
integer 2^30? A _lot_ of people don't expect to get the IEEE single
precision floating point value 2.0^30, but that is what happens in some
languages. Is that an "upcast"? Would the 32 bit integer 2^30 make more
sense? Now what about the case where the 32 bit integer is signed and adding
one to it will "wrap around" if the value remains an integer? Because these
two examples might make double precision or a wider integer (if available)
seem the correct answer, suppose it's only one element of a gigantic array?
Let's now talk about complex values....

There's plenty of rough edges like this when you mix numerical types. It's
guaranteed that everybody's ox will get gored somewhere.

> 2) No overflow occurs unless it is unavoidable (the range problem).
>
> > The casual user will never use single precision arrays and so will not
> > even notice they are there unless they request them.   If they do
request
>
> There are many ways in which single-precision arrays can creep into a
> program without a user's attention.

Absolutely.

> Suppose you send me some data in a
> pickled array, which happens to be single-precision. Or I call a
> library routine that does some internal calculation on huge data
> arrays, which it keeps at single precision, and (intentionally or by
> error) returns a single-precision result.

And the worst one is when the accuracy of the result is single precision,
but the _type_ of the result is double precision. There is a function in
S-plus which does this (without documenting it, of course) and man was that
a pain in the neck to sort out.

Today, I just found another bug in one of the S-plus functions - turns out
that that if you hand a complex triangular matrix and a real right hand side
to the triangular solver (backsubstitution) it doesn't cast the right hand
side to complex and uses whatever values are subsequent in memoty to the
right hand side as if they were part of the vector. Obviously, when testing
the function, they didn't try this mixed type case. But interpreters are
really convenient for writing code so that you _don't_ have to think about
types all the time and do your own casting. Which is why stubbing your head
on an unexpected cast is so unlooked for.

> I think your "active flag" solution is a rather good solution to the
> casting problem, because it gives access to a different behaviour in a
> very explicit way. So unless future experience points out problems,
> I'd propose to keep it.

Is there a simple way to ensure that no active arrays are ever activated at
any time when I use Numerical Python?

Later,
Andrew Mullhaupt


From amullhau at zen-pharaohs.com  Wed Feb  9 02:17:39 2000
From: amullhau at zen-pharaohs.com (Andrew P. Mullhaupt)
Date: Wed, 9 Feb 2000 02:17:39 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> <14496.42942.5355.849670@brant.geog.ubc.ca>
Message-ID: <03fe01bf72cd$c040d640$5063cb0a@amullhau>

> Travis Oliphant writes:
>  >
>  > 3) Facility for memory-mapped dataspace in arrays.
>
> For the NumPy users who are as ignorant about mmap, msync,
> and madvise as I am, I've put a couple of documents on
> my web site:

I have Kevin's "Why Aren't You Using mmap() Yet?" on my site. Kevin is
working on a new (11th anniversary edition? 1xth anniversary edition?).

By the way, Uresh Vahalia's book on Unix Internals is a very good idea for
anyone not yet familiar with modern operating systems, especially Unices.

Kevin is extremely knowledgable on this subject, and several others.

> Executive summary:
>
> i) mmap on Solaris can be a very big win

Orders of magnitude.

> (see bottom of
> http://www.geog.ubc.ca/~phil/mmap/msg00003.html) when
> used in combination with WILLNEED/WONTNEED  madvise calls to
> guide the page prefetching.

And with the newer versions of Solaris, madvise() is a good way to go.

madvise is _not_ SVR4 (not in SVID3) but it _is_ in the OSF/1 AES which
means it is _not_ vendor specific. But the standard part of madvise is that
it is a "hint".

However everything it actually _does_ when you hint the kernel with madvise
is specific usually to some versions of an operating system.

There are tricks to get around madvise not doing everything you want
(WONTNEED didn't work in Solaris for a long time. Kevin found a trick that
worked really well instead. Kevin knows people at Sun, since he was one of
the very earliest employees there, and so now the trick Kevin used to
suggest has now been found to be the implementation of WONTNEED in Solaris.)

And that trick is well worth understanding. It happens that msync() is a
good call to know. It has an undocumented behavior on Solaris that when you
msync a memory region with MS_INVALIDATE | MS_ASYNC, what happens is the
dirty pages are queued for writing and backing store is available
immediately, or if dirty, as soon as written out. This means that the pager
doesn't have to run at all to scavenge the pages. Linux didn't do this last
time I looked. I suggested it to the kernel guys and the idea got some
positive response, but I don't know if they did it.

> ii) IRIX and some other Unices (Linux 2.2 in particular), haven't
> implemented madvise, and naive use of mmap without madvise can produce
> lots of page faulting and much slower io than, say, asynchronous io
> calls on IRIX.  (http://www.geog.ubc.ca/~phil/mmap/msg00009.html)

IRIX has an awful implementation of mmap. And SGI people go around
badmouthing mmap; not that they don't have cause, but they are usually very
surprised to see how big the win is with a good implementation. Of course,
the msync() trick doesn't work on IRIX last I looked, which leads to the SGI
people believing that mmap() is brain damaged because it runs the pager into
the ground. It's a point of view that is bound to come up.

HP/UX was really wacked last time I looked. They had a version (10) which
supported the full mmap() on one series of workstations (700, 7000, I
forget, let's say 7e+?) and didn't support it except in the non-useful
SVR3.2 way on another series of workstations (8e+?). The reason was that the
8e+? workstations were multiprocessor and they hadn't figured out how to get
the newer kernel flying on the multiprocessors. I know Konrad had HP systems
at one point, maybe he has the scoop on those.

> So I'd love to see mmap in Numpy, but we may need to produce a
> tutorial outlining the tradeoffs, and giving some examples of
> madvise/msync/mmap used together (with a few benchmarks).  Any mmap
> module would need to include member functions that call madvise/msync
> for the mmapped array (but these may be no-ops on several popular OSes.)

I don't know if you want a separate module; maybe what you want is the
normal allocation of memory for all Numerical Python objects to be handled
in a way that makes sense for each operating system.

The approach I took when I was writing portable code for this sort of thing
was to write a wrapper for the memory operation semantics and then implement
the operations as a small library that would be OS specific, although not
_that_ specific. It was possible to write single source code for SVID3 and
OSF/AES1 systems with sparing use of conditional defines. Unfortunately,
that code is the intellectual property of another firm, or else I'd donate
it as an example for people who want to learn stuff about mmap. As it
stands, there was some similar code I was able to produce at some point. I
forget who here has a copy, maybe Konrad, maybe David Ascher.

Later,
Andrew Mullhaupt


From skaller at maxtal.com.au  Wed Feb  9 11:12:49 2000
From: skaller at maxtal.com.au (skaller)
Date: Thu, 10 Feb 2000 03:12:49 +1100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr>
Message-ID: <38A19201.8A43EC01@maxtal.com.au>

Konrad Hinsen wrote:

> But back to precision, which is also a popular subject:

	but one which even numerical programmers don't seem to 
understand ...
 
> The upcasting rule thus ensures that
> 
> 1) No precision is lost accidentally. If you multiply a float by
>    a double, the float might contain the exact number 2, and thus
>    have infinite precision. The language can't know this, so it
>    acts conservatively and chooses the "bigger" type.
> 
> 2) No overflow occurs unless it is unavoidable (the range problem).

	.. which is all wrong.

	It is NOT safe to convert floating point from a lower to a higher
number
of bits. ALL such conversions should be removed for this reason: any
conversions
should have to be explicit.

	The reason is that whether a conversion to a larger number
is safe or not is context dependent (and so it should NEVER be done
silently). Consider a function 

	k0 = 100
	k = 99
	while k < k0:
		..
		k0 = k
		k = ...

which refines a calculation until the measure k stops decreasing.
This algorithm may terminate when k is a float, but _fail_ when
k is a double  -- the extra precision may cause the algorithm
to perform many useless iterations, in which the precision
of the result is in fact _lost_ due to rounding error.

What is happening is that the real comparison is probably:

	k - k0 < epsilon

where epsilon was 0.0 in floating point, and thus omitted.

My point is that throwing away information is what numerical
programming is all about. Numerical programmers need to know
how big numbers are, and how much significance they have,
and optimise calculations accordingly -- sometimes by _using_
the precision of the working types to advantage.

to put this another way, it is generally bad to keep more digits (bits)
or precision than you actually have: it can be misleading.
So a language should not assume that it is OK to add more precision.
It may not be.

-- 
John (Max) Skaller, mailto:skaller at maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
homepage: http://www.maxtal.com.au/~skaller
download: ftp://ftp.cs.usyd.edu/au/jskaller


From gpk at bell-labs.com  Wed Feb  9 11:23:47 2000
From: gpk at bell-labs.com (Greg Kochanski)
Date: Wed, 09 Feb 2000 11:23:47 -0500
Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #10 - 10 msgs
References: <200002091617.IAA28931@lists.sourceforge.net>
Message-ID: <38A19493.6E2394E6@bell-labs.com>

> From: "Andrew P. Mullhaupt" <amullhau at zen-pharaohs.com> 
> > The upcasting rule thus ensures that
> >
> > 1) No precision is lost accidentally.
> 
> More or less.
> 
> More precisely, it depends on what you call an accident. What happens when
> you add the IEEE single precision floating point value 1.0 to the 32-bit
> integer 2^30? A _lot_ of people don't expect to get the IEEE single
> precision floating point value 2.0^30, but that is what happens in some
> languages. Is that an "upcast"? Would the 32 bit integer 2^30 make more
> sense? Now what about the case where the 32 bit integer is signed and adding
> one to it will "wrap around" if the value remains an integer? Because these
> two examples might make double precision or a wider integer (if available)
> seem the correct answer, suppose it's only one element of a gigantic array?
> Let's now talk about complex values....
> 


It's most important that the rules be simple, and (preferably) close
to common languages.  I'd suggest C.
In my book, anyone who carelessly mixes floats and ints deserves
whatever
punishment the language metes out.

I've done numeric work in languages where
casting was by request _only_ (e.g., Limbo, for Inferno),
and I found, to my surprise,
that automatic type casting these type casting is only a mild
convenience.   Writing code with manual typecasting is surprisingly
easy.   Since automatic typecasting only buys a small improvement
in ease of use, I'd want to be extremely sure that it doesn't cause
many problems.

It's very easy to write some complicated set of rules that wastes more
time (in the form of unexpected, untraceable bugs) than it saves.


By the way, automatic downcasting has a hidden problems if python
is ever set to trap underflow errors.   I had a program that would
randomly crash every 10th (or so) time I ran it with a large dataset
(1000x1000 linear algebra).   After days of hair-pulling, I found that
the matrix was being converted from double to float at one step,
and about 1 in 10,000,000 of the entries was too small to represent
as a single precision number.  That very rare event would underflow,
be trapped, and crash the program with a floating point exception.


From hinsen at cnrs-orleans.fr  Wed Feb  9 12:17:30 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 18:17:30 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <38A19201.8A43EC01@maxtal.com.au> (message from skaller on Thu,
	10 Feb 2000 03:12:49 +1100)
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr> <38A19201.8A43EC01@maxtal.com.au>
Message-ID: <200002091717.SAA10604@chinon.cnrs-orleans.fr>

> silently). Consider a function 
> 
> 	k0 = 100
> 	k = 99
> 	while k < k0:
> 		..
> 		k0 = k
> 		k = ...
> 
> which refines a calculation until the measure k stops decreasing.
> This algorithm may terminate when k is a float, but _fail_ when
> k is a double  -- the extra precision may cause the algorithm

I'd call this a buggy implementation. Convergence criteria should be
explicit and not rely on the internal representation of data
types. Neither Python nor C guarantees you any absolute bounds for
precision and value range, and even languages that do (such as Fortran
9x) only promise to give you a data type that is *at least* as big as
your specification.

> programming is all about. Numerical programmers need to know
> how big numbers are, and how much significance they have,
> and optimise calculations accordingly -- sometimes by _using_
> the precision of the working types to advantage.

If you care at all about portability, you shouldn't even think about
this.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From amullhau at zen-pharaohs.com  Wed Feb  9 12:21:37 2000
From: amullhau at zen-pharaohs.com (Andrew P. Mullhaupt)
Date: Wed, 9 Feb 2000 12:21:37 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr> <38A19201.8A43EC01@maxtal.com.au>
Message-ID: <054301bf7322$23cbbbe0$5063cb0a@amullhau>


> Konrad Hinsen wrote:
>
> > But back to precision, which is also a popular subject:
>
> but one which even numerical programmers don't seem to
> understand ...

Some do, some don't.


> It is NOT safe to convert floating point from a lower to a higher
> number
> of bits.

It is usually safe. Extremely safe. Safe enough that code in which it is
_not_ safe is badly designed.

> ALL such conversions should be removed for this reason: any
> conversions should have to be explicit.

I really hope not. A generic function with six different arguments becomes
an interesting object in a language without automatic conversions. Usually,
a little table driven piece of code has to cast the arguments into
conformance, and then multiple versions of similar code are applied.


> which refines a calculation until the measure k stops decreasing.
> This algorithm may terminate when k is a float, but _fail_ when
> k is a double  -- the extra precision may cause the algorithm
> to perform many useless iterations, in which the precision
> of the result is in fact _lost_ due to rounding error.

This is a classic bad programming practice and _it_ is what should be
eliminated. It is a good, (and required, if you work for me), practice that:

1. All iterations should have termination conditions which are correct; that
is, prevent extra iterations. This is typically precision sensitive. But
that is simply something that has to be taken into account when writing the
termination condition.

2. All iterations should be protected against an unexpectedly large number
of iterates taking place.

There are examples of iterations which are intrinsically stable in lower
precision and not in higher precision (Brun's algorithm) but those are quite
rare in practice. (Note that the Fergueson-Forcade algorithm, as implemented
by Lenstra, Odlyzko, and others, has completely supplanted any need to use
Brun's algorithm as well.)

When an algorithm converges because of lack of precision, it is because the
rounding error regularizes the problem. This is normally referred to in the
trade as "idiot regularization". It is in my experience, invariably better
to actually choose a regularization that is specific to the computation than
to rely on rounding effects which might be different from machine to
machine.

In particular, your type of example is in for serious programmer enjoyment
hours on Intel or AMD machines, which have 80 bit wide registers for all the
floating point arithmetic.

Supporting needless machine dependency is not something to argue for,
either, since the Cray style floating point arithmetic has a bad error
model. Even Cray has been beaten into submission on this, finally releasing
IEEE compliant processors, but only just recently.

> to put this another way, it is generally bad to keep more digits (bits)
> or precision than you actually have

I couldn't agree less.

The exponential function and inner product accumulation are famous examples
of why extra bits are important in intermediate computations. It's almost
impossible to have an accurate exponential function without using extra
precision - which is one reason why so many machines have extra bits in
their FPUs and there is an IEEE "extended" precision type.

The storage history effects which result from temporarily increased
precision are well understood, mild in that they violate no common error
models used in numerical analysis.

And for those few cases where testing for equality is needed for debugging
purposes, many systems permit you to impose truncation and eliminate storage
history issues.

Later,
Andrew Mullhaupt


From hinsen at cnrs-orleans.fr  Wed Feb  9 12:31:00 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 18:31:00 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <NDBBIEFMILBFPMDHJIMFCELICCAA.pauldubois@home.com>
References: <NDBBIEFMILBFPMDHJIMFCELICCAA.pauldubois@home.com>
Message-ID: <200002091731.SAA10614@chinon.cnrs-orleans.fr>

> > In addition to backwards compatibility, there is another argument for
> > keeping indexing behaviour as it is: compatibility with other Python
> > sequence types.
> 
> I claim the current Numeric is INconsistent with other Python sequence
> types:
> 
> >>> x = [1, 2, 3, 4, 5]
> >>> y = x[2:5]
> >>> x
> [1, 2, 3, 4, 5]
> >>> y
> [3, 4, 5]
> >>> y[1] = 7
> >>> y
> [3, 7, 5]
> >>> x
> [1, 2, 3, 4, 5]
> 
> So, y is a copy of x[2:5], not a reference.

Good point. So we can't be consistent with all properties of other
Python sequence types.

Which reminds me of some very different compatibility problem in NumPy
that can (and should) be removed: the rules for integer division and
remainders for negative arguments are not the same. NumPy inherits the
C rules, Python has its own.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From gpk at bell-labs.com  Wed Feb  9 13:25:00 2000
From: gpk at bell-labs.com (Greg Kochanski)
Date: Wed, 09 Feb 2000 13:25:00 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] Re: Matrix-SIG digest, Vol 1 #364 - 9 msgs
References: <20000209064911.9404C1CD3C@dinsdale.python.org> <38A19236.51D1879F@bell-labs.com> <053b01bf731f$30118c20$5063cb0a@amullhau>
Message-ID: <38A1B0FC.168FDCB5@bell-labs.com>

"Andrew P. Mullhaupt" wrote:

> > It's most important that the rules be simple, and (preferably) close
> > to common languages.  I'd suggest C.
> 
> That is a good example of a language which has a pretty weird history on
> this particular matter.

True.   The only real advantage of C is that so many people are
used to it.   Don't forget the human element.  FORTRAN would
also be a reasonable choice.

There's a big cost to learning a new language; if it gets too big,
people simply won't use Python.


> > Since automatic typecasting only buys a small improvement
> > in ease of use, I'd want to be extremely sure that it doesn't cause
> > many problems.
> 
> Au contraire. It is a huge win. Try writing a "generic" function with six
> arguments which can sensibly be integers, or single or double precision
> variables. If you have to test variables to see what they are, then you have
> to essentially write a table driven typecaster. If, as in Fortran, you have
> to write different functions for different argument types then you have the
> dangerous programming practice of having several different pieces of code
> which do essentially the same computation.


While that's nice to say, it doesn't really translate completely to
practice.
A lot of functions don't make sense with arbitrary objects;
and some require subtle changes.

For instance, who wants a matrix inversion function that operates on
integers,
using integer division inside?

Lots of functions have DOUBLE_EPS or FLOAT_EPS embedded inside them.
One has to change the small number when you change the data type.

I'll grant you that running things with both doubles or floats is often
useful.
I'd be happy with automatic upcasting among them.
I'd be moderately happy with upcasting among the integers.
I really don't see any crying need to mix integers with floating point
numbers.

I'd like some examples to make me believe that mixing ints and floats
is a 'huge win'.


From hinsen at cnrs-orleans.fr  Wed Feb  9 13:39:46 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 19:39:46 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>
	(beausol@exch.hpl.hp.com)
References: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>
Message-ID: <200002091839.TAA10737@chinon.cnrs-orleans.fr>

> (1) The extent of the support for LAPACK. Do we want to stick with LAPACK
> Lite?

There has been a full LAPACK interface for a long while, of which
LAPACK Lite is just the subset that is needed for supporting the
high-level routines in the module LinearAlgebra. I seem to have lost
the URL to the full version, but it's on my disk, so I can put it
onto my FTP server if there is a need.

> (2) The storage format. If we've still got row-ordered matrices under the
> hood, and we want to use native LAPACK libraries that were compiled using
> column-major format, then we'll have to be careful to set all of the flags
> correctly. This isn't going to be a big deal, _unless_ NumPy will support
> more of LAPACK when a native library is available. Then, of course, there

The low-level interface routines don't take care of this. It's the
high-level Python code (module LinearAlgebra) that sets the transposition
argument correctly. That looks like a good compromise to me.

> (3) Through the judicious use of header files with compiler-dependent flags,
> we could accommodate the various naming conventions used when the FORTRAN
> libraries were compiled (e.g., sgetrf_ or SGETRF).

That's already done!

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From pitts.todd at mayo.edu  Wed Feb  9 14:05:02 2000
From: pitts.todd at mayo.edu (Pitts, Todd A., Ph.D.)
Date: Wed, 9 Feb 2000 13:05:02 -0600 (CST)
Subject: [Numpy-discussion] Upcasting
Message-ID: <Pine.LNX.4.10.10002091207490.22233-100000@us17.mayo.edu>

Here are my two cents worth on the subject...

	Most of what has been said in this thread (at least what I
have read) I find very valuable.  Apparently, many people have been
thinking about the subject.  I view this problem as inherent in a
language without an Lvalue (like C) that allows a very explicit and
clear definition from the programmer's point of view as to the size of
the container you are going put things in.  The language in many cases
simply returns an object to you and has made some decision as to what
you "needed" or "wanted".  Of course, this is one of the things that
makes languages such an Numerical Python, Matlab, IDL, etc. very nice
for protyping and investigating.  In many cases this decision will be
adequate or acceptable.  In some, quite simply, it will not be.  At
this point the programmer has to have a good means of managing this
decision himself.

If memory is not a constraint, I can think of very few situations
(none, actually) where I would choose to go with something other than
the Numerical Python default of double.  In general, that is what you
get when creating python arrays unless you make some effort to obtain
some other type.  However, in some important (read "cases that affect
the author") situations memory is a very critical constraint.
Typically, in Numerical Python I use 4-byte floats.  In fact, one of
the reasons I use Numerical Python is because I don't *need* doubles
and matlab for example is really only setup to work gracefully with
doubles.  I do *need* to conserve memory as I deal with very large
data sets.  It seems the question we are discussing is not really
"what *should* be done in terms of casting?" but "what provides good
enough decisions much of the time *and* a gracefull way to manage the
decisions when "good enough" no longer applies to you?"

Currently, this is not a trivial thing to manage.  Reading in a 100 MB
data set and multiplying by the python scalar 2 produces a 200 MB data
set.  I manage this by wrapping the 2 in an array.  This happens, of
course, all the time.  Having to do this once is not a big deal, but
doing everywhere in code that uses floats makes for cluttered code --
not something which I expect to have to write in an otherwise very
elegant and concise language.  Also, I often find myself trudging
through code looking for the subtlety that converted my floats to
doubles, doubled my memory usage and then caused subsequent float only
routines to error out.  To those who are constrained to use floats
this is awkward and time consuming.  To those who are not I would say
-- use doubles.  The flag that causes an array to be a "space saving
array" seems to be a temporary fix (that doesn't mean it was a bad
idea -- just that it feels messy and effectively adds complexity that
shouldn't be there).  It also, mearly postpones the problem as I
understand it -- what happens when I multiply two space saving arrays?

We simply will never get away from situations where we have to manage
the interaction ourselves and so we should be careful not to make that
management so awkward (and currently I think it is awkward) that the
floats, bytes, shorts, etc. become marginalized in their
utility.  My suggestion is to go with the rule that a simple hirearchy
(in which downcasting is the rule)

longs
integers
shorts
bytes
cardinals
booleans
doubles  complex doubles     <--- default
floats   complex floats
 
for the most part makes good decisions: Principally because people who
are not constrained to conserve memory will use the larger, default
types all the time and not wince.  They don't *need* floats or bytes.
If anyone gives them a float a simple astype('d') or astype('D') to
make sure it becomes a double lets them go on their way.  Types like
integers and bytes are effectively treated as being precise.  If you
are constrained to conserve memory by staying with floats or bytes
instead of just reading things in from disk and making them doubles it
will not be so awkward to manage the types in large programs.  If I
use someones code and they have a scalar anywhere in it at some point,
even if I (or they) cast the output, memory usage swells at least for
intermediate calculations.  Effectively, python *has* 4-byte floats
but programming with them is awkward.

This means, of course, that multiplying a float array by a double
array produces a float.  Multiplying a double array by anything above
it produces a double. etc.  For my work, if I have a float anywhere in
the calculation I don't believe precision beyond that in the output so
getting a float back is reasonable. I know that some operations
produce "more precision" and so I would cast the array if I needed to
take advantage of that.

Perhaps the downcasting is *not* the way to go.  However, I definately
think the current awkwardness should be eliminated.  I hope my
comments will not be percieved as being critical of the original
language designers.  I find python to be very useful or I wouldn't
have bothered to make the comments at all.

-Todd Pitts


From beausol at exch.hpl.hp.com  Wed Feb  9 14:16:58 2000
From: beausol at exch.hpl.hp.com (Beausoleil, Raymond)
Date: Wed, 9 Feb 2000 11:16:58 -0800 
Subject: [Numpy-discussion] RE: [Matrix-SIG] An Experiment in code-cleanup.
Message-ID: <34E36C05935CD311AE5000A0C9B6B0BF07D16F@hplex3.hpl.hp.com>

From: Konrad Hinsen [mailto:hinsen at cnrs-orleans.fr]

> > (1) The extent of the support for LAPACK. Do we want to stick
> > with LAPACK Lite?
>
> There has been a full LAPACK interface for a long while, of which
> LAPACK Lite is just the subset that is needed for supporting the
> high-level routines in the module LinearAlgebra. I seem to have lost
> the URL to the full version, but it's on my disk, so I can put it
> onto my FTP server if there is a need.

Yes, I'd like to get a copy! You can simply e-mail it to me, if you'd
prefer.

> > (2) The storage format. If we've still got row-ordered matrices
> > under the hood, and we want to use native LAPACK libraries that
> > were compiled using column-major format, then we'll have to be
> > careful to set all of the flags correctly. This isn't going to
> > be a big deal, _unless_ NumPy will support more of LAPACK when a
> > native library is available. Then, of course, there ...
>
> The low-level interface routines don't take care of this. It's the
> high-level Python code (module LinearAlgebra) that sets the
> transposition argument correctly. That looks like a good compromise
> to me.

I'll have to look at this more carefully. Due to my relative lack of Python
experience, I hacked the C code so that Fortran routines could be called
instead, producing the expected results.

> > (3) Through the judicious use of header files with compiler-
> > dependent flags,  we could accommodate the various naming
> > conventions used when the FORTRAN libraries were compiled (e.g.,
> > sgetrf_ or SGETRF).
> 
> That's already done!

Where? Even in the latest f2c'd source code that I downloaded from
SourceForge, I see all names written using the
lower-case-trailing-underscore convention (e.g., dgeqrf_). The Intel MKL was
compiled from Fortran source using the upper-case-no-underscore convention
(e.g., DGEQRF). If I replace dgeqrf_ with DGEQRF in dlapack_lite.c (and a
few other tweaks), then the subsequent link with the IMKL succeeds.

============================
Ray Beausoleil
Hewlett-Packard Laboratories
mailto:beausol at hpl.hp.com
Vox: 425-883-6648
Fax: 425-883-2535
HP Telnet: 957-4951
============================


From godzilla at netmeg.net  Wed Feb  9 15:47:34 2000
From: godzilla at netmeg.net (Les Schaffer)
Date: Wed, 9 Feb 2000 15:47:34 -0500 (EST)
Subject: [Numpy-discussion] digest
Message-ID: <14497.53862.853166.521584@gargle.gargle.HOWL>

just switched over from matrix-sig to numpy-discussion. in the process 
i changed to the digest version and got my first issue.

is it possible to distribute the digests properly formatted as
multipart/digests as per rfc822 and company?

having such a formatted digest makes it very easy when using an
emailer like VM in emacs: VM automatically displays the digest as a
virtual folder, allowing one to browse all the posts in a given digest
very quickly and easily. don't know whether the other (lacklusters)
emailers out there will handle it so nicely, but i don't think the
extra required markers will interfere with your reading of the digests
at all.

highly recommended. i'd be glad to work with whoever has control over
this to ensure that the proper markers get placed into the digests.

les schaffer


From hinsen at cnrs-orleans.fr  Wed Feb  9 15:58:47 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 21:58:47 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <34E36C05935CD311AE5000A0C9B6B0BF07D16F@hplex3.hpl.hp.com>
	(beausol@exch.hpl.hp.com)
References: <34E36C05935CD311AE5000A0C9B6B0BF07D16F@hplex3.hpl.hp.com>
Message-ID: <200002092058.VAA10798@chinon.cnrs-orleans.fr>

> > onto my FTP server if there is a need.
> 
> Yes, I'd like to get a copy! You can simply e-mail it to me, if you'd
> prefer.

OK, coming soon...

> I'll have to look at this more carefully. Due to my relative lack of Python
> experience, I hacked the C code so that Fortran routines could be called
> instead, producing the expected results.

That's fine, you can simply replace the f2c-generated code by
Fortran-compiled code, as long as the calling conventions are the
same. I have used optimized BLAS as well on some machines.

> Where? Even in the latest f2c'd source code that I downloaded from
> SourceForge, I see all names written using the
> lower-case-trailing-underscore convention (e.g., dgeqrf_). The Intel MKL was

Sure, f2c generates the underscores. But the LAPACK interface code
(the one I'll send you, and also LAPACK Lite) supports both
conventions, controlled by the preprocessor symbol NO_APPEND_FORTRAN
(maybe not the most obvious name). On the other hand, there is
no support for uppercase names; that convention is not used in the
Unix world. But I suppose it could be added by machine transformation
of the code.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From da at ski.org  Wed Feb  9 16:23:22 2000
From: da at ski.org (David Ascher)
Date: Wed, 9 Feb 2000 13:23:22 -0800
Subject: [Numpy-discussion] digest
References: <14497.53862.853166.521584@gargle.gargle.HOWL>
Message-ID: <00cd01bf7343$ea55ae30$0100000a@ski.org>

> just switched over from matrix-sig to numpy-discussion. in the process
> i changed to the digest version and got my first issue.
>
> is it possible to distribute the digests properly formatted as
> multipart/digests as per rfc822 and company?

Did you try to edit your configuration on the mailman control panel?  There
is a choice between MIME and plain-text digests.

--david ascher


From skaller at maxtal.com.au  Wed Feb  9 17:04:13 2000
From: skaller at maxtal.com.au (skaller)
Date: Thu, 10 Feb 2000 09:04:13 +1100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr> <38A19201.8A43EC01@maxtal.com.au> <200002091717.SAA10604@chinon.cnrs-orleans.fr>
Message-ID: <38A1E45D.6E0EF317@maxtal.com.au>

Konrad Hinsen wrote:
> 
> > silently). Consider a function
> >
> >       k0 = 100
> >       k = 99
> >       while k < k0:
> >               ..
> >               k0 = k
> >               k = ...
> >
> > which refines a calculation until the measure k stops decreasing.
> > This algorithm may terminate when k is a float, but _fail_ when
> > k is a double  -- the extra precision may cause the algorithm
> 
> I'd call this a buggy implementation. Convergence criteria should be
> explicit and not rely on the internal representation of data
> types. 

> If you care at all about portability, you shouldn't even think about
> this.

	But sometimes you DON'T care about portability.
Sometimes, you want the best result the architecture can support,
and so you need to perform a portable computation of an architecture
dependent value.

-- 
John (Max) Skaller, mailto:skaller at maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
homepage: http://www.maxtal.com.au/~skaller
download: ftp://ftp.cs.usyd.edu/au/jskaller


From da at ski.org  Wed Feb  9 18:50:03 2000
From: da at ski.org (David Ascher)
Date: Wed, 9 Feb 2000 15:50:03 -0800
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> <14496.42942.5355.849670@brant.geog.ubc.ca> <03fe01bf72cd$c040d640$5063cb0a@amullhau>
Message-ID: <037d01bf7358$630b8980$0100000a@ski.org>

> it as an example for people who want to learn stuff about mmap. As it
> stands, there was some similar code I was able to produce at some point. I
> forget who here has a copy, maybe Konrad, maybe David Ascher.
>
> Later,
> Andrew Mullhaupt

I did have some of that code, but it was almost 3 years ago and five
computers ago.  In other words, it's *somewhere*.  I'll start a grep, but
don't hold your breath...

--da


From da at ski.org  Thu Feb 10 01:52:21 2000
From: da at ski.org (David Ascher)
Date: Wed, 9 Feb 2000 22:52:21 -0800
Subject: [Numpy-discussion] Binary distribution available
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> <14496.42942.5355.849670@brant.geog.ubc.ca> <03fe01bf72cd$c040d640$5063cb0a@amullhau>
Message-ID: <051f01bf7393$61bff4e0$0100000a@ski.org>

With Travis' wise advice, I appear to have succeeded in putting forth a
binary installation of Numerical-15.2.

Due to a bug in distutils, this is an 'install in place' package, instead of
a 'run python setup.py install' package. So, unzip the file in your main
Python tree, and it should 'work'. Let me (and Paul and Travis) know if it
doesn't.

Download is available from the main page
(http://sourceforge.net/project/?group_id=1369 look for [zip]) or from
http://download.sourceforge.net/numpy/python-numpy-15.2.zip

--david ascher


From gvwilson at nevex.com  Thu Feb 10 13:28:51 2000
From: gvwilson at nevex.com (gvwilson at nevex.com)
Date: Thu, 10 Feb 2000 13:28:51 -0500 (EST)
Subject: [Numpy-discussion] re: scientific Python publishing venue
Message-ID: <Pine.LNX.4.10.10002101328230.30179-100000@akbar.nevex.com>

Hi, folks. A former colleague of mine is now editing a magazine devoted to
scientific computing, and is looking for articles.  If you're doing
something scientific with Python, and want to tell the world about it,
please give me a shout, and I'll forward more information.

Greg Wilson
http://www.software-carpentry.com


From archiver at db.geocrawler.com  Thu Feb 17 12:34:11 2000
From: archiver at db.geocrawler.com (andrew x swan)
Date: Thu, 17 Feb 2000 09:34:11 -0800
Subject: [Numpy-discussion] more speed?
Message-ID: <200002171734.JAA08011@www.geocrawler.com>

This message was sent from Geocrawler.com by "andrew x swan" <a.swan at anprod.csiro.au>
Be sure to reply to that address.

hi - i've only just started using python and
numpy... the program i wrote below runs much more
slowly than a fortran equivalent. ie. on a dataset
where the order of the matrix is (3325,3325),
python took this long:

362.25user 0.74system 6:09.78elapsed 98%CPU 

and fortran took this long:

2.68user 1.12system 0:03.89elapsed 97%CPU

is this because the element by element
calculations involved are contained in python for
loops?

thanks

#!/usr/bin/python

from Numeric import *

def nrm(pedigree):

    n_animals = len(pedigree) + 1
    
    nrm = zeros((n_animals,n_animals),Float)

    for i in xrange(1,n_animals):
        isire = pedigree[i-1][1]
        idam = pedigree[i-1][2]
        nrm[i,i] = 1.0 + 0.5 * nrm[isire,idam]
        for j in xrange(i+1,n_animals):
            jsire = pedigree[j-1][1]
            jdam = pedigree[j-1][2]
            nrm[j,i] = 0.5 * (nrm[jsire,i] +
nrm[jdam,i])
            nrm[i,j] = nrm[j,i]

    return nrm

if __name__ == '__main__':

    test_ped = [(1,0,0),(2,0,0),(3,1,0),(4,1,2),
                (5,3,4),(6,1,4),(7,5,6)]

    a = nrm(test_ped)
    print a
    

Geocrawler.com - The Knowledge Archive


From da at ski.org  Thu Feb 17 18:25:57 2000
From: da at ski.org (David Ascher)
Date: Thu, 17 Feb 2000 15:25:57 -0800
Subject: [Numpy-discussion] more speed?
References: <200002171734.JAA08011@www.geocrawler.com>
Message-ID: <04d501bf799e$5a82abd0$0100000a@ski.org>

From: andrew x swan <archiver at db.geocrawler.com>

> python took this long:
> 
> 362.25user 0.74system 6:09.78elapsed 98%CPU 
> 
> and fortran took this long:
> 
> 2.68user 1.12system 0:03.89elapsed 97%CPU
> 
> is this because the element by element
> calculations involved are contained in python for
> loops?

yes.

--david ascher


From syrus at long.ucsd.edu  Thu Feb 17 18:27:29 2000
From: syrus at long.ucsd.edu (Syrus Nemat-Nasser)
Date: Thu, 17 Feb 2000 15:27:29 -0800 (PST)
Subject: [Numpy-discussion] more speed?
In-Reply-To: <200002171734.JAA08011@www.geocrawler.com>
Message-ID: <Pine.LNX.3.96.1000217152207.21034B-100000@long.ucsd.edu>

On Thu, 17 Feb 2000, andrew x swan wrote:

> is this because the element by element
> calculations involved are contained in python for
> loops?

Hi Andrew!

I've only just begun using Numeric Python, but I'm a long-time user of GNU
Octave and a sporadic user of MatLab. In general, for loops kill the
execution speed of interpretive environments like Numpy and Octave. 

The high-speed comes when one uses vector operations such as Matrix
multiplication.

If you can vectorize your code, meaning replace all the loops with matrix
operations, you should see equivalent speed to Fortran for large data
sets. As far as I know, you will never see an interpreted language match a
compiled one in the execution of for loops.

Thanks. Syrus.

-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Syrus Nemat-Nasser <syrus at ucsd.edu>    UCSD Physics Dept.


From peter at eexpc.eee.nott.ac.uk  Fri Feb 18 12:05:06 2000
From: peter at eexpc.eee.nott.ac.uk (Peter Chang)
Date: Fri, 18 Feb 2000 17:05:06 +0000 (GMT)
Subject: [Numpy-discussion] numpy documentation - alternative format?
Message-ID: <Pine.LNX.3.96.1000218165801.21400C-100000@eexpc.eee.nott.ac.uk>

Hi there,

I've just started to use python and numpy and want to print out the numpy
document but the PDF file has a strange aspect ratio which makes it hard
to print it as 2up on A4 paper. (I've tried hacking about with the
postscript generated by xpdf but it seems that there is no global setting
for page size!)

Could the authors please provide alternative formats for the doc, eg.
as postscript files sized for A4 and letter so that people can print them
out easier?

Thanks
 Peter


From roitblat at hawaii.edu  Fri Feb 18 12:14:22 2000
From: roitblat at hawaii.edu (Herbert L. Roitblat)
Date: Fri, 18 Feb 2000 07:14:22 -1000
Subject: [Numpy-discussion] numpy documentation - alternative format?
Message-ID: <03fd01bf7a33$9b046320$8fd6afcf@0gl1u.pixi.com>

Adobe Acrobat has a shrink to fit option in their print menu.  I'm not sure
if it comes with their free-reader.
Try printing as a 1up.  It seems a small adaptation.
HLR
-----Original Message-----
From: Peter Chang <peter at eexpc.eee.nott.ac.uk>
To: Numpy-discussion at lists.sourceforge.net
<Numpy-discussion at lists.sourceforge.net>
Date: Friday, February 18, 2000 7:09 AM
Subject: [Numpy-discussion] numpy documentation - alternative format?


>
>Hi there,
>
>I've just started to use python and numpy and want to print out the numpy
>document but the PDF file has a strange aspect ratio which makes it hard
>to print it as 2up on A4 paper. (I've tried hacking about with the
>postscript generated by xpdf but it seems that there is no global setting
>for page size!)
>
>Could the authors please provide alternative formats for the doc, eg.
>as postscript files sized for A4 and letter so that people can print them
>out easier?
>
>Thanks
> Peter
>
>
>
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>http://lists.sourceforge.net/mailman/listinfo/numpy-discussion
>


From peter at eexpc.eee.nott.ac.uk  Fri Feb 18 12:18:46 2000
From: peter at eexpc.eee.nott.ac.uk (Peter Chang)
Date: Fri, 18 Feb 2000 17:18:46 +0000 (GMT)
Subject: [Numpy-discussion] numpy documentation - alternative format?
In-Reply-To: <03fd01bf7a33$9b046320$8fd6afcf@0gl1u.pixi.com>
Message-ID: <Pine.LNX.3.96.1000218171657.21400D-100000@eexpc.eee.nott.ac.uk>

On Fri, 18 Feb 2000, Herbert L. Roitblat wrote:

> Adobe Acrobat has a shrink to fit option in their print menu.  I'm not sure
> if it comes with their free-reader.

Is it available for Linux? I'll check it out...

> Try printing as a 1up.  It seems a small adaptation.

I'm trying to save dead trees, i.e. print out 40 odd pages instead of 90
odd.

Peter


From sanner at scripps.edu  Sat Feb 19 22:50:16 2000
From: sanner at scripps.edu (Michel Sanner)
Date: Sat, 19 Feb 2000 19:50:16 -0800
Subject: [Numpy-discussion] Numeric Python under IRIX646
Message-ID: <1000219195017.ZM77150@noah.scripps.edu>

Hi There,

I just tried to add SGI running IRIX6.5 to the collection of Unix boxes I will
support and I ran into the following problem:

If I compile Python -O2 loading the Numeric extensions dumps the core,
if I compile Python -g it works just fine and this regardless if Numeric is
compile -g or -O2.

After I re-compiled Objects/complexobject.o using -g (everything else being
compiled -O2) I got it to work ...

did anyone else out there see this kind of behavior ?

I also post this to psa-members just in case this might be Python related


-Michel


-- 

-----------------------------------------------------------------------

>>>>>>>>>> AREA CODE CHANGE <<<<<<<<< we are now 858 !!!!!!!

Michel F. Sanner Ph.D.                   The Scripps Research Institute
Assistant Professor			Department of Molecular Biology
					  10550 North Torrey Pines Road
Tel. (858) 784-2341				     La Jolla, CA 92037
Fax. (858) 784-2860
sanner at scripps.edu                        http://www.scripps.edu/sanner
-----------------------------------------------------------------------


From mitch.chapman at mciworld.com  Mon Feb 21 13:01:59 2000
From: mitch.chapman at mciworld.com (Mitch Chapman)
Date: Mon, 21 Feb 2000 11:01:59 -0700
Subject: [Numpy-discussion] Re: [PSA MEMBERS] Numeric Python under IRIX646
In-Reply-To: <1000219195017.ZM77150@noah.scripps.edu>
References: <1000219195017.ZM77150@noah.scripps.edu>
Message-ID: <00022111060701.00593@mchapmanpc>

On Sat, 19 Feb 2000, Michel Sanner wrote:
> Hi There,
> 
> I just tried to add SGI running IRIX6.5 to the collection of Unix boxes I will
> support and I ran into the following problem:
> 
> If I compile Python -O2 loading the Numeric extensions dumps the core,
> if I compile Python -g it works just fine and this regardless if Numeric is
> compile -g or -O2.
> 
> After I re-compiled Objects/complexobject.o using -g (everything else being
> compiled -O2) I got it to work ...
> 
> did anyone else out there see this kind of behavior ?


I saw exactly this behavior just last Friday afternoon.
After all of Python was recompiled with -g the bus error went away.

Thanks for pointing out that only complexobject needs to be compiled
with -g.  It didn't occur to me to try this, despite the location of
the bus error, because it was possible to exercise complex objects
interactively w. no problems.

BTW I don't know whether you were compiling N32 or N64.  In our
case N32 created the bus error.

-- 
Mitch Chapman
mitch.chapman at mciworld.com


From hinsen at cnrs-orleans.fr  Fri Feb 25 07:26:58 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 25 Feb 2000 13:26:58 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] Numeric Array: adding a 0-D array to a cell in a 2-D array
In-Reply-To: <019901bf7e93$8771d5e0$8fd6afcf@0gl1u.pixi.com>
	(roitblat@hawaii.edu)
References: <019901bf7e93$8771d5e0$8fd6afcf@0gl1u.pixi.com>
Message-ID: <200002251226.NAA14777@chinon.cnrs-orleans.fr>

> We get the type error from trying to set the matrix element with a matrix
> element (apparently).  In the old version (1.9) on our NT box,
> temp=a[kwd,kwd] results in temp being an int type.  How can we either cast
> the temp to an int or enable what we really want, which is to add an int to
> a[kwd,kwd], as in a[kwd,kwd] = a[kwd,kwd] + jwd ?
> 
> Do we have a bad version of Numeric?

Maybe an experimental version. If you check the archives of this
mailing list, you can find a recent discussion about proposed
modifications. One of them was to eliminate the automatic conversion
of rank-0 arrays to scalars, in order to prevent type promotion.
Perhaps this proposal was implemented in the version you have.


Note to the NumPy maintainers: please announce all new releases on
this list, mentioning changes, especially those that affect backward
compatibility. As a maintainer of code that makes heavy use of NumPy,
I keep getting questions and bug reports caused by some new NumPy
release that I haven't even heard of. A recent example is the change
of location of the header files; C modules using arrays now have to
include Numeric/arrayobject.h instead of simply arrayobject.h. I can
understand this change (although I am not sure it's important enough
to break compatibility), but I'd have preferred to learn about it
directly and as early as possible. It's really no fun working through
a 2 KB bug report sent by someone with zero knowledge of C.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Oliphant.Travis at mayo.edu  Fri Feb 25 15:23:01 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Fri, 25 Feb 2000 14:23:01 -0600 (CST)
Subject: [Numpy-discussion] Array-casting problem.
Message-ID: <Pine.LNX.4.10.10002251325570.4131-100000@us2.mayo.edu>

Hi Herb,

It has taken awhile for me to respond to this, but your problem's here
illustrate exactly the kinds of difficulties one encounters with the
current NumPy coercion rules:

You do not have a bad version of Numeric.  The behavior you describe is
exactly what "should" happen though it needs to be fixed.  I'll trace
for you exactly what is going on as it could be illustrative to others:

>>> a = zeros((5,5),'b')
# You've just created a 5x5 byte array that follows "normal" coercion
#   rules filled with zeros.

>>> a[3,3] = 8
# This line copies the rank-0 array of type 'b' created from the Python
#  Integer 8 (by a direct coercion in C) into element (3,3) of matrix a

>>> temp = a[3,3]
# This selects out the rank-0 array of typecode 'b' at position (3,3).  As
# of 15.2 this is nolonger changed to a scalar.  Note that rank-0 arrays
# act alot like scalars, but because there is not a one-to-one
# correspondence between the Python Scalars and rank-0 arrays, this is not
# automatically converted to a Python scalar (this is a change in 15.2)

>>> temp = temp + 3
# This is the problem line for you right here.  Something is wrong though,
# since it should not be, a problem.
# You are adding a rank-0 array of typecode 'b' to a Python Integer which 
# is interpreted by Numeric as a rank-0 array of typecode 'l'.  The result
# should be a Python Integer.  For some reason this is returning an array 
# of typecode 'i' (which does not get automatically converted to a Python
# scalar).

>>> a[3,3] = temp
# This would work fine if temp were the Python scalar it should be. 
# Right now, assignment doesn't let you assign an array of a "larger" type 
# to elements of a smaller type (except for Python scalars).  Since temp
# is (incorrectly I think) a type 'i' rank-0 array, it does not let you 
# make the assignment.  At any rate it is inconsistent to let you assign
# Python scalars but not rank-0 arrays of arbitrary precision, this should
# be fixed.  It is also a problem that temp + 3 returns an array of
# typecode 'i'.

I will look into fixing the above problems this example points out.  Of 
course, it could also be fixed by having long integers lower in the
coercion tree than byte arrays.

Thanks for the feedback,

Travis Oliphant


From Oliphant.Travis at mayo.edu  Fri Feb 25 15:57:34 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Fri, 25 Feb 2000 14:57:34 -0600 (CST)
Subject: [Numpy-discussion] Casting problems with new version of NumPy.
Message-ID: <Pine.LNX.4.10.10002251453500.4377-100000@us2.mayo.edu>

The code sent by Herbert Roitblat pointed out some inconsistencies in the
current NumPy, that I've fixed with two small changes:

1) Long's can no longer be safely cast to Int's (this is not safe on
64-bit machines anyway) -- this makes Numeric more consistent with how it
interprets Python integers.

2) Automatic casting will be done when using rank-0 arrays to set elements
of a Numeric array to be consistent with the behavior for Python scalars.

The changes are in CVS right now, but are simple to change back if there
is a problem.

-Travis


From collins at rushe.aero.org  Mon Feb 28 13:17:38 2000
From: collins at rushe.aero.org (JEFFERY COLLINS)
Date: Mon, 28 Feb 2000 10:17:38 -0800
Subject: [Numpy-discussion] Matrix.py problem
Message-ID: <200002281817.KAA04027@rushe.aero.org>

I installed the Numpy 15.2 and got the following error during the
import of Matrix.  Apparently, the version number is no longer
embedded in the module doc string following the # sign.

>>> import Matrix
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python1.5/site-packages/Numeric/Matrix.py", line 5, in ?
    __version__ = int(__id__[string.index(__id__, '#')+1:-1])
  File "/usr/local/lib/python1.5/string.py", line 138, in index
    return _apply(s.index, args)
ValueError: substring not found in string.index

Jeff


From vanandel at atd.ucar.edu  Wed Feb  2 13:35:34 2000
From: vanandel at atd.ucar.edu (Joe Van Andel)
Date: Wed, 02 Feb 2000 11:35:34 -0700
Subject: [Numpy-discussion] single precision routines in NumPy?
Message-ID: <389878F6.B7F2DAF@atd.ucar.edu>

I would like a single precision version of 'interp' in the Numeric
Core.   (I want such a routine since I'm operating on huge single
precision arrays, that I don't want promoted to double precision.)

I've written such a routine, but Paul Dubois and I are discussing the
best way of integrating it into the core.

One solution is to simply add a new function 'interpf' to
arrayfnsmodule.c .  

Another solution is to add a typecode=Float option to interp.

Any opinions on how this single precision version be handled?


-- 
Joe VanAndel  	          
National Center for Atmospheric Research
http://www.atd.ucar.edu/~vanandel/
Internet: vanandel at ucar.edu


From tla at research.nj.nec.com  Thu Feb  3 16:57:41 2000
From: tla at research.nj.nec.com (Tom Adelman)
Date: Thu, 03 Feb 2000 16:57:41 -0500
Subject: [Numpy-discussion] newbie: PyArray_Check difficulties
Message-ID: <3.0.1.32.20000203165741.00958d00@zingo.nj.nec.com>

I'm having a problem with PyArray_Check.  If I just call
PyArray_Check(args) I don't have a problem, but if I try to assign the
result to anything, etc., it crashes (due to acces violation).  So, for
example the code at the end of this note doesn't work, yet I know an array
is being passed and I can, for example, calculate its trace correctly if I
type cast it as a PyArrayObject*.

Also, a more general question: is this the recommended way to input numpy
arrays when using swig, or do most people find it easier to use more
elaborate typemaps, or something else?

Finally, I apologize if this is the wrong forum to post this question.
Please let me know.

Thanks, Tom 


Method from C++ class:

PyObject * Test01::trace(PyObject * args)
{
	if (!(PyArray_Check(args))) {  // <- crashes here
		PyErr_SetString(PyExc_ValueError, "must use NumPy array");
		return NULL;
	}
	return NULL;
}


Swig file: (where typemaps are the ones included with most recent swig)

/* TMatrix.i */
%module Ptest
%include "typemaps.i"
%{
#include "Test01.h"
%}

class Test01  
{
public:
	PyObject * trace(PyObject *INPUT);
	Test01();
	virtual ~Test01();
};


Python code:

import Ptest
t = Ptest.Test01()

import Numeric
a = Numeric.arange(1.1, 2.7, .1)
b = Numeric.reshape(a, (4,4))

x = t.trace(b)


From Oliphant.Travis at mayo.edu  Fri Feb  4 15:49:34 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Fri, 4 Feb 2000 14:49:34 -0600 (CST)
Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #7 - 1 msg
In-Reply-To: <200002042005.MAA15424@lists.sourceforge.net>
Message-ID: <Pine.LNX.4.10.10002041445100.1101-100000@us2.mayo.edu>

> I'm having a problem with PyArray_Check.  If I just call
> PyArray_Check(args) I don't have a problem, but if I try to assign the
> result to anything, etc., it crashes (due to acces violation).  So, for
> example the code at the end of this note doesn't work, yet I know an array
> is being passed and I can, for example, calculate its trace correctly if I
> type cast it as a PyArrayObject*.
> 
> Also, a more general question: is this the recommended way to input numpy
> arrays when using swig, or do most people find it easier to use more
> elaborate typemaps, or something else?

I have some experience with SWIG but it is not my favorite method to use
Numerical Python with C, since you have so little control over how things
get allocated.

Your problem is probably due to the fact that you do not run
import_array() in the module header.  

There is a typemap in SWIG that let's you put commands to run at module
initialization.   Try this in your *.i file.

%init %{
        import_array();
%}

This may help.


Best,

Travis


From Oliphant.Travis at mayo.edu  Mon Feb  7 19:08:43 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Mon, 7 Feb 2000 18:08:43 -0600 (CST)
Subject: [Numpy-discussion] An Experiment in code-cleanup.
Message-ID: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>

I wanted to let users of the community know (so they can help if they
want, or offer criticism or comments) that over the next several
months I will be experimenting with a branch of the main Numerical source
tree and endeavoring to "clean-up" the code for Numerical Python.  I have
in mind a few (in my opinion minor) alterations to the current code-base
which necessitates a branch. 

Guido has made some good suggestions for improving the code base, and both
David Ascher and Paul Dubois have expressed concerns over the current
state of the source code and given suggestions as to how to improve it.  
That said, I should emphasize that my work is not authorized,
or endorsed, by any of the people mentioned above.  It is simply my little
experiment.

My intent is not to re-create Numerical Python --- I like most of the
current functionality --- but to merely, clean-up the code, comment it, 
and change the underlying structure just a bit and add some features I
want.  One goal I have is to create something that can go into Python
1.7 at some future point, so this incarnation of Numerical Python may not
be completely C-source compatible with current Numerical Python (but it
will be close).  This means C-extensions that access the underlying
structure of the current arrayobject may need some alterations to use this
experimental branch if it every becomes useful.

I don't know how long this will take me.  I'm not promising anything.  The
purpose of this announcement is just to invite interested parties into the
discussion.   

These are the (somewhat negotiable) directions I will be pursuing.

1) Still written in C but heavily (in my opinion) commented.

2) Addition of bit-types and unsigned integer types.

3) Facility for memory-mapped dataspace in arrays.

4) Slices become copies with the addition of methods for current strict
referencing behavior.

5) Handling of sliceobjects which consist of sequences of indices (so that
setting and getting elements of arrays using their index is possible). 

6) Rank-0 arrays will not be autoconverted to Python scalars, but will
still behave as Python scalars whenever Python allows general scalar-like
objects in it's operations.  Methods will allow the
user-controlled conversion to the Python scalars.  

7) Addition of attributes so that different users can configure aspects of
the math behavior, to their hearts content.

If their is anyone interested in helping in this "unofficial branch
work" let me know and we'll see about setting up someplace to work.  Be
warned, however, that I like actual code or code-templates more than just
great ideas (truly great ideas are never turned away however ;-) )

If something I do benefits the current NumPy source in a non-invasive,
backwards compatible way, I will try to place it in the current CVS tree,
but that won't be a priority, as my time does have limitations, and I'm
scratching my own itch at this point.

Best regards,

Travis Oliphant


From dubois1 at llnl.gov  Mon Feb  7 19:22:45 2000
From: dubois1 at llnl.gov (Paul F. Dubois)
Date: Mon, 7 Feb 2000 16:22:45 -0800
Subject: [Numpy-discussion] RE: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <NDBBIEFMILBFPMDHJIMFKEKLCCAA.dubois1@llnl.gov>

Travis says that I don't necessarily endorse his goals but in fact I do,
strongly.

If I understand right he intends to make a CVS branch for this experiment
and that is fine with me.

The only goal I didn't quite understand was:

Addition of attributes so that different users can configure aspects of
the math behavior, to their hearts content.

In a world of reusable components the situation is complicated. I would not
like to support a dot-product routine, for example, if the user could turn
off any double precision behind my back. My needs for precision are local to
my algorithm.


From archiver at db.geocrawler.com  Tue Feb  8 10:52:47 2000
From: archiver at db.geocrawler.com (John Travers)
Date: Tue, 8 Feb 2000 07:52:47 -0800
Subject: [Numpy-discussion] Re: A proposal for dot product
Message-ID: <200002081552.HAA10267@www.geocrawler.com>

This message was sent from Geocrawler.com by "John Travers" <j.c.travers at durham.ac.uk>
Be sure to reply to that address.

If the above was implemented, I would be very happy indeed. As a maths student, I use NumPy 
a lot. And get infuriated with the current implementation.

John.

Geocrawler.com - The Knowledge Archive


From hinsen at cnrs-orleans.fr  Tue Feb  8 12:12:56 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Tue, 8 Feb 2000 18:12:56 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> (message
	from Travis Oliphant on Mon, 7 Feb 2000 18:08:43 -0600 (CST))
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <200002081712.SAA03158@chinon.cnrs-orleans.fr>

> 3) Facility for memory-mapped dataspace in arrays.

I'd really like to have that...

> 4) Slices become copies with the addition of methods for current strict
> referencing behavior.

This will break a lot of code, and in a way that will be difficult to
debug. In fact, this is the only point you mention which would be
reason enough for me not to use your modified version; going through
all of my code to check what effect this might have sounds like a
nightmare.

I see the point of having a copying version as well, but why not
implement the copying behaviour as methods and leave indexing as it
is?

> 5) Handling of sliceobjects which consist of sequences of indices (so that
> setting and getting elements of arrays using their index is possible). 

Sounds good as well...

> 6) Rank-0 arrays will not be autoconverted to Python scalars, but will
> still behave as Python scalars whenever Python allows general scalar-like
> objects in it's operations.  Methods will allow the
> user-controlled conversion to the Python scalars.  

I suspect that full behaviour-compatibility with scalars is
impossible, but I am willing to be proven wrong. For example, Python
scalars are immutable, arrays aren't. This also means that rank-0
arrays can't be used as keys in dictionaries.

How do you plan to implement mixed arithmetic with scalars? If the
return value is a rank-0 array, then a single library returning
a rank-0 array somewhere could mess up a program well enough that
debugging becomes a nightmare.

> 7) Addition of attributes so that different users can configure aspects of
> the math behavior, to their hearts content.

You mean global attributes? That could be the end of universally
usable library modules, supposing that people actually use them.

> If their is anyone interested in helping in this "unofficial branch
> work" let me know and we'll see about setting up someplace to work.  Be

I don't have much time at the moment, but I could still help out with
testing etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen at dirac.cnrs-orleans.fr  Tue Feb  8 12:13:20 2000
From: hinsen at dirac.cnrs-orleans.fr (hinsen at dirac.cnrs-orleans.fr)
Date: Tue, 8 Feb 2000 18:13:20 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> (message
	from Travis Oliphant on Mon, 7 Feb 2000 18:08:43 -0600 (CST))
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <200002081713.SAA03161@chinon.cnrs-orleans.fr>

> 3) Facility for memory-mapped dataspace in arrays.

I'd really like to have that...

> 4) Slices become copies with the addition of methods for current strict
> referencing behavior.

This will break a lot of code, and in a way that will be difficult to
debug. In fact, this is the only point you mention which would be
reason enough for me not to use your modified version; going through
all of my code to check what effect this might have sounds like a
nightmare.

I see the point of having a copying version as well, but why not
implement the copying behaviour as methods and leave indexing as it
is?

> 5) Handling of sliceobjects which consist of sequences of indices (so that
> setting and getting elements of arrays using their index is possible). 

Sounds good as well...

> 6) Rank-0 arrays will not be autoconverted to Python scalars, but will
> still behave as Python scalars whenever Python allows general scalar-like
> objects in it's operations.  Methods will allow the
> user-controlled conversion to the Python scalars.  

I suspect that full behaviour-compatibility with scalars is
impossible, but I am willing to be proven wrong. For example, Python
scalars are immutable, arrays aren't. This also means that rank-0
arrays can't be used as keys in dictionaries.

How do you plan to implement mixed arithmetic with scalars? If the
return value is a rank-0 array, then a single library returning
a rank-0 array somewhere could mess up a program well enough that
debugging becomes a nightmare.

> 7) Addition of attributes so that different users can configure aspects of
> the math behavior, to their hearts content.

You mean global attributes? That could be the end of universally
usable library modules, supposing that people actually use them.

> If their is anyone interested in helping in this "unofficial branch
> work" let me know and we'll see about setting up someplace to work.  Be

I don't have much time at the moment, but I could still help out with
testing etc.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Oliphant.Travis at mayo.edu  Tue Feb  8 12:38:26 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Tue, 8 Feb 2000 11:38:26 -0600 (CST)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <200002081712.SAA03158@chinon.cnrs-orleans.fr>
Message-ID: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>

> > 3) Facility for memory-mapped dataspace in arrays.
> 
> I'd really like to have that...

This is pretty easy to add but it does require some changes to the
underlying structure, So you can expect it.
> 
> > 4) Slices become copies with the addition of methods for current strict
> > referencing behavior.
> 
> This will break a lot of code, and in a way that will be difficult to
> debug. In fact, this is the only point you mention which would be
> reason enough for me not to use your modified version; going through
> all of my code to check what effect this might have sounds like a
> nightmare.

I know this will be a sticky point.  I'm not sure what to do exactly, but
the current behavior and implementation makes the semantics for slicing an
array using a sequence problematic since I don't see a way to represent a
reference to a sequence of indices in the underlying structure of an
array. So such slices would have to be copies and not references, which
makes for an inconsistent code.  

> 
> I see the point of having a copying version as well, but why not
> implement the copying behaviour as methods and leave indexing as it
> is?

I want to agree with you, but I think we may need to change the behavior
eventually so when is it going to happen?

> 
> > 5) Handling of sliceobjects which consist of sequences of indices (so that
> > setting and getting elements of arrays using their index is possible). 
> 
> Sounds good as well...

This facility is already embedded in the underlying structure.  My plan is
to go with the original idea that Jim Hugunin and Chris Chase had for
slice objects.   The sliceobject in python is already general enough for
this to work.

> 
> > 6) Rank-0 arrays will not be autoconverted to Python scalars, but will
> > still behave as Python scalars whenever Python allows general scalar-like
> > objects in it's operations.  Methods will allow the
> > user-controlled conversion to the Python scalars.  
> 
> I suspect that full behaviour-compatibility with scalars is
> impossible, but I am willing to be proven wrong. For example, Python
> scalars are immutable, arrays aren't. This also means that rank-0
> arrays can't be used as keys in dictionaries.
> 
> How do you plan to implement mixed arithmetic with scalars? If the
> return value is a rank-0 array, then a single library returning
> a rank-0 array somewhere could mess up a program well enough that
> debugging becomes a nightmare.
>

Mixed arithmetic in general is another sticky point.  I went back and read
the discussion of this point which occured 1995-1996.  It was very
interesting reading and a lot of points were made.  Now we have several
years of experience and we should apply what we've learned (of course
we've all learned different things :-) ).  

Konrad, you had a lot to say on this point 4 years ago.  I've had a long
discussion with a colleague who is starting to "get in" to Numerical
Python and he has really been annoyed with the current mixed arithmetic
rules.  The seem to try to outguess the user.  The spacesaving concept
helps, but it still seem's like a hack to me.

I know there are several opinions, so I'll offer mine.  We need 
simple rules that are easy to teach a newcomer.  Right now the rule is
farily simple in that coercion always proceeds up.  But, mixed arithmetic
with a float and a double does not produce something with double
precision -- yet that's our rule.  I think any automatic conversion should
go the other way.  

Konrad, 4 years ago, you talked about unexpected losses of precision if
this were allowed to happen, but I couldn't understand how.  To me, it is
unexpected to have double precision arrays which are really only
carrying single-precision results.  My idea of the coercion hierchy is 
shown below with conversion always happening down when called for.  The
Python scalars get mapped to the "largest precision" in their category and
then normal coercions rules take place.  

The casual user will never use single precision arrays and so will not
even notice they are there unless they request them.   If they do request
them, they don't want them suddenly changing precision.  That is my take
anyway.

Boolean 
Character
Unsigned
        long
	int
	short
Signed
	long
	int 
	short
Real
	/* long double */
	double
	float
Complex
	/* __complex__ long double */
	__complex__ double
	__complex__ float
Object

> > 7) Addition of attributes so that different users can configure aspects of
> > the math behavior, to their hearts content.
> 
> You mean global attributes? That could be the end of universally
> usable library modules, supposing that people actually use them.

I thought I did, but I've changed my mind after reading the discussion in
1995.  I don't like global attributes either, so I'm not going there.

> 
> > If their is anyone interested in helping in this "unofficial branch
> > work" let me know and we'll see about setting up someplace to work.  Be
> 
> I don't have much time at the moment, but I could still help out with
> testing etc.

Konrad you were very instrumental in getting NumPy off the ground in the
first place and I will always appreciate your input.


From pauldubois at home.com  Tue Feb  8 12:56:11 2000
From: pauldubois at home.com (Paul F. Dubois)
Date: Tue, 8 Feb 2000 09:56:11 -0800
Subject: [Numpy-discussion] precision isn't just precision
In-Reply-To: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>
Message-ID: <NDBBIEFMILBFPMDHJIMFOELECCAA.pauldubois@home.com>

Before we all rattle on too long about precision, I'd like to point out that
selecting a precision actually carries two consequences in the context of
computer languages:

1. Expected: The number of digits of accuracy in the representation of a
floating point number.
2. Unexpected: The range of numbers that can be represented by this type.

Thus, to a scientist it is perfectly logical that if d is a double and f is
a single,

d * f

has only single precision validity.

Unfortunately in a computer if you hold this answer in a single, then it may
fail if the contents of d include numbers outside the single range, even if
f is 1.0.

Thus the rules in C and Fortran that coercion is UP had to do as much with
range as precision.


From pearu at ioc.ee  Tue Feb  8 14:46:16 2000
From: pearu at ioc.ee (Pearu Peterson)
Date: Tue, 8 Feb 2000 21:46:16 +0200 (EET)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>
Message-ID: <Pine.HPX.4.05.10002082122490.8962-100000@egoist.ioc.ee>

On Tue, 8 Feb 2000, Travis Oliphant wrote:

> I know there are several opinions, so I'll offer mine.  We need 
> simple rules that are easy to teach a newcomer.  Right now the rule is
> farily simple in that coercion always proceeds up.  But, mixed arithmetic
> with a float and a double does not produce something with double
> precision -- yet that's our rule.  I think any automatic conversion should
> go the other way.  

Remark:
If you are consistent then you say here that mixed arithmetic with an int
and a float/double produces int?! Right? (I hope that I am wrong.)

> Boolean 
> Character
> Unsigned
>         long
> 	int
> 	short
> Signed
> 	long
> 	int 
> 	short

How about `/* long long */'? Is this left out intentionally?

> Real
> 	/* long double */
> 	double
> 	float

Travis, while you are doing revision on NumPy, could you also estimate the
degree of difficulty of introducing column major order arrays?

Pearu


From hinsen at cnrs-orleans.fr  Tue Feb  8 14:56:21 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Tue, 8 Feb 2000 20:56:21 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> (message
	from Travis Oliphant on Tue, 8 Feb 2000 11:38:26 -0600 (CST))
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu>
Message-ID: <200002081956.UAA03241@chinon.cnrs-orleans.fr>

> I know this will be a sticky point.  I'm not sure what to do exactly, but
> the current behavior and implementation makes the semantics for slicing an
> array using a sequence problematic since I don't see a way to represent a

You are right there. But is it really necessary to extend the meaning
of slices? Of course everyone wants the functionality of indexing with
a sequence, but I'd be perfectly happy to have it implemented as a
method. Indexing would remain as it is (by reference), and a new
method would provide copying behaviour for element extraction and also
permit more generalized sequence indices.

In addition to backwards compatibility, there is another argument for
keeping indexing behaviour as it is: compatibility with other Python
sequence types. If you have a list of lists, which in many ways
behaves like a 2D array, and extract the third element (which is thus
a list), then this data is shared with the full nested list.

> > How do you plan to implement mixed arithmetic with scalars? If the
> > return value is a rank-0 array, then a single library returning
> > a rank-0 array somewhere could mess up a program well enough that
> > debugging becomes a nightmare.
> >
> 
> Mixed arithmetic in general is another sticky point.  I went back and read
> the discussion of this point which occured 1995-1996.  It was very

What I meant was not mixed-precision arithmetic, but arithmetic in which
one operand is a scalar and the other one a rank-0 array.

Which reminds me: rank-0 arrays are also incompatible with the
nested-list view of arrays. The elements of a list of numbers are
numbers, not number-like sequence objects.

But back to precision, which is also a popular subject:

> discussion with a colleague who is starting to "get in" to Numerical
> Python and he has really been annoyed with the current mixed arithmetic
> rules.  The seem to try to outguess the user.  The spacesaving concept
> helps, but it still seem's like a hack to me.

I wouldn't say that the current system tries to outguess the user. It
simply gives precision a higher priority than memory space. That might
not coincide with what a particular user wants, but it is consistent
and easy to understand.

> I know there are several opinions, so I'll offer mine.  We need 
> simple rules that are easy to teach a newcomer.  Right now the rule is
> farily simple in that coercion always proceeds up.  But, mixed arithmetic

Like in Python (for scalars), C, Fortran, and all other languages that
I can think of.

> Konrad, 4 years ago, you talked about unexpected losses of precision if
> this were allowed to happen, but I couldn't understand how.  To me, it is
> unexpected to have double precision arrays which are really only
> carrying single-precision results.  My idea of the coercion hierchy is 

I think this is a confusion of two different meanings of "precision".
In numerical algorithms, precision refers to the deviation between an
ideal and a real numerical value. In programming languages, it refers
to the *maximum* precision that can be stored in a given data type
(and is in fact often combined with a difference in range).

The upcasting rule thus ensures that

1) No precision is lost accidentally. If you multiply a float by
   a double, the float might contain the exact number 2, and thus
   have infinite precision. The language can't know this, so it
   acts conservatively and chooses the "bigger" type.

2) No overflow occurs unless it is unavoidable (the range problem).

> The casual user will never use single precision arrays and so will not
> even notice they are there unless they request them.   If they do request

There are many ways in which single-precision arrays can creep into a
program without a user's attention. Suppose you send me some data in a
pickled array, which happens to be single-precision. Or I call a
library routine that does some internal calculation on huge data
arrays, which it keeps at single precision, and (intentionally or by
error) returns a single-precision result.

I think your "active flag" solution is a rather good solution to the
casting problem, because it gives access to a different behaviour in a
very explicit way. So unless future experience points out problems,
I'd propose to keep it.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Barrett at stsci.edu  Tue Feb  8 15:10:39 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Tue,  8 Feb 2000 15:10:39 -0500 (EST)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002081110510.6700-100000@us2.mayo.edu>
References: <14496.16890.698835.619131@nem-srvr.stsci.edu>
	<Pine.LNX.4.10.10002081110510.6700-100000@us2.mayo.edu>
Message-ID: <14496.26037.829754.450187@nem-srvr.stsci.edu>

Travis Oliphant writes:
 > > 
 > > 1) The re-use of temporary arrays -- to conserve memory.
 > 
 > Please elaborate about this request.

When Python evaluates the expression:

>>> Y = B*X + A

where A, B, X, and Y are all arrays, B*X creates a temporary array, T.
A new array, Y, will be created to hold the result of T + A, and T
will be deleted.  If T and Y have the same shape and typecode, then
instead of creating Y, T can be re-used to conserve memory.

 > > 
 > > 2) A copy-on-write option -- to enhance performance.
 > > 
 > 
 > I need more explanation of this as well.

This would be an advanced feature of arrays that use memory-mapping or 
access their arrays from disk.  It is similar to the secondary cache
of a CPU.  The data is held in memory until a write request is made.

 > >
 > > 3) The initialization of arrays by default -- to help novices.
 > 
 > What kind of initialization are you taking about (we have zeros and ones
 > and random already).

For mixed-type (or object) arrays containing strings, zeros() and
ones() would be confusing.  Therefore by default, integer and floating
types are initialized to 0 and string types to ' ', and the option
would be available to not initialize the array for performance.

 > > 
 > > 4) The creation of a standard API -- which I guess is assumed, if it
 > >    is to be part of the Python standard distribution.
 > 
 > Any suggestions as to what needs to be changed in the already somewhat
 > standard API.

No, not exactly.  But the last time I looked, I thought some
improvements could be made to it.

 > > 
 > > 5) The inclusion of IEEE support.
 > 
 > This was supposed to be there from the beginning, but it didn't get
 > finished.  Jim's original idea was to have two math modules, one which
 > checked and gave error's for 1/0 and another that returned IEEE inf for
 > 1/0. 
 > 
 > The current umath does both with different types which is annoying. 

When I last spoke to Jim about this at IPC6, I was under the
impression that IEEE support was not fully implemented and much work 
still needed to be done.  Has this situation changed since then?

 > > 
 > >    And
 > > 
 > > 6) Enhanced support for mixed-types or objects.
 > > 
 > > This last issue is very import to me and the astronomical community,
 > > since we routinely store data as (multi-dimensional) arrays of fixed
 > > length records or C-structures.  A current deficiency of NumPy is that
 > > the object typecode does not work with the fromstring() method, so
 > > importing arrays of records from a binary file is just not possible.
 > > I've been developing my own C-extension type to handle this situation
 > > and have come to realize that my record type is really just a
 > > generalization of NumPy's types.  
 > 
 > 
 > I would like to see the code for your generalized type which would help me
 > see if there were some relatively painless way the two could be merged.

recordmodule.c is part of my PyFITS module for dealing with FITS
files.  You can find it here:

   ftp://ra.stsci.edu/pub/barrett/PyFITS_0.3.tgz

I use NumPy to access fixed-type arrays and the record type for
accessing mixed-type arrays.  A common example is accessing the second
element of a mixed-type (ie. an object) from the entire array.  This
returns a record type with a single element, which is equivalent to a
NumPy array of fixed type.  Therefore users expect this object to be a 
NumPy array and it isn't. They have to convert it to one.

 > > two C-extension types merged.  I think this enhancement can be done
 > > with minimal change to the current NumPy behavior and minor changes to
 > > the typecode system.
 > 
 > If you already see how to do it, then great.

Note that NumPy already has some support for an Object type.  It has
been proposed that it be removed, because it is not well supported and
hence few people use it.  I have the contrary opinion and feel we
should enhance the Object type and make it much more usable.  If you
don't need it, then you don't have to use it.  This enhancement really
shouldn't get in the way of those who only use fixed-type arrays.

So what changes to NumPy are needed?

1) Instead of a typecode (or in addition to the typecode for backward
   compatibility), I suggest an optional format keyword, which can be
   used to specify the mixed-type or object format.  Namely, format =
   'i, f, s10', where 'i' is an integer type, 'f' a floating point
   type, and s10 is a string of 10 characters.

2) Array access will be the same as it is now.  For example

   #  Create a 10x10 mixed-type array.
   A = array((10, 10), format = 'i, f, 10s')
   #  Create a 10x10 fixed-type array.
   B = array((10, 10), typecode = 'i')

   #  Print a 5x5 subarray of mixed-type.
   print A[:5,:5]

   #  Print a 5x5 subarray of fixed-type
   print B[:5,:5]
   #  Or 
   #  (Note that the 3rd index is optional for fixed-type arrays, it
   #  always defaults to 0.)
   print B[:5,:5,0]
   
   #  Print the second element of the mixed-type of the entire array.
   #  Note that this is now an array of fixed-type.
   print A[:,:,1]

   The major thorn that I see at this point is how to reconcile the
   behavior of numbers and strings during operations.  But I don't see 
   this as an intractable problem.

   I actually believe this enhancement will encourage us to create a
   better and more generic multi-dimensional array module by
   concentrating on the behavioral aspects of this extension type.

   Note that J, which NumPy is base upon, allows such mixed-types.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-516-6714    DESD/DPT
FAX:   410-516-8615    Baltimore, MD 21218


From pauldubois at home.com  Tue Feb  8 15:16:55 2000
From: pauldubois at home.com (Paul F. Dubois)
Date: Tue, 8 Feb 2000 12:16:55 -0800
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <200002081956.UAA03241@chinon.cnrs-orleans.fr>
Message-ID: <NDBBIEFMILBFPMDHJIMFCELICCAA.pauldubois@home.com>

Konrad wrote:
>
> In addition to backwards compatibility, there is another argument for
> keeping indexing behaviour as it is: compatibility with other Python
> sequence types.

I claim the current Numeric is INconsistent with other Python sequence
types:

>>> x = [1, 2, 3, 4, 5]
>>> y = x[2:5]
>>> x
[1, 2, 3, 4, 5]
>>> y
[3, 4, 5]
>>> y[1] = 7
>>> y
[3, 7, 5]
>>> x
[1, 2, 3, 4, 5]

So, y is a copy of x[2:5], not a reference.


From DavidA at ActiveState.com  Tue Feb  8 15:30:23 2000
From: DavidA at ActiveState.com (David Ascher)
Date: Tue, 8 Feb 2000 12:30:23 -0800
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <14496.26037.829754.450187@nem-srvr.stsci.edu>
Message-ID: <000101bf7273$531e3f80$c355cfc0@ski.org>

> So what changes to NumPy are needed?
>
> 1) Instead of a typecode (or in addition to the typecode for backward
>    compatibility), I suggest an optional format keyword, which can be
>    used to specify the mixed-type or object format.  Namely, format =
>    'i, f, s10', where 'i' is an integer type, 'f' a floating point
>    type, and s10 is a string of 10 characters.

I'd suggest to go all the way and make it a real object, not just a string.
That object can then have useful attributes, like size in bytes, maxval,
minval, some indication of precision, etc.

Logically, itemsize should be an attribute of the numeric type of an array,
not of the array itself.

--david ascher


From beausol at exch.hpl.hp.com  Tue Feb  8 16:31:30 2000
From: beausol at exch.hpl.hp.com (Beausoleil, Raymond)
Date: Tue, 8 Feb 2000 13:31:30 -0800 
Subject: [Numpy-discussion] RE: [Matrix-SIG] An Experiment in code-cleanup.
Message-ID: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>

I've been reading the posts on this topic with considerable interest. For a
moment, I want to emphasize the "code-cleanup" angle more literally than the
functionality mods suggested so far.

Several months ago, I hacked my personal copy of the NumPy distribution so
that I could use the Intel Math Kernel Library for Windows. The IMKL is
(1) freely available from Intel at
http://developer.intel.com/vtune/perflibst/mkl/index.htm;
(2) basically BLAS and LAPACK, with an FFT or two thrown in for good
measure;
(3) optimized for the different x86 processors (e.g., generic x86, Pentium
II & III);
(4) configured to use 1, 2, or 4 processors; and
(5) configured to use multithreading.
It is an impressive, fast implementation. I'm sure there are similar native
libraries available on other platforms.

Probably due to my inexperience with both Python and NumPy, it took me a
couple of days to successfully tear out the f2c'd stuff and get the IMKL
linking correctly. The parts I've used work fine, but there are probably
other features that I haven't tested yet that still aren't up to snuff. In
any case, the resulting code wasn't very pretty.

As long as the NumPy code is going to be commented and cleaned up, I'd be
glad to help make sure that the process of using a native BLAS/LAPACK
distribution (which was probably compiled using Fortran storage and naming
conventions) is more straightforward. Among the more tedious issues to
consider are:
(1) The extent of the support for LAPACK. Do we want to stick with LAPACK
Lite?
(2) The storage format. If we've still got row-ordered matrices under the
hood, and we want to use native LAPACK libraries that were compiled using
column-major format, then we'll have to be careful to set all of the flags
correctly. This isn't going to be a big deal, _unless_ NumPy will support
more of LAPACK when a native library is available. Then, of course, there
are the special cases: the IMKL has both a C and a Fortran interface to the
BLAS.
(3) Through the judicious use of header files with compiler-dependent flags,
we could accommodate the various naming conventions used when the FORTRAN
libraries were compiled (e.g., sgetrf_ or SGETRF).

The primary output of this effort would be an expansion of the "Compilation
Notes" subsection of Section 15 of the NumPy documentation, and some header
files that made the recompilation easier than it is now.

Regards,

Ray

============================
Ray Beausoleil
Hewlett-Packard Laboratories
mailto:beausol at hpl.hp.com
Vox: 425-883-6648
Fax: 425-883-2535
HP Telnet: 957-4951
============================


From Oliphant.Travis at mayo.edu  Tue Feb  8 16:32:57 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Tue, 8 Feb 2000 15:32:57 -0600 (CST)
Subject: [Numpy-discussion] Come take an informal survey.
In-Reply-To: <200002082004.MAA26529@lists.sourceforge.net>
Message-ID: <Pine.LNX.4.10.10002081530250.7305-100000@us2.mayo.edu>

In an effort to try and get data about what users' attitudes are toward
Numerical Python, I'm conducting a survey at sourceforge.net

If you would like to participate in the survey, please go to
http://www.sourceforge.net, log-in with your sourceforge id and go to the
numpy page:
http://sourceforge.net/project/?group_id=1369

In the Public Survey section there is a short survey you can fill out.

Thank you,

Travis Oliphant
NumPy Developer


From phil at geog.ubc.ca  Tue Feb  8 18:33:18 2000
From: phil at geog.ubc.ca (Phil Austin)
Date: Tue, 8 Feb 2000 15:33:18 -0800 (PST)
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu>
Message-ID: <14496.42942.5355.849670@brant.geog.ubc.ca>

Travis Oliphant writes:
 > 
 > 3) Facility for memory-mapped dataspace in arrays.
 > 

For the NumPy users who are as ignorant about mmap, msync,
and madvise as I am, I've put a couple of documents on
my web site:

1) http://www.geog.ubc.ca/~phil/mmap/mmap.pdf

A pdf version of Kevin Sheehan's paper: "Why aren't you
using mmap yet?" (19 page Frame postscript orginal, page order
back to front).  He gives a good discusion of the SV4 VM model,
with some mmap examples in C.

2) http://www.geog.ubc.ca/~phil/mmap/threads.html

An archived email exchange (initially on the F90 mailing list) between
Kevin (who is an independent Solaris consultant) and Brian Sumner
(SGI) about the pros and cons of using mmap.  

Executive summary:

i) mmap on Solaris can be a very big win (see bottom of
http://www.geog.ubc.ca/~phil/mmap/msg00003.html) when
used in combination with WILLNEED/WONTNEED  madvise calls to
guide the page prefetching.

ii) IRIX and some other Unices (Linux 2.2 in particular), haven't
implemented madvise, and naive use of mmap without madvise can produce
lots of page faulting and much slower io than, say, asynchronous io
calls on IRIX.  (http://www.geog.ubc.ca/~phil/mmap/msg00009.html)

So I'd love to see mmap in Numpy, but we may need to produce a
tutorial outlining the tradeoffs, and giving some examples of
madvise/msync/mmap used together (with a few benchmarks).  Any mmap
module would need to include member functions that call madvise/msync
for the mmapped array (but these may be no-ops on several popular OSes.)

Regards, Phil


From jrwebb at goodnet.com  Tue Feb  8 01:03:42 2000
From: jrwebb at goodnet.com (James R. Webb)
Date: Mon, 7 Feb 2000 23:03:42 -0700
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>
Message-ID: <001801bf71fa$41f681a0$01f936d1@janus>

There is now a linux native BLAS available through links at
http://www.cs.utk.edu/~ghenry/distrib/  courtesy of the ASCI Option Red
Project.

There is also ATLAS (http://www.netlib.org/atlas/).

Either library seems to link into NumPy without a hitch.

----- Original Message -----
From: "Beausoleil, Raymond" <beausol at exch.hpl.hp.com>
To: <numpy-discussion at lists.sourceforge.net>
Cc: <matrix-sig at python.org>
Sent: Tuesday, February 08, 2000 2:31 PM
Subject: RE: [Matrix-SIG] An Experiment in code-cleanup.


> I've been reading the posts on this topic with considerable interest. For
a
> moment, I want to emphasize the "code-cleanup" angle more literally than
the
> functionality mods suggested so far.
>
> Several months ago, I hacked my personal copy of the NumPy distribution so
> that I could use the Intel Math Kernel Library for Windows. The IMKL is
> (1) freely available from Intel at
> http://developer.intel.com/vtune/perflibst/mkl/index.htm;
> (2) basically BLAS and LAPACK, with an FFT or two thrown in for good
> measure;
> (3) optimized for the different x86 processors (e.g., generic x86, Pentium
> II & III);
> (4) configured to use 1, 2, or 4 processors; and
> (5) configured to use multithreading.
> It is an impressive, fast implementation. I'm sure there are similar
native
> libraries available on other platforms.
>
> Probably due to my inexperience with both Python and NumPy, it took me a
> couple of days to successfully tear out the f2c'd stuff and get the IMKL
> linking correctly. The parts I've used work fine, but there are probably
> other features that I haven't tested yet that still aren't up to snuff. In
> any case, the resulting code wasn't very pretty.
>
> As long as the NumPy code is going to be commented and cleaned up, I'd be
> glad to help make sure that the process of using a native BLAS/LAPACK
> distribution (which was probably compiled using Fortran storage and naming
> conventions) is more straightforward. Among the more tedious issues to
> consider are:
> (1) The extent of the support for LAPACK. Do we want to stick with LAPACK
> Lite?
> (2) The storage format. If we've still got row-ordered matrices under the
> hood, and we want to use native LAPACK libraries that were compiled using
> column-major format, then we'll have to be careful to set all of the flags
> correctly. This isn't going to be a big deal, _unless_ NumPy will support
> more of LAPACK when a native library is available. Then, of course, there
> are the special cases: the IMKL has both a C and a Fortran interface to
the
> BLAS.
> (3) Through the judicious use of header files with compiler-dependent
flags,
> we could accommodate the various naming conventions used when the FORTRAN
> libraries were compiled (e.g., sgetrf_ or SGETRF).
>
> The primary output of this effort would be an expansion of the
"Compilation
> Notes" subsection of Section 15 of the NumPy documentation, and some
header
> files that made the recompilation easier than it is now.
>
> Regards,
>
> Ray
>
> ============================
> Ray Beausoleil
> Hewlett-Packard Laboratories
> mailto:beausol at hpl.hp.com
> Vox: 425-883-6648
> Fax: 425-883-2535
> HP Telnet: 957-4951
> ============================
>
> _______________________________________________
> Matrix-SIG maillist  -  Matrix-SIG at python.org
> http://www.python.org/mailman/listinfo/matrix-sig
>


From amullhau at zen-pharaohs.com  Wed Feb  9 01:51:09 2000
From: amullhau at zen-pharaohs.com (Andrew P. Mullhaupt)
Date: Wed, 9 Feb 2000 01:51:09 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr>
Message-ID: <03f401bf72ca$0e0608e0$5063cb0a@amullhau>

> I'd be perfectly happy to have it implemented as a
> method. Indexing would remain as it is (by reference), and a new
> method would provide copying behaviour for element extraction and also
> permit more generalized sequence indices.

I think I can live with that, as long as it _syntactically_ looks like
indexing. This is one case where the syntax is more important than
functionality. There are things you want to index with indices, etc., and
the composition with parenthesis-like (Dyck language) syntax has proved to
be one of the few readable ways to do it.

> In addition to backwards compatibility, there is another argument for
> keeping indexing behaviour as it is: compatibility with other Python
> sequence types. If you have a list of lists, which in many ways
> behaves like a 2D array, and extract the third element (which is thus
> a list), then this data is shared with the full nested list.

_Avoiding_ data sharing will eventually be more important that supporting
data sharing since memory continues to get cheaper but memory bandwidth and
latency do not improve at the same rate. Locality of reference is hard to
control when there is a lot of default data sharing, and performance
suffers, yet it becomes important on more and more scales as memory systems
become more and more hierarchical.

Ultimately, the _semantics_ we like will be implemented efficiently by
emulating references and copies in code which copies and references as it
sees fit and keeps track of which copies are "really" references and which
references are really "copies". I've thought this through for the
"everything gets copied" languages and it isn't too mentally distressing -
you simply reference count fake copies. The "everything is a reference"
languages are less clean, but the database people have confronted that
problem.

> Which reminds me: rank-0 arrays are also incompatible with the
> nested-list view of arrays.

There are ways out of that trap. Most post-ISO APLs provide examples of how
to cope.

> > I know there are several opinions, so I'll offer mine.  We need
> > simple rules that are easy to teach a newcomer.  Right now the rule is
> > farily simple in that coercion always proceeds up.  But, mixed
arithmetic
>
> Like in Python (for scalars), C, Fortran, and all other languages that
> I can think of.

And that is not a bad thing. But which way is "up"? (See example below.)

> > Konrad, 4 years ago, you talked about unexpected losses of precision if
> > this were allowed to happen, but I couldn't understand how.  To me, it
is
> > unexpected to have double precision arrays which are really only
> > carrying single-precision results.

Most people always hate, and only sometimes detect, when that happens. It
specifically contravenes the Geneva conventions on programming mental
hygiene.

> The upcasting rule thus ensures that
>
> 1) No precision is lost accidentally.

More or less.

More precisely, it depends on what you call an accident. What happens when
you add the IEEE single precision floating point value 1.0 to the 32-bit
integer 2^30? A _lot_ of people don't expect to get the IEEE single
precision floating point value 2.0^30, but that is what happens in some
languages. Is that an "upcast"? Would the 32 bit integer 2^30 make more
sense? Now what about the case where the 32 bit integer is signed and adding
one to it will "wrap around" if the value remains an integer? Because these
two examples might make double precision or a wider integer (if available)
seem the correct answer, suppose it's only one element of a gigantic array?
Let's now talk about complex values....

There's plenty of rough edges like this when you mix numerical types. It's
guaranteed that everybody's ox will get gored somewhere.

> 2) No overflow occurs unless it is unavoidable (the range problem).
>
> > The casual user will never use single precision arrays and so will not
> > even notice they are there unless they request them.   If they do
request
>
> There are many ways in which single-precision arrays can creep into a
> program without a user's attention.

Absolutely.

> Suppose you send me some data in a
> pickled array, which happens to be single-precision. Or I call a
> library routine that does some internal calculation on huge data
> arrays, which it keeps at single precision, and (intentionally or by
> error) returns a single-precision result.

And the worst one is when the accuracy of the result is single precision,
but the _type_ of the result is double precision. There is a function in
S-plus which does this (without documenting it, of course) and man was that
a pain in the neck to sort out.

Today, I just found another bug in one of the S-plus functions - turns out
that that if you hand a complex triangular matrix and a real right hand side
to the triangular solver (backsubstitution) it doesn't cast the right hand
side to complex and uses whatever values are subsequent in memoty to the
right hand side as if they were part of the vector. Obviously, when testing
the function, they didn't try this mixed type case. But interpreters are
really convenient for writing code so that you _don't_ have to think about
types all the time and do your own casting. Which is why stubbing your head
on an unexpected cast is so unlooked for.

> I think your "active flag" solution is a rather good solution to the
> casting problem, because it gives access to a different behaviour in a
> very explicit way. So unless future experience points out problems,
> I'd propose to keep it.

Is there a simple way to ensure that no active arrays are ever activated at
any time when I use Numerical Python?

Later,
Andrew Mullhaupt


From amullhau at zen-pharaohs.com  Wed Feb  9 02:17:39 2000
From: amullhau at zen-pharaohs.com (Andrew P. Mullhaupt)
Date: Wed, 9 Feb 2000 02:17:39 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> <14496.42942.5355.849670@brant.geog.ubc.ca>
Message-ID: <03fe01bf72cd$c040d640$5063cb0a@amullhau>

> Travis Oliphant writes:
>  >
>  > 3) Facility for memory-mapped dataspace in arrays.
>
> For the NumPy users who are as ignorant about mmap, msync,
> and madvise as I am, I've put a couple of documents on
> my web site:

I have Kevin's "Why Aren't You Using mmap() Yet?" on my site. Kevin is
working on a new (11th anniversary edition? 1xth anniversary edition?).

By the way, Uresh Vahalia's book on Unix Internals is a very good idea for
anyone not yet familiar with modern operating systems, especially Unices.

Kevin is extremely knowledgable on this subject, and several others.

> Executive summary:
>
> i) mmap on Solaris can be a very big win

Orders of magnitude.

> (see bottom of
> http://www.geog.ubc.ca/~phil/mmap/msg00003.html) when
> used in combination with WILLNEED/WONTNEED  madvise calls to
> guide the page prefetching.

And with the newer versions of Solaris, madvise() is a good way to go.

madvise is _not_ SVR4 (not in SVID3) but it _is_ in the OSF/1 AES which
means it is _not_ vendor specific. But the standard part of madvise is that
it is a "hint".

However everything it actually _does_ when you hint the kernel with madvise
is specific usually to some versions of an operating system.

There are tricks to get around madvise not doing everything you want
(WONTNEED didn't work in Solaris for a long time. Kevin found a trick that
worked really well instead. Kevin knows people at Sun, since he was one of
the very earliest employees there, and so now the trick Kevin used to
suggest has now been found to be the implementation of WONTNEED in Solaris.)

And that trick is well worth understanding. It happens that msync() is a
good call to know. It has an undocumented behavior on Solaris that when you
msync a memory region with MS_INVALIDATE | MS_ASYNC, what happens is the
dirty pages are queued for writing and backing store is available
immediately, or if dirty, as soon as written out. This means that the pager
doesn't have to run at all to scavenge the pages. Linux didn't do this last
time I looked. I suggested it to the kernel guys and the idea got some
positive response, but I don't know if they did it.

> ii) IRIX and some other Unices (Linux 2.2 in particular), haven't
> implemented madvise, and naive use of mmap without madvise can produce
> lots of page faulting and much slower io than, say, asynchronous io
> calls on IRIX.  (http://www.geog.ubc.ca/~phil/mmap/msg00009.html)

IRIX has an awful implementation of mmap. And SGI people go around
badmouthing mmap; not that they don't have cause, but they are usually very
surprised to see how big the win is with a good implementation. Of course,
the msync() trick doesn't work on IRIX last I looked, which leads to the SGI
people believing that mmap() is brain damaged because it runs the pager into
the ground. It's a point of view that is bound to come up.

HP/UX was really wacked last time I looked. They had a version (10) which
supported the full mmap() on one series of workstations (700, 7000, I
forget, let's say 7e+?) and didn't support it except in the non-useful
SVR3.2 way on another series of workstations (8e+?). The reason was that the
8e+? workstations were multiprocessor and they hadn't figured out how to get
the newer kernel flying on the multiprocessors. I know Konrad had HP systems
at one point, maybe he has the scoop on those.

> So I'd love to see mmap in Numpy, but we may need to produce a
> tutorial outlining the tradeoffs, and giving some examples of
> madvise/msync/mmap used together (with a few benchmarks).  Any mmap
> module would need to include member functions that call madvise/msync
> for the mmapped array (but these may be no-ops on several popular OSes.)

I don't know if you want a separate module; maybe what you want is the
normal allocation of memory for all Numerical Python objects to be handled
in a way that makes sense for each operating system.

The approach I took when I was writing portable code for this sort of thing
was to write a wrapper for the memory operation semantics and then implement
the operations as a small library that would be OS specific, although not
_that_ specific. It was possible to write single source code for SVID3 and
OSF/AES1 systems with sparing use of conditional defines. Unfortunately,
that code is the intellectual property of another firm, or else I'd donate
it as an example for people who want to learn stuff about mmap. As it
stands, there was some similar code I was able to produce at some point. I
forget who here has a copy, maybe Konrad, maybe David Ascher.

Later,
Andrew Mullhaupt


From skaller at maxtal.com.au  Wed Feb  9 11:12:49 2000
From: skaller at maxtal.com.au (skaller)
Date: Thu, 10 Feb 2000 03:12:49 +1100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr>
Message-ID: <38A19201.8A43EC01@maxtal.com.au>

Konrad Hinsen wrote:

> But back to precision, which is also a popular subject:

	but one which even numerical programmers don't seem to 
understand ...
 
> The upcasting rule thus ensures that
> 
> 1) No precision is lost accidentally. If you multiply a float by
>    a double, the float might contain the exact number 2, and thus
>    have infinite precision. The language can't know this, so it
>    acts conservatively and chooses the "bigger" type.
> 
> 2) No overflow occurs unless it is unavoidable (the range problem).

	.. which is all wrong.

	It is NOT safe to convert floating point from a lower to a higher
number
of bits. ALL such conversions should be removed for this reason: any
conversions
should have to be explicit.

	The reason is that whether a conversion to a larger number
is safe or not is context dependent (and so it should NEVER be done
silently). Consider a function 

	k0 = 100
	k = 99
	while k < k0:
		..
		k0 = k
		k = ...

which refines a calculation until the measure k stops decreasing.
This algorithm may terminate when k is a float, but _fail_ when
k is a double  -- the extra precision may cause the algorithm
to perform many useless iterations, in which the precision
of the result is in fact _lost_ due to rounding error.

What is happening is that the real comparison is probably:

	k - k0 < epsilon

where epsilon was 0.0 in floating point, and thus omitted.

My point is that throwing away information is what numerical
programming is all about. Numerical programmers need to know
how big numbers are, and how much significance they have,
and optimise calculations accordingly -- sometimes by _using_
the precision of the working types to advantage.

to put this another way, it is generally bad to keep more digits (bits)
or precision than you actually have: it can be misleading.
So a language should not assume that it is OK to add more precision.
It may not be.

-- 
John (Max) Skaller, mailto:skaller at maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
homepage: http://www.maxtal.com.au/~skaller
download: ftp://ftp.cs.usyd.edu/au/jskaller


From gpk at bell-labs.com  Wed Feb  9 11:23:47 2000
From: gpk at bell-labs.com (Greg Kochanski)
Date: Wed, 09 Feb 2000 11:23:47 -0500
Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #10 - 10 msgs
References: <200002091617.IAA28931@lists.sourceforge.net>
Message-ID: <38A19493.6E2394E6@bell-labs.com>

> From: "Andrew P. Mullhaupt" <amullhau at zen-pharaohs.com> 
> > The upcasting rule thus ensures that
> >
> > 1) No precision is lost accidentally.
> 
> More or less.
> 
> More precisely, it depends on what you call an accident. What happens when
> you add the IEEE single precision floating point value 1.0 to the 32-bit
> integer 2^30? A _lot_ of people don't expect to get the IEEE single
> precision floating point value 2.0^30, but that is what happens in some
> languages. Is that an "upcast"? Would the 32 bit integer 2^30 make more
> sense? Now what about the case where the 32 bit integer is signed and adding
> one to it will "wrap around" if the value remains an integer? Because these
> two examples might make double precision or a wider integer (if available)
> seem the correct answer, suppose it's only one element of a gigantic array?
> Let's now talk about complex values....
> 


It's most important that the rules be simple, and (preferably) close
to common languages.  I'd suggest C.
In my book, anyone who carelessly mixes floats and ints deserves
whatever
punishment the language metes out.

I've done numeric work in languages where
casting was by request _only_ (e.g., Limbo, for Inferno),
and I found, to my surprise,
that automatic type casting these type casting is only a mild
convenience.   Writing code with manual typecasting is surprisingly
easy.   Since automatic typecasting only buys a small improvement
in ease of use, I'd want to be extremely sure that it doesn't cause
many problems.

It's very easy to write some complicated set of rules that wastes more
time (in the form of unexpected, untraceable bugs) than it saves.


By the way, automatic downcasting has a hidden problems if python
is ever set to trap underflow errors.   I had a program that would
randomly crash every 10th (or so) time I ran it with a large dataset
(1000x1000 linear algebra).   After days of hair-pulling, I found that
the matrix was being converted from double to float at one step,
and about 1 in 10,000,000 of the entries was too small to represent
as a single precision number.  That very rare event would underflow,
be trapped, and crash the program with a floating point exception.


From hinsen at cnrs-orleans.fr  Wed Feb  9 12:17:30 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 18:17:30 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <38A19201.8A43EC01@maxtal.com.au> (message from skaller on Thu,
	10 Feb 2000 03:12:49 +1100)
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr> <38A19201.8A43EC01@maxtal.com.au>
Message-ID: <200002091717.SAA10604@chinon.cnrs-orleans.fr>

> silently). Consider a function 
> 
> 	k0 = 100
> 	k = 99
> 	while k < k0:
> 		..
> 		k0 = k
> 		k = ...
> 
> which refines a calculation until the measure k stops decreasing.
> This algorithm may terminate when k is a float, but _fail_ when
> k is a double  -- the extra precision may cause the algorithm

I'd call this a buggy implementation. Convergence criteria should be
explicit and not rely on the internal representation of data
types. Neither Python nor C guarantees you any absolute bounds for
precision and value range, and even languages that do (such as Fortran
9x) only promise to give you a data type that is *at least* as big as
your specification.

> programming is all about. Numerical programmers need to know
> how big numbers are, and how much significance they have,
> and optimise calculations accordingly -- sometimes by _using_
> the precision of the working types to advantage.

If you care at all about portability, you shouldn't even think about
this.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From amullhau at zen-pharaohs.com  Wed Feb  9 12:21:37 2000
From: amullhau at zen-pharaohs.com (Andrew P. Mullhaupt)
Date: Wed, 9 Feb 2000 12:21:37 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr> <38A19201.8A43EC01@maxtal.com.au>
Message-ID: <054301bf7322$23cbbbe0$5063cb0a@amullhau>


> Konrad Hinsen wrote:
>
> > But back to precision, which is also a popular subject:
>
> but one which even numerical programmers don't seem to
> understand ...

Some do, some don't.


> It is NOT safe to convert floating point from a lower to a higher
> number
> of bits.

It is usually safe. Extremely safe. Safe enough that code in which it is
_not_ safe is badly designed.

> ALL such conversions should be removed for this reason: any
> conversions should have to be explicit.

I really hope not. A generic function with six different arguments becomes
an interesting object in a language without automatic conversions. Usually,
a little table driven piece of code has to cast the arguments into
conformance, and then multiple versions of similar code are applied.


> which refines a calculation until the measure k stops decreasing.
> This algorithm may terminate when k is a float, but _fail_ when
> k is a double  -- the extra precision may cause the algorithm
> to perform many useless iterations, in which the precision
> of the result is in fact _lost_ due to rounding error.

This is a classic bad programming practice and _it_ is what should be
eliminated. It is a good, (and required, if you work for me), practice that:

1. All iterations should have termination conditions which are correct; that
is, prevent extra iterations. This is typically precision sensitive. But
that is simply something that has to be taken into account when writing the
termination condition.

2. All iterations should be protected against an unexpectedly large number
of iterates taking place.

There are examples of iterations which are intrinsically stable in lower
precision and not in higher precision (Brun's algorithm) but those are quite
rare in practice. (Note that the Fergueson-Forcade algorithm, as implemented
by Lenstra, Odlyzko, and others, has completely supplanted any need to use
Brun's algorithm as well.)

When an algorithm converges because of lack of precision, it is because the
rounding error regularizes the problem. This is normally referred to in the
trade as "idiot regularization". It is in my experience, invariably better
to actually choose a regularization that is specific to the computation than
to rely on rounding effects which might be different from machine to
machine.

In particular, your type of example is in for serious programmer enjoyment
hours on Intel or AMD machines, which have 80 bit wide registers for all the
floating point arithmetic.

Supporting needless machine dependency is not something to argue for,
either, since the Cray style floating point arithmetic has a bad error
model. Even Cray has been beaten into submission on this, finally releasing
IEEE compliant processors, but only just recently.

> to put this another way, it is generally bad to keep more digits (bits)
> or precision than you actually have

I couldn't agree less.

The exponential function and inner product accumulation are famous examples
of why extra bits are important in intermediate computations. It's almost
impossible to have an accurate exponential function without using extra
precision - which is one reason why so many machines have extra bits in
their FPUs and there is an IEEE "extended" precision type.

The storage history effects which result from temporarily increased
precision are well understood, mild in that they violate no common error
models used in numerical analysis.

And for those few cases where testing for equality is needed for debugging
purposes, many systems permit you to impose truncation and eliminate storage
history issues.

Later,
Andrew Mullhaupt


From hinsen at cnrs-orleans.fr  Wed Feb  9 12:31:00 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 18:31:00 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <NDBBIEFMILBFPMDHJIMFCELICCAA.pauldubois@home.com>
References: <NDBBIEFMILBFPMDHJIMFCELICCAA.pauldubois@home.com>
Message-ID: <200002091731.SAA10614@chinon.cnrs-orleans.fr>

> > In addition to backwards compatibility, there is another argument for
> > keeping indexing behaviour as it is: compatibility with other Python
> > sequence types.
> 
> I claim the current Numeric is INconsistent with other Python sequence
> types:
> 
> >>> x = [1, 2, 3, 4, 5]
> >>> y = x[2:5]
> >>> x
> [1, 2, 3, 4, 5]
> >>> y
> [3, 4, 5]
> >>> y[1] = 7
> >>> y
> [3, 7, 5]
> >>> x
> [1, 2, 3, 4, 5]
> 
> So, y is a copy of x[2:5], not a reference.

Good point. So we can't be consistent with all properties of other
Python sequence types.

Which reminds me of some very different compatibility problem in NumPy
that can (and should) be removed: the rules for integer division and
remainders for negative arguments are not the same. NumPy inherits the
C rules, Python has its own.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From gpk at bell-labs.com  Wed Feb  9 13:25:00 2000
From: gpk at bell-labs.com (Greg Kochanski)
Date: Wed, 09 Feb 2000 13:25:00 -0500
Subject: [Numpy-discussion] Re: [Matrix-SIG] Re: Matrix-SIG digest, Vol 1 #364 - 9 msgs
References: <20000209064911.9404C1CD3C@dinsdale.python.org> <38A19236.51D1879F@bell-labs.com> <053b01bf731f$30118c20$5063cb0a@amullhau>
Message-ID: <38A1B0FC.168FDCB5@bell-labs.com>

"Andrew P. Mullhaupt" wrote:

> > It's most important that the rules be simple, and (preferably) close
> > to common languages.  I'd suggest C.
> 
> That is a good example of a language which has a pretty weird history on
> this particular matter.

True.   The only real advantage of C is that so many people are
used to it.   Don't forget the human element.  FORTRAN would
also be a reasonable choice.

There's a big cost to learning a new language; if it gets too big,
people simply won't use Python.


> > Since automatic typecasting only buys a small improvement
> > in ease of use, I'd want to be extremely sure that it doesn't cause
> > many problems.
> 
> Au contraire. It is a huge win. Try writing a "generic" function with six
> arguments which can sensibly be integers, or single or double precision
> variables. If you have to test variables to see what they are, then you have
> to essentially write a table driven typecaster. If, as in Fortran, you have
> to write different functions for different argument types then you have the
> dangerous programming practice of having several different pieces of code
> which do essentially the same computation.


While that's nice to say, it doesn't really translate completely to
practice.
A lot of functions don't make sense with arbitrary objects;
and some require subtle changes.

For instance, who wants a matrix inversion function that operates on
integers,
using integer division inside?

Lots of functions have DOUBLE_EPS or FLOAT_EPS embedded inside them.
One has to change the small number when you change the data type.

I'll grant you that running things with both doubles or floats is often
useful.
I'd be happy with automatic upcasting among them.
I'd be moderately happy with upcasting among the integers.
I really don't see any crying need to mix integers with floating point
numbers.

I'd like some examples to make me believe that mixing ints and floats
is a 'huge win'.


From hinsen at cnrs-orleans.fr  Wed Feb  9 13:39:46 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 19:39:46 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>
	(beausol@exch.hpl.hp.com)
References: <34E36C05935CD311AE5000A0C9B6B0BF07D16D@hplex3.hpl.hp.com>
Message-ID: <200002091839.TAA10737@chinon.cnrs-orleans.fr>

> (1) The extent of the support for LAPACK. Do we want to stick with LAPACK
> Lite?

There has been a full LAPACK interface for a long while, of which
LAPACK Lite is just the subset that is needed for supporting the
high-level routines in the module LinearAlgebra. I seem to have lost
the URL to the full version, but it's on my disk, so I can put it
onto my FTP server if there is a need.

> (2) The storage format. If we've still got row-ordered matrices under the
> hood, and we want to use native LAPACK libraries that were compiled using
> column-major format, then we'll have to be careful to set all of the flags
> correctly. This isn't going to be a big deal, _unless_ NumPy will support
> more of LAPACK when a native library is available. Then, of course, there

The low-level interface routines don't take care of this. It's the
high-level Python code (module LinearAlgebra) that sets the transposition
argument correctly. That looks like a good compromise to me.

> (3) Through the judicious use of header files with compiler-dependent flags,
> we could accommodate the various naming conventions used when the FORTRAN
> libraries were compiled (e.g., sgetrf_ or SGETRF).

That's already done!

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From pitts.todd at mayo.edu  Wed Feb  9 14:05:02 2000
From: pitts.todd at mayo.edu (Pitts, Todd A., Ph.D.)
Date: Wed, 9 Feb 2000 13:05:02 -0600 (CST)
Subject: [Numpy-discussion] Upcasting
Message-ID: <Pine.LNX.4.10.10002091207490.22233-100000@us17.mayo.edu>

Here are my two cents worth on the subject...

	Most of what has been said in this thread (at least what I
have read) I find very valuable.  Apparently, many people have been
thinking about the subject.  I view this problem as inherent in a
language without an Lvalue (like C) that allows a very explicit and
clear definition from the programmer's point of view as to the size of
the container you are going put things in.  The language in many cases
simply returns an object to you and has made some decision as to what
you "needed" or "wanted".  Of course, this is one of the things that
makes languages such an Numerical Python, Matlab, IDL, etc. very nice
for protyping and investigating.  In many cases this decision will be
adequate or acceptable.  In some, quite simply, it will not be.  At
this point the programmer has to have a good means of managing this
decision himself.

If memory is not a constraint, I can think of very few situations
(none, actually) where I would choose to go with something other than
the Numerical Python default of double.  In general, that is what you
get when creating python arrays unless you make some effort to obtain
some other type.  However, in some important (read "cases that affect
the author") situations memory is a very critical constraint.
Typically, in Numerical Python I use 4-byte floats.  In fact, one of
the reasons I use Numerical Python is because I don't *need* doubles
and matlab for example is really only setup to work gracefully with
doubles.  I do *need* to conserve memory as I deal with very large
data sets.  It seems the question we are discussing is not really
"what *should* be done in terms of casting?" but "what provides good
enough decisions much of the time *and* a gracefull way to manage the
decisions when "good enough" no longer applies to you?"

Currently, this is not a trivial thing to manage.  Reading in a 100 MB
data set and multiplying by the python scalar 2 produces a 200 MB data
set.  I manage this by wrapping the 2 in an array.  This happens, of
course, all the time.  Having to do this once is not a big deal, but
doing everywhere in code that uses floats makes for cluttered code --
not something which I expect to have to write in an otherwise very
elegant and concise language.  Also, I often find myself trudging
through code looking for the subtlety that converted my floats to
doubles, doubled my memory usage and then caused subsequent float only
routines to error out.  To those who are constrained to use floats
this is awkward and time consuming.  To those who are not I would say
-- use doubles.  The flag that causes an array to be a "space saving
array" seems to be a temporary fix (that doesn't mean it was a bad
idea -- just that it feels messy and effectively adds complexity that
shouldn't be there).  It also, mearly postpones the problem as I
understand it -- what happens when I multiply two space saving arrays?

We simply will never get away from situations where we have to manage
the interaction ourselves and so we should be careful not to make that
management so awkward (and currently I think it is awkward) that the
floats, bytes, shorts, etc. become marginalized in their
utility.  My suggestion is to go with the rule that a simple hirearchy
(in which downcasting is the rule)

longs
integers
shorts
bytes
cardinals
booleans
doubles  complex doubles     <--- default
floats   complex floats
 
for the most part makes good decisions: Principally because people who
are not constrained to conserve memory will use the larger, default
types all the time and not wince.  They don't *need* floats or bytes.
If anyone gives them a float a simple astype('d') or astype('D') to
make sure it becomes a double lets them go on their way.  Types like
integers and bytes are effectively treated as being precise.  If you
are constrained to conserve memory by staying with floats or bytes
instead of just reading things in from disk and making them doubles it
will not be so awkward to manage the types in large programs.  If I
use someones code and they have a scalar anywhere in it at some point,
even if I (or they) cast the output, memory usage swells at least for
intermediate calculations.  Effectively, python *has* 4-byte floats
but programming with them is awkward.

This means, of course, that multiplying a float array by a double
array produces a float.  Multiplying a double array by anything above
it produces a double. etc.  For my work, if I have a float anywhere in
the calculation I don't believe precision beyond that in the output so
getting a float back is reasonable. I know that some operations
produce "more precision" and so I would cast the array if I needed to
take advantage of that.

Perhaps the downcasting is *not* the way to go.  However, I definately
think the current awkwardness should be eliminated.  I hope my
comments will not be percieved as being critical of the original
language designers.  I find python to be very useful or I wouldn't
have bothered to make the comments at all.

-Todd Pitts


From beausol at exch.hpl.hp.com  Wed Feb  9 14:16:58 2000
From: beausol at exch.hpl.hp.com (Beausoleil, Raymond)
Date: Wed, 9 Feb 2000 11:16:58 -0800 
Subject: [Numpy-discussion] RE: [Matrix-SIG] An Experiment in code-cleanup.
Message-ID: <34E36C05935CD311AE5000A0C9B6B0BF07D16F@hplex3.hpl.hp.com>

From: Konrad Hinsen [mailto:hinsen at cnrs-orleans.fr]

> > (1) The extent of the support for LAPACK. Do we want to stick
> > with LAPACK Lite?
>
> There has been a full LAPACK interface for a long while, of which
> LAPACK Lite is just the subset that is needed for supporting the
> high-level routines in the module LinearAlgebra. I seem to have lost
> the URL to the full version, but it's on my disk, so I can put it
> onto my FTP server if there is a need.

Yes, I'd like to get a copy! You can simply e-mail it to me, if you'd
prefer.

> > (2) The storage format. If we've still got row-ordered matrices
> > under the hood, and we want to use native LAPACK libraries that
> > were compiled using column-major format, then we'll have to be
> > careful to set all of the flags correctly. This isn't going to
> > be a big deal, _unless_ NumPy will support more of LAPACK when a
> > native library is available. Then, of course, there ...
>
> The low-level interface routines don't take care of this. It's the
> high-level Python code (module LinearAlgebra) that sets the
> transposition argument correctly. That looks like a good compromise
> to me.

I'll have to look at this more carefully. Due to my relative lack of Python
experience, I hacked the C code so that Fortran routines could be called
instead, producing the expected results.

> > (3) Through the judicious use of header files with compiler-
> > dependent flags,  we could accommodate the various naming
> > conventions used when the FORTRAN libraries were compiled (e.g.,
> > sgetrf_ or SGETRF).
> 
> That's already done!

Where? Even in the latest f2c'd source code that I downloaded from
SourceForge, I see all names written using the
lower-case-trailing-underscore convention (e.g., dgeqrf_). The Intel MKL was
compiled from Fortran source using the upper-case-no-underscore convention
(e.g., DGEQRF). If I replace dgeqrf_ with DGEQRF in dlapack_lite.c (and a
few other tweaks), then the subsequent link with the IMKL succeeds.

============================
Ray Beausoleil
Hewlett-Packard Laboratories
mailto:beausol at hpl.hp.com
Vox: 425-883-6648
Fax: 425-883-2535
HP Telnet: 957-4951
============================


From godzilla at netmeg.net  Wed Feb  9 15:47:34 2000
From: godzilla at netmeg.net (Les Schaffer)
Date: Wed, 9 Feb 2000 15:47:34 -0500 (EST)
Subject: [Numpy-discussion] digest
Message-ID: <14497.53862.853166.521584@gargle.gargle.HOWL>

just switched over from matrix-sig to numpy-discussion. in the process 
i changed to the digest version and got my first issue.

is it possible to distribute the digests properly formatted as
multipart/digests as per rfc822 and company?

having such a formatted digest makes it very easy when using an
emailer like VM in emacs: VM automatically displays the digest as a
virtual folder, allowing one to browse all the posts in a given digest
very quickly and easily. don't know whether the other (lacklusters)
emailers out there will handle it so nicely, but i don't think the
extra required markers will interfere with your reading of the digests
at all.

highly recommended. i'd be glad to work with whoever has control over
this to ensure that the proper markers get placed into the digests.

les schaffer


From hinsen at cnrs-orleans.fr  Wed Feb  9 15:58:47 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Wed, 9 Feb 2000 21:58:47 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
In-Reply-To: <34E36C05935CD311AE5000A0C9B6B0BF07D16F@hplex3.hpl.hp.com>
	(beausol@exch.hpl.hp.com)
References: <34E36C05935CD311AE5000A0C9B6B0BF07D16F@hplex3.hpl.hp.com>
Message-ID: <200002092058.VAA10798@chinon.cnrs-orleans.fr>

> > onto my FTP server if there is a need.
> 
> Yes, I'd like to get a copy! You can simply e-mail it to me, if you'd
> prefer.

OK, coming soon...

> I'll have to look at this more carefully. Due to my relative lack of Python
> experience, I hacked the C code so that Fortran routines could be called
> instead, producing the expected results.

That's fine, you can simply replace the f2c-generated code by
Fortran-compiled code, as long as the calling conventions are the
same. I have used optimized BLAS as well on some machines.

> Where? Even in the latest f2c'd source code that I downloaded from
> SourceForge, I see all names written using the
> lower-case-trailing-underscore convention (e.g., dgeqrf_). The Intel MKL was

Sure, f2c generates the underscores. But the LAPACK interface code
(the one I'll send you, and also LAPACK Lite) supports both
conventions, controlled by the preprocessor symbol NO_APPEND_FORTRAN
(maybe not the most obvious name). On the other hand, there is
no support for uppercase names; that convention is not used in the
Unix world. But I suppose it could be added by machine transformation
of the code.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From da at ski.org  Wed Feb  9 16:23:22 2000
From: da at ski.org (David Ascher)
Date: Wed, 9 Feb 2000 13:23:22 -0800
Subject: [Numpy-discussion] digest
References: <14497.53862.853166.521584@gargle.gargle.HOWL>
Message-ID: <00cd01bf7343$ea55ae30$0100000a@ski.org>

> just switched over from matrix-sig to numpy-discussion. in the process
> i changed to the digest version and got my first issue.
>
> is it possible to distribute the digests properly formatted as
> multipart/digests as per rfc822 and company?

Did you try to edit your configuration on the mailman control panel?  There
is a choice between MIME and plain-text digests.

--david ascher


From skaller at maxtal.com.au  Wed Feb  9 17:04:13 2000
From: skaller at maxtal.com.au (skaller)
Date: Thu, 10 Feb 2000 09:04:13 +1100
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002081118230.6700-100000@us2.mayo.edu> <200002081956.UAA03241@chinon.cnrs-orleans.fr> <38A19201.8A43EC01@maxtal.com.au> <200002091717.SAA10604@chinon.cnrs-orleans.fr>
Message-ID: <38A1E45D.6E0EF317@maxtal.com.au>

Konrad Hinsen wrote:
> 
> > silently). Consider a function
> >
> >       k0 = 100
> >       k = 99
> >       while k < k0:
> >               ..
> >               k0 = k
> >               k = ...
> >
> > which refines a calculation until the measure k stops decreasing.
> > This algorithm may terminate when k is a float, but _fail_ when
> > k is a double  -- the extra precision may cause the algorithm
> 
> I'd call this a buggy implementation. Convergence criteria should be
> explicit and not rely on the internal representation of data
> types. 

> If you care at all about portability, you shouldn't even think about
> this.

	But sometimes you DON'T care about portability.
Sometimes, you want the best result the architecture can support,
and so you need to perform a portable computation of an architecture
dependent value.

-- 
John (Max) Skaller, mailto:skaller at maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
homepage: http://www.maxtal.com.au/~skaller
download: ftp://ftp.cs.usyd.edu/au/jskaller


From da at ski.org  Wed Feb  9 18:50:03 2000
From: da at ski.org (David Ascher)
Date: Wed, 9 Feb 2000 15:50:03 -0800
Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in code-cleanup.
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> <14496.42942.5355.849670@brant.geog.ubc.ca> <03fe01bf72cd$c040d640$5063cb0a@amullhau>
Message-ID: <037d01bf7358$630b8980$0100000a@ski.org>

> it as an example for people who want to learn stuff about mmap. As it
> stands, there was some similar code I was able to produce at some point. I
> forget who here has a copy, maybe Konrad, maybe David Ascher.
>
> Later,
> Andrew Mullhaupt

I did have some of that code, but it was almost 3 years ago and five
computers ago.  In other words, it's *somewhere*.  I'll start a grep, but
don't hold your breath...

--da


From da at ski.org  Thu Feb 10 01:52:21 2000
From: da at ski.org (David Ascher)
Date: Wed, 9 Feb 2000 22:52:21 -0800
Subject: [Numpy-discussion] Binary distribution available
References: <Pine.LNX.4.10.10002071711190.5868-100000@us2.mayo.edu> <14496.42942.5355.849670@brant.geog.ubc.ca> <03fe01bf72cd$c040d640$5063cb0a@amullhau>
Message-ID: <051f01bf7393$61bff4e0$0100000a@ski.org>

With Travis' wise advice, I appear to have succeeded in putting forth a
binary installation of Numerical-15.2.

Due to a bug in distutils, this is an 'install in place' package, instead of
a 'run python setup.py install' package. So, unzip the file in your main
Python tree, and it should 'work'. Let me (and Paul and Travis) know if it
doesn't.

Download is available from the main page
(http://sourceforge.net/project/?group_id=1369 look for [zip]) or from
http://download.sourceforge.net/numpy/python-numpy-15.2.zip

--david ascher


From gvwilson at nevex.com  Thu Feb 10 13:28:51 2000
From: gvwilson at nevex.com (gvwilson at nevex.com)
Date: Thu, 10 Feb 2000 13:28:51 -0500 (EST)
Subject: [Numpy-discussion] re: scientific Python publishing venue
Message-ID: <Pine.LNX.4.10.10002101328230.30179-100000@akbar.nevex.com>

Hi, folks. A former colleague of mine is now editing a magazine devoted to
scientific computing, and is looking for articles.  If you're doing
something scientific with Python, and want to tell the world about it,
please give me a shout, and I'll forward more information.

Greg Wilson
http://www.software-carpentry.com


From archiver at db.geocrawler.com  Thu Feb 17 12:34:11 2000
From: archiver at db.geocrawler.com (andrew x swan)
Date: Thu, 17 Feb 2000 09:34:11 -0800
Subject: [Numpy-discussion] more speed?
Message-ID: <200002171734.JAA08011@www.geocrawler.com>

This message was sent from Geocrawler.com by "andrew x swan" <a.swan at anprod.csiro.au>
Be sure to reply to that address.

hi - i've only just started using python and
numpy... the program i wrote below runs much more
slowly than a fortran equivalent. ie. on a dataset
where the order of the matrix is (3325,3325),
python took this long:

362.25user 0.74system 6:09.78elapsed 98%CPU 

and fortran took this long:

2.68user 1.12system 0:03.89elapsed 97%CPU

is this because the element by element
calculations involved are contained in python for
loops?

thanks

#!/usr/bin/python

from Numeric import *

def nrm(pedigree):

    n_animals = len(pedigree) + 1
    
    nrm = zeros((n_animals,n_animals),Float)

    for i in xrange(1,n_animals):
        isire = pedigree[i-1][1]
        idam = pedigree[i-1][2]
        nrm[i,i] = 1.0 + 0.5 * nrm[isire,idam]
        for j in xrange(i+1,n_animals):
            jsire = pedigree[j-1][1]
            jdam = pedigree[j-1][2]
            nrm[j,i] = 0.5 * (nrm[jsire,i] +
nrm[jdam,i])
            nrm[i,j] = nrm[j,i]

    return nrm

if __name__ == '__main__':

    test_ped = [(1,0,0),(2,0,0),(3,1,0),(4,1,2),
                (5,3,4),(6,1,4),(7,5,6)]

    a = nrm(test_ped)
    print a
    

Geocrawler.com - The Knowledge Archive


From da at ski.org  Thu Feb 17 18:25:57 2000
From: da at ski.org (David Ascher)
Date: Thu, 17 Feb 2000 15:25:57 -0800
Subject: [Numpy-discussion] more speed?
References: <200002171734.JAA08011@www.geocrawler.com>
Message-ID: <04d501bf799e$5a82abd0$0100000a@ski.org>

From: andrew x swan <archiver at db.geocrawler.com>

> python took this long:
> 
> 362.25user 0.74system 6:09.78elapsed 98%CPU 
> 
> and fortran took this long:
> 
> 2.68user 1.12system 0:03.89elapsed 97%CPU
> 
> is this because the element by element
> calculations involved are contained in python for
> loops?

yes.

--david ascher


From syrus at long.ucsd.edu  Thu Feb 17 18:27:29 2000
From: syrus at long.ucsd.edu (Syrus Nemat-Nasser)
Date: Thu, 17 Feb 2000 15:27:29 -0800 (PST)
Subject: [Numpy-discussion] more speed?
In-Reply-To: <200002171734.JAA08011@www.geocrawler.com>
Message-ID: <Pine.LNX.3.96.1000217152207.21034B-100000@long.ucsd.edu>

On Thu, 17 Feb 2000, andrew x swan wrote:

> is this because the element by element
> calculations involved are contained in python for
> loops?

Hi Andrew!

I've only just begun using Numeric Python, but I'm a long-time user of GNU
Octave and a sporadic user of MatLab. In general, for loops kill the
execution speed of interpretive environments like Numpy and Octave. 

The high-speed comes when one uses vector operations such as Matrix
multiplication.

If you can vectorize your code, meaning replace all the loops with matrix
operations, you should see equivalent speed to Fortran for large data
sets. As far as I know, you will never see an interpreted language match a
compiled one in the execution of for loops.

Thanks. Syrus.

-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Syrus Nemat-Nasser <syrus at ucsd.edu>    UCSD Physics Dept.


From peter at eexpc.eee.nott.ac.uk  Fri Feb 18 12:05:06 2000
From: peter at eexpc.eee.nott.ac.uk (Peter Chang)
Date: Fri, 18 Feb 2000 17:05:06 +0000 (GMT)
Subject: [Numpy-discussion] numpy documentation - alternative format?
Message-ID: <Pine.LNX.3.96.1000218165801.21400C-100000@eexpc.eee.nott.ac.uk>

Hi there,

I've just started to use python and numpy and want to print out the numpy
document but the PDF file has a strange aspect ratio which makes it hard
to print it as 2up on A4 paper. (I've tried hacking about with the
postscript generated by xpdf but it seems that there is no global setting
for page size!)

Could the authors please provide alternative formats for the doc, eg.
as postscript files sized for A4 and letter so that people can print them
out easier?

Thanks
 Peter


From roitblat at hawaii.edu  Fri Feb 18 12:14:22 2000
From: roitblat at hawaii.edu (Herbert L. Roitblat)
Date: Fri, 18 Feb 2000 07:14:22 -1000
Subject: [Numpy-discussion] numpy documentation - alternative format?
Message-ID: <03fd01bf7a33$9b046320$8fd6afcf@0gl1u.pixi.com>

Adobe Acrobat has a shrink to fit option in their print menu.  I'm not sure
if it comes with their free-reader.
Try printing as a 1up.  It seems a small adaptation.
HLR
-----Original Message-----
From: Peter Chang <peter at eexpc.eee.nott.ac.uk>
To: Numpy-discussion at lists.sourceforge.net
<Numpy-discussion at lists.sourceforge.net>
Date: Friday, February 18, 2000 7:09 AM
Subject: [Numpy-discussion] numpy documentation - alternative format?


>
>Hi there,
>
>I've just started to use python and numpy and want to print out the numpy
>document but the PDF file has a strange aspect ratio which makes it hard
>to print it as 2up on A4 paper. (I've tried hacking about with the
>postscript generated by xpdf but it seems that there is no global setting
>for page size!)
>
>Could the authors please provide alternative formats for the doc, eg.
>as postscript files sized for A4 and letter so that people can print them
>out easier?
>
>Thanks
> Peter
>
>
>
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>http://lists.sourceforge.net/mailman/listinfo/numpy-discussion
>


From peter at eexpc.eee.nott.ac.uk  Fri Feb 18 12:18:46 2000
From: peter at eexpc.eee.nott.ac.uk (Peter Chang)
Date: Fri, 18 Feb 2000 17:18:46 +0000 (GMT)
Subject: [Numpy-discussion] numpy documentation - alternative format?
In-Reply-To: <03fd01bf7a33$9b046320$8fd6afcf@0gl1u.pixi.com>
Message-ID: <Pine.LNX.3.96.1000218171657.21400D-100000@eexpc.eee.nott.ac.uk>

On Fri, 18 Feb 2000, Herbert L. Roitblat wrote:

> Adobe Acrobat has a shrink to fit option in their print menu.  I'm not sure
> if it comes with their free-reader.

Is it available for Linux? I'll check it out...

> Try printing as a 1up.  It seems a small adaptation.

I'm trying to save dead trees, i.e. print out 40 odd pages instead of 90
odd.

Peter


From sanner at scripps.edu  Sat Feb 19 22:50:16 2000
From: sanner at scripps.edu (Michel Sanner)
Date: Sat, 19 Feb 2000 19:50:16 -0800
Subject: [Numpy-discussion] Numeric Python under IRIX646
Message-ID: <1000219195017.ZM77150@noah.scripps.edu>

Hi There,

I just tried to add SGI running IRIX6.5 to the collection of Unix boxes I will
support and I ran into the following problem:

If I compile Python -O2 loading the Numeric extensions dumps the core,
if I compile Python -g it works just fine and this regardless if Numeric is
compile -g or -O2.

After I re-compiled Objects/complexobject.o using -g (everything else being
compiled -O2) I got it to work ...

did anyone else out there see this kind of behavior ?

I also post this to psa-members just in case this might be Python related


-Michel


-- 

-----------------------------------------------------------------------

>>>>>>>>>> AREA CODE CHANGE <<<<<<<<< we are now 858 !!!!!!!

Michel F. Sanner Ph.D.                   The Scripps Research Institute
Assistant Professor			Department of Molecular Biology
					  10550 North Torrey Pines Road
Tel. (858) 784-2341				     La Jolla, CA 92037
Fax. (858) 784-2860
sanner at scripps.edu                        http://www.scripps.edu/sanner
-----------------------------------------------------------------------


From mitch.chapman at mciworld.com  Mon Feb 21 13:01:59 2000
From: mitch.chapman at mciworld.com (Mitch Chapman)
Date: Mon, 21 Feb 2000 11:01:59 -0700
Subject: [Numpy-discussion] Re: [PSA MEMBERS] Numeric Python under IRIX646
In-Reply-To: <1000219195017.ZM77150@noah.scripps.edu>
References: <1000219195017.ZM77150@noah.scripps.edu>
Message-ID: <00022111060701.00593@mchapmanpc>

On Sat, 19 Feb 2000, Michel Sanner wrote:
> Hi There,
> 
> I just tried to add SGI running IRIX6.5 to the collection of Unix boxes I will
> support and I ran into the following problem:
> 
> If I compile Python -O2 loading the Numeric extensions dumps the core,
> if I compile Python -g it works just fine and this regardless if Numeric is
> compile -g or -O2.
> 
> After I re-compiled Objects/complexobject.o using -g (everything else being
> compiled -O2) I got it to work ...
> 
> did anyone else out there see this kind of behavior ?


I saw exactly this behavior just last Friday afternoon.
After all of Python was recompiled with -g the bus error went away.

Thanks for pointing out that only complexobject needs to be compiled
with -g.  It didn't occur to me to try this, despite the location of
the bus error, because it was possible to exercise complex objects
interactively w. no problems.

BTW I don't know whether you were compiling N32 or N64.  In our
case N32 created the bus error.

-- 
Mitch Chapman
mitch.chapman at mciworld.com


From hinsen at cnrs-orleans.fr  Fri Feb 25 07:26:58 2000
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 25 Feb 2000 13:26:58 +0100
Subject: [Numpy-discussion] Re: [Matrix-SIG] Numeric Array: adding a 0-D array to a cell in a 2-D array
In-Reply-To: <019901bf7e93$8771d5e0$8fd6afcf@0gl1u.pixi.com>
	(roitblat@hawaii.edu)
References: <019901bf7e93$8771d5e0$8fd6afcf@0gl1u.pixi.com>
Message-ID: <200002251226.NAA14777@chinon.cnrs-orleans.fr>

> We get the type error from trying to set the matrix element with a matrix
> element (apparently).  In the old version (1.9) on our NT box,
> temp=a[kwd,kwd] results in temp being an int type.  How can we either cast
> the temp to an int or enable what we really want, which is to add an int to
> a[kwd,kwd], as in a[kwd,kwd] = a[kwd,kwd] + jwd ?
> 
> Do we have a bad version of Numeric?

Maybe an experimental version. If you check the archives of this
mailing list, you can find a recent discussion about proposed
modifications. One of them was to eliminate the automatic conversion
of rank-0 arrays to scalars, in order to prevent type promotion.
Perhaps this proposal was implemented in the version you have.


Note to the NumPy maintainers: please announce all new releases on
this list, mentioning changes, especially those that affect backward
compatibility. As a maintainer of code that makes heavy use of NumPy,
I keep getting questions and bug reports caused by some new NumPy
release that I haven't even heard of. A recent example is the change
of location of the header files; C modules using arrays now have to
include Numeric/arrayobject.h instead of simply arrayobject.h. I can
understand this change (although I am not sure it's important enough
to break compatibility), but I'd have preferred to learn about it
directly and as early as possible. It's really no fun working through
a 2 KB bug report sent by someone with zero knowledge of C.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Oliphant.Travis at mayo.edu  Fri Feb 25 15:23:01 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Fri, 25 Feb 2000 14:23:01 -0600 (CST)
Subject: [Numpy-discussion] Array-casting problem.
Message-ID: <Pine.LNX.4.10.10002251325570.4131-100000@us2.mayo.edu>

Hi Herb,

It has taken awhile for me to respond to this, but your problem's here
illustrate exactly the kinds of difficulties one encounters with the
current NumPy coercion rules:

You do not have a bad version of Numeric.  The behavior you describe is
exactly what "should" happen though it needs to be fixed.  I'll trace
for you exactly what is going on as it could be illustrative to others:

>>> a = zeros((5,5),'b')
# You've just created a 5x5 byte array that follows "normal" coercion
#   rules filled with zeros.

>>> a[3,3] = 8
# This line copies the rank-0 array of type 'b' created from the Python
#  Integer 8 (by a direct coercion in C) into element (3,3) of matrix a

>>> temp = a[3,3]
# This selects out the rank-0 array of typecode 'b' at position (3,3).  As
# of 15.2 this is nolonger changed to a scalar.  Note that rank-0 arrays
# act alot like scalars, but because there is not a one-to-one
# correspondence between the Python Scalars and rank-0 arrays, this is not
# automatically converted to a Python scalar (this is a change in 15.2)

>>> temp = temp + 3
# This is the problem line for you right here.  Something is wrong though,
# since it should not be, a problem.
# You are adding a rank-0 array of typecode 'b' to a Python Integer which 
# is interpreted by Numeric as a rank-0 array of typecode 'l'.  The result
# should be a Python Integer.  For some reason this is returning an array 
# of typecode 'i' (which does not get automatically converted to a Python
# scalar).

>>> a[3,3] = temp
# This would work fine if temp were the Python scalar it should be. 
# Right now, assignment doesn't let you assign an array of a "larger" type 
# to elements of a smaller type (except for Python scalars).  Since temp
# is (incorrectly I think) a type 'i' rank-0 array, it does not let you 
# make the assignment.  At any rate it is inconsistent to let you assign
# Python scalars but not rank-0 arrays of arbitrary precision, this should
# be fixed.  It is also a problem that temp + 3 returns an array of
# typecode 'i'.

I will look into fixing the above problems this example points out.  Of 
course, it could also be fixed by having long integers lower in the
coercion tree than byte arrays.

Thanks for the feedback,

Travis Oliphant


From Oliphant.Travis at mayo.edu  Fri Feb 25 15:57:34 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Fri, 25 Feb 2000 14:57:34 -0600 (CST)
Subject: [Numpy-discussion] Casting problems with new version of NumPy.
Message-ID: <Pine.LNX.4.10.10002251453500.4377-100000@us2.mayo.edu>

The code sent by Herbert Roitblat pointed out some inconsistencies in the
current NumPy, that I've fixed with two small changes:

1) Long's can no longer be safely cast to Int's (this is not safe on
64-bit machines anyway) -- this makes Numeric more consistent with how it
interprets Python integers.

2) Automatic casting will be done when using rank-0 arrays to set elements
of a Numeric array to be consistent with the behavior for Python scalars.

The changes are in CVS right now, but are simple to change back if there
is a problem.

-Travis


From collins at rushe.aero.org  Mon Feb 28 13:17:38 2000
From: collins at rushe.aero.org (JEFFERY COLLINS)
Date: Mon, 28 Feb 2000 10:17:38 -0800
Subject: [Numpy-discussion] Matrix.py problem
Message-ID: <200002281817.KAA04027@rushe.aero.org>

I installed the Numpy 15.2 and got the following error during the
import of Matrix.  Apparently, the version number is no longer
embedded in the module doc string following the # sign.

>>> import Matrix
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python1.5/site-packages/Numeric/Matrix.py", line 5, in ?
    __version__ = int(__id__[string.index(__id__, '#')+1:-1])
  File "/usr/local/lib/python1.5/string.py", line 138, in index
    return _apply(s.index, args)
ValueError: substring not found in string.index

Jeff