[SciPy-dev] Module Submission: Orthogonal Distance Regression

José Fonseca j_r_fonseca at yahoo.co.uk
Wed Nov 20 07:25:43 EST 2002


On Wed, Nov 20, 2002 at 12:58:44AM -0800, Robert Kern wrote:
> On Tue, Nov 19, 2002 at 06:25:45PM +0200, Pearu Peterson wrote:
> >  - odrpack wraper is handwritten and the code base is quite large. This
> >    may cause some maintainance issues.
> >  - Using the wrapper is sensitive to Fortran/C multi-dimensional array
> >    issue (the different data ordering stuff).
> > 
> > I think that the last one is the most serious one. Probably the
> > easiest way to solve this is to use f2py generated wrappers. This would
> > also reduce the maintainance issues.
> 
> There are a couple of reasons why I didn't use f2py. I do have an f2py
> version that mostly works, however,
> 
[...]
>
>  * With my usage of the module, I prefer using arrays already in FORTRAN
>    order. N=number of observations, M=number of dimensions per
>    observation in the input variable. ODRPACK expects (N,M) FORTRAN
>    arrays which are (M,N) Python arrays. These get passed to the model
>    functions. For the most part, one wants to implement y=f(x) without
>    explicit looping over each observation; I find it convenient to use
>    x[i] rather than x[:,i]. I have to put explicit transpose's to get
>    the f2py'd code to accept the arrays I'm passing around. I think
>    there's some combination of options to get f2py to do what I want,
>    but I haven't found it yet.
> 
>    I could also be wrongheaded about my preferences. I don't think
>    "N-first" indexing works well with Python's slicing abilities.

I felt exactly the same when generating a python binding to ARPACK (A
large scale eigenvalue solver, available at
http://www.netlib.org/arpack), which I'm still finishing.

I started by trying both pyfortran and f2py, but both failed to run the
most simple example. ARPACK uses a "reverse communication" scheme where
a subroutine is repeatedly called and updates all its arguments in each
iteration and ask the caller to do specific test.

I could avoid all this by making a more "user friendly" wrapper to
ARPACK, which would just take one or two matrices and spits the
eigenvalues/eigenvectors, but that would take away the possibility to do
alot of optimizations (in-place operations, matrices shapes and
properties, etc). So I've decided to make the bindings by hand. And a
"user-friendly" wrapper still can be easily made in python, which do all
sort of transposing/casting - for large scale problems, you don't want
those to happen.

By using some carefully coded C macros, all I need is to do

	CHECK_ARRAY_2D(v, PyArray_FLOAT, == ncv, >= n);
	
to check if the array is a 2D contiguous, check the dimensions, and
output a error message if not. I'm still finishing it, but you can
take a peek at http://jrfonseca.dyndns.org/work/phd/python/modules/ .

BTW, this and other things (like Python never copying a object but give
a reference), gave a 10x boost to my original Matlab FE code, when
implemented in Numeric Python.

José Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com



More information about the SciPy-Dev mailing list