[SciPy-Dev] f2py, the fortran integer type, and npy_intp

Sat Jul 10 16:59:15 EDT 2010

On Sat, Jul 10, 2010 at 2:32 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sat, Jul 10, 2010 at 12:30 PM, Kurt Smith <kwmsmith at gmail.com> wrote:
>>
>> On Sat, Jul 10, 2010 at 12:31 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > I note that some c programs try to call f2py generated fortran
>> > interfaces
>> > using the npy_intp type for the array dimension. There are two problems
>> > here, first the prototype for the generated wrapper uses int causing
>> > compiler warnings on a 64 bit os, and second, the fortran subroutines
>> > themselves use integer types. So some questions. What precision are
>> > fortran
>> > integers? Should we be using npy_intp for array dimensions at all? Is
>> > there
>> > any transparent way to have 64 bit support?
>>
>> I can comment on some of the above (not the f2py-specific bits, though).
>>
>> 1) "What precision are fortran integers?"
>>
>> According to a somewhat authoritative interlanguage programming site
>> [0], a fortran 'integer' corresponds to a C 'int'.  This can be relied
>> upon, but other fortran type declarations cannot (e.g. 'integer*8' !=
>> a C 'long' on all platforms, with all compilers).  If sizeof(npy_intp)
>> == sizeof(int), then everything will be fine, but it's a near
>> certainty there are platforms where this doesn't hold, and may lead to
>> problems if array sizes are large.
>>
>> [0] http://www.math.utah.edu/software/c-with-fortran.html
>>
>
> Since 'int' is 32 bits with gcc and msvc regardless of whether the operating
> system is 32 or 64 bits, we are going to have a problem with all the fortran
> code in scipy that uses integer types for array indexing. That is the case
> for all the code in fitpack, for instance. The only way I see to get around
> that is to have two fortran libraries, one modified for 64 bits and another
> for 32 bits. How does your cythonized fortran deal with this?

There's a compatibility wrapper written in fortran.  All index
arguments are passed to the wrapper as npy_intp's, and there's a check
in the wrapper to ensure the sizes line up before calling the wrapped
subroutine/function.

If the user wants to pass an array that requires a dimension >= 2 mln.
elements and the code has 'integer' dimensions, then the user should
use compile flags to modify the default integer size.

Kurt

>
>>
>> 2) "Should we be using npy_intp for array dimensions at all?"
>>
>> To interoperate with PEP-3118 buffers, the short answer is "yes."
>>
>
> Well, yes. The question is what to do with all that fortran code that
> doesn't support it.
>
>>
>> As I'm sure you know, the PyArrayObject struct uses npy_intp's for the
>> dimensions and the strides.  Further, the PEP-3118 buffer protocol
>> calls for Py_ssize_t (which maps to npy_intp inside numpy) for the
>> "shape", "stride" and "suboffset" arrays inside the Py_buffer struct.
>> As I understand it, this is primarily required for when "suboffset" is
>> >=0 in a specific dimension, since it indicates the buffer is arranged
>> in a non-contiguous-memory layout and pointer arithmetic is required
>> to access the array elements.
>>
>> Here's the relevant PEP-3118 section:
>>
>> """
>> suboffsets
>>
>>    address of a Py_ssize_t * variable that will be filled with a
>> pointer to an array of Py_ssize_t of length *ndims. If these suboffset
>> numbers are >=0, then the value stored along the indicated dimension
>> is a pointer and the suboffset value dictates how many bytes to add to
>> the pointer after de-referencing. A suboffset value that it negative
>> indicates that no de-referencing should occur (striding in a
>> contiguous memory block). If all suboffsets are negative (i.e. no
>> de-referencing is needed, then this must be NULL (the default value).
>> If this is not requested by the caller (PyBUF_INDIRECT is not set),
>> then this should be set to NULL or an PyExc_BufferError raised if this
>> is not possible.
>>
>>    For clarity, here is a function that returns a pointer to the
>> element in an N-D array pointed to by an N-dimesional index when there
>> are both non-NULL strides and suboffsets:
>>
>>    void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides,
>>                           Py_ssize_t *suboffsets, Py_ssize_t *indices) {
>>        char *pointer = (char*)buf;
>>        int i;
>>        for (i = 0; i < ndim; i++) {
>>            pointer += strides[i] * indices[i];
>>            if (suboffsets[i] >=0 ) {
>>                pointer = *((char**)pointer) + suboffsets[i];
>>            }
>>        }
>>        return (void*)pointer;
>>    }
>>
>>    Notice the suboffset is added "after" the dereferencing occurs.
>> Thus slicing in the ith dimension would add to the suboffsets in the
>> (i-1)st dimension. Slicing in the first dimension would change the
>> location of the starting pointer directly (i.e. buf would be
>> modified).
>> """
>>
>> Hope this gives you some info to go on.
>>
>
> Chuck
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>