[Numpy-discussion] size_t or npy_intp?

Tue Jul 27 11:10:05 EDT 2010

Francesc Alted wrote:
> A Tuesday 27 July 2010 15:20:47 Charles R Harris escrigué:
>   
>> On Tue, Jul 27, 2010 at 7:08 AM, Francesc Alted <faltet at pytables.org> wrote:
>>     
>>> Hi,
>>>
>>> I'm a bit confused on which datatype should I use when referring to NumPy
>>> ndarray lengths.  In one hand I'd use `size_t` that is the canonical way
>>> to refer to lengths of memory blocks.  In the other hand, `npy_intp`
>>> seems the standard data type used in NumPy for this.
>>>       
>> They have different ranges, npy_intp is signed and in later versions of
>> Python is the same as Py_ssize_t, while size_t is unsigned. It would be a
>> bad idea to mix the two.
>>     
>
> Agreed that mixing the two is a bad idea.  So I suppose that you are 
> suggesting to use `npy_intp`.  But then, I'd say that `size_t` being unsigned, 
> is a better fit for describing a memory length.
>
> Mmh, I'll stick with `size_t` for the time being (unless anyone else can 
> convince me that this is really a big mistake ;-)
>   
Well, Python has reasons for using Py_ssize_t (= ssize_t where 
available) internally for everything that has to do with indexing. (E.g. 
it wants to use the same type for the strides, which can be negative.)

You just can't pass indices to any Python API that doesn't fit in 
ssize_t. You're free to use size_t in your own code, but if you actually 
use the extra bit, the moment it hits Python you'll overflow and get 
garbage...so you need to check every time you hit any Python layer, 
rather than only in the input to your code.

Your choice though.

Dag Sverre