[Numpy-discussion] change default integer from int32 to int64 on win64?

Wed Jul 23 16:17:01 EDT 2014

On Wed, 2014-07-23 at 22:06 +0200, Sebastian Berg wrote:
> On Wed, 2014-07-23 at 21:50 +0200, Julian Taylor wrote:
> > On 23.07.2014 20:54, Robert Kern wrote:
> > > On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
> > > <jtaylor.debian at googlemail.com> wrote:
> > >> hi,
> > >> it recently came to my attention that the default integer type in numpy
> > >> on windows 64 bit is a 32 bit integers [0].
> > >> This seems like a quite serious problem as it means you can't use any
> > >> integers created from python integers < 32 bit to index arrays larger
> > >> than 2GB.
> > >> For example np.product(array.shape) which will never overflow on linux
> > >> and mac, can overflow on win64.
> > > 
> > > Currently, on win64, we use Python long integer objects for `.shape`
> > > and related attributes. I wonder if we could return numpy int64
> > > scalars instead. Then np.product() (or anything else that consumes
> > > these via np.asarray()) would infer the correct dtype for the result.
> > 
> > this might be a less invasive alternative that might solve a lot of the
> > incompatibilities, but it would probably also change np.arange(5) and
> > similar functions to int64 which might change the dtype of a lot of
> > arrays. The difference to just changing it everywhere might not be so
> > large anymore.
> > 
> 
> Aren't most such functions already using intp? Just guessing, but:
> 
> In [16]: np.arange(30, dtype=np.long).dtype.num
> Out[16]: 9
> 
> In [17]: np.arange(30, dtype=np.intp).dtype.num
> Out[17]: 7
> 
> In [18]: np.arange(30).dtype.num
> Out[18]: 7
> 

Ops, never mind that stuff, probably not... np.int_ is 7 too, this is
just the way how intp is chosen.

> frankly, I am not sure what needs to change at all, except the normal
> array creation and the sum promotion rule. I am probably naive here, but
> what is the ABI change that is necessary for that?
> 
> I guess the problem you see is breaking code doing np.array([1,2,3]) and
> then assuming in C that it is a long array?
> 
> - Sebastian
> 
> > > 
> > >> I think this is a very dangerous platform difference and a quite large
> > >> inconvenience for win64 users so I think it would be good to fix this.
> > >> This would be a very large change of API and probably also ABI.
> > > 
> > > Yes. Not only would it be a very large change from the status quo, I
> > > think it introduces *much greater* platform difference than what we
> > > have currently. The assumption that the default integer object
> > > corresponds to the platform C long, whatever that is, is pretty
> > > heavily ingrained.
> > 
> > This should be only a concern for the ABI which can be solved by simply
> > recompiling.
> > In comparison that the API is different on win64 compared to all other
> > platforms is something that needs source level changes.
> > 
> > > 
> > >> But as we also never officially released win64 binaries we could change
> > >> it for from source compilations and give win64 binary distributors the
> > >> option to keep the old ABI/API at their discretion.
> > > 
> > > That option would make the problem worse, not better.
> > > 
> > 
> > maybe, I'm not familiar with the numpy win64 distribution landscape.
> > Is it not like linux where you have one distributor per workstation
> > setup that can update all its packages to a new ABI on one go?
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> > 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>