[Numpy-discussion] Integers to integer powers, let's make a decision

Mon Jun 20 18:15:00 EDT 2016

On Mon, Jun 20, 2016 at 3:09 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Jun 20, 2016 at 4:31 PM, Alan Isaac <alan.isaac at gmail.com> wrote:
>> On 6/13/2016 1:54 PM, Marten van Kerkwijk wrote:
>>>
>>> 1. What in principle is the best return type for int ** int (which
>>> Josef I think most properly rephrased as whether `**` should be
>>> thought of as a float operator, like `/` in python3 and `sqrt` etc.);
>>
>>
>>
>> Perhaps the question is somewhat different.  Maybe it is: what type
>> should a user expect when the exponent is a Python int?  The obvious
>> choices seem to be an object array of Python ints, or an array of
>> floats.  So far, nobody has proposed the former, and concerns have
>> been expressed about the latter.  More important, either would break
>> the rule that the scalar type is not important in array operations,
>> which seems like a good general rule (useful and easy to remember).
>>
>> How much commitment is there to such a rule?  E.g.,
>> np.int64(2**7)*np.arange(5,dtype=np.int8)
>> violates this.  One thing that has come out of this
>> discussion for me is that the actual rules in play are
>> hard to keep track of.  Are they all written down in
>> one place?
>>
>> I suspect there is general support for the idea that if someone
>> explicitly specifies the same dtype for the base and the
>> exponent then the result should also have that dtype.
>> I think this is already true for array exponentiation
>> and for scalar exponentiation.
>>
>> One other thing that a user might expect, I believe, is that
>> any type promotion rules for scalars and arrays will be the same.
>> This is not currently the case, and that feels like an
>> inconsistency.  But is it an inconsistency?  If the rule is that
>> that array type dominates the scalar type, that may
>> be understandable, but then it should be a firm rule.
>> In this case, an exponent that is a Python int should not
>> affect the dtype of the (array) result.
>>
>> In sum, as a user, I've come around to Chuck's original proposal:
>> integers raised to negative integer powers raise an error.
>> My reason for coming around is that I believe it meshes
>> well with a general rule that in binary operations the
>> scalar dtypes should not influence the dtype of an array result.
>> Otoh, it is unclear to me how much commitment there is to that rule.
>>
>> Thanks in advance to anyone who can help me understand better
>> the issues in play.
>
> the main thing I get out of the discussion in this thread is that this
> is way to complicated.
>
> which ints do I have?
>
> is it python or one of the many numpy int types, or two different
> (u)int types or maybe one is a scalar so it shouldn't count?
>
>
> scalar dominates here
>
>>>> (np.ones(5, np.int8) *1.0).dtype
> dtype('float64')
>
> otherwise a huge amount of code would be broken that uses the *1. trick

I *think* the documented rule is that scalar *kind* matters (so we pay
attention to it being a float) but scalar *type* doesn't (we ignore
whether it's float64 versus float32) and scalar *value* doesn't (we
ignore whether it's 1.0 or 2.0**53). Obviously even this is not 100%
true, but I think it is the original intent.

My suspicion is that a better rule would be: *Python* types (int,
float, bool) are treated as having an unspecified width, but all numpy
types/dtypes are treated the same regardless of whether they're a
scalar or not. So np.int8(2) * 2 would return an int8, but np.int8(2)
* np.int64(2) would return an int64. But this is totally separate from
the issues around **, and would require a longer discussion and larger
overhaul of the typing system.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org