[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Thu Jan 17 20:18:14 EST 2013

Hi,

On Fri, Jan 18, 2013 at 1:04 AM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Thu, Jan 17, 2013 at 6:26 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>
>> I am starting to wonder if we should aim for making
>>
>> * scalar and array casting rules the same;
>> * Python int / float scalars become int32 / 64 or float64;
>
> aren't they already? I'm not sure what you are proposing.

Sorry - yes that is what they are already, this sentence refers back
to an earlier suggestion of mine on the thread, which I am discarding.

>> This has the benefit of being very easy to understand and explain.  It
>> makes dtypes predictable in the sense they don't depend on value.
>
> That is key -- I don't think casting should ever depend on value.
>
>> Those wanting to maintain - say - float32 will need to cast scalars to float32.
>>
>> Maybe the use-cases motivating the scalar casting rules - maintaining
>> float32 precision in particular - can be dealt with by careful casting
>> of scalars, throwing the burden onto the memory-conscious to maintain
>> their dtypes.
>
> IIRC this is how it worked "back in the day" (the Numeric day? -- and
> I'm pretty sure that in the long run it worked out badly. the core
> problem is that there are only python literals for a couple types, and
> it was oh so easy to do things like:
>
> my_arr = np,zeros(shape, dtype-float32)
>
> another_array = my_array * 4.0
>
> and you'd suddenly get a float64 array. (of course, we already know
> all that..) I suppose this has the up side of being safe, and having
> scalar and array casting rules be the same is of course appealing, but
> you use a particular size dtype for a reason,and it's a real pain to
> maintain it.

Yes, I do understand that.  The difference - as I understand it - is
that back in the day, numeric did not have the the float32 etc
scalars, so you could not do:

another_array = my_array * np.float32(4.0)

(please someone correct me if I'm wrong).

> Casual users will use the defaults that match the Python types anyway.

I think what we are reading in this thread is that even experienced
numpy users can find the scalar casting rules surprising, and that's a
real problem, it seems to me.

The person with a massive float32 array certainly should have the
ability to control upcasting, but I think the default should be the
least surprising thing, and that, it seems to me, is for the casting
rules to be the same for arrays and scalars.   In the very long term.

Cheers,

Matthew