[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Matthew Brett matthew.brett at gmail.com
Mon Jan 7 17:26:08 EST 2013


Hi,

On Mon, Jan 7, 2013 at 10:03 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi Matthew,
>
>> Ah - well - I only meant that raising an error in the example would be
>> no more surprising than raising an error at the python prompt.  Do you
>> agree with that?  I mean, if the user knew that:
>>
>>>>> np.array([1], dtype=np.int8) + 128
>>
>> would raise an error, they'd probably expect your offset routine to do the same.
>
> I think they would be surprised in both cases, considering this works fine:
>
> np.array([1], dtype=np.int8) + np.array([128])
>
>> I agree it kind of feels funny, but that's why I wanted to ask you for
>> some silly but specific example where the funniness would be more
>> apparent.
>
> Here are a couple of examples I slapped together, specifically
> highlighting the value of the present (or similar) upcasting behavior.
>  Granted, they are contrived and can all be fixed by conditional code,
> but this is my best effort at illustrating the "real-world" problems
> people may run into.
>
> Note that there is no easy way for the user to force upcasting to
> avoid the error, unless e.g. an "upcast" keyword were added to these
> functions, or code added to inspect the data dtype and use numpy.add
> to simulate the current behavior.
>
> def map_heights(self, dataset_name, heightmap):
>     """ Correct altitudes by adding a custom heightmap
>
>     dataset_name: Name of HDF5 dataset containing altitude data
>     heightmap:  Corrections in meters.  Must match shape of the
> dataset (or be a scalar).
>     """
>     # TODO: scattered reports of errors when a constant heightmap value is used
>
>     return self.f[dataset_name][...] + heightmap
>
> def perform_analysis(self, dataset_name, kernel_offset=128):
>     """ Apply Frobnication analysis, using optional linear offset
>
>     dataset_name: Name of dataset in file
>     kernel_offset:  Optional sequencing parameter.  Must be a power of
> 2 and at least 16 (default 128)
>     """
>     # TODO: people report certain files frobnicate fine in IDL but not
> in Python...
>
>      import frob
>      data = self.f[dataset_name][...]
>      try:
>          return frob.frobnicate(data + kernel_offset)
>      except ValueError:
>          raise AnalysisFailed("Invalid input data")

Thanks - I know it seems silly - but it is helpful.

There are two separate issues though:

1) Is the upcasting behavior of 1.6 better than the overflow behavior of 1.5?
2) If the upcasting of 1.6 is bad, is it better to raise an error or
silently overflow, as in 1.5?

Taking 2) first, in this example:

>     return self.f[dataset_name][...] + heightmap

assuming it is not going to upcast, would you rather it overflow than
raise an error?  Why?  The second seems more explicit and sensible to
me.

For 1) - of course the upcasting in 1.6 is only going to work some of
the time.   For example:

In [2]: np.array([127], dtype=np.int8) * 1000
Out[2]: array([-4072], dtype=int16)

So - you'll get something, but there's a reasonable chance you won't
get what you were expecting.  Of course that is true for 1.5 as well,
but at least the rule there is simpler and so easier - in my opinion -
to think about.

Best,

Matthew



More information about the NumPy-Discussion mailing list