[Cython] Safer default exception handling with return type annotations?

Wed Sep 6 03:06:39 EDT 2017

Robert Bradshaw schrieb am 06.09.2017 um 08:28:
> On Tue, Sep 5, 2017 at 10:44 PM, Stefan Behnel wrote:
>> Robert Bradshaw schrieb am 06.09.2017 um 07:21:
>>> I'm not a huge fan of behaving differently depending on what syntax
>>> was used to annotate the return type--I'd rather they be 100% aliases
>>> of each other.
>>
>> Regarding this bit - I already chose to implement some differences for
>> annotation typing. Mainly, if you say
>>
>>     def f(x: int) -> float:
>>         return x
>>
>> then the (plain "def") function will actually be typed as "double
>> f(object)"., assuming that you probably meant the Python types and not the
>> C types. If you want the C types "int" and "float", you have to use either
>> of these:
>>
>>     def f1(x: cython.int) -> cython.float:
>>         return x
>>
>>     cpdef float f2(int x):
>>         return x
>>
>> That is because the main use case of signature annotations is Python code
>> compatibility, so I tried to change the semantics as little as possible
>> from what the code would be expected to do in Python.
> 
> What about
> 
> def f(x: float) -> int
>   return x * 2
> 
> would that throw an error if x was, say, a str?

It would raise an exception on input, but there would not currently be an
error on return.

> I think float -> c double but int -> python object will be surprising.

I agree. There are two reasons: we don't currently use the int/long Python
types anywhere in Cython (which could obviously be changed), and in Python
2, this would exclude "long" objects, which is most likely not intended.
So, "int" would probably best refer to "int object or long object" in
Python 2, which definitely complicates things.

Besides, how many functions can really deal with both "int" and "str" input
completely? Your example above is actually a good one, because it would
currently return a float object, not "int". But, since it would do the same
thing when run in Python, that's probably acceptable. It would need an
explicit conversion in both cases, in which case Cython wouldn't need to
enforce the type anymore.

If "int" is used as input type (or variable declaration), then enforcing it
somehow on assignment would be much more relevant.

Maybe we should reconsider this whole business when we drop support for
Python 2.7. ;)

> I also worry a bit
> about x: float being enforced but x: List[float] not being so.

Interpreting "List[float]" as Cython type "list" would definitely be nice,
but note that it would disallow subtypes. In Python, it does not.

I think the right way to deal with that, eventually, will be optionally
allowing subtypes also in Cython, and handling the distinction more at a
case by case basis. Then you could declare a variable as "list" or
"List[Any]", and enforce the exact type in the first case but not in the
second.

And we should definitely use "List[itemtype]" hints also in type inference
for loops and indexing at some point.

>> I think this type interpretation is a reasonable, use case driven
>> difference to make. Thus my question if we should extend this to the
>> exception declaration.
> 
> I suppose you've already made a case for deviating...
> 
> I guess I think it'd be nice to change the default universally, but
> that's perhaps a bigger conversation.

I think so, too. For this specific case, we can change the default without
breaking backwards compatibility.

I also added a new decorator "@exceptval(x, check=False)" for pure mode. If
used without arguments as "@exceptval()" or even "@exceptval(check=False)",
which seems more readable, users could still get the "write unraisable but
don't propagate" behaviour, if they really need it.

Stefan