[SciPy-Dev] Static Typing

Ralf Gommers ralf.gommers at gmail.com
Fri Jul 2 03:19:00 EDT 2021


On Wed, Jun 30, 2021 at 10:37 PM Serge Guelton <
serge.guelton at telecom-bretagne.eu> wrote:

> On Wed, Jun 30, 2021 at 08:14:55PM +0200, Ralf Gommers wrote:
> >
> >
> > On Wed, Jun 30, 2021 at 8:07 PM Stefan van der Walt <[1]
> stefanv at berkeley.edu>
> > wrote:
> >
> >     On Wed, Jun 30, 2021, at 09:03, Evgeni Burovski wrote:
> >     > ISTM it's important that annotations are optional in the sense
> that we
> >     > do not explicitly require that new code is typed. If someone is
> >     > willing to add them, great (and if someone is willing to review a
> >     > typing PR, even better :-)). But this should be possible to do in a
> >     > follow-up PR, not as a requirement for an enhancement PR.
> >
> >     I agree, especially given that the typing notation is still
> changing.  For
> >     example, they're currently working out a shorthand for typing
> function
> >     definitions (and I'm sure other simplifications are in the pipeline
> too).
>
> I think it's worth noting that some numpy interface are inherently
> incompatible
> with fine-grain static typing. On simple example would be
>
> def foo(x : ndarray[int, :, :], strict: bool):
>     return np.mean(x, keepdims=strict)
>
> What should be the return type of `foo`? We can't tell precisely, because
> it
> depends on the runtime value of strict. We're left with something alonside
> "this
> returns an array of the same dimension or a scalar of the same dtype"
>

The "boolean keyword to control return type or shape" is (unfortunately) so
common that there's a specific way to deal with this, using @overload:

  @overload  def foo(x : ndarray[int, :, :], strict: Literal[True]) ->
ndarray[int, :, :]: ...
  @overload  def foo(x: ndarray[int,:,:], strict: Literal[False]) ->
ndarray[int, :]: ...

  # The fallback, if a user passes `strict='this-is-true' then we have
to guess (unless we raise an exception)

  def foo(x: ndarray[int,:,:], strict: bool) -> ndarray[int, :]: ...

See https://mypy.readthedocs.io/en/stable/literal_types.html. So the idea
is to treat `True` and `False` as distinct types. And if you build, e.g., a
compiler for Python code then do the same. This is fairly painful and ugly,
but doable. There is other behavior and functions in numpy code that's
harder to deal with, things like value-based casting, output shapes that
depend on (array) input data, and returning scalars instead of 0-D arrays.

I totally agree that boolean keywords are best avoided, but at least there
is a solution if they do happen.

I don't know how much this dynamicity leaks to scipy interface, but it does
> look
> like a difficult problem to solve.
>

SciPy is just as bad as NumPy in this respect. For example, scipy.stats
does this a lot:

  if y.ndim == 0:
       y = y[()]  # return a float rather than an array here

  return y

Type checkers will complain loudly about this kind of thing, so having a
type checker in CI warns you about this being a bad pattern. On the other
hand, to add correct type annotations to old code that's already like that,
you have to jump through a lot of hoops.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20210702/46afc5fa/attachment.html>


More information about the SciPy-Dev mailing list