[scikit-learn] Is there any official position on PEP484/mypy?

Daniel Moisset dmoisset at machinalis.com
Tue Aug 2 09:34:17 EDT 2016


A couple of things I forgot to mention:

* One relevant consequence is that, to add annotations on the code,
scikit-learn should depend on the "typing"[1] module which contains some of
the basic names imported and used in annotations. It's a stdlib module in
python 3.5, but the PyPI package backports it to python 2.7 and newer (I'm
not sure how it works with Python 2.6, which might be an issue)
* As an example of the kind of bugs that mypy can find, someone here
already found a documentation bug in the sklearn.svm.SVC() initializer; the
"kernel" parameter is described as "string"[2], when it's actually a
"string or callable" (which can be read in the "small print" description of
the argument). That kind of slips would be automatically prevented if
declared as an annotation with mypy on the CI. Also it would be more clear
what is the signature of the callable directly instead of looking up
additional documentation on kernel functions or digging into the source

[1] https://pypi.python.org/pypi/typing
[2]
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC


On Mon, Aug 1, 2016 at 5:15 PM, Daniel Moisset <dmoisset at machinalis.com>
wrote:

> On Fri, Jul 29, 2016 at 8:57 PM, Gael Varoquaux <
> gael.varoquaux at normalesup.org> wrote:
>
>>
>> Can you summarize once again in very simple terms what would be the big
>> benefits?
>>
>
> Benefits for regular scikit-learn users
>
> 1. Reliable information on method signatures in a standarized way
> ("reliable" in the sense of "automatically verified")
> 2. Better integration with tools supporting PEP-484 (editors,
> documentation tools). This is a small set now, but I expect it to grow (and
> it's also an egg and chicken problem, support has to start somewhere)
>
> Benefits for scikit-learn users also using mypy and/or PEP-484 (probably
> not a large set, but I know a few people :) )
>
> 0. Same as the rest of the users
> 1. Early detection of errors in own code while writing code based on SKL
> 2. Making own code more readable/explicit by annotating functions that
> receive/return SKL types (and verifying that annotations)
>
> Benefits for scikit-learn developers
>
> 1. Some extra checks that changes keep internal consistency
> 2. (Future) possible simplification of typing information in docstrings,
> which would make themselves redundant (this would require updating doc
> generators)
>
> Regarding the cost for contributing, an scenario where you get a CI error
> due to mypy would be because:
>
> * the change in the code somewhat changed the existing accepted/returned
> types, which is a change in the API and should actually be verified
> * the change in the code extended the signature of an existing function
> (what Andreas mentioned); in this situation it's similar to a PR that adds
> an argument and doesn't update the docstring (only that this is
> automatically caught).
>
> WRT to the second issue, the error here might be confusing when using the
> "one line" syntax because arguments may "misalign" with their signatures.
> The multiline version (or the python3-only form) is safer in that sense (in
> fact, adding an argument there will not produce a CI problem because its
> unannotated and assumed to be "any type").
>
> Adding new modules/methods without no annotations wouldn't produce an
> error, just an incompleteness in the annotations
>
> A possible source of problems like the one you mention is that the
> implementation of the annotated methods will be checked, and sometimes
> you'll get a warning about a local variable if mypy can't infer its type
> (it happens sometimes when assigning an empty list to a local, where mypy
> knows that it's a list but doesn't know the element type). But in that case
> I think the message you get is very obvious.
>
> --
> Daniel F. Moisset - UK Country Manager
> www.machinalis.com
> Skype: @dmoisset
>



-- 
Daniel F. Moisset - UK Country Manager
www.machinalis.com
Skype: @dmoisset
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160802/96532716/attachment.html>


More information about the scikit-learn mailing list