[Pandas-dev] Open questions regarding docstrings

Tom Augspurger tom.augspurger88 at gmail.com
Mon Mar 5 17:58:21 EST 2018


Agreed with Chris about 3.

In the same vein, about 4 and 6, I'd could see more precision in the
docstrings as an aid to adopting function annotations and mypy in the
future.
Is List[int] too ugly / unusual for readers? Case in point, one of your
examples from 6, a Python list, isn't array like (in the sense that
is_array_like(List) is False).

Documenting exactly what we mean by array-like is probably not something
we're ready for, but I'd like to hear what others thing about adopting
mypy's spelling
of types where it's not too burdensome.

Tom



On Mon, Mar 5, 2018 at 2:07 PM, Chris Bartak <cbartak at gmail.com> wrote:

> Hi Marc,
>
> Thanks for pulling out this list.  The only one of these that seems
> potentially objectionable to me is #3 - it does seem like we're pretty
> inconsistent on this currently, but in my opinion it'd be better to side
> with `str` - matching the actual python type, numpy, mpyp annotations, etc?
>
> On Mon, Mar 5, 2018 at 2:55 PM, Marc Garcia <garcia.marc at gmail.com> wrote:
>
>> Hi there,
>>
>> There are few things regarding the docstrings, that are still open to
>> discussion, in many cases because the numpy convention (or numpy doc
>> examples) is different for the unwritten convention used in most pandas
>> docstrings.
>>
>> Probably you've seen the discussion in GitHub, but I list them here, with
>> the proposed decision (mainly keep the pandas way). If anyone disagrees in
>> any point, please let us know, so we'll change the documentation for the
>> sprint, and do it in the desired way.
>>
>> 1) Starting the docstring just after the opening triple quotes, or in the
>> next line. In pandas it's more common to do it in the next line, so we'll
>> keep it this way.
>>
>> 2) For parameters, showing the default value after the type, or after the
>> description. Numpy does not find it necessary to specify them, and it
>> specified the recommended place is after the description. The proposal
>> (mainly by Joris) is to always have them and after the type, as it's easier
>> to see it.
>>
>> 3) For parameters expecting a string, in the numpy convention examples
>> `str` is used, the proposal is to use `string` instead.
>>
>> 4) For complex types like dicts, I think there is some consensus that is
>> easier to understand the types if using brackets (e.g. "dict of {str: int}"
>> over "dict of str: int"). And same for tuples (e.g. "tuple of (int, str,
>> int)" over "tuple of int, str, int"). For list and sets, the type is
>> simpler (e.g. "list of int" or "set of str"). I propose to use the brackets
>> for list and tuple, and not for list and set, and use `str` over `string`
>> if part of a complex type.
>>
>> 5) For cases where a parameter is optional, so, have a None value by
>> default, meaning the value is not required (as I understand if it was the
>> case of `fillna(value=None)` value wouldn't be optional, as it means is the
>> value used to replace `NaN`). In this case, the proposal is to use as the
>> type, something like "int or float, optional" over "int, float or None
>> (default None)".
>>
>> 6) When the parameter expects something in the form of a Python list, a
>> numpy array, a pandas Series... document it as "array-like" over other
>> options list "iterable" or "numpy.array, Series or list".
>>
>>
>> Thanks!
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180305/151e61df/attachment-0001.html>


More information about the Pandas-dev mailing list