[Pandas-dev] Open questions regarding docstrings

Joris Van den Bossche jorisvandenbossche at gmail.com
Mon Mar 5 18:05:13 EST 2018


Yes, thanks for the overview.

Regarding the type descriptions, as a reference, an overview of all
currently used type descriptions can be seen here:
https://github.com/pandas-dev/pandas/pull/19704#issuecomment-369405611
>From that you can see that many things are now rather inconsistent .. (str
vs string, optional vs default None, ... in most cases rather equally
used). So we should make choices! :)

For str vs string: I *think* "string" can be more readable and
understandable for newcomers (not sure how well known the str type is for
this user group). But of course, if taking "string" rather than "str", we
should maybe also look at "int" vs "integer", "bool" vs "boolean", etc.
I can live with either decision.

Joris


2018-03-05 23:07 GMT+01:00 Chris Bartak <cbartak at gmail.com>:

> Hi Marc,
>
> Thanks for pulling out this list.  The only one of these that seems
> potentially objectionable to me is #3 - it does seem like we're pretty
> inconsistent on this currently, but in my opinion it'd be better to side
> with `str` - matching the actual python type, numpy, mpyp annotations, etc?
>
> On Mon, Mar 5, 2018 at 2:55 PM, Marc Garcia <garcia.marc at gmail.com> wrote:
>
>> Hi there,
>>
>> There are few things regarding the docstrings, that are still open to
>> discussion, in many cases because the numpy convention (or numpy doc
>> examples) is different for the unwritten convention used in most pandas
>> docstrings.
>>
>> Probably you've seen the discussion in GitHub, but I list them here, with
>> the proposed decision (mainly keep the pandas way). If anyone disagrees in
>> any point, please let us know, so we'll change the documentation for the
>> sprint, and do it in the desired way.
>>
>> 1) Starting the docstring just after the opening triple quotes, or in the
>> next line. In pandas it's more common to do it in the next line, so we'll
>> keep it this way.
>>
>> 2) For parameters, showing the default value after the type, or after the
>> description. Numpy does not find it necessary to specify them, and it
>> specified the recommended place is after the description. The proposal
>> (mainly by Joris) is to always have them and after the type, as it's easier
>> to see it.
>>
>> 3) For parameters expecting a string, in the numpy convention examples
>> `str` is used, the proposal is to use `string` instead.
>>
>> 4) For complex types like dicts, I think there is some consensus that is
>> easier to understand the types if using brackets (e.g. "dict of {str: int}"
>> over "dict of str: int"). And same for tuples (e.g. "tuple of (int, str,
>> int)" over "tuple of int, str, int"). For list and sets, the type is
>> simpler (e.g. "list of int" or "set of str"). I propose to use the brackets
>> for list and tuple, and not for list and set, and use `str` over `string`
>> if part of a complex type.
>>
>> 5) For cases where a parameter is optional, so, have a None value by
>> default, meaning the value is not required (as I understand if it was the
>> case of `fillna(value=None)` value wouldn't be optional, as it means is the
>> value used to replace `NaN`). In this case, the proposal is to use as the
>> type, something like "int or float, optional" over "int, float or None
>> (default None)".
>>
>> 6) When the parameter expects something in the form of a Python list, a
>> numpy array, a pandas Series... document it as "array-like" over other
>> options list "iterable" or "numpy.array, Series or list".
>>
>>
>> Thanks!
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180306/6a6387c7/attachment.html>


More information about the Pandas-dev mailing list