[Pandas-dev] Proposal to change the default number of rows for DataFrame display (lower max_rows)
Joris Van den Bossche
jorisvandenbossche at gmail.com
Fri Dec 8 09:54:10 EST 2017
*[Note for those reading it on the pydata mailing list, please answer to
pandas-dev at python.org <pandas-dev at python.org> to keep discussion
centralised there]*
Hi all,
I am reposting the mail of Clemens below, but with slightly changed focus,
as I think the main discussion point is about the number of rows.
The proposal in https://github.com/pandas-dev/pandas/pull/17023 is to lower
the default number of rows shown when displaying a Series or DataFrame from
60 to 20.
Thoughts on that?
Best,
Joris
2017-11-28 11:57 GMT+01:00 Clemens Brunner <clemens.brunner at gmail.com>:
> Hello!
>
> We're currently discussing a change in how data frames are displayed by
> default in https://github.com/pandas-dev/pandas/pull/17023. There are two
> proposed changes:
>
> (1) Set pd.options.display.max_columns=0 (previously this was set to 20).
> (2) Set pd.options.display.max_rows=20 (previously this was set to 60).
>
> Change (1) means that the number of printed columns is adapted to fit
> within the width of the terminal. If there are too many columns, ellipsis
> will be shown to indicate collapsed columns in the middle of the data
> frame. This doesn't work if Python is run as a Jupyter kernel (e.g. in a
> Jupyter notebook or in IPython QtConsole), in which case the maximum
> columns remain 20.
>
> Example:
> ========
> import pandas as pd
> import numpy as np
> pd.DataFrame(np.random.rand(5, 10))
>
> Output before (in a terminal with 100 chars width):
> ---------------------------------------------------
> 0 1 2 3 4 5 6 \
> 0 0.643979 0.690414 0.018603 0.991478 0.707534 0.376765 0.670848
> 1 0.547836 0.810972 0.054448 0.415112 0.268120 0.904528 0.839258
> 2 0.582256 0.732149 0.284208 0.405197 0.213591 0.715367 0.150106
> 3 0.197348 0.317159 0.051669 0.738405 0.821046 0.179270 0.245793
> 4 0.483466 0.583330 0.999213 0.882883 0.315169 0.045712 0.897048
>
> 7 8 9
> 0 0.891467 0.494220 0.713369
> 1 0.601304 0.449880 0.266205
> 2 0.113262 0.360580 0.238833
> 3 0.798063 0.077769 0.471169
> 4 0.262779 0.530565 0.992084
>
> Output after:
> -------------
> 0 1 2 3 ... 6 7
> 8 9
> 0 0.673621 0.211505 0.943201 0.946548 ... 0.900453 0.612182
> 0.861933 0.710967
> 1 0.670855 0.834449 0.796273 0.785976 ... 0.609954 0.686663
> 0.684582 0.837505
> 2 0.544736 0.814827 0.352893 0.459556 ... 0.650993 0.735943
> 0.279110 0.840203
> 3 0.440125 0.554323 0.745462 0.940896 ... 0.544576 0.224175
> 0.852603 0.509837
> 4 0.225551 0.791834 0.476059 0.321857 ... 0.391165 0.423213
> 0.290683 0.954423
>
> [5 rows x 10 columns]
>
>
> Change (2) implies fewer rows are displayed before auto-hiding takes
> place. I find that 60 rows almost always causes the terminal to scroll
> (most terminals have between 25-40 rows), so reducing the value to 20
> increases the chance that a data frame can be observed on one terminal
> page. I'm not including a before/after output since it should be easy to
> imagine how this change affects the output.
>
> Both changes would make Pandas behave similar to R's Tidyverse (which I
> really like), but this should not be the main reason why these changes are
> a good idea. I mainly like them because these settings make (large) data
> frames much nicer to look at.
>
> Note that these changes affect the default values. Of course, users are
> free to change them back in their active Python session.
>
> Comments to both proposed changes are highly welcome (either here on the
> mailing list or at https://github.com/pandas-dev/pandas/pull/17023.
>
> Clemens
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20171208/b76a77a8/attachment.html>
More information about the Pandas-dev
mailing list