From jorisvandenbossche at gmail.com Fri Apr 17 12:07:41 2015 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Fri, 17 Apr 2015 12:07:41 +0200 Subject: [Pandas-dev] Upcoming Index repr changes Message-ID: Hi all, We have a PR pending to unify the string representation of the different Index objects: https://github.com/pydata/pandas/pull/9901 What are the most important changes: - We propose to reduce the default number of values shown from 100 to 10 (an option controllable as pd.options.display.max_seq_items). - The datetime-like indices (DatetimeIndex, TimedeltaIndex, PeriodIndex) were always somewhat different and get a new repr that is now more consistent with how it is for other Index types like Int64Index. This is the biggest change. So for eg Int64Index not much changes (only 'name' is now also shown, and the number of shown values has changed), but for DatetimeIndex the change is larger. *But we would like to get some feedback on this!* Do you like the changes? For DatetimeIndex? For the number of shown values? Would you want different behaviour for repr() and str()? Some examples of the changes with the current state of the PR are shown below: Previous Behavior In [1]: pd.get_option('max_seq_items') Out[1]: 100 In [2]: pd.Index(range(4), name='foo') Out[2]: Int64Index([0, 1, 2, 3], dtype='int64') In [3]: pd.Index(range(104), name='foo') Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64') In [4]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') Out[4]: [2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00] Length: 4, Freq: D, Timezone: US/Eastern In [5]: pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern') Out[5]: [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] Length: 104, Freq: D, Timezone: US/Eastern New Behavior In [1]: pd.get_option('max_seq_items') Out[1]: 10 In [9]: pd.Index(range(4), name='foo') Out[9]: Int64Index([0, 1, 2, 3], dtype='int64', name=u'foo') In [10]: pd.Index(range(104), name='foo') Out[10]: Int64Index([0, 1, ..., 102, 103], dtype='int64', name=u'foo', length=104) In [11]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern') In [12]: pd.date_range('20130101', periods=104 ,name='foo', tz='US/Eastern') Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, freq='D', tz='US/Eastern') -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Fri Apr 17 19:09:18 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 17 Apr 2015 13:09:18 -0400 Subject: [Pandas-dev] sinhrks (Masaaki Horikoshi) Message-ID: All, @sinhrks (Masaaki Horikoshi) has accepted the invitation to be a *pandas core dev!* Thanks for all of the effort you have put in. Let keep it up to get even more people into pandas contributions. This project is quite popular, from just seeing the reaction at PyCon and is used in a lot of different fields / ways. Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Sat Apr 18 00:27:44 2015 From: wesmckinn at gmail.com (Wes McKinney) Date: Fri, 17 Apr 2015 15:27:44 -0700 Subject: [Pandas-dev] sinhrks (Masaaki Horikoshi) In-Reply-To: References: Message-ID: Great news -- @sinhrks, welcome to the team, and thanks for all the hard work on the project! cheers, Wes On Fri, Apr 17, 2015 at 10:09 AM, Jeff Reback wrote: > All, > > @sinhrks (Masaaki Horikoshi) > > has accepted the invitation to be a pandas core dev! > > Thanks for all of the effort you have put in. > > Let keep it up to get even more people into pandas contributions. > > This project is quite popular, from just seeing the reaction at PyCon and is > used in a lot > > of different fields / ways. > > Jeff > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > From shoyer at gmail.com Sat Apr 18 00:33:59 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 17 Apr 2015 15:33:59 -0700 Subject: [Pandas-dev] sinhrks (Masaaki Horikoshi) In-Reply-To: References: Message-ID: Congratulations Masaaki! We're really happy to have you on board :). On Fri, Apr 17, 2015 at 3:27 PM, Wes McKinney wrote: > Great news -- @sinhrks, welcome to the team, and thanks for all the > hard work on the project! > > cheers, > Wes > > On Fri, Apr 17, 2015 at 10:09 AM, Jeff Reback > wrote: > > All, > > > > @sinhrks (Masaaki Horikoshi) > > > > has accepted the invitation to be a pandas core dev! > > > > Thanks for all of the effort you have put in. > > > > Let keep it up to get even more people into pandas contributions. > > > > This project is quite popular, from just seeing the reaction at PyCon > and is > > used in a lot > > > > of different fields / ways. > > > > Jeff > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Tue Apr 21 02:59:33 2015 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Tue, 21 Apr 2015 02:59:33 +0200 Subject: [Pandas-dev] [pydata] Re: Upcoming Index repr changes In-Reply-To: <6ab996ef-e112-4f52-9d58-60947629011f@googlegroups.com> References: <3cb616db-fbad-44a9-970d-93c7dda3d42d@googlegroups.com> <6ab996ef-e112-4f52-9d58-60947629011f@googlegroups.com> Message-ID: I like the suggestion of John to have something more like the output of numpy arrays. For example, the proposed repr: In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern') Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, freq='D', tz='US/Eastern') would then be something like this: In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern') Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, freq='D', tz='US/Eastern') 2015-04-21 2:53 GMT+02:00 Jeff : > > John, you are quoting the current impl (which is first), the new is like > this: > > In [11]: pd.date_range('20130101',periods=4,name='foo',tz='US/Eastern') > Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern') > > In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern') > Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, freq='D', tz='US/Eastern') > > Lorenzo, to answer your question, MultiIndexes are unchanged (and > CategoricalIndex are new). We *could* make them a single line but would be > pretty crowded. > > Note that MultiIndex and CategoricalIndex are multi-line repr and do no > truncate sequences (of e.g. labels), this is consistent with previous > versions. (easy to change this though) > > In [1]: MultiIndex.from_product([list('abcdefg'),range(10)],names=['first','second']) > Out[1]: > MultiIndex(levels=[[u'a', u'b', u'c', u'd', u'e', u'f', u'g'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], > labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], > names=[u'first', u'second']) > > In [4]: pd.CategoricalIndex(np.random.randint(0,5,size=100),name='foo') > Out[4]: > CategoricalIndex([3, 0, 0, 3, 1, 3, 0, 4, 2, 3, 0, 4, 0, 1, 2, 0, 4, 1, 4, 2, 3, 1, 0, 4, 4, 3, 0, 3, 0, 1, 2, 3, 3, 1, 1, 0, 0, 4, 4, 1, 1, 3, 1, 1, 4, 4, 3, 0, 0, 0, 4, 4, 0, 1, 3, 1, 2, 0, 3, 1, 2, 2, 2, 1, 1, 4, 1, 0, 4, 3, 3, 0, 0, 0, 4, 4, 1, 4, 2, 2, 1, 4, 0, 0, 0, 4, 3, 0, 4, 0, 0, 0, 3, 3, 1, 2, 2, 3, 4, 1], > categories=[0, 1, 2, 3, 4], > ordered=False, > name=u'foo', > dtype='category') > > > > > > On Monday, April 20, 2015 at 8:37:01 PM UTC-4, John E wrote: >> >> This is probably not the sort of comment you're looking for, but I'd like >> to see more of a table-style output. I can just put a 'values' at the end >> to get the more numpy like output (which is easier to read IMO), but it >> won't stop at 10 or 100 unless I tell it to. Nevertheless, I think it's >> much easer to read this: >> >> pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern').values >> Out[442]: >> array(['2013-01-01T00:00:00.000000000-0500', >> '2013-01-02T00:00:00.000000000-0500', >> '2013-01-03T00:00:00.000000000-0500', >> '2013-01-04T00:00:00.000000000-0500', >> '2013-01-05T00:00:00.000000000-0500', >> >> than this: >> >> pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern') >> Out[443]: >> >> [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] >> Length: 104, Freq: D, Timezone: US/Eastern >> >> >> On Friday, April 17, 2015 at 6:07:44 AM UTC-4, Joris Van den Bossche >> wrote: >>> >>> Hi all, >>> >>> We have a PR pending to unify the string representation of the different >>> Index objects: https://github.com/pydata/pandas/pull/9901 >>> >>> What are the most important changes: >>> >>> - We propose to reduce the default number of values shown from 100 >>> to 10 (an option controllable as pd.options.display.max_seq_items). >>> - The datetime-like indices (DatetimeIndex, TimedeltaIndex, >>> PeriodIndex) were always somewhat different and get a new repr that is now >>> more consistent with how it is for other Index types like Int64Index. This >>> is the biggest change. >>> >>> So for eg Int64Index not much changes (only 'name' is now also shown, >>> and the number of shown values has changed), but for DatetimeIndex the >>> change is larger. >>> >>> *But we would like to get some feedback on this!* >>> >>> Do you like the changes? For DatetimeIndex? For the number of shown >>> values? >>> Would you want different behaviour for repr() and str()? >>> >>> Some examples of the changes with the current state of the PR are shown >>> below: >>> >>> Previous Behavior >>> >>> In [1]: pd.get_option('max_seq_items') >>> Out[1]: 100 >>> >>> In [2]: pd.Index(range(4), name='foo') >>> Out[2]: Int64Index([0, 1, 2, 3], dtype='int64') >>> >>> In [3]: pd.Index(range(104), name='foo') >>> Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, >>> 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, >>> 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, >>> 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, >>> 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, >>> 91, 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64') >>> >>> In [4]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') >>> Out[4]: >>> >>> [2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00] >>> Length: 4, Freq: D, Timezone: US/Eastern >>> >>> In [5]: pd.date_range('20130101', periods=104, name='foo', >>> tz='US/Eastern') >>> Out[5]: >>> >>> [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] >>> Length: 104, Freq: D, Timezone: US/Eastern >>> >>> New Behavior >>> >>> In [1]: pd.get_option('max_seq_items') >>> Out[1]: 10 >>> >>> In [9]: pd.Index(range(4), name='foo') >>> Out[9]: Int64Index([0, 1, 2, 3], dtype='int64', name=u'foo') >>> >>> In [10]: pd.Index(range(104), name='foo') >>> Out[10]: Int64Index([0, 1, ..., 102, 103], dtype='int64', name=u'foo', >>> length=104) >>> >>> In [11]: pd.date_range('20130101', periods=4, name='foo', >>> tz='US/Eastern') >>> Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 >>> 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], >>> dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern') >>> >>> In [12]: pd.date_range('20130101', periods=104 ,name='foo', >>> tz='US/Eastern') >>> Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 >>> 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 >>> 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, >>> freq='D', tz='US/Eastern') >>> >>> -- > You received this message because you are subscribed to the Google Groups > "PyData" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pydata+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Tue Apr 21 02:53:17 2015 From: jeffreback at gmail.com (Jeff) Date: Mon, 20 Apr 2015 17:53:17 -0700 (PDT) Subject: [Pandas-dev] Upcoming Index repr changes In-Reply-To: <3cb616db-fbad-44a9-970d-93c7dda3d42d@googlegroups.com> References: <3cb616db-fbad-44a9-970d-93c7dda3d42d@googlegroups.com> Message-ID: <6ab996ef-e112-4f52-9d58-60947629011f@googlegroups.com> John, you are quoting the current impl (which is first), the new is like this: In [11]: pd.date_range('20130101',periods=4,name='foo',tz='US/Eastern') Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern') In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern') Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, freq='D', tz='US/Eastern') Lorenzo, to answer your question, MultiIndexes are unchanged (and CategoricalIndex are new). We *could* make them a single line but would be pretty crowded. Note that MultiIndex and CategoricalIndex are multi-line repr and do no truncate sequences (of e.g. labels), this is consistent with previous versions. (easy to change this though) In [1]: MultiIndex.from_product([list('abcdefg'),range(10)],names=['first','second']) Out[1]: MultiIndex(levels=[[u'a', u'b', u'c', u'd', u'e', u'f', u'g'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], names=[u'first', u'second']) In [4]: pd.CategoricalIndex(np.random.randint(0,5,size=100),name='foo') Out[4]: CategoricalIndex([3, 0, 0, 3, 1, 3, 0, 4, 2, 3, 0, 4, 0, 1, 2, 0, 4, 1, 4, 2, 3, 1, 0, 4, 4, 3, 0, 3, 0, 1, 2, 3, 3, 1, 1, 0, 0, 4, 4, 1, 1, 3, 1, 1, 4, 4, 3, 0, 0, 0, 4, 4, 0, 1, 3, 1, 2, 0, 3, 1, 2, 2, 2, 1, 1, 4, 1, 0, 4, 3, 3, 0, 0, 0, 4, 4, 1, 4, 2, 2, 1, 4, 0, 0, 0, 4, 3, 0, 4, 0, 0, 0, 3, 3, 1, 2, 2, 3, 4, 1], categories=[0, 1, 2, 3, 4], ordered=False, name=u'foo', dtype='category') On Monday, April 20, 2015 at 8:37:01 PM UTC-4, John E wrote: > > This is probably not the sort of comment you're looking for, but I'd like > to see more of a table-style output. I can just put a 'values' at the end > to get the more numpy like output (which is easier to read IMO), but it > won't stop at 10 or 100 unless I tell it to. Nevertheless, I think it's > much easer to read this: > > pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern').values > Out[442]: > array(['2013-01-01T00:00:00.000000000-0500', > '2013-01-02T00:00:00.000000000-0500', > '2013-01-03T00:00:00.000000000-0500', > '2013-01-04T00:00:00.000000000-0500', > '2013-01-05T00:00:00.000000000-0500', > > than this: > > pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern') > Out[443]: > > [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] > Length: 104, Freq: D, Timezone: US/Eastern > > > On Friday, April 17, 2015 at 6:07:44 AM UTC-4, Joris Van den Bossche wrote: >> >> Hi all, >> >> We have a PR pending to unify the string representation of the different >> Index objects: https://github.com/pydata/pandas/pull/9901 >> >> What are the most important changes: >> >> - We propose to reduce the default number of values shown from 100 to >> 10 (an option controllable as pd.options.display.max_seq_items). >> - The datetime-like indices (DatetimeIndex, TimedeltaIndex, >> PeriodIndex) were always somewhat different and get a new repr that is now >> more consistent with how it is for other Index types like Int64Index. This >> is the biggest change. >> >> So for eg Int64Index not much changes (only 'name' is now also shown, and >> the number of shown values has changed), but for DatetimeIndex the change >> is larger. >> >> *But we would like to get some feedback on this!* >> >> Do you like the changes? For DatetimeIndex? For the number of shown >> values? >> Would you want different behaviour for repr() and str()? >> >> Some examples of the changes with the current state of the PR are shown >> below: >> >> Previous Behavior >> >> In [1]: pd.get_option('max_seq_items') >> Out[1]: 100 >> >> In [2]: pd.Index(range(4), name='foo') >> Out[2]: Int64Index([0, 1, 2, 3], dtype='int64') >> >> In [3]: pd.Index(range(104), name='foo') >> Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, >> 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, >> 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, >> 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, >> 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, >> 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64') >> >> In [4]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') >> Out[4]: >> >> [2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00] >> Length: 4, Freq: D, Timezone: US/Eastern >> >> In [5]: pd.date_range('20130101', periods=104, name='foo', >> tz='US/Eastern') >> Out[5]: >> >> [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] >> Length: 104, Freq: D, Timezone: US/Eastern >> >> New Behavior >> >> In [1]: pd.get_option('max_seq_items') >> Out[1]: 10 >> >> In [9]: pd.Index(range(4), name='foo') >> Out[9]: Int64Index([0, 1, 2, 3], dtype='int64', name=u'foo') >> >> In [10]: pd.Index(range(104), name='foo') >> Out[10]: Int64Index([0, 1, ..., 102, 103], dtype='int64', name=u'foo', >> length=104) >> >> In [11]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') >> Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 >> 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], >> dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern') >> >> In [12]: pd.date_range('20130101', periods=104 ,name='foo', >> tz='US/Eastern') >> Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 >> 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 >> 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, >> freq='D', tz='US/Eastern') >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From eiler13 at gmail.com Mon Apr 20 20:37:09 2015 From: eiler13 at gmail.com (John E) Date: Tue, 21 Apr 2015 00:37:09 -0000 Subject: [Pandas-dev] Upcoming Index repr changes In-Reply-To: References: Message-ID: <3cb616db-fbad-44a9-970d-93c7dda3d42d@googlegroups.com> This is probably not the sort of comment you're looking for, but I'd like to see more of a table-style output. I can just put a 'values' at the end to get the more numpy like output (which is easier to read IMO), but it won't stop at 10 or 100 unless I tell it to. Nevertheless, I think it's much easer to read this: pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern').values Out[442]: array(['2013-01-01T00:00:00.000000000-0500', '2013-01-02T00:00:00.000000000-0500', '2013-01-03T00:00:00.000000000-0500', '2013-01-04T00:00:00.000000000-0500', '2013-01-05T00:00:00.000000000-0500', than this: pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern') Out[443]: [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] Length: 104, Freq: D, Timezone: US/Eastern On Friday, April 17, 2015 at 6:07:44 AM UTC-4, Joris Van den Bossche wrote: > > Hi all, > > We have a PR pending to unify the string representation of the different > Index objects: https://github.com/pydata/pandas/pull/9901 > > What are the most important changes: > > - We propose to reduce the default number of values shown from 100 to > 10 (an option controllable as pd.options.display.max_seq_items). > - The datetime-like indices (DatetimeIndex, TimedeltaIndex, > PeriodIndex) were always somewhat different and get a new repr that is now > more consistent with how it is for other Index types like Int64Index. This > is the biggest change. > > So for eg Int64Index not much changes (only 'name' is now also shown, and > the number of shown values has changed), but for DatetimeIndex the change > is larger. > > *But we would like to get some feedback on this!* > > Do you like the changes? For DatetimeIndex? For the number of shown values? > Would you want different behaviour for repr() and str()? > > Some examples of the changes with the current state of the PR are shown > below: > > Previous Behavior > > In [1]: pd.get_option('max_seq_items') > Out[1]: 100 > > In [2]: pd.Index(range(4), name='foo') > Out[2]: Int64Index([0, 1, 2, 3], dtype='int64') > > In [3]: pd.Index(range(104), name='foo') > Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, > 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, > 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, > 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, > 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, > 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64') > > In [4]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') > Out[4]: > > [2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00] > Length: 4, Freq: D, Timezone: US/Eastern > > In [5]: pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern') > Out[5]: > > [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] > Length: 104, Freq: D, Timezone: US/Eastern > > New Behavior > > In [1]: pd.get_option('max_seq_items') > Out[1]: 10 > > In [9]: pd.Index(range(4), name='foo') > Out[9]: Int64Index([0, 1, 2, 3], dtype='int64', name=u'foo') > > In [10]: pd.Index(range(104), name='foo') > Out[10]: Int64Index([0, 1, ..., 102, 103], dtype='int64', name=u'foo', > length=104) > > In [11]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') > Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 > 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], > dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern') > > In [12]: pd.date_range('20130101', periods=104 ,name='foo', > tz='US/Eastern') > Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 > 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 > 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, > freq='D', tz='US/Eastern') > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lorenzo.deleo at gmail.com Mon Apr 20 17:13:33 2015 From: lorenzo.deleo at gmail.com (Lorenzo De Leo) Date: Mon, 20 Apr 2015 21:13:33 -0000 Subject: [Pandas-dev] Upcoming Index repr changes In-Reply-To: References: Message-ID: I like the changes you propose, the new version is much more readable. I used to be wary of calling df.index because it can be slow and the output is a bit messy, and I'm usually too lazy to select just a slice of it, so having something like this done by default is a welcome change. Just a question, does it apply also to multiindexes? Cheers! On Friday, April 17, 2015 at 12:07:44 PM UTC+2, Joris Van den Bossche wrote: > > Hi all, > > We have a PR pending to unify the string representation of the different > Index objects: https://github.com/pydata/pandas/pull/9901 > > What are the most important changes: > > - We propose to reduce the default number of values shown from 100 to > 10 (an option controllable as pd.options.display.max_seq_items). > - The datetime-like indices (DatetimeIndex, TimedeltaIndex, > PeriodIndex) were always somewhat different and get a new repr that is now > more consistent with how it is for other Index types like Int64Index. This > is the biggest change. > > So for eg Int64Index not much changes (only 'name' is now also shown, and > the number of shown values has changed), but for DatetimeIndex the change > is larger. > > *But we would like to get some feedback on this!* > > Do you like the changes? For DatetimeIndex? For the number of shown values? > Would you want different behaviour for repr() and str()? > > Some examples of the changes with the current state of the PR are shown > below: > > Previous Behavior > > In [1]: pd.get_option('max_seq_items') > Out[1]: 100 > > In [2]: pd.Index(range(4), name='foo') > Out[2]: Int64Index([0, 1, 2, 3], dtype='int64') > > In [3]: pd.Index(range(104), name='foo') > Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, > 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, > 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, > 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, > 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, > 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64') > > In [4]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') > Out[4]: > > [2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00] > Length: 4, Freq: D, Timezone: US/Eastern > > In [5]: pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern') > Out[5]: > > [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00] > Length: 104, Freq: D, Timezone: US/Eastern > > New Behavior > > In [1]: pd.get_option('max_seq_items') > Out[1]: 10 > > In [9]: pd.Index(range(4), name='foo') > Out[9]: Int64Index([0, 1, 2, 3], dtype='int64', name=u'foo') > > In [10]: pd.Index(range(104), name='foo') > Out[10]: Int64Index([0, 1, ..., 102, 103], dtype='int64', name=u'foo', > length=104) > > In [11]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern') > Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 > 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], > dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern') > > In [12]: pd.date_range('20130101', periods=104 ,name='foo', > tz='US/Eastern') > Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 > 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 > 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, > freq='D', tz='US/Eastern') > > -------------- next part -------------- An HTML attachment was scrubbed... URL: