[Pandas-dev] Upcoming Index repr changes
Jeff
jeffreback at gmail.com
Mon Apr 20 20:53:17 EDT 2015
John, you are quoting the current impl (which is first), the new is like
this:
In [11]: pd.date_range('20130101',periods=4,name='foo',tz='US/Eastern')
Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern')
In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern')
Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, freq='D', tz='US/Eastern')
Lorenzo, to answer your question, MultiIndexes are unchanged (and
CategoricalIndex are new). We *could* make them a single line but would be
pretty crowded.
Note that MultiIndex and CategoricalIndex are multi-line repr and do no
truncate sequences (of e.g. labels), this is consistent with previous
versions. (easy to change this though)
In [1]: MultiIndex.from_product([list('abcdefg'),range(10)],names=['first','second'])
Out[1]:
MultiIndex(levels=[[u'a', u'b', u'c', u'd', u'e', u'f', u'g'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]],
labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]],
names=[u'first', u'second'])
In [4]: pd.CategoricalIndex(np.random.randint(0,5,size=100),name='foo')
Out[4]:
CategoricalIndex([3, 0, 0, 3, 1, 3, 0, 4, 2, 3, 0, 4, 0, 1, 2, 0, 4, 1, 4, 2, 3, 1, 0, 4, 4, 3, 0, 3, 0, 1, 2, 3, 3, 1, 1, 0, 0, 4, 4, 1, 1, 3, 1, 1, 4, 4, 3, 0, 0, 0, 4, 4, 0, 1, 3, 1, 2, 0, 3, 1, 2, 2, 2, 1, 1, 4, 1, 0, 4, 3, 3, 0, 0, 0, 4, 4, 1, 4, 2, 2, 1, 4, 0, 0, 0, 4, 3, 0, 4, 0, 0, 0, 3, 3, 1, 2, 2, 3, 4, 1],
categories=[0, 1, 2, 3, 4],
ordered=False,
name=u'foo',
dtype='category')
On Monday, April 20, 2015 at 8:37:01 PM UTC-4, John E wrote:
>
> This is probably not the sort of comment you're looking for, but I'd like
> to see more of a table-style output. I can just put a 'values' at the end
> to get the more numpy like output (which is easier to read IMO), but it
> won't stop at 10 or 100 unless I tell it to. Nevertheless, I think it's
> much easer to read this:
>
> pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern').values
> Out[442]:
> array(['2013-01-01T00:00:00.000000000-0500',
> '2013-01-02T00:00:00.000000000-0500',
> '2013-01-03T00:00:00.000000000-0500',
> '2013-01-04T00:00:00.000000000-0500',
> '2013-01-05T00:00:00.000000000-0500',
>
> than this:
>
> pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern')
> Out[443]:
> <class 'pandas.tseries.index.DatetimeIndex'>
> [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00]
> Length: 104, Freq: D, Timezone: US/Eastern
>
>
> On Friday, April 17, 2015 at 6:07:44 AM UTC-4, Joris Van den Bossche wrote:
>>
>> Hi all,
>>
>> We have a PR pending to unify the string representation of the different
>> Index objects: https://github.com/pydata/pandas/pull/9901
>>
>> What are the most important changes:
>>
>> - We propose to reduce the default number of values shown from 100 to
>> 10 (an option controllable as pd.options.display.max_seq_items).
>> - The datetime-like indices (DatetimeIndex, TimedeltaIndex,
>> PeriodIndex) were always somewhat different and get a new repr that is now
>> more consistent with how it is for other Index types like Int64Index. This
>> is the biggest change.
>>
>> So for eg Int64Index not much changes (only 'name' is now also shown, and
>> the number of shown values has changed), but for DatetimeIndex the change
>> is larger.
>>
>> *But we would like to get some feedback on this!*
>>
>> Do you like the changes? For DatetimeIndex? For the number of shown
>> values?
>> Would you want different behaviour for repr() and str()?
>>
>> Some examples of the changes with the current state of the PR are shown
>> below:
>>
>> Previous Behavior
>>
>> In [1]: pd.get_option('max_seq_items')
>> Out[1]: 100
>>
>> In [2]: pd.Index(range(4), name='foo')
>> Out[2]: Int64Index([0, 1, 2, 3], dtype='int64')
>>
>> In [3]: pd.Index(range(104), name='foo')
>> Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
>> 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
>> 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
>> 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
>> 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91,
>> 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64')
>>
>> In [4]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern')
>> Out[4]:
>> <class 'pandas.tseries.index.DatetimeIndex'>
>> [2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00]
>> Length: 4, Freq: D, Timezone: US/Eastern
>>
>> In [5]: pd.date_range('20130101', periods=104, name='foo',
>> tz='US/Eastern')
>> Out[5]:
>> <class 'pandas.tseries.index.DatetimeIndex'>
>> [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00]
>> Length: 104, Freq: D, Timezone: US/Eastern
>>
>> New Behavior
>>
>> In [1]: pd.get_option('max_seq_items')
>> Out[1]: 10
>>
>> In [9]: pd.Index(range(4), name='foo')
>> Out[9]: Int64Index([0, 1, 2, 3], dtype='int64', name=u'foo')
>>
>> In [10]: pd.Index(range(104), name='foo')
>> Out[10]: Int64Index([0, 1, ..., 102, 103], dtype='int64', name=u'foo',
>> length=104)
>>
>> In [11]: pd.date_range('20130101', periods=4, name='foo', tz='US/Eastern')
>> Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02
>> 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'],
>> dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern')
>>
>> In [12]: pd.date_range('20130101', periods=104 ,name='foo',
>> tz='US/Eastern')
>> Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02
>> 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14
>> 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104,
>> freq='D', tz='US/Eastern')
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20150420/7e29f1aa/attachment-0001.html>
More information about the Pandas-dev
mailing list