From jeffreback at gmail.com  Mon May 11 17:42:11 2015
From: jeffreback at gmail.com (Jeff Reback)
Date: Mon, 11 May 2015 11:42:11 -0400
Subject: [Pandas-dev] ANN: pandas 0.16.1 released
Message-ID: <CAHMnJKjcOsM41+1_bE1UNR5WJ-8rHqJMEEudOZeF216yULXqCQ@mail.gmail.com>

Hello,

We are proud to announce v0.16.1 of pandas, a minor release from 0.16.0.

This release includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug
fixes.

This was a release of 7 weeks with 222 commits by 57 authors encompassing
85 issues.

We recommend that all users upgrade to this version.

*What is it:*

*pandas* is a Python package providing fast, flexible, and expressive data
structures designed to make working with ?relational? or ?labeled? data both
easy and intuitive. It aims to be the fundamental high-level building block
for
doing practical, real world data analysis in Python. Additionally, it has
the
broader goal of becoming the most powerful and flexible open source data
analysis / manipulation tool available in any language.

Highlights of this release include:

   - Support for *CategoricalIndex*, a category based index, see here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-enhancements-categoricalindex>
   - New section on how-to-contribute to *pandas*, see here
   <http://pandas.pydata.org/pandas-docs/stable/contributing.html>
   - Revised "Merge, join, and concatenate" documentation, including
   graphical examples to make it easier to understand each operations, see
   here <http://pandas.pydata.org/pandas-docs/stable/merging.html>
   - New method *sample* for drawing random samples from Series, DataFrames
   and Panels. See here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-enhancements-sample>
   - The default *Index* printing has changed to a more uniform format, see
   here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-index-repr>
   - *BusinessHour* datetime-offset is now supported, see here
   <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#business-hour>
   - Further enhancement to the *.str* accessor to make string operations
   easier, see here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-enhancements-string>


See the Whatsnew in v0.16.1
<http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#v0-16-1-may-11-2015>

Documentation:
http://pandas.pydata.org/pandas-docs/stable/

Source tarballs, windows binaries are available on PyPI:
https://pypi.python.org/pypi/pandas

windows binaries are courtesy of  Christoph Gohlke and are built on Numpy
1.8
macosx wheels are courtesy of Matthew Brett

Please report any issues here:
https://github.com/pydata/pandas/issues


Thanks

The Pandas Development Team


Contributors to the 0.16.1 release

   -
   - Alfonso MHC
   - Andy Hayden
   - Artemy Kolchinsky
   - Chris Gilmer
   - Chris Grinolds
   - Dan Birken
   - David BROCHART
   - David Hirschfeld
   - David Stephens
   - Dr. Leo
   - Evan Wright
   - Frans van Dunn?
   - Hatem Nassrat
   - Henning Sperr
   - Hugo Herter
   - Jan Schulz
   - Jeff Blackburne
   - Jeff Reback
   - Jim Crist
   - Jonas Abernot
   - Joris Van den Bossche
   - Kerby Shedden
   - Leo Razoumov
   - Manuel Riel
   - Mortada Mehyar
   - Nick Burns
   - Nick Eubank
   - Olivier Grisel
   - Phillip Cloud
   - Pietro Battiston
   - Roy Hyunjin Han
   - Sam Zhang
   - Scott Sanderson
   - Stephan Hoyer
   - Tiago Antao
   - Tom Ajamian
   - Tom Augspurger
   - Tomaz Berisa
   - Vikram Shirgur
   - Vladimir Filimonov
   - William Hogman
   - Yasin A
   - Younggun Kim
   - behzad nouri
   - dsm054
   - floydsoft
   - flying-sheep
   - gfr
   - jnmclarty
   - jreback
   - ksanghai
   - lucas
   - mschmohl
   - ptype
   - rockg
   - scls19fr
   - sinhrks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20150511/94c01b1e/attachment.html>

From jorisvandenbossche at gmail.com  Fri May 22 01:31:32 2015
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Fri, 22 May 2015 01:31:32 +0200
Subject: [Pandas-dev] [pydata] Re: Upcoming Index repr changes
In-Reply-To: <CALQtMBYgJumi=Jv6DFjOOyFnkvfcU1ffrk=M_pN4m_hsepWEWw@mail.gmail.com>
References: <CALQtMBaj320BaSMfh4N6YeUJqP9s6=y7jSnwzJnFAwq36d=YsA@mail.gmail.com>
 <3cb616db-fbad-44a9-970d-93c7dda3d42d@googlegroups.com>
 <6ab996ef-e112-4f52-9d58-60947629011f@googlegroups.com>
 <CALQtMBYgJumi=Jv6DFjOOyFnkvfcU1ffrk=M_pN4m_hsepWEWw@mail.gmail.com>
Message-ID: <CALQtMBZ6bCUUz-twhDgrZzeNY-_Ytrt+k72qNdmx-7+1fjykLQ@mail.gmail.com>

Follow-up of this discussion: as you may have seen, the changes were
released in 0.16.1 (see the whatsnew docs:
http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#index-representation
).
In the end, we used the suggestion of John to go for a bit more numpy style
output.

There will probably still be some quirks/things to improve, you can report
them at this follow-up issue: https://github.com/pydata/pandas/issues/10095

Joris

2015-04-21 2:59 GMT+02:00 Joris Van den Bossche <
jorisvandenbossche at gmail.com>:

> I like the suggestion of John to have something more like the output of
> numpy arrays.
>
> For example, the proposed repr:
>
> In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern')
> Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02
> 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14
> 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104,
> freq='D', tz='US/Eastern')
>
> would then be something like this:
>
> In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern')
> Out[12]:
> DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00',
> ...,
>                '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'],
>               dtype='datetime64[ns]', name=u'foo', length=104, freq='D',
> tz='US/Eastern')
>
>
> 2015-04-21 2:53 GMT+02:00 Jeff <jeffreback at gmail.com>:
>
>>
>> John, you are quoting the current impl (which is first), the new is like
>> this:
>>
>> In [11]: pd.date_range('20130101',periods=4,name='foo',tz='US/Eastern')
>> Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'], dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern')
>>
>> In [12]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern')
>> Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104, freq='D', tz='US/Eastern')
>>
>> Lorenzo, to answer your question, MultiIndexes are unchanged (and
>> CategoricalIndex are new). We *could* make them a single line but would be
>> pretty crowded.
>>
>> Note that MultiIndex and CategoricalIndex are multi-line repr and do no
>> truncate sequences (of e.g. labels), this is consistent with previous
>> versions. (easy to change this though)
>>
>> In [1]: MultiIndex.from_product([list('abcdefg'),range(10)],names=['first','second'])
>> Out[1]:
>> MultiIndex(levels=[[u'a', u'b', u'c', u'd', u'e', u'f', u'g'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]],
>>            labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]],
>>            names=[u'first', u'second'])
>>
>> In [4]: pd.CategoricalIndex(np.random.randint(0,5,size=100),name='foo')
>> Out[4]:
>> CategoricalIndex([3, 0, 0, 3, 1, 3, 0, 4, 2, 3, 0, 4, 0, 1, 2, 0, 4, 1, 4, 2, 3, 1, 0, 4, 4, 3, 0, 3, 0, 1, 2, 3, 3, 1, 1, 0, 0, 4, 4, 1, 1, 3, 1, 1, 4, 4, 3, 0, 0, 0, 4, 4, 0, 1, 3, 1, 2, 0, 3, 1, 2, 2, 2, 1, 1, 4, 1, 0, 4, 3, 3, 0, 0, 0, 4, 4, 1, 4, 2, 2, 1, 4, 0, 0, 0, 4, 3, 0, 4, 0, 0, 0, 3, 3, 1, 2, 2, 3, 4, 1],
>>                  categories=[0, 1, 2, 3, 4],
>>                  ordered=False,
>>                  name=u'foo',
>>                  dtype='category')
>>
>>
>>
>>
>>
>> On Monday, April 20, 2015 at 8:37:01 PM UTC-4, John E wrote:
>>>
>>> This is probably not the sort of comment you're looking for, but I'd
>>> like to see more of a table-style output.  I can just put a 'values' at the
>>> end to get the more numpy like output (which is easier to read IMO), but it
>>> won't stop at 10 or 100 unless I tell it to.  Nevertheless, I think it's
>>> much easer to read this:
>>>
>>> pd.date_range('20130101', periods=104, name='foo',
>>> tz='US/Eastern').values
>>> Out[442]:
>>> array(['2013-01-01T00:00:00.000000000-0500',
>>>        '2013-01-02T00:00:00.000000000-0500',
>>>        '2013-01-03T00:00:00.000000000-0500',
>>>        '2013-01-04T00:00:00.000000000-0500',
>>>        '2013-01-05T00:00:00.000000000-0500',
>>>
>>> than this:
>>>
>>> pd.date_range('20130101', periods=104, name='foo', tz='US/Eastern')
>>> Out[443]:
>>> <class 'pandas.tseries.index.DatetimeIndex'>
>>> [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00]
>>> Length: 104, Freq: D, Timezone: US/Eastern
>>>
>>>
>>> On Friday, April 17, 2015 at 6:07:44 AM UTC-4, Joris Van den Bossche
>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> We have a PR pending to unify the string representation of the
>>>> different Index objects: https://github.com/pydata/pandas/pull/9901
>>>>
>>>> What are the most important changes:
>>>>
>>>>    - We propose to reduce the default number of values shown from 100
>>>>    to 10 (an option controllable as pd.options.display.max_seq_items).
>>>>    - The datetime-like indices (DatetimeIndex, TimedeltaIndex,
>>>>    PeriodIndex) were always somewhat different and get a new repr that is now
>>>>    more consistent with how it is for other Index types like Int64Index. This
>>>>    is the biggest change.
>>>>
>>>> So for eg Int64Index not much changes (only 'name' is now also shown,
>>>> and the number of shown values has changed), but for DatetimeIndex the
>>>> change is larger.
>>>>
>>>> *But we would like to get some feedback on this!*
>>>>
>>>> Do you like the changes? For DatetimeIndex? For the number of shown
>>>> values?
>>>> Would you want different behaviour for repr() and str()?
>>>>
>>>> Some examples of the changes with the current state of the PR are shown
>>>> below:
>>>>
>>>> Previous Behavior
>>>>
>>>> In [1]: pd.get_option('max_seq_items')
>>>> Out[1]: 100
>>>>
>>>> In [2]: pd.Index(range(4), name='foo')
>>>> Out[2]: Int64Index([0, 1, 2, 3], dtype='int64')
>>>>
>>>> In [3]: pd.Index(range(104), name='foo')
>>>> Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
>>>> 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
>>>> 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
>>>> 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
>>>> 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
>>>> 91, 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64')
>>>>
>>>> In [4]: pd.date_range('20130101', periods=4, name='foo',
>>>> tz='US/Eastern')
>>>> Out[4]:
>>>> <class 'pandas.tseries.index.DatetimeIndex'>
>>>> [2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00]
>>>> Length: 4, Freq: D, Timezone: US/Eastern
>>>>
>>>> In [5]: pd.date_range('20130101', periods=104, name='foo',
>>>> tz='US/Eastern')
>>>> Out[5]:
>>>> <class 'pandas.tseries.index.DatetimeIndex'>
>>>> [2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00]
>>>> Length: 104, Freq: D, Timezone: US/Eastern
>>>>
>>>> New Behavior
>>>>
>>>> In [1]: pd.get_option('max_seq_items')
>>>> Out[1]: 10
>>>>
>>>> In [9]: pd.Index(range(4), name='foo')
>>>> Out[9]: Int64Index([0, 1, 2, 3], dtype='int64', name=u'foo')
>>>>
>>>> In [10]: pd.Index(range(104), name='foo')
>>>> Out[10]: Int64Index([0, 1, ..., 102, 103], dtype='int64', name=u'foo',
>>>> length=104)
>>>>
>>>> In [11]: pd.date_range('20130101', periods=4, name='foo',
>>>> tz='US/Eastern')
>>>> Out[11]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02
>>>> 00:00:00-05:00', '2013-01-03 00:00:00-05:00', '2013-01-04 00:00:00-05:00'],
>>>> dtype='datetime64[ns]', name=u'foo', freq='D', tz='US/Eastern')
>>>>
>>>> In [12]: pd.date_range('20130101', periods=104 ,name='foo',
>>>> tz='US/Eastern')
>>>> Out[12]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02
>>>> 00:00:00-05:00', ..., '2013-04-13 00:00:00-04:00', '2013-04-14
>>>> 00:00:00-04:00'], dtype='datetime64[ns]', name=u'foo', length=104,
>>>> freq='D', tz='US/Eastern')
>>>>
>>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "PyData" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pydata+unsubscribe at googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20150522/336edee4/attachment.html>

From jorisvandenbossche at gmail.com  Fri May 29 22:36:48 2015
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Fri, 29 May 2015 22:36:48 +0200
Subject: [Pandas-dev] Pandas development meeting: tuesday June 2 at 17:00 UTC
Message-ID: <CALQtMBZU8RaSPUHFPKpUsa_rxct_9eiOMi2GU=dvz2WodmG5eg@mail.gmail.com>

Hi all,

We are planning a next online Pandas Development Meeting coming Monday,
June 2nd, at 17:00 UTC (which should correspond to 19:00 CEST in most of
Europe, and 13:00 EST and  10am PST in the two sides of America).

Some first provisional topics to discuss are listed here:
https://docs.google.com/document/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit#heading=h.fwo3xcbnullz
If you think of other points, or have remarks on them, feel free to add
them in the google docs.

If you are interested in joining (and you don't need to be a core developer
of pandas for that!), send a notice, then we ensure to invite you for the
google hang-out.

Regards,
Joris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20150529/519c5b08/attachment.html>