[Pandas-dev] Pandas development hangout - Thursday September 27 at 17:00 UTC

Joris Van den Bossche jorisvandenbossche at gmail.com
Fri Sep 28 04:31:43 EDT 2018


Thanks for the notes!

2018-09-28 0:14 GMT+02:00 Brock Mendel <jbrockmendel at gmail.com>:

> I've written up some thoughts on the discussion (and things I couldn't
> communicate because of audio trouble)
>
> I) DatetimeArray/TimedeltaArray/PeriodArray Status
>     The constructors are unfinished.  The DatetimeIndex constructors have
>     comments suggesting they be simplified.  When writing the Array
>     constructors I only ported the parts I thought non-controversial.
>

Personally, I don't think we should copy over the constructors of the index
classes to the arrays. Eg the DatetimeIndex constructor is indeed overly
complicated, trying to do partly what date_range and to_datetime already
do.
I would personally keep the Array constructors very simple (what we
typically called _simple_new), and have other constructor methods/functions
for those specific cases.


>
>     The tests are near nonexistent.  The game-plan has been to get the
>     arithmetic tests finished, then add DatetimeArray etc to the
>     parameterizations.  This has been slowed down by the fact that there
>     are more arithmetic inconsistencies in DataFrame than expected.
>
>
> II) ExtensionArray
>     A) Allow 2D?
>         i) AFAICT none of the currently-implemented EA code actually
> depends
>            on the 1D restriction
>         ii) A bunch of fragile Block/BlockManager/DataFrame code has to do
>             gymnastics to deal with 1D-only cases.  Allowing reshape
>             to (N, 1) would make a lot of that unnecessary.
>

It certainly creates complexity to have both 1D (non-consolidatable) and 2D
blocks. But personally, I find the 2D nature (and then also transposed to
what you would expect) also very confusing to work with. A "one column =
one 1D array" model seems more attractive to me.


>         iii) If we intend for EA to be useful in the broader ecosystem
>              (e.g. xarray), it needs to be pretty much a drop-in
> replacement
>              for ndarray.
>

This is a good reason (and it would be interesting to hear from that
community), and we might need to be careful here (as we already have places
where we deviate from numpy semantics in EA).
But even then, when an EA supports 2D, we could still store it in a
DataFrame as a 1D array.


>
>     B) Constructors and Composition vs Inheritance
>         i) `Index` subclasses have `_simple_new` and `Index.__new__` can be
>            used to dispatch to the appropriate Index subclass.
>
>            Similarly, `Block` subclasses have `make_block` and
>            `internals.blocks.make_block` can be used to dispatch to the
>            appropriate `Block` subclass.
>

This is only for the base Index class no? (that it can return any kind of
Index subclass) So I don't think this aspect necessarily needs to influence
how the actual subclasses are created.


>
>         ii) Consider the following:
>             - Change `make_block` to follow `_simple_new` semantics/naming,
>                have `Block.__new__` behave analogously to `Index.__new__`
>             - Implement `_simple_new` on pandas' EA subclasses, with a
> similar
>               `EArray.__new__` dispatch
>
            - Define something like
>
>               @property
>               def _base_constructor(self):
>                   return (Index|Block|EArray)
>
>             - De-duplicate a whole mess of code
>
>         iii) This would pretty well lock us in to using inheritance
>

This might de-duplicate some code, but IMO at the cost of increased
complexity. Having both both Index and Array, we are two different objects
with different semantics, share actual implementation (instead of sharing
via composition) will make it more complex, I think.
Personally, I have the feeling that the composition will give use a simpler
model to reason about. And dispatching to underlying EA methods can
introduce some code overhead, but that could be automated if needed.

Joris



>
>
> On Thu, Sep 27, 2018 at 12:34 PM Joris Van den Bossche <
> jorisvandenbossche at gmail.com> wrote:
>
>> Sorry for the troubles. Annoying now, but somehow also a good sign that
>> we are with many of course .. Any case, need to think about it in advance
>> next time!
>>
>> Notes are still in the same document: https://docs.google.com/document/d/
>> 1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit#
>>
>> 2018-09-27 19:37 GMT+02:00 Marc Garcia <garcia.marc at gmail.com>:
>>
>>> Hey Wes,
>>>
>>> we're moving here: anaconda.webex.com/join/taugspurger
>>>
>>> On Thu, Sep 27, 2018 at 6:29 PM Wes McKinney <wesmckinn at gmail.com>
>>> wrote:
>>>
>>>> I came late to the call, but it was full. Can someone with a company
>>>> that uses Google Meet host the next hangout? I think those are not
>>>> size limited
>>>> On Tue, Sep 25, 2018 at 2:38 AM Joris Van den Bossche
>>>> <jorisvandenbossche at gmail.com> wrote:
>>>> >
>>>> > Correction to my previous mail: the title was correct about 17:00
>>>> UTC, but of course this corresponds then to 10:00 Pacific / 13:00 Eastern /
>>>> 18:00 UTC+1 / 18:00 CEST (Europe).
>>>> >
>>>> > Joris
>>>> >
>>>> >
>>>> > 2018-09-25 0:27 GMT+02:00 Joris Van den Bossche <
>>>> jorisvandenbossche at gmail.com>:
>>>> >>
>>>> >> Hi all,
>>>> >>
>>>> >> We're having a dev chat coming Thursday (September 27) at 9:00
>>>> Eastern / 14:00 UTC+1 / 15:00 CEST (Europe).
>>>> >> All are welcome to attend.
>>>> >>
>>>> >> Hangout: https://hangouts.google.com/hangouts/_/calendar/
>>>> am9yaXN2YW5kZW5ib3NzY2hlQGdtYWlsLmNvbQ.4mvdhb4jukib8ei4nsi4vn04do?
>>>> ijlm=1537828013406&authuser=0
>>>> >>
>>>> >> Calendar invite: https://calendar.google.com/
>>>> event?action=TEMPLATE&tmeid=NG12ZGhiNGp1a2liOGVpNG5zaTR2bj
>>>> A0ZG8gam9yaXN2YW5kZW5ib3NzY2hlQG0&tmsrc=jorisvandenbossche%40gmail.com
>>>> >>
>>>> >> Agenda/Minutes: https://docs.google.com/document/d/
>>>> 1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?usp=sharing
>>>> >>
>>>> >> Joris
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >
>>>> > _______________________________________________
>>>> > Pandas-dev mailing list
>>>> > Pandas-dev at python.org
>>>> > https://mail.python.org/mailman/listinfo/pandas-dev
>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>
>>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180928/49a2df55/attachment-0001.html>


More information about the Pandas-dev mailing list