[Pandas-dev] Datetime (with timezone?) as extension array?

Pietro Battiston me at pietrobattiston.it
Tue Aug 14 12:21:29 EDT 2018


Il giorno mar, 14/08/2018 alle 08.32 -0700, Brock Mendel ha scritto:
> `DatetimeArray` is close to ready if you want to bring it over the
> finish line.  Pretty much all that has to be done is having
> `DatetimeArrayMixin` subclass `ExtensionArray` (and, uh, implement
> the relevant EA methods).  If no one else picks this up, my current
> plan is to do this _after_ updating all of the relevant arithmetic
> tests to test DatetimeArrayMixin.
> 
> > The unclear part is what `Series[datetime_with_tz].values` should
> be.
> 
> I thought the conclusion was that `.values` should be non-lossy, in
> which case it would have to be the EA.  My preference would be for
> the EA to be returned for non-tz datetime64[ns] Series too.

Thanks for the clarifying comments.

I just wanted to stress that my concern is not just about the
(problematic) issue of whether ``.values`` should drop the tz, but
first and foremost that

pd.Series([pd.Timestamp('2018-10-10', tz='utc')])._values

returns a (Datetime)Index.
That this is wrong, I think is not controversial (right?), and
decoupling the datetime storage from the index interface should not per
se be a source of compatibilty problems (and is, as far as I
understand, a required step towards using DatetimeArray - and removing
some hacks in the codebase).

... but maybe there is no issue for this just because it is a natural
part of the migration to DatetimeArray?

Pietro


More information about the Pandas-dev mailing list