[Pandas-dev] datetimes

Wes McKinney wesmckinn at gmail.com
Fri May 10 01:32:37 CEST 2013


On Tue, Apr 30, 2013 at 1:42 PM, Jeff Reback <jeffreback at gmail.com> wrote:
> Currently we allow ONLY datetime64[ns] as an internal representation (and
> analgously timedelta64[ns] for timedeltas).
>
> There are several issues where things like this are done:
>
> a) Series([np.datetime(2013,1,1),np.datetime(2013,1,2)],dtype='M8[ms]')
>
> b) Series([datetime(2013,1,1),datetime(2013,1,2)],dtype='M8[D]')
>
>
> in a)  the np.datetimes are by default [us], so we need to do a conversion
> to M8[ns], ok, can do that to keep the internal rep, but what about the
> dtype specified? is this effectively an astype, but then is this
> conceptually just a display thing, e.g. the user wants to view the data as
> [ms], rather than [ns]
>
> several options to think about:
>
> 1) ignore completely the passed dtype and do some conversions on
> np.datetime64 (which we already do) to guarantee
> a M8[ns] internally (we do this now, but bork on a passed dtype that is not
> M8[ns] when the data is M8)
> 2) keep the passed dtype (or the inferred dtype) internally, effectively
> making datetimes a suite of M8[ms,D,s,ns......]
> 3) keep data a M8[ns] internally and provide an asfreq which works kind of
> like the PeriodIndex method, which can provide a DateTimeIndex I guess of
> the requested frequency? but then I keep thinking, is there any actual
> difference between
> 20130101 15:00:01.12345 in [ms], or [ns] (right now no)
>
> Any thoughts....I know I am ramblings a bit, but confused over what is even
> necessary here...
>
>
> Jeff
>
>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> http://mail.python.org/mailman/listinfo/pandas-dev
>

Little slow getting back. I'm pretty unhappy with how things turned
out in NumPy-- I guess it's my fault for not speaking up when the work
was being done in 2010 and 2011, but back then no one in the
Scientific Python establishment took pandas very seriously.

My thinking has always been we should have either have:

a) a single timestamp and timedelta data type, and different lower
frequencies (annual, montly, etc.) can be handled by the period data
type. This is the approach taken by pandas right now

b) A timestamp with parametric units. This is the approach taken in
NumPy, but with essentially no APIs to help you with that.

I'm fine with always yielding datetime64[ns] out of whatever
datetime64 dtype is passed. The NumPy data type system is just an ugly
implementation detail at this point, especially in this area.

- Wes


More information about the Pandas-dev mailing list