Improving datetime

Nicholas F. Fabry nick.fabry at coredump.us
Wed Mar 19 21:11:14 EDT 2008


			
On Mar 19, 2008, at 18:32, Steven D'Aprano wrote:

> On Wed, 19 Mar 2008 17:40:39 -0400, Nicholas F. Fabry wrote:
>
>> To summarize my proposal VERY briefly:
>>
>>
>> - Make aware datetime objects display in local time, but calculate/
>> compare in UTC.
>
> Your proposal is ambiguous. What does that mean? Can you give an  
> example?
>
>

That's why I said it was the VERY brief version - here is a longer  
version.

I would like a datetime to display local times, since if I create it  
with a specific tzinfo timezone, that implies I'm interested in what  
clocks in that time zone say.

For example:

 >>> from datetime import datetime
 >>> from dateutil.tz import gettz
 >>> NYC = gettz('America/New_York')
 >>> UTC = gettz('UTC')
 >>> wake_up_time = datetime(2008,3,8,15,30,0,tzinfo=NYC)
 >>> print wake_up_time
2008-03-08 15:30:00-05:00
 >>> print wake_up_time.hour
15

This is the behavior I want - if I create a datetime in a specific  
timezone, I wish to know the member information (date, hour, second,  
etc.) in that timezone.

However, when I calculate with it, I can get wrong answers.

 >>> a_later_time = datetime(2008,3,9,15,30,0,tzinfo=NYC)
 >>> print a_later_time - wake_up_time
1 day, 0:00:00

This is incorrect, because a_later_time is in daylight saving time,  
but the eariler time is in standard time - the gap is actually 23:00  
hours.

 >>> print a_later_time.astimezone(UTC) - wake_up_time.astimezone(UTC)
23:00

The reason datetime performs this calculation incorrectly is  
documented - if the tzinfo members are the same object, it ignores  
them and makes the calculation as though the two datetimes were  
naive.  If the tzinfo objects are different, even if they correspond  
to the exact same ruleset, then datetime does the job correctly.

 >>> import copy
 >>> ALT_NYC = copy.deepcopy(NYC)
 >>> the_same_later_time = datetime(2008,3,9,15,30,0,tzinfo=ALT_NYC)
 >>> print the_same_later_time - wake_up_time
23:00

The results of the calculation should not be dependent on whether the  
tzinfo object is the same or not, but what it's .utcoffset() method  
returns.  Or, to summarize, ALL calculations with aware datetime  
objects should first change the datetime objects to UTC so the  
calculations are done correctly.

Changing all times to UTC creation, aside from making ugly code, now  
forces a conversion back to local time, with more code, any time you  
want to display date and time information in the local time zone.   
This is not a good, pythonic solution.


>
>
>> - Raise exceptions when an illegal or ambiguous datetime is  
>> instantated.
>
> You mean like they already do?
>
>>>> datetime.datetime(2008, 03, 35)  # 35th of March
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ValueError: day is out of range for month
>
>

However, the following does NOT raise an exception, and it should.

 >>> a_time = datetime(2008, 3, 9, 1, 30, 0, tzinfo=NYC)
 >>> print a_time
2008-03-09 01:30:00-05:00

The problem is that there IS no local wallclock time of 0130 on March  
9, 2008 in New York.  At 0100, local clocks jump to 0200 and proceed  
from there.  A local wallclock time of 0130 on March 9, 2008 in New  
York is meaningless - it should not occur.  Therefore, if a programmer  
is (for example) parsing local times (say, out of a schedule book),  
and an illegal time like this appears, instead of an exception  
immediately being raised to alert him/her of the faulty source data,  
the program *assumes* the illegal local time was 'meant' to be in  
standard time, and sends it merrily on its way... making it much  
harder when errors crop up later to figure out their source.



>
>
> As for ambiguous, how can datetime arguments be ambiguous?
>
>    Help on class datetime in module datetime:
>
>    class datetime(date)
>     |  datetime(year, month, day[, hour[, minute[, second[,
>                 microsecond[,tzinfo]]]]])
>
>
> What possible ambiguity is there?
>

Continuing on from before:

 >>> b_time = datetime(2008, 11, 2, 1, 30, 0, tzinfo=NYC)
 >>> print b_time
2008-11-02 01:30:00-05:00

There is a little problem with this, though.  At 2008 Nov 2 at 0200  
Eastern Daylight Time, the clocks swing back one hour, to 0100, and  
essentially repeat the hour between 0100 and 0200 again.  So - is  
b_time 0130 Eastern STANDARD Time, or is it 0130 Eastern DAYLIGHT  
Time?  Simply providing a tzinfo class does not resolve this  
ambiguity, so it is again *assumed* that you 'meant' standard time.   
That may or may not be true, depending on the source data and other  
external factors.
A good datetime library (or tzinfo library) should alert you to this  
fact.  Isn't Explict better than Implicit?

A further problem is that it is now impossible to specify using tzinfo  
= NYC, a time corresponding to 2008 Nov 2, 0530 UTC, or in other  
words, 2008 Nov 2 0130 EDT.  If you try it directly, as above, you get  
2008 Nov 2 0130 EST, which is 0630 UTC.  If you create the time 2008  
Nov 2 0530 UTC, and then use .astimezone(NYC), you will get the same  
result.... 2008 Nov 2 0130 EST, which is not the EDT time we wanted,  
nor is it equal to 0530 UTC.

 >>> c_time = datetime(2008,11,2,5,30,0,tzinfo=UTC)
 >>> print c_time.astimezone(NYC)
2008-11-02 01:30:00-05:00
 >>> print c_time.astimezone(NYC).astimezone(UTC) == c_time
False

These problems all stem from the fact that local times are used for  
internal storage, when in certain circumstances they do not uniquely  
specify a real time.

This could be fixed by making each datetime object really have TWO  
datetime objects within it - one that corresponds to UTC (so it is  
unambigious what real time it refers to, and so calculations can  
speedily access UTC time), and one that corresponds to local wallclock  
time (so it is known what the local wallclocks are displaying, and  
allowing efficient member extraction and not breaking existing code).


This is the moderately short and casual version of it.

Nick





P.S. The code in full form, to follow along better:

from datetime import datetime
from dateutil.tz import gettz
NYC = gettz('America/New_York')
UTC = gettz('UTC')
wake_up_time = datetime(2008,3,8,15,30,0,tzinfo=NYC)
print wake_up_time
#2008-03-08 15:30:00-05:00
print wake_up_time.hour
#15
a_later_time = datetime(2008,3,9,15,30,0,tzinfo=NYC)
print a_later_time - wake_up_time
#1 day, 0:00:00
print a_later_time.astimezone(UTC) - wake_up_time.astimezone(UTC)
#23:00
import copy
ALT_NYC = copy.deepcopy(NYC)
the_same_later_time = datetime(2008,3,9,15,30,0,tzinfo=ALT_NYC)
print the_same_later_time - wake_up_time
#23:00
a_time = datetime(2008, 3, 9, 1, 30, 0, tzinfo=NYC)
print a_time
#2008-03-09 01:30:00-05:00
b_time = datetime(2008, 11, 2, 1, 30, 0, tzinfo=NYC)
print b_time
#2008-11-02 01:30:00-05:00
c_time = datetime(2008,11,2,5,30,0,tzinfo=UTC)
print c_time.astimezone(NYC)
#2008-11-02 01:30:00-05:00
print c_time.astimezone(NYC).astimezone(UTC) == c_time
#False













>
> -- 
> Steven
> -- 
> http://mail.python.org/mailman/listinfo/python-list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080319/f43d8e3d/attachment-0001.html>


More information about the Python-list mailing list