[New-bugs-announce] [issue19475] Inconsistency between datetime's str()/isoformat() and its strptime() method

Fri Nov 1 18:05:21 CET 2013

New submission from Skip Montanaro:

I have a CSV file. Here are a few rows:

"2013-10-30 14:26:46.000528","1.36097023829"
"2013-10-30 14:26:46.999755","1.36097023829"
"2013-10-30 14:26:47.999308","1.36097023829"
"2013-10-30 14:26:49.002472","1.36097023829"
"2013-10-30 14:26:50","1.36097023829"
"2013-10-30 14:26:51.000549","1.36097023829"
"2013-10-30 14:26:51.999315","1.36097023829"
"2013-10-30 14:26:52.999703","1.36097023829"
"2013-10-30 14:26:53.999640","1.36097023829"
"2013-10-30 14:26:54.999139","1.36097023829"

I want to parse the strings in the first column as timestamps. I can, and often do, use dateutil.parser.parse(), but in situations like this where all the timestamps are of the same format, it can be incredibly slow. OTOH, there is no single format I can pass to datetime.datetime.strptime() that will parse all the above timestamps. Using "%Y-%m-%d %H:%M:%S" I get errors about the leftover microseconds. Using "%Y-%m-%d %H:%M:%S".%f" I get errors when I try to parse a timestamp which doesn't have microseconds.

Alas, it is datetime itself which is to blame for this problem. The above timestamps were all printed from an earlier Python program which just dumps the str() of a datetime object to its output CSV file. Consider:

>>> dt = dateutil.parser.parse("2013-10-30 14:26:50")
>>> print dt
2013-10-30 14:26:50
>>> dt2 = dateutil.parser.parse("2013-10-30 14:26:51.000549")
>>> print dt2
2013-10-30 14:26:51.000549

The same holds for isoformat():

>>> print dt.isoformat()
2013-10-30T14:26:50
>>> print dt2.isoformat()
2013-10-30T14:26:51.000549

Whatever happened to "be strict in what you send, but generous in what you receive"? If strptime() is going to complain the way it does, then str() should always generate a full timestamp, including microseconds. The above is from a Python 2.7 session, but I also confirmed that Python 3.3 behaves the same.

I've checked 2.7 and 3.3 in the Versions list, but I don't think it can be fixed there. Can the __str__ and isoformat methods of datetime (and time) objects be modified for 3.4 to always include the microseconds? Alternatively, can the %S format character be modified to consume optional decimal point and microseconds? I rate this as "easy" considering the easiest fix is to modify __str__ and isoformat, which seems unchallenging.

----------
components: Extension Modules
keywords: easy
messages: 201917
nosy: skip.montanaro
priority: normal
severity: normal
status: open
title: Inconsistency between datetime's str()/isoformat() and its strptime() method
type: behavior
versions: Python 2.7, Python 3.3, Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19475>
_______________________________________