[Pandas-dev] Proposal to change the default of to_datetime in case of errors from 'ignore' to 'raise'

John E eiler13 at gmail.com
Thu Jul 23 11:22:28 EDT 2015


Well, I was only yesterday complaining at github about the silent default 
of read_csv converting 'NA' to NaN.   ;-)  So I have to agree with Lorenzo 
that this is a good change.  It also seems more consistent with pandas 
overall behavior.

FWIW, Stata's default with these sorts of operations is always to tell you 
how many values were changed, which is often very helpful.  E.g. if Stata 
tells you zero values were changed, this is a big clue you screwed up.  
Often this is more verbose than desired, but it's also easy to change that.

So, I'm definitely fine with just making it an error, but a possible middle 
ground would be a short report like:  "20 values changed, 5 values not 
changed".


On Thursday, July 23, 2015 at 9:51:11 AM UTC-4, Lorenzo De Leo wrote:
>
> Personally I'm very much in favor of this change. I don't like silent 
> defaults ;)
>
> L
>
>
> On Wednesday, July 22, 2015 at 4:59:31 PM UTC+2, Joris Van den Bossche 
> wrote:
>>
>> Hi all,
>>
>> On github there is a proposal to change the default behaviour of 
>> to_datetime in case of a parsing error from 'ignore' (leaving the values 
>> untouched) to 'raise' (raise an error).
>>
>> As a small example, the current behaviour:
>>
>> In [5]: pd.to_datetime('2014-30-30', errors='ignore')   # the default now
>> Out[5]: '2014-30-30'
>>
>> In [6]: pd.to_datetime('2014-30-30', errors='raise')
>> ...
>> ValueError: month must be in 1..12
>>
>>
>> So the proposal would be to change the default to the second case, 
>> raising an error.
>>
>> Note that this behaviour is already the default when providing your own format 
>> (and so in fact ignoring the value of the errors keyword):
>>
>> In [7]: pd.to_datetime('2014-30-30', format='%Y-%m-%d')
>> ...
>> ValueError: time data '2014-30-30' does not match format '%Y-%m-%d'
>>
>>
>> *Are there any objections to this change? *
>> *Are there people relying on the fact that, by default, to_datetime 
>> returns the exact original value if parsing does not succeed?*
>>
>> Best regards,
>> Joris
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20150723/c6cf51aa/attachment-0001.html>


More information about the Pandas-dev mailing list