[Python-Dev] iso8601 parsing

Wes Turner wes.turner at gmail.com
Thu Oct 26 16:26:03 EDT 2017


On Thursday, October 26, 2017, Chris Barker <chris.barker at noaa.gov> wrote:

> On Wed, Oct 25, 2017 at 7:37 PM, Wes Turner <wes.turner at gmail.com
> <javascript:_e(%7B%7D,'cvml','wes.turner at gmail.com');>> wrote:
>
>> ISO 8601 support offsets, but not time zones -- presumably the __str__
>>> supports the full datetime tzinfo somehow. Which may be why .isoformat()
>>> exists.
>>>
>>
>> ISO8601 does support timezones.
>> https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators
>>
>
> No, it doesn't -- it may call them "timezones", but it only supports
> offsets -- that is, and offset of -6 could be US Eastern Standard Time or
> US Central Daylight TIme (or I got that backwards :-)  )
>
> The point is that an offset is really easy, and timezones (with Daylight
> savings and all that) are a frickin' nightmare, but ARE supported by
> datetime
>
> Note that the vocabulary is not precise here, as I see this in the Pyton
> docs:
>
> *class *datetime.timezone
>
> A class that implements the tzinfo
> <https://docs.python.org/3/library/datetime.html#datetime.tzinfo> abstract
> base class as a fixed offset from the UTC.
> So THAT is supported by iso8601, and, indeed maps naturally to it.
>

Got it, thanks.


>
> Which means we can round trip isp8601 datetimes nicely, but can't round
> trip a datetime with a "full featured" tzinfo attached.
>

Because an iso8601 string only persists the offset.


>
> I don't think this really has any impact on the proposal, though: it's
> clear what to do when parsing a iso Datetime.
>
> I might be wrong, but I think many of the third party libraries listed
>> here default to either UTC or timezone-naieve timezones:
>> https://github.com/vinta/awesome-python/blob/master/README.
>> md#date-and-time
>>
>
> This is a key point that I hope is obvious:
>
>

> If an ISO string has NO offset or timezone indicator, then a naive
> datetime should be created.
>


>
> (I say, I "hope" it's obvious, because the numpy datetime64 implementation
> initially (and for years) would apply the machine local timezone to a bare
> iso string -- which was a f-ing nightmare!)
>

astropy.time.Time supports numpy.


>
>
>> Ctrl-F for 'tzinfo=' in the docs really doesn't explain how to just do it
>> with my local time.
>>
>> Here's an example with a *custom* GMT1 tzinfo subclass:
>> https://docs.python.org/3/library/datetime.html#datetime.time.tzname
>>
>
> Here it is:
>
> class GMT1(tzinfo):
>     def utcoffset(self, dt):
>         return timedelta(hours=1)
>     def dst(self, dt):
>         return timedelta(0)
>     def tzname(self,dt):
>         return "Europe/Prague"
>
> I hope Prague doesn't do DST, or that would be just wrong ...
>

Pendulum seems to have a faster timezone lookup than pytz:

https://pendulum.eustace.io/blog/a-faster-alternative-to-pyz.html

Both pendulum and pytz are in conda-forge (the new basis for the anaconda
distribution).


>
> What would you call the str argument? Does it accept strptime args or only
>> ISO8601?
>>
>
> I think Fred answered this, but iso 8601 only. we already have strptime if
> you need to parse anything else.
>
> Would all of that string parsing logic be a performance regression from
>> the current constructor? Does it accept None or empty string?
>>
>
> I suppose you need to do a little type checking first, so a tiny one.
>
> Though maybe just catching an Exception, so really tiny.
>
> The current constructor only takes numbers, so yes the string parsing
> version would be slower, but only if you use it...
>
> Deserializing dates from JSON (without #JSONLD and xsd:dateTime (ISO8601))
>> types is nasty, regardless (try/except, *custom* schema awareness). And
>> pickle is dangerous.
>>
>> AFAIU, we should not ever eval(repr(dt: datetime)).
>>
>
> why not? isn't that what __repr__ is supposed to do?
>

repr(dict) now returns ellipses ... for cyclical dicts; so I'm assuming
that repr only MAY be eval'able.


>
> Or do you mean not that it shouldn't work, but that we shouldn't do it?
>

That
We shouldn't ever eval untrusted data / code.
(That's why we need package hashes, signatures, and TUF).


>
> -CHB
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE
> <https://maps.google.com/?q=7600+Sand+Point+Way+NE&entry=gmail&source=g>
>   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> <javascript:_e(%7B%7D,'cvml','Chris.Barker at noaa.gov');>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171026/44a52833/attachment.html>


More information about the Python-Dev mailing list