[issue35723] Add "time zone index" cache to datetime objects

STINNER Victor report at bugs.python.org
Mon Jan 14 07:57:46 EST 2019


STINNER Victor <vstinner at redhat.com> added the comment:

I dislike adding a public API for an optimization. Would it be possible to make it private? Would it make sense? tzidx => _tzidx.

> One other thing I might mention here is that I did explore the idea of storing this cache on the tzinfo implementation itself, but it is problematic for a number of reasons:
>
> 1. It would either need to use some sort of expiring cache (lru, ttl) or require a great deal of memory, greatly reducing the utility - the proposed implementation requires no additional memory.

In test of your PR, tzinfo allocates memory for its cache: 

            offsets = [timedelta(hours=0), timedelta(hours=1)]
            names = ['+00:00', '+01:00']
            dsts = [timedelta(hours=0), timedelta(hours=1)]

This memory isn't free. I don't see how using an index completely prevents the need to allocate memory for a cache.

Somehow, we need a method to clear the cache and decide a caching policy. The simplest policy is to have no limit. The re.compile() uses a cache of 512 entries. functools.lru_cache uses a default limit of 128 entries.

Instead of adding a new API, would it be possible to reuse functools.lru_cache somehow?

> 2. Because the implementation of datetime.__hash__ invokes utcoffset(), it is impossible to implement utcoffset in terms of a dictionary of tz-aware datetimes. This means that you need to construct a new, naive datetime, which is a fairly slow operation and really puts a damper in the utility of the cache.

For special local timezones, would it be possible to explicitly exclude them, and restrict the cache the simple timespace (fixed offset)?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35723>
_______________________________________


More information about the Python-bugs-list mailing list