memory consumption

Chris Angelico rosuav at gmail.com
Thu Apr 1 08:26:34 EDT 2021


On Thu, Apr 1, 2021 at 10:56 PM Alexey <zen.supagood at gmail.com> wrote:
>
> Found it. As I said before the problem was lurking in the cache.
> Few days ago I read about circular references and things like that and
> I thought to myself that it might be the case. To build the cache I was
> using lots of 'setdefault' methods chained together
>
> self.__cache.setdefault(cluster_name, {}).setdefault(database_name, {})...
>
> and instead of wring a long lines I decided to divide it to increase
> readability
>
> cluster = self.__cache.setdefault(cluster_name, {})
> database = database.setdefault(database_name, {})
> ...
> and I guess that was the problem.
>
> First thing I did was to rewrite this back to single line.

If the cache is always and only used in this way, it might be cleaner
to use a defaultdict(dict) instead of the setdefault calls. Or, since
this appears to be a two-level cache:

self.__cache = defaultdict(lambda: defaultdict(dict))

and then you can simply reference
self.__cache[cluster_name][database_name] to read or update the cache.

> And it helped.
> In the morning I tried different approach and decided to clear cache
> with different way. So instead of doing self.__cache.clear(),
> self.__cache = None or even 'del self.__cache' I did:
>
> for item in list(self.__cache.keys()):
>         del self.__cache[item]
>
> and againg effect was positive. As a result I decided to rewrite all the
> methods to build,update and get from cache without 'setdefault' and
> to use "for loop" instead of dict.clear().

That seems very strange. Why should this be more effective than
self.__cache.clear()? I don't get it.

Having that be more efficient than either self.__cache=None or del
self.__cache (which will be equivalent), I can understand. But better
than clearing the dict? Seems very odd.

Ideally, though, you'd want to NOT have those reference loops. I
presume the database objects need to have a reference to whatever
'self' is, but perhaps the cache can be done externally to the object,
which would make all the references one-way instead of circular. But
that's something only you can investigate.

ChrisA


More information about the Python-list mailing list