Pickle caching objects?

Mon Dec 16 02:56:26 EST 2019

José María Mateos <chema at rinzewind.org> writes:
> I just asked this question on the IRC channel but didn't manage to get
> a response, though some people replied with suggestions that expanded
> this question a bit.
>
> I have a program that has to read some pickle files, perform some
> operations on them, and then return. The pickle objects I am reading
> all have the same structure, which consists of a single list with two
> elements: the first one is a long list, the second one is a numpy
> object.
>
> I found out that, after calling that function, the memory taken by the
> Python executable (monitored using htop -- the entire thing runs on
> Python 3.6 on an Ubuntu 16.04, pretty standard conda installation with
> a few packages installed directly using `conda install`) increases in
> proportion to the size of the pickle object being read. My intuition
> is that that memory should be free upon exiting.
>
> Does pickle keep a cache of objects in memory after they have been
> returned?

"pickle.load" does not itself cache objects in memory.
However, Python may cache some (usually small objects, such as some
integer objects).

Note also that Python memeory management is quite elaborate:
not every memory block is immediately obtained from and released
to the operating system: Python has its own memory management
data structures (to fill the gap between the fine grained
memory requirements of a Python application and the mostly crude memory
management services the operating system supports out of hand).
This means that usually, a memory block freed by the Python application
is not returned to the operating system but maintained by Python's
memory management to be reused later. As a consequence, operating system
tools for memory monitoring usually cannot tell the amount of memory
really used by the application.