Question about garbage collection

Chris Angelico rosuav at gmail.com
Tue Jan 16 07:15:32 EST 2024


On Tue, 16 Jan 2024 at 23:08, Frank Millman via Python-list
<python-list at python.org> wrote:
>
> On 2024-01-15 3:51 PM, Frank Millman via Python-list wrote:
> > Hi all
> >
> > I have read that one should not have to worry about garbage collection
> > in modern versions of Python - it 'just works'.
> >
> > I don't want to rely on that. My app is a long-running server, with
> > multiple clients logging on, doing stuff, and logging off. They can
> > create many objects, some of them long-lasting. I want to be sure that
> > all objects created are gc'd when the session ends.
> >
>
> I did not explain myself very well. Sorry about that.
>
> My problem is that my app is quite complex, and it is easy to leave a
> reference dangling somewhere which prevents an object from being gc'd.
>
> This can create (at least) two problems. The obvious one is a memory
> leak. The second is that I sometimes need to keep a reference from a
> transient object to a more permanent structure in my app. To save myself
> the extra step of removing all these references when the transient
> object is deleted, I make them weak references. This works, unless the
> transient object is kept alive by mistake and the weak ref is never removed.
>
> I feel it is important to find these dangling references and fix them,
> rather than wait for problems to appear in production. The only method I
> can come up with is to use the 'delwatcher' class that I used in my toy
> program in my original post.
>
> I am surprised that this issue does not crop up more often. Does nobody
> else have these problems?
>

It really depends on how big those dangling objects are. My personal
habit is to not worry about a few loose objects, by virtue of ensuring
that everything either has its reference loops deliberately broken at
some point in time, or by keeping things small.

An example of deliberately breaking a refloop would be when I track
websockets. Usually I'll tag the socket object itself with some kind
of back-reference to my own state, but I also need to be able to
iterate over all of my own state objects (let's say they're
dictionaries for simplicity) and send a message to each socket. So
there'll be a reference loop between the socket and the state. But at
some point, I will be notified that the socket has been disconnected,
and that's when I go to its state object and wipe out its
back-reference. It can then be disposed of promptly, since there's no
loop.

It takes a bit of care, but in general, large state objects won't have
these kinds of loops, and dangling references haven't caused me any
sort of major issues in production.

Where do you tend to "leave a reference dangling somewhere"? How is
this occurring? Is it a result of an incomplete transaction (like an
HTTP request that never finishes), or a regular part of the operation
of the server?

ChrisA


More information about the Python-list mailing list