[Python-ideas] Multi-core reference count garbage collection

Jonathan Fine jfine2358 at gmail.com
Sat Jul 21 03:54:35 EDT 2018


Hi Steve

Thank you for your message. I think my response below allows us to go move
forward.

WHAT'S THE PROBLEM
You asked:
> What problem are you trying to solve?
> Its okay if there is no immediate problem, that you're just exploring
> alternative garbage collection strategies. Or if you're trying to remove
> the GIL.

I'm exploring. I hope we'll find something that helps CPython run faster on
multi-core machines, at least for some users. I hope this helps you
understand where I'm coming from.

Please, in this thread, can we confine ourselves to getting a better
understanding of multi-core reference count garbage collection. That is for
me already hard enough, in isolation, without mixing in other issues.

RACE CONDITION
You wrote
> multithreaded reference counting can suffer from race conditions
> when incrementing or decrementing
In support you referenced (the very useful)
>
http://flyingfrogblog.blogspot.com/2013/09/how-do-reference-counting-and-tracing.html
The relevant statement on that page is
> multithreaded reference counting is non-deterministic because increments
and decrements race.

According to https://en.wikipedia.org/wiki/Race_condition
---
A race condition or race hazard is the behavior of an electronics,
software, or other system where the output is dependent on the sequence or
timing of other uncontrollable events. It becomes a bug when events do not
happen in the order the programmer intended.
---

I like this, but I would change the last sentence to read
> It becomes a bug when the outputs are incorrect, for example are in a
wrong order.
Sometimes, eg iterating over a set, there is no intended order.

DEALING WITH THE RACE CONDITION
There is a race condition (or hazard) in buffered reference counting
garbage collection. You wrote that this "would be a bad thing". I hope
that, at the expense of some complexity in the code, the potential for a
bug arising from race hazard can be avoided. Certainly, a bug in garbage
collection is a bad thing.

I was aware of this race hazard when I started this discussion thread. In
my first post of yesterday I made "a start on the hard part" of "doing the
right thing". I also then said
> The remainder of the hard part is to get hold of adjusted reference
> counts, without having to lock the worker threads.

GOING FORWARD
I hope in the next day or two to make a start on the remainder of the hard
part. Your link to flyingfrog is valuable. I hope you'll contribute again.

Finally, you might help you to know that I'm trying in my exploration to
use top-down or stepwise design.

-- 
Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180721/4b5a2bf6/attachment.html>


More information about the Python-ideas mailing list