[Python-ideas] Simpler thread synchronization using "Sticky Condition"

Richard Whitehead richard.whitehead at ieee.org
Tue Mar 26 12:50:20 EDT 2019


Nathaniel,

Thanks very much for taking the time to comment.

Clearing the event after waiting for it will introduce a race condition: if 
the sender has gone around its loop again and set the semaphore after we 
have woken but before we've cleared it. As you said, this stuff is tricky! 
The only safe way is to make the wait-and-clear atomic, which can be done 
with a lock; and this comes essentially back to what I'm proposing.

I realise this is not a fundamental new primitive - if it was, I wouldn't be 
able to build it in pure Python - but I've found it extremely useful in our 
generic threading and processing library.

You're right about what you say regarding queues; I didn't want to go into 
the full details of the multi-threading and multi-processing situation at 
hand, but I will say that we have a pipeline of tasks that can run as either 
threads or processes, and we want to make it easy to construct this 
pipeline, "wiring" it as necessary; combining command queues with data 
queues just gets a real mess.

Richard



-----Original Message----- 
From: Nathaniel Smith
Sent: Tuesday, March 26, 2019 4:24 PM
To: richard.whitehead at ieee.org
Cc: Python-Ideas
Subject: Re: [Python-ideas] Simpler thread synchronization using "Sticky 
Condition"

These kinds of low-level synchronization primitives are notoriously
tricky, yeah, and I'm all in favor of having better higher-level
tools. But I'm not sure that AutoResetEvent adds enough to be worth
it.

AFAICT, you can get this behavior with an Event just fine – using your
pseudocode:

def sender():
     while alive():
           wait_for_my_data_from_hardware()
           send_data_to_receiver()
           auto_event.set()

def receiver():
     while alive():
           auto_event.wait()
           auto_event.clear()   # <-- this line added
           receive_all_data_from_sender()
           process_data()

It's true that if we use a regular Event then the .clear() doesn't
happen atomically with the wakeup, but that doesn't matter. If we call
auto_event.set() and then have new data arrive, then there are two
cases:

1) the new data early enough to be seen by the current call to
receive_all_data_from_sender(): this is fine, the new data will be
processed in this call
2) the new data arrives too late to be seen by the current call to
receive_all_data_from_sender(): that means the new data arrived after
the call to receive_all_data_from_sender() started, which means it
arrived after auto_event.clear(), which means that the call to
auto_event.set() will successfully re-arm the event and another call
to receive_all_data_from_sender() will happen immediately

That said, this is still very tricky. It requires careful analysis,
and it's not very general (for example, if we want to support multiple
receivers than we need to throw out the whole approach and do
something entirely different). In Trio we've actually discussed
removing Event.clear(), since it's so difficult to use correctly:
https://github.com/python-trio/trio/issues/637

You said your original problem is that you have multiple event
sources, and the receiver needs to listen to all of them. And based on
your approach, I guess you only have one receiver, and that it's OK to
couple all the event sources directly to this receiver (i.e., you're
OK with passing them all a Condition object to use).

Under these circumstances, wouldn't it make more sense to use a single
Queue, pass it to all the sources, and have them each do
queue.put((source_id, event))? That's simple to implement, hard to
mess up, and can easily be extended to multiple receivers.

If you want to further decouple the sources from the receiver, then
one approach would be to have each source expose its own Queue
independently, and then define some kind of 'select' operation (like
in Golang/CSP/concurrent ML) to let the receiver read from multiple
Queues simultaneously. This is non-trivial to do, but in return you
get a very general and powerful construct. There's some more links and
discussion here: https://github.com/python-trio/trio/issues/242

> Regarding the link you sent, I don't entirely agree with the opinion 
> expressed: if you try to use a Semaphore for this purpose you will soon 
> find that it is "the wrong way round", it is intended to protect resources 
> from multiple accesses, not to synchronize those multiple accesses

Semaphores are extremely generic primitives – there are a lot of
different ways to use them. I think the blog post is correct that an
AutoResetEvent is equivalent to a semaphore whose value is clamped so
that it can't exceed 1. Your 'auto_event.set()' would be implemented
as 'sem.release()', and 'auto_event.wait()' would be 'sem.acquire()'.

I guess technically the semantics might be slightly different when
there are multiple waiters: the semaphore wakes up exactly one waiter,
while I'm not sure what your AutoResetEvent would do. But I can't see
any way to use AutoResetEvent reliably with multiple waiters anyway.


-n

--
Nathaniel J. Smith -- https://vorpus.org 



More information about the Python-ideas mailing list