Read-Write Lock vs primitive Lock()

Tue Dec 30 03:16:14 EST 2008

> >> Note that the thread acquires the lock ONCE, repeats several thousand
> >> times an assignment to a *local* variable called GLOBAL_VAR (!), finally
> >> releases the lock and exits. As every thread does the same, they just  
> >> run
> >> one after another, they never have a significant overlap.

Ok. Forget about the GLOBAL_VAR. See the below code:

import threading
import locks
import time

GLOBAL_VAR = []
#GLOBAL_LOCK = locks.ReadWriteLock()
GLOBAL_LOCK = threading.Lock()
#GLOBAL_LOCK = mylock()
GLOBAL_LOOP_COUNT = 10000000
GLOBAL_READER_COUNT = 100
GLOBAL_WRITER_COUNT = 1
global GLOBAL_LEN

class wthread(threading.Thread):
    def run(self):
            try:
                GLOBAL_LOCK.acquireWrite()
                #GLOBAL_LOCK.acquire_write()
                #GLOBAL_LOCK.acquire()
                for i in range(GLOBAL_LOOP_COUNT):
                    pass
            finally:
                #GLOBAL_LOCK.release_write()
                GLOBAL_LOCK.release()

class rthread(threading.Thread):
    def run(self):
            try:
                GLOBAL_LOCK.acquireRead()
                #GLOBAL_LOCK.acquire_read()
                print 'ENTER:'+str(threading.currentThread())
                #GLOBAL_LOCK.acquire()
                print 'ACQUIRE:'+str(threading.currentThread())
                for i in range(GLOBAL_LOOP_COUNT):
                    pass
            finally:
                #GLOBAL_LOCK.release_read()
                GLOBAL_LOCK.release()
                print 'RELEASE:'+str(threading.currentThread())

# module executed?
if __name__ == "__main__":
    starttime = time.clock()
    threads = []
    for i in range(GLOBAL_READER_COUNT):
        rt = rthread()
        threads.append(rt)
    for i in range(GLOBAL_WRITER_COUNT):
        wt = wthread()
        threads.append(wt)

    for thread in threads:
        thread.start()
    for thread in threads:
        thread.join()
    print "All operations took " + str(time.clock() - starttime) + "
msecs"

As GLOBAL_LOOP_COUNT is 10000000, now this is making a bottleneck on
the readers. I had assumed that as everythread is given only 100
bytecodes to execute, that it will be enough to have a 10000 value for
this number to let other rthread start() and try to acquire the lock.
But, Python undocumentedly, does not grab the GIL easily if a Lock is
held. This is strange for me.

> > If I put the for loop outside, in that case, readers will not overlap
> > at all, and you would be amazed by the numbers for that test. They
> > indicate primitive lock is faster than read-write lock, as it requires
> > the lock, executes only one bytecode operation and releases the lock.
> > So, in order to create a bottlenneck on readers, we need to somehow do
> > not release the lock immediately.
>
> Again, what is your specific use case? why do you think readers will see a  
> bottleneck?
>
> > With that, primitive locks perform 10 times better than Read-Write
> > lock. See above.
>
> So what? You don't like such results? The read-write lock in the recipe  
> you posted is rather complex, and isn't obvious whether it is correct or  
> not, and the author doesn't prove it nor provide a reference. I'd stick to  
> the standard Lock object unless I have specific needs.
> Note that threading.Lock objects are implemented in C and don't have  
> significant overhead (other than the wait itself); even threading.RLock  
> objects are a lot slower. So depending on what you really have to do, the  
> overhead of using a class like ReadWriteLock may be worse than using a  
> standard Lock.

That may be the point. That is why I am trying to test this. When will
ReadWrite() lock permforms better over the primitive lock. By the way,
for the reference, the code in the recipe is used in CherryPy and
based on the Tanenbaum's book Operating Systems Design.

> >> Hmmm, it's a reader but attempts to modify the value?
> >> You don't have to protect a read operation on a global variable - so a
> >> lock isn't required here.
>
> > This is just for testing. Suppose that I am actually reading the
> > value. I don't understand why a lock is not required? Are you saying
> > lock is not necesary beacuse GLOBAL_VALUE is an immutable object, if
> > then, suppose it is not. This is just a test. Suppose GLOBAl_VAR is a
> > list and we are calling append() on it which is nt an atomic
> > operation.
>
> It doesn't matter whether it is mutable or immutable. Remember that, in  
> Python, a global variable is just a *name* that refers to an object; you  
> either get the old object referred, or the new one. If it is a list and it  
> is being modified "in place", you always get the same list -- its contents  
> may differ from one instant to another so iterating over it isn't safe.
> That's why I insist on knowing your use case: you may me doing things more  
> complicated than they should.

I understand your point completely, but let's change anything inside
the loop. Just assume that we a thread that is supposed to read
something, and a thread that is supposed to write. If I create
thousands of readers, and if the operation is enormously huge
calculation(as Python does not grab GIL easily inside a Lock), then
this will create a bottlencek on readers. But, with ReadWrite Lock
this problem *SHOULD* be fixed and in my tests I see it is fixed, when
I increase the time spent in the loop in huge amounts.

>
> Each thread in your original code were like this:
>
> begin thread; wait for lock; execute something locally; release lock; exit  
> thread;
>
> There was no shared state between threads. Even if you fix them to share  
> some variable, they were a single, sequential, piece of code (the inner  
> loop is irrelevant here). Once a thread gets the lock, it just runs to its  
> end. No one waits more than once for the lock. So all of them run  
> sequentially, in arbitrary order, but just one after the other. They never  
> overlap.

You are correct about this. But this is an undocumented thing. I had
assumed that Python will release GIL for other threads after 100
bytecodes are executed by default. However, as I stated, when a Lock()
is held this is changing. I think this is because to avoid simple MT
problems for new programmers. Not sure. I will read threads.c for
information.

Thanks.