Fwd: I/O bound threads got to no chance to run with small CPU bound threads with new GIL

Souvik Ghosh souvikghosh872 at gmail.com
Sat Dec 11 07:09:12 EST 2021


*Resending this message after subscribing in python-mail-list*

---------- Forwarded message ---------
From: Souvik Ghosh <souvikghosh872 at gmail.com>
Date: Sat, Dec 11, 2021 at 5:10 PM
Subject: I/O bound threads got to no chance to run with small CPU bound
threads with new GIL
To: <python-list at python.org>


Hello PSF,

I'm Souvik Ghosh from India. I've been coding for Python for almost 5
years now. And, I love Python and care about it so much.

The issue is stated below,

According to David Beazley' talk in PyCon'2010 in Atlanta Georgia, he
demonstrated about a new GIL with running CPU bound and I/O bound
threads together.

He said the talk that the threads which are forced to timeout of 5ms,
will have the lower priority(which is CPU bound) and the thread which
suspends the GIL within 5ms will have higher priority (which is I/O
bound).

What happens in the following code is if I set args=(10000000,) (seven
zero after 1) then only I/O bound runs and returns when CPU bound
takes much time to execute. But if I decrease that args to
args=(1000,) then I/O bound got no chance to reaquire the GIL in the
meantime even though the sys.getswitchinterval() is equal to 5ms(By
default). If I/O bound doesn't reacquire GIL with args=(10000,) then
the time to execute to run
only the CPU bound takes 0.42777760000035414 seconds. Thats means
almost ticks 0.42777760000035414/0.005=85 (approx) times to set the
priority in between the two threads. In that case if the I/O got more
priority within that time, it should have returned the value within
that ticks. But I didn't happen.
import threading
from queue import Queue
from timeit import default_timer as timer
import urllib.request


q = Queue()  # Queue technique to pass returns among threads while running


def decrement(numbers):  # CPU bound
    while numbers > 0:
        numbers -= 1
        if not q.empty():
            """I added this method because this thread will run most of the time
            because it's mostly cpu bound"""
            print(numbers)
            print(q.get(block=False))
            print(timer() - start)  # It tell after when exactly I/O
bound returns value after both the threads started to run


def get_data():  # I/O bound

    with urllib.request.urlopen("https://www.google.com") as dt:
        q.put(dt.read(), block=False)


if __name__ == "__main__":
    start = timer()
    t1 = threading.Thread(target=get_data)
    #t2 = threading.Thread(target=decrement, args=(10000000,)) #For
this I/O responds and returns
    t2 = threading.Thread(target=decrement, args=(100000,)) # I/O
doesn't responds at all
    t1.start()
    t2.start()

    t1.join()
    t2.join()
    print(timer() - start)

Look at the second code...

import threading
from queue import Queue
from timeit import default_timer as timer
import urllib.request
import sys


q = Queue()  # Queue technique to pass returns among threads while running


def decrement(numbers):  # CPU bound
    while numbers > 0:
        numbers -= 1
        if not q.empty():
            """I added this method because this thread will run most of the time
            because it's mostly cpu bound"""
            print(numbers)
            print(q.get(block=False))
            print(timer() - start)  # It tell after when exactly I/O
bound returns value after both the threads started to run


def get_data():  # I/O bound

    with urllib.request.urlopen("https://www.google.com") as dt:
        q.put(dt.read(), block=False)


if __name__ == "__main__":
    sys.setswitchinterval(0.0000000000000000000000000001)
    start = timer()
    t1 = threading.Thread(target=get_data)
    #t2 = threading.Thread(target=decrement, args=(1000000,)) #I/O
responds with this
    t2 = threading.Thread(target=decrement, args=(10000,))    # I/O
doesn't responds at all even with this 0.0000000000000000000000000001
seconds of threads switching interval
    t1.start()
    t2.start()

    t1.join()
    t2.join()
    print(timer() - start)

Can't we have a more better version of GIL to set I/O threads(overall)
priorities even more better and not to degrade the CPU bound and
better callbacks in response? Or, try to remove the GIL?

The issue I submitted in here:- https://bugs.python.org/issue46046

Thank you so much, great future of Python!


More information about the Python-list mailing list