Is there a more efficient threading lock?

Chris Angelico rosuav at gmail.com
Sun Feb 26 00:50:38 EST 2023


On Sun, 26 Feb 2023 at 16:27, Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
>
> On Sat, 25 Feb 2023 15:41:52 -0600, Skip Montanaro
> <skip.montanaro at gmail.com> declaimed the following:
>
>
> >concurrent.futures.ThreadPoolExecutor() with the default number of workers (
> >os.cpu_count() * 1.5, or 12 threads on my system) to process each month, so
> >12 active threads at a time. Given that the process is pretty much CPU
> >bound, maybe reducing the number of workers to the CPU count would make
>
>         Unless things have improved a lot over the years, the GIL still limits
> active threads to the equivalent of a single CPU. The OS may swap among
> which CPU as it schedules system processes, but only one thread will be
> running at any moment regardless of CPU count.

Specifically, a single CPU core *executing Python bytecode*. There are
quite a few libraries that release the GIL during computation. Here's
a script that's quite capable of saturating my CPU entirely - in fact,
typing this email is glitchy due to lack of resources:

import threading
import bcrypt
results = [0, 0]
def thrd():
    for _ in range(10):
        ok = bcrypt.checkpw(b"password",
b'$2b$15$DGDXMb2zvPotw1rHFouzyOVzSopiLIUSedO5DVGQ1GblAd6L6I8/6')
        results[ok] += 1

threads = [threading.Thread(target=thrd) for _ in range(100)]
for t in threads: t.start()
for t in threads: t.join()
print(results)

I have four cores eight threads, and yeah, my CPU's not exactly the
latest and greatest (i7 6700k - it was quite good some years ago, but
outstripped now), but feel free to crank the numbers if you want to.

I'm pretty sure bcrypt won't use more than one CPU core for a single
hashpw/checkpw call, but by releasing the GIL during the hard number
crunching, it allows easy parallelization. Same goes for numpy work,
or anything else that can be treated as a separate operation.

So it's more accurate to say that only one CPU core can be
*manipulating Python objects* at a time, although it's hard to pin
down exactly what that means, making it easier to say that there can
only be one executing Python bytecode; it should be possible for any
function call into a C library to be a point where other threads can
take over (most notably, any sort of I/O, but also these kinds of
things).

As mentioned, GIL-removal has been under discussion at times, most
recently (and currently) with PEP 703
https://peps.python.org/pep-0703/ - and the benefits in multithreaded
applications always have to be factored against quite significant
performance penalties. It's looking like PEP 703's proposal has the
least-bad performance measurements of any GILectomy I've seen so far,
showing 10% worse performance on average (possibly able to be reduced
to 5%). As it happens, a GIL just makes sense when you want pure, raw
performance, and it's only certain workloads that suffer under it.

ChrisA


More information about the Python-list mailing list