Is there a more efficient threading lock?

Mon Feb 27 01:26:59 EST 2023

https://stackoverflow.com/questions/69993959/python-threads-difference-for-3-10-and-others

https://github.com/python/cpython/commit/4958f5d69dd2bf86866c43491caf72f774ddec97

it's a quirk of implementation. the scheduler currently only checks if it
needs to release the gil after the POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE,
JUMP_ABSOLUTE, CALL_METHOD, CALL_FUNCTION, CALL_FUNCTION_KW, and
CALL_FUNCTION_EX opcodes.

>>> import code
>>> import dis
>>> dis.dis( code.update_x_times )
 10           0 LOAD_GLOBAL              0 (range)
              2 LOAD_FAST                0 (xx)
              4 CALL_FUNCTION            1
##### GIL CAN RELEASE HERE #####
              6 GET_ITER
        >>    8 FOR_ITER                 6 (to 22)
             10 STORE_FAST               1 (_)
 12          12 LOAD_GLOBAL              1 (vv)
             14 LOAD_CONST               1 (1)
             16 INPLACE_ADD
             18 STORE_GLOBAL             1 (vv)
             20 JUMP_ABSOLUTE            4 (to 8)
##### GIL CAN RELEASE HERE (after JUMP_ABSOLUTE points the instruction
counter back to FOR_ITER, but before the interpreter actually jumps to
FOR_ITER again) #####
 10     >>   22 LOAD_CONST               0 (None)
             24 RETURN_VALUE
>>>

due to this, this section:
 12          12 LOAD_GLOBAL              1 (vv)
             14 LOAD_CONST               1 (1)
             16 INPLACE_ADD
             18 STORE_GLOBAL             1 (vv)

is effectively locked/atomic on post-3.10 interpreters, though this is
neither portable nor guaranteed to stay that way into the future

On Sun, Feb 26, 2023 at 10:19 PM Michael Speer <knomenet at gmail.com> wrote:

> I wanted to provide an example that your claimed atomicity is simply
> wrong, but I found there is something different in the 3.10+ cpython
> implementations.
>
> I've tested the code at the bottom of this message using a few docker
> python images, and it appears there is a difference starting in 3.10.0
>
> python3.8
> EXPECTED 2560000000
> ACTUAL   84533137
> python:3.9
> EXPECTED 2560000000
> ACTUAL   95311773
> python:3.10 (.8)
> EXPECTED 2560000000
> ACTUAL   2560000000
>
> just to see if there was a specific sub-version of 3.10 that added it
> python:3.10.0
> EXPECTED 2560000000
> ACTUAL   2560000000
>
> nope, from the start of 3.10 this is happening
>
> the only difference in the bytecode I see is 3.10 adds SETUP_LOOP and
> POP_BLOCK around the for loop
>
> I don't see anything different in the long c code that I would expect
> would cause this.
>
> AFAICT the inplace add is null for longs and so should revert to the
> long_add that always creates a new integer in x_add
>
> another test
> python:3.11
> EXPECTED 2560000000
> ACTUAL   2560000000
>
> I'm not sure where the difference is at the moment. I didn't see anything
> in the release notes given a quick glance.
>
> I do agree that you shouldn't depend on this unless you find a written
> guarantee of the behavior, as it is likely an implementation quirk of some
> kind
>
> --[code]--
>
> import threading
>
> UPDATES = 10000000
> THREADS = 256
>
> vv = 0
>
> def update_x_times( xx ):
>     for _ in range( xx ):
>         global vv
>         vv += 1
>
> def main():
>     tts = []
>     for _ in range( THREADS ):
>         tts.append( threading.Thread( target = update_x_times, args =
> (UPDATES,) ) )
>
>     for tt in tts:
>         tt.start()
>
>     for tt in tts:
>         tt.join()
>
>     print( 'EXPECTED', UPDATES * THREADS )
>     print( 'ACTUAL  ', vv )
>
> if __name__ == '__main__':
>     main()
>
> On Sun, Feb 26, 2023 at 6:35 PM Jon Ribbens via Python-list <
> python-list at python.org> wrote:
>
>> On 2023-02-26, Barry Scott <barry at barrys-emacs.org> wrote:
>> > On 25/02/2023 23:45, Jon Ribbens via Python-list wrote:
>> >> I think it is the case that x += 1 is atomic but foo.x += 1 is not.
>> >
>> > No that is not true, and has never been true.
>> >
>> >:>>> def x(a):
>> >:...    a += 1
>> >:...
>> >:>>>
>> >:>>> dis.dis(x)
>> >   1           0 RESUME                   0
>> >
>> >   2           2 LOAD_FAST                0 (a)
>> >               4 LOAD_CONST               1 (1)
>> >               6 BINARY_OP               13 (+=)
>> >              10 STORE_FAST               0 (a)
>> >              12 LOAD_CONST               0 (None)
>> >              14 RETURN_VALUE
>> >:>>>
>> >
>> > As you can see there are 4 byte code ops executed.
>> >
>> > Python's eval loop can switch to another thread between any of them.
>> >
>> > Its is not true that the GIL provides atomic operations in python.
>>
>> That's oversimplifying to the point of falsehood (just as the opposite
>> would be too). And: see my other reply in this thread just now - if the
>> GIL isn't making "x += 1" atomic, something else is.
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>