[issue45542] Using multiple comparison operators can cause performance issues

Mon Apr 4 23:14:58 EDT 2022

Dennis Sweeney <sweeney.dennis650 at gmail.com> added the comment:

For reference, chaining is about 1.18x slower in this microbenchmark on GCC:

./python -m pyperf timeit -s "x = 100" "if 10 < x < 30: print('no')" --duplicate=10
.....................
Mean +- std dev: 21.3 ns +- 0.2 ns
./python -m pyperf timeit -s "x = 100" "if 10 < x and x < 30: print('no')" --duplicate=10
.....................
Mean +- std dev: 18.0 ns +- 0.5 ns

For a related case, in GH-30970, the bytecode generate by "a, b = a0, b0" was changed.
   Before: [load_a0, load_b0, swap, store_a, store_b]
   After:  [load_a0, load_b0, store_b, store_a]
However, this was only changed when the stores were STORE_FASTs. STORE_GLOBAL/STORE_NAME/STORE_DEREF cases still have the SWAP.
In the STORE_GLOBAL cases, you can construct scenarios with custom __del__ methods where storing b and then a has different behavior than storing a and then b. No such cases can be constructed for STORE_FAST without resorting to frame hacking.

I wonder if the same argument applies here: maybe @akuvfx's PR could be altered to use LOAD_FAST twice for each variable *only* if everything in sight is the result of a LOAD_FAST or a LOAD_CONST. My example above uses a LOAD_DEREF, so its behavior could remain unchanged.

The argument that this would within the language spec is maybe a little bit more dubious than the "a, b = a0, b0" case though, since custom `__lt__` methods are a bit more well-specified than custom `__del__` methods.

Thoughts?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue45542>
_______________________________________