[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)

neonene report at bugs.python.org
Thu Apr 7 19:41:04 EDT 2022


neonene <nicesalmon at gmail.com> added the comment:

>What exactly does "pgo hard reject" mean?

In my recognition, "pgo hard reject" is based on the PGOptimizer's heuristic, "reject" is related to the probe count (hot/cold).

  https://developercommunity.visualstudio.com/t/1531987#T-N1535774


And there was a reply from MSVC team, closing the issue. MSVC won't be fixed in the near future.

  https://developercommunity.visualstudio.com/t/1595341#T-N1695626

>From the reply and my investigation, 3.11 would need the following:

1. Some callsites such as tp_* pointer should not inline its fastpaths in the eval switch-case. They often conflict. Each pointer needs to be wrapped with a function or maybe _PyEval_EvalFrameDefault needs to be enclosed with "inline_depth(0)" pragma.

2. __assume(0) should be replaced with other function, inside the eval switch-case or in the inlined paths of callees. This is critical with PGO.

3. For inlining, use __forceinline / macro / const function pointer.

   MSVC's stuck can be avoided in many ways, when force-inlining in the evalloop a ton of Py_DECREF()s, unless tp_dealloc does not create a inlined callsite:

     void
     _Py_Dealloc(PyObject *op)
     {
      ...
     #pragma inline_depth(0) // effects from here, PGO accepts only 0.
         (*dealloc)(op);     // conflicts when inlined.
     }
     #pragma inline_depth()  // can be reset only outside the func.



* Virtual Call Speculation:
  https://docs.microsoft.com/en-us/cpp/build/profile-guided-optimizations?view=msvc-170#optimizations-performed-by-pgo


* The profiler runs under /GENPROFILE:PATH option, but at the big ceval-func, the optimizer merges the profiles into one like /GENPROFILE:NOPATH mode.
https://docs.microsoft.com/en-us/cpp/build/reference/genprofile-fastgenprofile-generate-profiling-instrumented-build?view=msvc-170#arguments


* __assume(0) (Py_UNREACHABLE):
  https://devblogs.microsoft.com/cppblog/visual-studio-2017-throughput-improvements-and-advice/#remove-usages-of-__assume

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue45116>
_______________________________________


More information about the Python-bugs-list mailing list