[issue45020] Freeze all modules imported during startup.

Eric Snow report at bugs.python.org
Mon Aug 30 12:08:02 EDT 2021


Eric Snow <ericsnowcurrently at gmail.com> added the comment:

On Fri, Aug 27, 2021 at 6:29 PM Guido van Rossum <report at bugs.python.org> wrote:
> The plot thickens. By searching my extensive GMail archives for Jeethu Rao I found
> an email from Sept. 14 to python-dev by Larry Hastings titled "Store startup modules
> as C structures for 20%+ startup speed improvement?"

Thanks for finding that, Guido!

On Fri, Aug 27, 2021 at 6:37 PM Guido van Rossum <report at bugs.python.org> wrote:
> Either way it's a suboptimal experience for people contributing to those modules. But
> we stand to gain a ~20% startup time improvement.

Agreed, and I think a solution shouldn't be too hard to reach.

On Fri, Aug 27, 2021 at 7:48 PM Larry Hastings <report at bugs.python.org> wrote:
> In experimenting with the prototype, I observed that simply calling stat() to ensure
> the frozen .py file hadn't changed on disk lost us about half the performance win
> from this approach.

Yeah, this is an approach others had suggested and I'd considered.  We
have other solutions available that don't have that penalty.

On Fri, Aug 27, 2021 at 8:08 PM Larry Hastings <report at bugs.python.org> wrote:
> There should be a boolean flag that enables/disables cached copies of .py files from
> Lib/.  You should be able to turn it off with either an environment variable or a
> command-line option, and when it's off it skips all the internal cached stuff and
> uses the normal .py / .pyc machinery.
>
> With that in place, it'd be great to pre-cache all the .py files automatically read in
> at startup.

Yeah, something along these lines should be good enough.

> [snip]
> But then I'm not sure this is a very good analogy--the workflow for making Clinic
> changes is very different from people hacking on Lib/*.py.

Agreed.

On Fri, Aug 27, 2021 at 10:06 PM Guido van Rossum
<report at bugs.python.org> wrote:
> [snip]
> FWIW in my attempts to time this, it looks like the perf benefits of Eric's approach are
> close to those of deep-freezing. And deep-freezing causes much more bloat of the
> source code and of the resulting binary.

The question of freeze vs deep-freeze (i.e. is deep-freeze better
enough) is one we can discuss separately, and your point here is
probably the fundamental center of that discussion.  However, I don't
think it has a lot of bearing on the change proposed in this issue.

> [snip]
> I think the only solution here was hinted at in the python-dev thread from 2018: have
> a command-line flag to turn it on or off (e.g. -X deepfreeze=1/0) and have a policy for
> what the default for that flag should be (e.g. on by default in production builds, off by
> default in developer builds -- anything that doesn't use --enable-optimizations).

Agreed.

> [snip]
> it wasn't so clear that code objects should be immutable -- that realization came later,
> when Greg Stein proposed making them ROM-able. That didn't work out, but the
> notion that code objects should be strictly mutable (to the python user, at least)
> was born

This sounds like an interesting story.  Do you have any mailing list
links handy?  (Otherwise I can search the archives.)

> In fact, Eric's approach freezes everything in the encodings package, which turns out
> to be a lot of files and a lot of code (lots of simple data tables expressed in code), and
> I found that for basic startup time, it's best not to deep-freeze the encodings module
> except for __init__.py, aliases.py and utf_8.py.

Yeah, this is something to consider.  FWIW, in my testing, dropping
encodings.* from the
list of frozen modules reduced the performance gains (from 20 ms to 21 ms).

-eric

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue45020>
_______________________________________


More information about the Python-bugs-list mailing list