[Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

Neil Schemenauer nas-python at arctrix.com
Fri Sep 14 17:54:24 EDT 2018


On 2018-09-14, Larry Hastings wrote:
[...]
> improvement 0.21242667903482038 %

I assume that should be 21.2 % othewise I recommend you abandon the
idea. ;-P

> The downside of the patch: for these modules it ignores the Python files on
> disk--it doesn't even stat them.

Having a command-line/env var to turn this on/off would be an
acceptable fix, IMHO.  If I'm running Python a server, I don't need
to be editing .py modules and have them be recognized.  Maybe have
it turned off by default, at least at first.

> Is it worth working on?

I wonder how much of the speedup relies on putting it in the data
segment (i.e. using linker/loader to essentially handle the
unmarshal).  What if you had a new marshal format that only needed a
light 2nd pass in order to fix up the data loaded from disk?  Yuri
suggested looking at formats like Cap'n Proto.  If the cost of the
2nd pass was not bad, you wouldn't have to rely on the platform C
toolchain.  Instead we can write .pyc files that hold this data.

Then the speedup can work on all compiled Python modules, not just
the ones you go through the special process that links them into the
data segment.  I suppose that might mean that .pyc files become arch
specific.  Maybe that's okay.

As you said last night, there doesn't seem to be much low hanging
fruit around anymore.  So, 21% looks pretty decent.

Regards,

  Neil


More information about the Python-Dev mailing list