[Python-ideas] add a hash to .pyc to don't mess between .py and .pyc

Xavier Combelle xavier.combelle at gmail.com
Sun Aug 14 19:05:47 EDT 2016


I have stumbled upon several time with the following problem.
I delete a module and the .pyc stay around. and by "magic", python still
use the .pyc
A similar error happen (but less often) when by some file system
manipulation the .pyc happen to be
newer than the .py but correspond to an older version of .py. It is not
a major problem but it is still an existing problem.

I'm not the first one to have this problem. A stack overflow search lead
to quite a lot of relevant answers
http://stackoverflow.com/search?q=old+pyc and google search too
https://www.google.fr/search?q=old+pyc
moreover several result of google result in bug tracking of various
project. (There is also in these result the fact that .pyc
are stored in VCS repositories but this is another problem not related)
I even found a blog post using .pyc as a backdoor
http://secureallthethings.blogspot.fr/2015/11/backdooring-python-via-pyc-pi-wa-si_9.html

My idea to kill both bird in one stone would be to add a hash (likely to
be cryptographic) of the .py file in the .pyc file and read the .py file
and check the hash
The additional cost of first startup cost will be just the hash
calculation which I think is cheap comparing to other factors
(especially input output)
The additional second startup cost of a program the main cost will be
the additional read of .py files and the cheap hash calculations.

I believe the removing of the bugs would worth the performance cost.

I know that some use case makes a use of just using .pyc and not keeping
.py  around, for example by not distribute the source file.
But in my vision, this uses case should be solved per opt-in decision
and not as a default. Several opt-in mechanisms could be envisioned:
environment variables, command line switches, special compilation of
.pyc which explicitly ask to not check for the hash.

--
Xavier



More information about the Python-ideas mailing list