[Python-Dev] Mysterious Python pyc file corruption problems

Brett Cannon brett at python.org
Thu May 16 23:30:26 CEST 2013


On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum <guido at python.org> wrote:
> This reminds me of the following bug, which can happen when two
> processes are both writing the .pyc file and a third is reading it.
> First some background.
>
> When writing a .pyc file, we use the following strategy:

> - open the file for writing
> - write a dummy header (four null bytes)
> - write the .py file's mtime
> - write the marshalled code object
> - replace the dummy heaer with the correct magic word
>

Just so people know, this is how we used to do it. In importlib we
write the entire file to a temp file and then to an atomic rename.

> Even py_compile.py (used by compileall.py) uses this strategy.

py_compile as of Python 3.4 now just uses importlib directly, so it
matches its semantics.

-Brett

>
> When reading a .pyc file, we ignore it when the magic word isn't there
> (or when the mtime doesn't match that of the .py file exactly), and
> then we will write it back like described above.
>
> Now consider the following scenario. It involves *three* processes.
>
> - Two unrelated processes both start and want to import the same module.
> - They both see the .pyc file is missing/corrupt and decide to write it.
> - The first process finishing writing the file, writing the correct header.
> - Now a third process wants to import the module, sees the valid
> header, and starts reading the file.
> - However, while this is going on, the second process gets ready to
> write the file.
> - The second process truncates the file, writes the dummy header, and
> then stalls.
> - At this point the third process (which thought it was reading a
> valid file) sees an unexpected EOF because the file has been
> truncated.
>
> Now, this would explain the EOFError, but not necessarily the
> ValueError with "unknown type code". However, it looks like marshal
> doesn't always check for EOF immediately (sometimes it calls getc()
> without checking the result, and sometimes it doesn't check the error
> state after calling r_string()), so I think all the errors are
> actually explainable from this scenario.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org


More information about the Python-Dev mailing list