CRC-checksum failed in gzip

Laszlo Nagy gandalf at shopzeus.com
Wed Aug 1 09:27:26 EDT 2012


- The file is written with the linux gzip program.
- no I can't reproduce the error with the same exact file that did
failed, that's what is really puzzling,

How do you make sure that no process is reading the file before it is 
fully flushed to disk?

Possible way of testing for this kind of error: before you open a file, 
use os.stat to determine its size, and write out the size and the file 
path into a log file. Whenever an error occurs, compare the actual size 
of the file with the logged value. If they are different, then you have 
tried to read from a file that was growing at that time.

Suggestion: from the other process, write the file into a different file 
(for example, "file.gz.tmp"). Once the file is flushed and closed, use 
os.rename() to give its final name. On POSIX systems, the rename() 
operation is atomic.


>    there seems to be no clear pattern and just randmoly fails. The file
> is also just open for read from this program,
>    so in theory no way that it can be corrupted.
Yes, there is. Gzip stores CRC for compressed *blocks*. So if the file 
is not flushed to the disk, then you can only read a fragment of the 
block, and that changes the CRC.
>
>    I also checked with lsof if there are processes that opened it but
> nothing appears..
lsof doesn't work very well over nfs. You can have other processes on 
different computers (!) writting the file. lsof only lists the processes 
on the system it is executed on.
>
> - can't really try on the local disk, might take ages unfortunately
> (we are rewriting this system from scratch anyway)
>




More information about the Python-list mailing list