CRC-checksum failed in gzip

andrea crotti andrea.crotti.0 at gmail.com
Wed Aug 1 09:52:59 EDT 2012


2012/8/1 Laszlo Nagy <gandalf at shopzeus.com>:
>>    there seems to be no clear pattern and just randmoly fails. The file
>> is also just open for read from this program,
>>    so in theory no way that it can be corrupted.
>
> Yes, there is. Gzip stores CRC for compressed *blocks*. So if the file is
> not flushed to the disk, then you can only read a fragment of the block, and
> that changes the CRC.
>
>>
>>    I also checked with lsof if there are processes that opened it but
>> nothing appears..
>
> lsof doesn't work very well over nfs. You can have other processes on
> different computers (!) writting the file. lsof only lists the processes on
> the system it is executed on.
>
>>
>> - can't really try on the local disk, might take ages unfortunately
>> (we are rewriting this system from scratch anyway)
>>
>


Thanks a lotl, someone that writes on the file while reading might be
an explanation, the problem is that everyone claims that they are only
reading the file.

Apparently this file is generated once and a long time after only read
by two different tools (in sequence), so this could not be possible
either in theory.. I'll try to investigate more in this sense since
it's the only reasonable explation I've found so far.



More information about the Python-list mailing list