CRC-checksum failed in gzip

Laszlo Nagy gandalf at shopzeus.com
Thu Aug 2 06:21:24 EDT 2012


> One last thing I would like to do before I add this fix is to actually
> be able to reproduce this behaviour, and I thought I could just do the
> following:
>
> import gzip
> import threading
>
>
> class OpenAndRead(threading.Thread):
>      def run(self):
>          fz = gzip.open('out2.txt.gz')
>          fz.read()
>          fz.close()
>
>
> if __name__ == '__main__':
>      for i in range(100):
>          OpenAndRead().start()
>
>
> But no matter how many threads I start, I can't reproduce the CRC
> error, any idea how I can try to help it happening?
Your example did not share the file object between threads. Here an 
example that does that:

class OpenAndRead(threading.Thread):
     def run(self):
	global fz
	fz.read(100)

if __name__ == '__main__':
    fz = gzip.open('out2.txt.gz')
    for i in range(10):
         OpenAndRead().start()

Try this with a huge file. And here is the one that should never throw 
CRC error, because the file object is protected by a lock:

class OpenAndRead(threading.Thread):
     def run(self):
         global fz
         global fl
         with fl:
             fz.read(100)

if __name__ == '__main__':
    fz = gzip.open('out2.txt.gz')
    fl = threading.Lock()
    for i in range(2):
         OpenAndRead().start()

>
> The code in run should be shared by all the threads since there are no
> locks, right?
The code is shared but the file object is not. In your example, a new 
file object is created, every time a thread is started.




More information about the Python-list mailing list