CRC-checksum failed in gzip

andrea crotti andrea.crotti.0 at gmail.com
Thu Aug 2 06:57:06 EDT 2012


2012/8/2 Laszlo Nagy <gandalf at shopzeus.com>:
>
> Your example did not share the file object between threads. Here an example
> that does that:
>
> class OpenAndRead(threading.Thread):
>     def run(self):
>         global fz
>         fz.read(100)
>
> if __name__ == '__main__':
>
>    fz = gzip.open('out2.txt.gz')
>    for i in range(10):
>         OpenAndRead().start()
>
> Try this with a huge file. And here is the one that should never throw CRC
> error, because the file object is protected by a lock:
>
> class OpenAndRead(threading.Thread):
>     def run(self):
>         global fz
>         global fl
>         with fl:
>             fz.read(100)
>
> if __name__ == '__main__':
>
>    fz = gzip.open('out2.txt.gz')
>    fl = threading.Lock()
>    for i in range(2):
>         OpenAndRead().start()
>
>
>>
>> The code in run should be shared by all the threads since there are no
>> locks, right?
>
> The code is shared but the file object is not. In your example, a new file
> object is created, every time a thread is started.
>


Ok sure that makes sense, but then this explanation is maybe not right
anymore, because I'm quite sure that the file object is *not* shared
between threads, everything happens inside a thread..

I managed to get some errors doing this with a big file
class OpenAndRead(threading.Thread):
     def run(self):
         global fz
         fz.read(100)

if __name__ == '__main__':

    fz = gzip.open('bigfile.avi.gz')
    for i in range(20):
         OpenAndRead().start()

and it doesn't fail without the *global*, but this is definitively not
what the code does, because every thread gets a new file object, it's
not shared..

Anyway we'll read once for all the threads or add the lock, and
hopefully it should solve the problem, even if I'm not convinced yet
that it was this.



More information about the Python-list mailing list