uncompressed size of .gz file

Benjamin Niemann b.niemann at betternet.de
Mon Sep 20 09:11:55 EDT 2004


It should be noted that this information may not be reliable, the filesize could 
be faked by a modified .gz file. "Bad guys" might use this to create a file that 
fills up your HD when you try to unpack it (e.g. used as a DOS attack again 
virus scanners that analyse compressed mail attachments). If you have to handle 
.gz files from unknown sources and you want to defend such attacks, the 
mentioned method is not sufficient.

Fredrik Lundh wrote:
> "frankabel at tesla.cujae.edu.cu" wrote:
> 
> 
>>What python function give me the uncompressed size of .gz file like
>>"gzip -l name_of_compress_file".
> 
> 
> the size is stored as a 32-bit integer at the end of the file.  to get it, you
> can use something like:
> 
> def getsize(gzipfile):
>     import struct
>     f = open(gzipfile, "rb")
>     if f.read(2) != "\x1f\x8b":
>         raise IOError("not a gzip file")
>     f.seek(-4, 2)
>     return struct.unpack("<i", f.read())[0]
> 
> usage:
> 
> 
>>>>print getsize("Python-2.4a3.tgz")
> 
> 38758400
> 
> hope this helps!
> 
> </F> 
> 
> 
> 



More information about the Python-list mailing list