uncompressed size of .gz file

Fredrik Lundh fredrik at pythonware.com
Mon Sep 20 06:16:38 EDT 2004


Heiko Wundram wrote:

> You should be fine using "<I", but I'd rather look out whether it's really
> always "<", as this means little-endian. I don't know whether gzipped files
> are always written as little endian (this would feel strange, as big-endian
> is the "portable way")

If it had been "native", I'd used "=" instead of "<".  It never hurts to
read the relevant RFC before posting:

    All multi-byte numbers in the format described here are stored
    with the least-significant byte first (at the lower memory address).

and

    ISIZE (Input SIZE)
    This contains the size of the original (uncompressed) input data
    modulo 2^32.

and  yes, "<I" (unsigned) is better than "<i" unless you want full "gzip -l"
compatibility (according to the docs, it only reports correct sizes for files
up to 2 gigabytes).  Still won't work for files larger than 4 gigs, though,
but there's not much you can do about that (unless you know what kind
of data you have in the file, of course, in case you can use the typical com-
pression ratio to find the right 4-gigabyte window in many cases).

</F> 






More information about the Python-list mailing list