How to "gunzip-iterate" over a file?

kj no.email at please.post
Wed Jul 29 16:05:36 EDT 2009




I need to iterate over the lines of *very* large (>1 GB) gzipped
files.  I would like to do this without having to read the full
compressed contents into memory so that I can apply zlib.decompress
to these contents.  I also would like to avoid having to gunzip
the file (i.e. creating an uncompressed version of the file in the
filesystem) prior to iterating over it.

Basically I'm looking for something that will give me the same
functionality as Perl's gzip IO layer, which looks like this (from
the documentation):

         use PerlIO::gzip;
         open FOO, "<:gzip", "file.gz" or die $!;
         print while <FOO>; # And it will be uncompressed...

What's the best way to achieve the same functionality in Python?

TIA!

kynn



More information about the Python-list mailing list