How to read gzipped utf8 file in Python?

John Nagle nagle at animats.com
Thu Nov 22 15:07:53 EST 2007


   I have a large (gigabytes) file which is encoded in UTF-8 and then
compressed with gzip.  I'd like to read it with the "gzip" module
and "utf8" decoding.  The obvious approach is

	fd = gzip.open(fname, 'rb',encoding='utf8')

But "gzip.open" doesn't support an "encoding" parameter.  (It
probably should, for consistency.)  Is there some way to do this?
Is it possible to express "unzip, then decode utf8" via
"codecs.open"?

				John Nagle



More information about the Python-list mailing list