Silly question about ungzipping a file

Lemniscate d_blade8 at hotmail.com
Thu Sep 26 17:11:56 EDT 2002


Hi everyone,

This may be a ridiculously easy question, but I've kind of hit a wall
and I was wondering if I am just missing something.  I want to
automate the retrieval and unzipping of a *.gz file.  The issue is
that the file, when it is unzipped, usually has a size somewhere in
the range of 128MB.  Occassionally, I get memory errors when I try to
run it.  Here is a quick idea of what I am doing...

>>> import gzip
>>> myFile = gzip.open('LL_tmpl.gz')
>>> file('output.txt', 'w').write(myFile.read())
>>> 

Now, is there any way to do this so that less memory is used?  I mean,
if I wanted to do some processing on the resulting output file, I
would use xreadlines or something like that to keep memory consumption
to a minimum.  Is there something roughly equivalent that I am not
noticing in the gzip documentation.  Let me also say that I have tried
the following as well:

>>> myFile = gzip.open('LL_tmpl.gz')
>>> fout = file('output.txt', 'w')
>>> while myFile.readline():
... 	fout.write(myFile.readline())
... 	fout.write('\n')
... 	
>>> fout.close()
>>> myFile = gzip.open('LL_tmpl.gz')
>>> fout = file('output2.txt', 'w')
>>> while myFile.readline():
... 	fout.writelines(myFile.readline())
... 	
>>> fout.close()


These do solve my memory problem, but there are other issues.  First
of all, my CPU gets pegged and it takes FOREVER (okay, not forever,
but about ...let me test real quick... at least 4-5 times as long, and
my computer is pretty much useless during that time (side note:  can
you tell I am working on a woefully underpowered machine?)).  Is there
something in-between that anybody can think of?  The other, and much
more immediate, issue is puzzling to me.  It seems that the resulting
files from the code are only about 65MB (64.7 to be exact) versus
129MB.  I'm sure I'm just missing something simple, but why is that? 
Thanks for your time.

Lem



More information about the Python-list mailing list