[Tutor] gzip
Stefan Behnel
stefan_ml at behnel.de
Mon Aug 8 10:34:15 CEST 2011
questions anon, 08.08.2011 01:57:
> Thank you, I didn't realise that was all I needed.
> Moving on to the next problem:
> I would like to loop through a number of directories and decompress each
> *.gz file and leave them in the same folder but the code I have written only
> seems to focus on the last folder. Not sure where I have gone wrong.
> Any feedback will be greatly appreciated.
>
>
> import gzip
> import os
>
> MainFolder=r"D:/DSE_work/temp_samples/"
>
> for (path, dirs, files) in os.walk(MainFolder):
> for dir in dirs:
> outputfolder=os.path.join(path,dir)
> print "the path and dirs are:", outputfolder
> for gzfiles in files:
> print gzfiles
> if gzfiles[-3:]=='.gz':
> print 'dealing with gzfiles:', dir, gzfiles
> f_in=os.path.join(outputfolder,gzfiles)
> print f_in
> compresseddata=gzip.GzipFile(f_in, "rb")
> newFile=compresseddata.read()
> f_out=open(f_in[:-3], "wb")
> f_out.write(newFile)
Note how "outputfolder" is set and reset in the first inner loop, *before*
starting the second inner loop. Instead, build the output directory name
once, without looping over the directories (which, as far as I understand
your intention, you can ignore completely).
Also, see the shutils module. It has a method that efficiently copies data
between open file(-like) objects. With that, you can avoid reading the
whole file into memory.
Stefan
More information about the Tutor
mailing list