[Tutor] gzip

questions anon questions.anon at gmail.com
Tue Aug 9 01:40:40 CEST 2011


Thank you and I will look into shutils,  below is the code that did work:

import gzip
import os

MainFolder=r"E:/data"

for (path, dirs, files) in os.walk(MainFolder):
    for dir in dirs:
        print dir
    path=path+'/'
    for gzfiles in files:
        if gzfiles[-3:]=='.gz':
            print "dealing with gzfiles:", gzfiles
            compressedfiles=os.path.join(path,gzfiles)
            f_in=gzip.GzipFile(compressedfiles, "rb")
            newFile=f_in.read()
            f_out=open(compressedfiles[:-3], "wb")
            f_out.write(newFile)
print "end of processing"



On Mon, Aug 8, 2011 at 6:34 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> questions anon, 08.08.2011 01:57:
>
>  Thank you, I didn't realise that was all I needed.
>> Moving on to the next problem:
>> I would like to loop through a number of directories and decompress each
>> *.gz file and leave them in the same folder but the code I have written
>> only
>> seems to focus on the last folder. Not sure where I have gone wrong.
>> Any feedback will be greatly appreciated.
>>
>>
>> import gzip
>> import os
>>
>> MainFolder=r"D:/DSE_work/temp_**samples/"
>>
>> for (path, dirs, files) in os.walk(MainFolder):
>>     for dir in dirs:
>>         outputfolder=os.path.join(**path,dir)
>>         print "the path and dirs are:", outputfolder
>>     for gzfiles in files:
>>         print gzfiles
>>         if gzfiles[-3:]=='.gz':
>>             print 'dealing with gzfiles:', dir, gzfiles
>>             f_in=os.path.join(**outputfolder,gzfiles)
>>             print f_in
>>             compresseddata=gzip.GzipFile(**f_in, "rb")
>>             newFile=compresseddata.read()
>>             f_out=open(f_in[:-3], "wb")
>>             f_out.write(newFile)
>>
>
> Note how "outputfolder" is set and reset in the first inner loop, *before*
> starting the second inner loop. Instead, build the output directory name
> once, without looping over the directories (which, as far as I understand
> your intention, you can ignore completely).
>
> Also, see the shutils module. It has a method that efficiently copies data
> between open file(-like) objects. With that, you can avoid reading the whole
> file into memory.
>
> Stefan
>
> ______________________________**_________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/**mailman/listinfo/tutor<http://mail.python.org/mailman/listinfo/tutor>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110809/4832cb7f/attachment.html>


More information about the Tutor mailing list