remove header line when reading/writing files

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Fri Oct 12 02:39:00 EDT 2007


On Thu, 11 Oct 2007 22:52:55 +0000, RyanL wrote:

> I'm a newbie with a large number of data files in multiple
> directories.  I want to uncompress, read, and copy the contents of
> each file into one master data file.  The code below seems to be doing
> this perfectly.  The problem is each of the data files has a header
> row in the first line, which I do not want in the master file.  How
> can I skip that first line when writing to the master file?  Any help
> is much appreciated.  Thank you.

Untested version with `itertools.islice()`:

import glob
import gzip
import os
from itertools import islice


def main():
    zipdir = 'G:/Research/Data/'
    outfilename = 'G:/Research/Data/master_data.txt'
    out_file = open(outfilename, 'w')
    for name in os.listdir(os.curdir):
        if os.path.isdir(name):
            os.chdir(name)
            for zip_name in glob.glob('*.gz'):
                in_file = gzip.GzipFile(zip_name, 'r')
                out_file.writelines(islice(in_file, 1, None))
                in_file.close()
            os.chdir(os.pardir)
    out_file.close()

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list