Read a gzip file from inside a tar file

rohisingh at gmail.com rohisingh at gmail.com
Mon Dec 13 11:14:37 EST 2004


I have a tar file. The content of the file are as following.

rohits at sandman 12-08-04 $ tar tvf 20041208.tar
drwxr-xr-x root/root         0 2004-12-08 21:39:19 20041208/
-rw-r--r-- root/root      1576 2004-12-08 21:39:19 20041208/README
drwxr-xr-x root/root         0 2004-12-08 21:27:31
20041208/snapshot_01/
-rw-r--r-- was/was   103010606 2004-12-08 16:37:38
20041208/snapshot_01/tpv-2004 1208-1350.xml.gz


What is the best method to read the content of the
tpv-20041208-1350.xml.gz?

I want to do the following with minimum code :-)
1) read above tar file
2) find the gzip file
3) read the content of this file
4) perform operations on content
5) continue

I tried various combination of following code but it does not work as
intended

fileName = sys.argv[1]
print "File Name is ", fileName
tar = tarfile.open(fileName, "r:")
for tarinfo in tar:
if tarinfo.isreg():
print tarinfo.name
if tarinfo.name.find("tpv") != -1:
#read the gzip file
print "\thttp plugin file"
fileLike = tar.extractfile(tarinfo)
fileText = fileLike.read()
stringio = StringIO.StringIO(fileText)
fileRead = gzip.GzipFile(stringio)
for aLine in fileRead:
                    print aLine




More information about the Python-list mailing list