Help! calling tar from Python on Linux

Guido van Rossum guido at cnri.reston.va.us
Tue Oct 12 22:43:58 EDT 1999


Donn Cave <donn at u.washington.edu> writes:

> Quoth Preston Landers <prestonlanders at my-deja.com>:
> | I'm trying to untar a large file from within a Python script.
> | I'm calling tar like this:
> |
> | filename = "/tmp/data.tar.gz"
> | status, result = commands.getstatusoutput("tar xzvf %s" % filename)
> | if status:
> |    print "error! tar said:\n%s" % result
> | else:
> |    # continue
> |
> | The problem is that every time I get an error status from tar, though
> | tar seems to completely extract all files.  I get this as the last 2
> | lines of output:
> |
> | gzip: stdout: Broken pipe
> | tar: Child returned status 1
> |
> | If I try to perform this same operation from the command line, it works
> | fine.  But if I try to do it in Python, I get this error status.  Has
> | anybody seen anything like this?
> 
> I think there may be a bug in commands.getstatusoutput.  Here's what I
> see in commands.py:
> 
> def getstatusoutput(cmd):
>     """Return (status, output) of executing cmd in a shell."""
>     import os
>     pipe = os.popen('{ ' + cmd + '; } 2>&1', 'r')
>     text = pipe.read()
>     sts = pipe.close()
>     if sts == None: sts = 0
>     if text[-1:] == '\n': text = text[:-1]
>     return sts, text
>  
> I suspect your problem may go away if you rewrite that function, changing
> "text = pipe.read()"  to something like this:
> 
>     text = ''
>     while 1:
>         frag = pipe.read()
>         if frag == '':
>             break
>         text = text + frag
> 
> Basically, you need this if your command is going to write more than once
> to output.  The way it's written now, that read() catches the first output
> and then closes the pipe, so the command will get a SIGPIPE the next time
> it writes output.

Donn, I don't think that there's a bug in getstatusoutput().  Python's
read() method on file objects already contains the loop you are
suggesting.

My guess as to why he's getting the error that he's getting is that
tar somehow stops reading the gzip output before gzip is done.  Maybe
the rest of the tar output (if there's any) would be a clue.

--Guido van Rossum (home page: http://www.python.org/~guido/)




More information about the Python-list mailing list