creating size-limited tar files

Tue Nov 13 11:25:20 EST 2012

On Tue, Nov 13, 2012 at 9:07 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
> It'll look something like this:
>
>>>> p1 = subprocess.Popen(cmd1, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>>>> p2 = subprocess.Popen(cmd2, shell=True, stdin=p1.stdout, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>>>> p1.communicate()
> ('', '')
>>>> p2.communicate()
> ('', '')
>>>> p1.wait()
> 0
>>>> p2.wait()
> 0
>
> Note that there's a subtle potential for deadlock here.  During the
> p1.communicate() call, if the p2 output buffer fills up, then it will
> stop accepting input from p1 until p2.communicate() can be called, and
> then if that buffer also fills up, p1 will hang.  Additionally, if p2
> needs to wait on the parent process for some reason, then you end up
> effectively serializing the two processes.
>
> Solution would be to poll all the open-ended pipes in a select() loop
> instead of using communicate(), or perhaps make the two communicate
> calls simultaneously in separate threads.

Sorry, the example I gave above is wrong.  If you're calling
p1.communicate(), then you need to first remove the p1.stdout pipe
from the Popen object.  Otherwise, the communicate() call will try to
read data from it and may "steal" input from p2.  It should look more
like this:

>>> p1 = subprocess.Popen(cmd1, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>>> p2 = subprocess.Popen(cmd2, shell=True, stdin=p1.stdout, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>>> p1.stdout = None