[Tutor] capture output from a subprocess while it is being produced

Wed Jun 15 00:48:03 EDT 2016

On Tue, Jun 14, 2016 at 8:03 PM, Albert-Jan Roskam
<sjeik_appie at hotmail.com> wrote:
>
>     proc = Popen("ls -lF", cwd=cwd, shell=True, stdout=PIPE, stderr=PIPE)

Don't use shell=True if you can avoid it. "ls" isn't a shell command.
Use ['ls', '-lF'].

The child process probably buffers its output when stdout isn't a
terminal. A typical standard I/O buffer size is 4 KiB. Reading from
the pipe will block until the child flushes the buffer to the pipe. It
might fflush() at checkpoints, but otherwise this happens
automatically when the buffer is full or when the child exits
normally.

On Linux you can use stdbuf to try forcing a program's stdout to use
either no buffering (-o0) or line buffering (-oL) -- e.g. `stdbuf -o0
ls -lF`. That said, "ls" should use full buffering in a pipe. You need
to test using a long-running process with intermittent output. You
could use a Python script that sleeps for a random interval between
writes to stdout and stderr.

On Windows there's no program like stdbuf to control standard I/O
buffering. If the target is a Python script, you can force interactive
mode using the "-i" option or disable buffering using "-u" (but -u
doesn't work for the 2.x REPL on Windows, since it switches to binary
mode).

>             #output, errors = proc.communicate() # "Wait for process to terminate."
>             output = proc.stdout.read()  # docs: "Use communicate() rather than .stdin.write, .stdout.read or .stderr.read"
>             errors = proc.stderr.read()

Start a dedicated thread to read from each file. Otherwise you can end
up with a deadlock. For example, the child could block while writing
to a full stderr pipe, while the parent blocks while reading from an
empty stdout pipe. Also, you should read by lines using readline(), or
a byte at a time using read(1).

>             if errors:
>                 raise RuntimeError(str(errors))

Data on stderr doesn't necessarily indicate an error. It could simply
be logging, warnings, or debug output. An error is indicated by a
non-zero exit code. Use proc.wait().