subprocess.popen, capturar stdout
Rafael Villar Burke
pachi en rvburke.com
Mar Oct 30 20:44:48 CET 2007
Arnau Sanchez wrote:
> Asumo que no puedes tocar la aplicación, ni está hecha en Python
> (entonces la llamarías con -u). En tal caso, no queda sino plegar
> velas y tirarse a python-pexpect, que ejecuta los programas simulando
> un terminal (y entonces la salida es "unbuffered"):
Pexpect no funciona en windows (salvo con cygwin, creo).
En pexpect hay una explicación sobre esta cuestión en la FAQ (abajo la
añado). El problema es, efectivamente, el modo de buffering en la
biblioteca estándar stdio. Por lo que casi todo lo que no sea una salida
a un terminal, o si la aplicación hace un flush explícito, acaba en un
buffer que mata la interactividad. Lo malo es que es bastante probable
que las aplicaciones que nos encontremos estén hechas usando stdio :(.
*Q: Why not just use a pipe (popen())?*
A: A pipe works fine for getting the output to non-interactive
programs. If you just want to get the output from ls, uname, or ping
then this works. Pipes do not work very well for interactive
programs and pipes will almost certainly fail for most applications
that ask for passwords such as telnet, ftp, or ssh.
There are two reasons for this.
First an application may bypass stdout and print directly to its
controlling TTY. Something like SSH will do this when it asks you
for a password. This is why you cannot redirect the password prompt
because it does not go through stdout or stderr.
The second reason is because most applications are built using the C
Standard IO Library (anything that uses #include <stdio.h>). One of
the features of the stdio library is that it buffers all input and
output. Normally output is */line buffered/* when a program is
printing to a TTY (your terminal screen). Everytime the program
prints a line-feed the currently buffered data will get printed to
your screen. The problem comes when you connect a pipe. The stdio
library is smart and can tell that it is printing to a pipe instead
of a TTY. In that case it switches from line buffer mode to /*block
buffered*/. In this mode the currently buffered data is flushed when
the buffer is full. This causes most interactive programs to
deadlock. Block buffering is more efficient when writing to disks
and pipes. Take the situation where a program prints a message
"Enter your user name:\n" and then waits for you type type
something. In block buffered mode, the stdio library will not put
the message into the pipe even though a linefeed is printed. The
result is that you never receive the message, yet the child
application will sit and wait for you to type a response. Don't
confuse the stdio lib's buffer with the pipe's buffer. The pipe
buffer is another area that can cause problems. You could flush the
input side of a pipe, whereas you have no control over the stdio
library buffer.
More information: the Standard IO library has three states for a
FILE *. These are: _IOFBF for block buffered; _IOLBF for line
buffered; and _IONBF for unbuffered. The STDIO lib will use block
buffering when talking to a block file descriptor such as a pipe.
This is usually not helpful for interactive programs. Short of
recompiling your program to include fflush() everywhere or
recompiling a custom stdio library there is not much a controlling
application can do about this if talking over a pipe.
The program may have put data in its output that remains unflushed
because the output buffer is not full; then the program will go and
deadlock while waiting for input -- because you never send it any
because you are still waiting for its output (still stuck in the
STDIO's output buffer).
The answer is to use a pseudo-tty. A TTY device will force /*line*/
buffering (as opposed to block buffering). Line buffering means that
you will get each line when the child program sends a line feed.
This corresponds to the way most interactive programs operate --
send a line of output then wait for a line of input.
I put "answer" in quotes because it's ugly solution and because
there is no POSIX standard for pseudo-TTY devices (even though they
have a TTY standard...). What would make more sense to me would be
to have some way to set a mode on a file descriptor so that it will
tell the STDIO to be line-buffered. I have investigated, and I don't
think there is a way to set the buffered state of a child process.
The STDIO Library does not maintain any external state in the kernel
or whatnot, so I don't think there is any way for you to alter it.
I'm not quite sure how this line-buffered/block-buffered state
change happens internally in the STDIO library. I think the STDIO
lib looks at the file descriptor and decides to change behavior
based on whether it's a TTY or a block file (see isatty()).
I hope that this qualifies as helpful.
Saludos,
Rafael Villar Burke
_______________________________________________
Lista de correo Python-es
http://listas.aditel.org/listinfo/python-es
FAQ: http://listas.aditel.org/faqpyes
Más información sobre la lista de distribución Python-es