subprocess.popen, capturar stdout

Rafael Villar Burke pachi en rvburke.com
Mar Oct 30 20:44:48 CET 2007


Arnau Sanchez wrote:
> Asumo que no puedes tocar la aplicación, ni está hecha en Python
> (entonces la llamarías con -u). En tal caso, no queda sino plegar
> velas y tirarse a python-pexpect, que ejecuta los programas simulando
> un terminal (y entonces la salida es "unbuffered"): 
Pexpect no funciona en windows (salvo con cygwin, creo).

En pexpect hay una explicación sobre esta cuestión en la FAQ (abajo la
añado). El problema es, efectivamente, el modo de buffering en la
biblioteca estándar stdio. Por lo que casi todo lo que no sea una salida
a un terminal, o si la aplicación hace un flush explícito, acaba en un
buffer que mata la interactividad. Lo malo es que es bastante probable
que las aplicaciones que nos encontremos estén hechas usando stdio :(.

    *Q: Why not just use a pipe (popen())?*

    A: A pipe works fine for getting the output to non-interactive
    programs. If you just want to get the output from ls, uname, or ping
    then this works. Pipes do not work very well for interactive
    programs and pipes will almost certainly fail for most applications
    that ask for passwords such as telnet, ftp, or ssh.

    There are two reasons for this.

    First an application may bypass stdout and print directly to its
    controlling TTY. Something like SSH will do this when it asks you
    for a password. This is why you cannot redirect the password prompt
    because it does not go through stdout or stderr.

    The second reason is because most applications are built using the C
    Standard IO Library (anything that uses #include <stdio.h>). One of
    the features of the stdio library is that it buffers all input and
    output. Normally output is */line buffered/* when a program is
    printing to a TTY (your terminal screen). Everytime the program
    prints a line-feed the currently buffered data will get printed to
    your screen. The problem comes when you connect a pipe. The stdio
    library is smart and can tell that it is printing to a pipe instead
    of a TTY. In that case it switches from line buffer mode to /*block
    buffered*/. In this mode the currently buffered data is flushed when
    the buffer is full. This causes most interactive programs to
    deadlock. Block buffering is more efficient when writing to disks
    and pipes. Take the situation where a program prints a message
    "Enter your user name:\n" and then waits for you type type
    something. In block buffered mode, the stdio library will not put
    the message into the pipe even though a linefeed is printed. The
    result is that you never receive the message, yet the child
    application will sit and wait for you to type a response. Don't
    confuse the stdio lib's buffer with the pipe's buffer. The pipe
    buffer is another area that can cause problems. You could flush the
    input side of a pipe, whereas you have no control over the stdio
    library buffer.

    More information: the Standard IO library has three states for a
    FILE *. These are: _IOFBF for block buffered; _IOLBF for line
    buffered; and _IONBF for unbuffered. The STDIO lib will use block
    buffering when talking to a block file descriptor such as a pipe.
    This is usually not helpful for interactive programs. Short of
    recompiling your program to include fflush() everywhere or
    recompiling a custom stdio library there is not much a controlling
    application can do about this if talking over a pipe.

    The program may have put data in its output that remains unflushed
    because the output buffer is not full; then the program will go and
    deadlock while waiting for input -- because you never send it any
    because you are still waiting for its output (still stuck in the
    STDIO's output buffer).

    The answer is to use a pseudo-tty. A TTY device will force /*line*/
    buffering (as opposed to block buffering). Line buffering means that
    you will get each line when the child program sends a line feed.
    This corresponds to the way most interactive programs operate --
    send a line of output then wait for a line of input.

    I put "answer" in quotes because it's ugly solution and because
    there is no POSIX standard for pseudo-TTY devices (even though they
    have a TTY standard...). What would make more sense to me would be
    to have some way to set a mode on a file descriptor so that it will
    tell the STDIO to be line-buffered. I have investigated, and I don't
    think there is a way to set the buffered state of a child process.
    The STDIO Library does not maintain any external state in the kernel
    or whatnot, so I don't think there is any way for you to alter it.
    I'm not quite sure how this line-buffered/block-buffered state
    change happens internally in the STDIO library. I think the STDIO
    lib looks at the file descriptor and decides to change behavior
    based on whether it's a TTY or a block file (see isatty()).

    I hope that this qualifies as helpful.


Saludos,

Rafael Villar Burke
_______________________________________________
Lista de correo Python-es 
http://listas.aditel.org/listinfo/python-es
FAQ: http://listas.aditel.org/faqpyes





Más información sobre la lista de distribución Python-es