Delays getting data on sys.stdin.readline() ?

Christian Convey christian.convey at gmail.com
Sun Nov 20 16:03:00 EST 2005


On 11/19/05, Mike Meyer <mwm at mired.org> wrote:
> Christian Convey <christian.convey at gmail.com> writes:
> > I've got a program that (ideally) perpetually monitors sys.stdin for
> > lines of text. As soon as a line comes in, my program takes some
> > action.
> > The problem is, it seems like a very large amount of data must
> > accumulate on sys.stdin before even my first invocation of readline()
> > returns.  This delay prevents my program from being responsive in the
> > way it must be.
>
> readline normally returns as soon as it sees a newline. External
> conditions may cause this to change, or make it impossible. Without
> knowing those external conditions, the best we can do is guess as to
> what might be the problem.
>
> > Has anyone else seen this effect?  If so, is there a reasonable workaround?
>
> Yes, and maybe. Depends on what's causing the problem. Tell us more
> about the program, and what sys.stdin is connected to, and the
> platform you're running on, and someone should be able to provide
> explicit information.

OK, I've fixed it, but I don't understand why the fix works.

Let's say I've got two Python programs, I'll call "producer" and
"consumer". "producer" runs for a long time and occasionally while
running sends lines of text to stdout. "consumer" is typically blocked
in a call to sys.stdin.readline().

When I run "producer" on its own, I see its output appear on the
console pretty much immediately after calling the "print" command.

But when I pipe "producer"s output to "consumer"s stdin on the Linux
command line, "consumer" stays blocked on its first call to
sys.stdin.readline() until the "producer" program terminates. At that
point, "consumer" seems to immediately get access to all of the stdout
produced by "producer".

I've found I can fix this problem by modifying "producer" so that
immediately after each "print" command, I call sys.stdout.flush(). 
When I make this modification, I find that "consumer" has access to
the output of "producer" immediately after "producer" issues a "print"
statement.

So here's what I don't get: If "producer" was retaining its output for
a while for the sake of efficiency, I would expect to see that effect
when I just run "producer" on the command line. That is, I would
expect the console to not show any output from "producer" until
"producer" terminates.  But instead, I see the output immediately. So
why, when I pipe the output to "consumer", doesn't "consumer" get
access to that data as its produced unless "consumer" is explicitely
calling sys.stdout.flush().

Any thoughts?

Thanks,
Christian



More information about the Python-list mailing list