Running a command line program and reading the result as it runs

Peter Otten __peter__ at web.de
Fri Aug 23 12:39:47 EDT 2013


random832 at fastmail.us wrote:

> On Fri, Aug 23, 2013, at 7:14, Peter Otten wrote:
>> The following works on my linux system:
>> 
>> instream = iter(p.stdout.readline, "")
>>         
>> for line in instream:
>>     print line.rstrip()
>> 
>> I don't have Windows available to test, but if it works there, too, the
>> problem is the internal buffer used by Python's implementation of file
>> iteration rather than the OS.
> 
> I can confirm this on Windows.
> 
> Doesn't this surprising difference between for line in
> iter(f.readline,'') vs for line in f violate TOOWTDI? We're led to
> believe from the documentation that iterating over a file does _not_
> read lines into memory before returning them. It's not clear to me what
> performance benefit can be gained from waiting when there is no more
> data available, either.
> 
> I don't understand how it's even happening - from looking at the code,
> it looks like next() just calls readline() once, no fancy buffering
> specific to itself.

Maybe you are looking in the wrong version?

For 2.x you can use the file_iternext() function as a starting point, see:

http://hg.python.org/cpython/file/1ea833ecaf5a/Objects/fileobject.c#l2316

Python 3 uses a different approach that allows you to mix iteration and 
readline():

$ python -c 'f = open("tmp.txt"); next(f); f.readline()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ValueError: Mixing iteration and read methods would lose data
$ python3 -c 'f = open("tmp.txt"); next(f); f.readline()'

The relevant code is likely in the Modules/_io/ directory. There is also

[New I/O] http://www.python.org/dev/peps/pep-3116/




More information about the Python-list mailing list