Unbuffer stdout?

Donn Cave donn at u.washington.edu
Wed Mar 26 13:27:40 EST 2003


Quoth "Francis Avila" <franga at shentel.net>:
| "Donn Cave" <donn at drizzle.com> wrote in message
| news:1048659589.577001 at yasure...

|>| >>> sys.stdout.write( os.popen2("yes", 'r', 0)[1].readline() )
|>
|> It would, if the pipe ever closed, but the above doesn't make that
|> happen, I imagine because some internal references keep the file
|> object from being deleted.  By the way, I don't think your parameters
|> to popen2() are right.
|
| You're right.  It's meaningless to specify 'r' or 'w' on popen?(), since
| they return separate file objects for reading and writing.  However, the
| function doesn't seem to complain....
|
| I'm still curious as to why popen2() is not closing the pipe in this
| context, while popen() is.

I haven't looked into the details.  This matter of closing file objects
automatically deals with a couple of general issues that you may find
interesting to know more about, "reference counting" (the mechanism
that determines whether an object may be retired), and "finalization"
(the actions that are taken on retirement.)  Also see "circular reference."

...
|> If you don't want to read a line at a time but just whatever input is
|> available, then I think it makes more sense (as usual) to forget the
|> file object and use the pipe file descriptor directly --
|>   fp = os.popen('yes')
|>   fd = fp.fileno()
|>   while 1:
|>       data = os.read(fd, 16000)
|>       if data:
|>           os.write(1, data)
|>       else:
|>           break
|>   fp.close()
|
| Is there any appreciable advantage to reading from the file descriptor
| directly verses passing an argument to read() or readline()?  It seems as
| though:
|
| p = os.popen('yes')
| while 1:
|     data = p.read(16000)
|         if data:
|             sys.stdout.write(data)
|         else:
|             break
|
| would do much the same, and be more intuitive.  I guess it'd be faster,
| since there are fewer layers to cut through.

On the contrary, it will be slower, because the file object read()
(basically fread(3)) already has to do a system level read(2), you're
just adding an extraneous buffer copy.  Nor is it the same result -
just like the p.read() that didn't work for you, p.read(sz) is going
to block when it could return data, until it actually has sz bytes
or reaches EOF.  The size parameter in os.read(fd, sz) means only
that no more than sz bytes should be returned - you'll get whatever
is there, up to that amount - it just relieves the function from
having to allocate an infinite amount of storage in advance.  So
they're not as similar as they may appear.  File objects are almost
always the right thing for disk files, but for input from a slow
device like a pipe or socket you have to consider whether it's worth
the tradeoffs.  It isn't a good idea to just turn off the buffering
for the sake of keeping a familiar interface, because functions like
readline and read are designed for buffered input and may become very
inefficient when unbuffered.

	Donn Cave, donn at u.washington.edu




More information about the Python-list mailing list