Is it just me, or..

Mon Jul 5 09:27:34 EDT 1999

Markus Stenberg <mstenber at cc.Helsinki.FI> wrote:
: "Michael P. Reilly" <Michael.P..Reilly at p98.f112.n480.z2.fidonet.org> writes:
:> From: "Michael P. Reilly" <arcege at shore.net>
:> Markus Stenberg <mstenber at cc.Helsinki.FI> wrote:
:> : Is there a bug in the handling of pipes with the Python? Esp. in
:> : non-blocking way.
:> : Example:
:> :  (r,w)=os.pipe()
:> :  r = os.fdopen(r, "r", 0)
:> :  w = os.fdopen(w, "w", 16384)
:> :  w.write('foobarbaz')
:> :  select.select([r],[],[]) # returns r, as it should
:> :  r.read(1) # returns 'f'
:> :  select.select([r],[],[]) # blocks!
:> 
:> : Now select _blocks_, despite there being obviously some data that at least
:> : should not have been buffered elsewhere. Also, fdopened stuff's reading
:> : features are bit annoying, as doing read(bigNumber) results in blocking
:> : call, as opposed to recv(bigNumber)'s semantics. Thus instead of pipes
:> : (=fast) I am using TCP/UNIX sockets (=slow) but they seem to work better
:> : for what I need. Am I missing something really obvious?
:> 
:> : This is on Linux, if it matters.
:> 
:> At first glance, I'd say that it is the buffering in the fdopen(w, "w",
:> 16384) call.  Put the bufsize to 0 or call the flush method.

: Oh, yes, I was flushing it (that example was done by hand out of existing
: code that is actually in use). Yet, still no dice. Also, buffer size for
: "r"does not matter (0,1,N); behavior is same always. Frustrating. (Oh
: well. UNIX sockets work, I suppose)

No, the buffer size of the writer ("w") should be 0.

Just tried the following on Linux (and on Solaris 2.6):

Python 1.5.2 (#1, Jun 15 1999, 01:24:40)  [GCC 2.7.2.3] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import sys, select
>>> import os
>>> os.uname()
('Linux', 'starship.python.net', '2.0.36', '#1 Tue Oct 13 22:17:11 EDT 1998', 'i586')
>>> (r, w) = os.pipe()
>>> rf = fdopen(r, 'r', 0)
Traceback (innermost last):
  File "<stdin>", line 1, in ?
NameError: fdopen
>>> rf = os.fdopen(r, 'r', 0)
>>> wf = os.fdopen(w, 'w', 0)
>>> wf.write('foobarbaz')
>>> select.select([rf], [], [])
([<open file '(fdopen)', mode 'r' at 80e9cc8>], [], [])
>>> rf.read(1)
'f'
>>> rf.read(1)      -- success, why? no buffering in the writer
'o'
>>> rf.read()
[hang - what happened to 'obarbaz'?]

The data must be in the pipe or the read will hang.  A read(-1) will
always block (since there is never enough data).

Personally, I never trusted the C standard I/O library for anything
other than plain files.  But that's just me. ;)

I would suggest using the POSIX calls (os.read, os.write):

>>> (r, w) = os.pipe()
>>> os.write(w, 'foobarbaz')
9
>>> select.select([r], [], [])
([3], [], [])
>>> os.read(r, 1)
'f'
>>> os.read(r, 1)
'o'
>>>

A rule of thumb is: use Python file objects with objects created as
file object (popen, socket), but when using POSIX files, use the POSIX
functions, too many buffering and timing consideration need to be taken
care of directly.

You might also want to think about named pipes (os.mkfifo).

Tips&Tricks with UNIX pipes (named or otherwise):
* when the read side is closed, processes opened for writing will
  receive a SIGPIPE signal; this is important when using Python
  file objects which close when destroyed.
* when the write side is closed, processes opened for reading will
  receive an EOF.
* there is a OS imposed limit to how much data can sit in a pipe,
  be concious of this when you write your protocol.
* POSIX write is atomic, STDIO write is not necessarily atomic.
* reads must contain a length; the read will wait until all data
  is in the pipe - think about non-blocking reads:
    import fcntl, FCNTL; fcntl.fcntl(r, FCNTL.F_SETFL, FCNTL.O_NDELAY)

  -Arcege