Fixing socket.makefile()

Donn Cave donn at u.washington.edu
Tue Aug 10 14:10:52 EDT 2004


In article <pJXRc.2959$5j7.2643 at newssvr27.news.prodigy.com>,
 Bryan Olson <fakeaddress at nowhere.org> wrote:
...
> The problem is that makefile() returns a Python object that has
> its own local buffer.  The recv() call reads directly from the
> socket, oblivious to any data queued in the file object's
> buffer.  The problem is not limited to recv(); select(), and
> perhaps other calls, will ignore the buffer and look directly at
> the socket.  Output buffering appears to have a similar problem.
> 
> Now look up socket.makefile().readline().  It gets one byte at a
> time. It will get the byte from the Python buffer if the buffer
> is non-empty, otherwise it will try to recv() one byte at a
> time, directly from the socket.  By itself, readline() never
> over-reads the socket; if select() and recv() would work
> correctly before the readline(), they'll work after.  While
> correct, reading one byte at a time is painfully slow.

I don't get this.  Has socket.py changed this much since 2.2?
The readline I'm looking at says self._sock.recv(self._rbufsize),
so you would only get this behavior if you specified a buffer
size of 1 or less, and read() does the same - so you could do
this to yourself, but not specially just with readline.

At any rate, I think it would put this in better perspective
to recall that pipes, terminals and in general any "slow"
device has the same issues, and that they work out the same
in Python as in the original C, with socket file descriptors
in place of socket objects and stdio file pointers in place
of file objects.

It's definitely a problem, and some kind of solution might be
well received, but it needs to be portable (so forget MSG_PEEK
unless you're really confident that it will be supported on
every platform that now supports sockets to some useful degree),
and it would be nice to apply to the problem in general and not
just sockets.  I think the root of the problem really is that
select() doesn't look at process buffers in fileobject instances,
and it can't be made to do that because that information isn't
available from the stdio file pointer underneath the fileobject.
So, you need a replacement for fileobject, to start with.

   Donn Cave, donn at u.washington.edu



More information about the Python-list mailing list