[Python-Dev] sock.close() not closing?

Guido van Rossum guido at python.org
Wed May 7 18:42:10 CEST 2008


On Wed, May 7, 2008 at 7:37 AM, Sjoerd Mullender <sjoerd at acm.org> wrote:
> On 2008-05-07 13:37, Amaury Forgeot d'Arc wrote:
>
> > Hello,
> >
> > 2008/5/7 Sjoerd Mullender <sjoerd at acm.org>:
> >
> > > Why does sock.close() not actually close sock?
> > >
> > >  If I run the code
> > >
> > >  import socket
> > >  sock = socket.socket()
> > >  ...
> > >  sock.close()
> > >
> > >  I would expect that a system call is done to actually close the socket
> and
> > > free the file descriptor.  But that does not happen.  Look at the code
> in
> > > socket.py.  It merely replaces the socket instance with a dummy instance
> so
> > > that all subsequent calls on the sock object fail, but it does nothing
> else!
> > >
> >
> > It does close the socket:
> >
> > In socket.py, when self._sock is replaced, its __del__ method will be
> called.
> > This __del__ is implemented in C, in socketmodule.c:
> >
> > static void
> > sock_dealloc(PySocketSockObject *s)
> > {
> >        if (s->sock_fd != -1)
> >                (void) SOCKETCLOSE(s->sock_fd);
> >        Py_TYPE(s)->tp_free((PyObject *)s);
> > }
> >
> >
> > Of course, if you call sock.dup() or sock.makefile(),
> > there is another reference to the underlying _sock, and you must
> > close() all these objects.
> >
> >
>
>  I have to question the design of this.  When I close() an object I
>  expect it to be closed there and then and not at some indeterminate
>  later time (well, it is determinate when you're fully aware of all
>  references, but often you aren't--trust me, I understand reference
>  counting).
>
>  Then there also seems to be a bug in imaplib.IMAP4_SSL since in its
>  shutdown method it closes the socket but leaves the sslobj untouched.  I
>  assume that that object keeps a reference to the socket.
>
>  But as I said, I expect the close() to actually close.

I agree that the design is incredibly fragile. It came about from a
desire to remain compatible with the semantics enshrined in the first
implementation supporting makefile() -- once you call makefile(), you
have multiple references to the socket, and the choice was made to
dup() the file descriptor for each reference. This meant that, while
.close() on each object would close the file descriptor, the socket
connection would not be closed until the last fd was closed,
essentially letting the Unix kernel do our reference counting for us.
Example:

$ python2.2
Python 2.2.3+ (#1, Sep  6 2005, 04:14:07)
[GCC 4.0.2 20050808 (prerelease) (Debian 4.0.1-4ubuntu6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> s = socket.socket()
>>> s.connect(('python.org', 80))
>>> f = s.makefile()
>>> s.fileno()
3
>>> f.fileno()
4
>>>

Unfortunately we couldn't use this architecture on Windows, which
doesn't (or at the time didn't) have a dup() system call, and I
believe that's where the code that doesn't close the socket but relies
on refcounting comes from. The SSL architecture also caused additional
problems. In the end a version of this approach was used on all
platforms, so that we now have the following:

python2.6
Python 2.6a1 (trunk:61189, Mar  2 2008, 17:07:23)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> s = socket.socket()
>>> s.connect(('python.org', 80))
>>> f = s.makefile()
>>> s.fileno()
3
>>> f.fileno()
3
>>>

Note that s and f now share a file descriptor! Python 2.4 behaves the
same way. I don't recall if we changed this in 2.3 or in 2.4.

All this was because of the requirement that should be allowed to
close (or simply delete) the original socket even when you continue to
use the stream(s) derived from it using makefile(). This requirement
existed out of the desire to be backwards compatible with the original
Unix implementation, which had this property.

I would be okay with changing the requirements in Py3k so that you are
required to keep the socket object open at least as long as you plan
on using the derived stream(s), but this will require some careful
redesign, especially in the light of SSL support. (Read ssl.py and
_ssl.c to understand the level of intertwinement.) In 2.6 I think we
should leave the design alone (to the point that it hasn't already
changed due to Bill Janssen's providing a new, enhanced SSL
implementation) for best compatibility.

I don't know anything about the IMAP issue, sorry.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list