[Python-Dev] socket.try_reuse_address()

Trent Nelson tnelson at onresolve.com
Tue Apr 29 15:58:02 CEST 2008


Since the recent changes to networking-oriented tests (issue 2550, r62234 and r62237), I think it's safe to say stability of the test suite on all the non-Windows platforms has improved significantly in this area (I don't recall seeing any socket errors in *nix buildbot logs since those commits).

However, Windows buildbots are still periodically failing.  More specifically, my Windows buildbots are still failing.  One thing that's different about my buildbots is that two are being run at the same time for both trunk and py3k -- one doing an x86 build, the other doing x64 build.

Since the changes in the aforementioned revisions, the behaviour of my buildbots has definitely improved -- they no longer completely wedge on test_async(chat|core), mainly due to abolishing all use of SO_REUSEADDR as a socket option in any network-oriented tests.

However, periodically, they're still dying/failing in a variety of ways -- see relevant log snippets at the end of this e-mail for examples.  I attribute this to the fact that SO_REUSEADDR is still set as a socket option in asyncore.py and SocketServer.py.  Basically, SO_REUSEADDR should *never* be set on Windows for TCP/IP sockets.  Using asyncore.py as an example, here are two ways we could handle this:

1. Hard code the Windows opt-out:
--- asyncore.py (revision 62509)
+++ asyncore.py (working copy)
@@ -267,6 +267,8 @@

     def set_reuse_addr(self):
         # try to re-use a server port if possible
+        if os.name == 'nt' and self.socket.socket_type != socket.SOCK_DGRAM:
+            return
         try:
             self.socket.setsockopt(
                 socket.SOL_SOCKET, socket.SO_REUSEADDR,

2. Introduce socket.try_reuse_address():
--- asyncore.py (revision 62509)
+++ asyncore.py (working copy)
@@ -266,15 +266,7 @@
         self.add_channel(map)

     def set_reuse_addr(self):
-        # try to re-use a server port if possible
-        try:
-            self.socket.setsockopt(
-                socket.SOL_SOCKET, socket.SO_REUSEADDR,
-                self.socket.getsockopt(socket.SOL_SOCKET,
-                                       socket.SO_REUSEADDR) | 1
-                )
-        except socket.error:
-            pass
+        self.socket.try_reuse_address()


With try_use_address implemented as follows:

--- socket.py   (revision 62509)
+++ socket.py   (working copy)
@@ -197,6 +197,10 @@
         Return a new socket object connected to the same system resource."""
         return _socketobject(_sock=self._sock)

+    def try_reuse_address(self):
+        if not (os.name == 'nt' and self._sock.type != SOCK_DGRAM):
+            self._sock.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
+
     def makefile(self, mode='r', bufsize=-1):
         """makefile([mode[, bufsize]]) -> file object

I prefer the latter as it's cleaner, easier to document and encapsulates what we're trying to do relatively well.  The affected modules would be asyncore.py, SocketServer.py and idlelib/rpc.py.  Thoughts?

Regards,

        Trent.


<eg1>
test_ftplib

remoteFailed: [Failure instance: Traceback (failure with no frames): twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion.
]
</eg1>

<eg2>
test_asynchat
test test_asynchat failed -- errors occurred; run in verbose mode for details
[snip to bottom of log where test_asynchat is re-run]
1 test failed:
    test_asynchat
33 tests skipped:
    test__locale test_aepack test_applesingle test_cProfile
    test_commands test_crypt test_curses test_dbm test_epoll
    test_fcntl test_fork1 test_gdbm test_grp test_ioctl test_kqueue
    test_macostools test_mhlib test_nis test_openpty test_ossaudiodev
    test_pipes test_poll test_posix test_pty test_pwd test_resource
    test_scriptpackages test_signal test_syslog test_threadsignals
    test_wait3 test_wait4 test_zipfile64
Those skips are all expected on win32.
Re-running failed tests in verbose mode
Re-running test 'test_asynchat' in verbose mode
test_close_when_done (test.test_asynchat.TestAsynchat) ... ok
test_empty_line (test.test_asynchat.TestAsynchat) ... ok
test_line_terminator1 (test.test_asynchat.TestAsynchat) ... ok
test_line_terminator2 (test.test_asynchat.TestAsynchat) ... ok
test_line_terminator3 (test.test_asynchat.TestAsynchat) ... ok
test_none_terminator (test.test_asynchat.TestAsynchat) ... ok
test_numeric_terminator1 (test.test_asynchat.TestAsynchat) ... ok
test_numeric_terminator2 (test.test_asynchat.TestAsynchat) ... ok
test_simple_producer (test.test_asynchat.TestAsynchat) ... ok
test_string_producer (test.test_asynchat.TestAsynchat) ... ok
test_close_when_done (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_empty_line (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_line_terminator1 (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_line_terminator2 (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_line_terminator3 (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_none_terminator (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_numeric_terminator1 (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_numeric_terminator2 (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_simple_producer (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_string_producer (test.test_asynchat.TestAsynchat_WithPoll) ... ok
test_find_prefix_at_end (test.test_asynchat.TestHelperFunctions) ... ok
test_basic (test.test_asynchat.TestFifo) ... ok
test_given_list (test.test_asynchat.TestFifo) ... ok

----------------------------------------------------------------------
Ran 23 tests in 11.812s

OK
</eg2>
(Note that re-running the test here didn't result in the test failing again.)

<eg3>
1 test failed:
    test_smtplib

Traceback (most recent call last):
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\threading.py", line 493, in _bootstrap_inner
    self.run()
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\threading.py", line 449, in run
    self._target(*self._args, **self._kwargs)
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\test\test_smtplib.py", line 106, in debugging_server
    poll_fun(0.01, asyncore.socket_map)
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\asyncore.py", line 132, in poll
    read(obj)
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\asyncore.py", line 72, in read
    obj.handle_error()
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\asyncore.py", line 68, in read
    obj.handle_read_event()
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\asyncore.py", line 390, in handle_read_event
    self.handle_read()
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\test\test_ssl.py", line 524, in handle_read
    data = self.recv(1024)
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\asyncore.py", line 342, in recv
    data = self.socket.recv(buffer_size)
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\ssl.py", line 247, in recv
    return self.read(buflen)
  File "S:\buildbots\python\3.0.nelson-windows\build\lib\ssl.py", line 162, in read
    v = self._sslobj.read(len or 1024)
socket.error: [Errno 10053] An established connection was aborted by the software in your host machine
</eg3>




More information about the Python-Dev mailing list