[Python-ideas] WSAPoll and tulip

Tue Nov 27 13:33:25 CET 2012

A weekend or two ago, I was planning on doing some work on some
    ideas I had regarding IOCP and the tulip/async-IO discussion.

    I ended up getting distracted by WSAPoll.  WSAPoll is a method
    that Microsoft introduced with Vista/2008 that is intended to
    be semantically equivalent to poll() on UNIX.

    I decided to play around and see what it would take to get it
    available via select.poll() on Windows, eventually hacking it
    into a working state.

    Issue: http://bugs.python.org/issue16507
    Patch: http://bugs.python.org/file28038/wsapoll.patch

    So, it basically works.  poll() on Windows, who would have thought.

    It's almost impossible to test with our current infrastructure; all
    our unit tests seem to pass pipes and other non-Winsock-backed-socks
    to poll(), which, like select()-on-Windows, isn't supported.

    I suspect Twisted's test suite would give it a much better work out
    (CC'd Glyph just so it's on his radar).  I ended up having to verify
    it worked with some admittedly-hacky dual-python-console sessions,
    one poll()'ing as a server, the other connecting as a client.  It
    definitely works, so, it's worth keeping it in mind for the future.

    It's still equivalent to poll()'s O(N) on UNIX, but that's better
    than the 64/512 limit select is currently stuck with on Windows.

    Didn't have much luck trying to get the patched Python working with
    tulip's PollReactor, unfortunately, so I just wanted to provide some
    feedback on that experience.

    First bit of feedback: man, debugging `yield from` stuff is *hard*.
    Tulip just flat out didn't work with the PollReactor from the start
    but it was dying in a non-obvious way.

    So, I attached both a Pdb debugger and Visual Studio debugger and
    tried to step through everything to figure out why the first call
    to poll() was blowing up (can't remember the exact error message
    but it was along the lines of "you can't poll() whatever it is you
    just asked me to poll(), it's defo' not a socket").

    I eventually, almost by pure luck, traced the problem to the fact
    that PollReactor's __init__ eventually results in code being called
    that calls poll() on two os.pipe() objects (in EventLoop I think).

    However, when I was looking at the code, it appeared as though the
    first poll() came from the getaddrinfo().  So all my breakpoints
    and whatnot were geared towards that, yet none of them were being
    hit, yet poll() was still being called somehow, somewhere.

    I ended up having to spend ages traipsing through every line in
    Visual Studio's debugger to try figure out what the heck was going
    on.  I believe the `yield from` aspect made that so much more of an
    arduous affair -- one moment I'm in selectmodule.c's getaddrinfo(),
    then I'm suddenly deep in the bowels of some cryptic eval frame
    black magic, then one 'step' later, I'm over in some completely
    different part of selectmodule.c, and so on.

    I think the reason I found it so tough was because when you're
    single stepping through each line of a C program, you can sort of
    always rely on the fact you know what's going to happen when you
    "step" the next line.

    In this case though, a step of an eval frame would wildly jump
    to seemingly unrelated parts of C code.  As far as I could tell,
    there was no easy/obvious way to figure the details out before
    stepping that instruction either (i.e. probing the various locals
    and whatnot).

    So, that's the main feedback from that weekend, I guess.  Granted,
    it's more of a commentary on `yield from` than tulip per se, but I
    figured it would be worth offering up my experience nevertheless.

    I ended up with the following patch to avoid the initial poll()
    against os.pipe() objects:

--- a/polling.py        Sat Nov 03 13:54:14 2012 -0700
+++ b/polling.py        Tue Nov 27 07:05:10 2012 -0500
@@ -41,6 +41,7 @@
 import os
 import select
 import time
+import sys
 
 
 class PollsterBase:
@@ -459,6 +460,10 @@
     """
 
     def __init__(self, eventloop, executor=None):
+        if sys.platform == 'win32':
+            # Work around the fact that we can't poll pipes on Windows.
+            if isinstance(eventloop.pollster, PollPollster):
+                eventloop = EventLoop(SelectPollster())
         self.eventloop = eventloop
         self.executor = executor  # Will be constructed lazily.
         self.pipe_read_fd, self.pipe_write_fd = os.pipe()

    By that stage it was pretty late in the day and I accepted defeat.
    My patch didn't really work, it just allowed the test to run to
    completion without the poll OSError exception being raised.

        Trent.