[Python-ideas] Tulip / PEP 3156 event loop implementation question: CPU vs. I/O starvation

Sat Jan 12 00:41:05 CET 2013

Here's an interesting puzzle. Check out the core of Tulip's event
loop: http://code.google.com/p/tulip/source/browse/tulip/unix_events.py#672

Specifically this does something like this:

1. poll for I/O, appending any ready handlers to the _ready queue

2. append any handlers scheduled for a time <= now to the _ready queue

3. while _ready:
       handler = _ready.popleft()
       call handler

It is the latter loop that causes me some concern. In theory it is
possible for a bad callback to make this loop never finish, as
follows:

def hogger():
    tulip.get_event_loop().call_soon(hogger)

Because call_soon() appends the handler to the _ready queue, the while
loop will never finish.

There is a simple enough solution (Tornado uses this AFAIK):

now_ready = list(_ready)
_ready.clear()
for handler in now_ready:
    call handler

However this implies that we go back to the I/O polling code more
frequently. While the I/O polling code sets the timeout to zero when
there's anything in the _ready queue, so it won't block, it still
isn't free; it's an expensive system call that we'd like to put off
until we have nothing better to do.

I can imagine various patterns where handlers append other handlers to
the _ready queue for immediate execution, and I'd make such patterns
efficient (i.e. the user shouldn't have to worry about the cost of the
I/O poll compared to the amount of work appended to the _ready queue).
It is also convenient to say that a hogger that really wants to hog
the CPU can do so anyway, e.g.:

def hogger():
    while True:
        pass

However this would pretty much assume malice; real-life versions of
the former hogger pattern may be spread across many callbacks and
could be hard to recognize or anticipate.

So what's more important? Avoid I/O starvation at all cost or make the
callbacks-posting-callbacks pattern efficient? I can see several
outcomes of this discussion: we could end up deciding that one or the
other strategy is always best; we could also leave it up to the
implementation (but then I still would want guidance for what to do in
Tulip); we could even decide this is so important that the user needs
to be able to control the policy here (though I hate having many
configuration options, since in practice few people bother to take
control, and you might as well have hard-coded the default...).

Thoughts? Do I need to explain it better?

-- 
--Guido van Rossum (python.org/~guido)