[Python-ideas] An alternate approach to async IO

Guido van Rossum guido at python.org
Wed Nov 28 03:59:41 CET 2012


On Tue, Nov 27, 2012 at 5:09 PM, Sturla Molden <sturla at molden.no> wrote:
>
> Den 28. nov. 2012 kl. 01:15 skrev Trent Nelson <trent at snakebite.org>:
>
>>
>>    Right, that's why I proposed using non-Python types as buffers
>>    whilst in the background IO threads.  Once the thread finishes
>>    processing the event, it pushes the necessary details onto a
>>    global interlocked list.  ("Details" being event type and possibly
>>    a data buffer if the event was 'data received'.)
>>
>>    Then, when aio.events() is called, CPython code (holding the GIL)
>>    does an interlocked/atomic flush/pop_all, creates the relevant
>>    Python objects for the events, and returns them in a list for
>>    the calling code to iterate over.
>>
>>    The rationale for all of this is that this approach should scale
>>    better when heavily loaded (i.e. tens of thousands of connections
>>    and/or Gb/s traffic).  When you're dealing with that sort of load
>>    on a many-core machine (let's say 16+ cores), an interlocked list
>>    is going to reduce latency versus 16+ threads constantly vying for
>>    the GIL.
>>
>
> Sorry. I changed my mind. I believe you are right after all :-)

It's always great to see people change their mind.

> I see two benefits:

I may not understand the proposal any more, but...

> 1. It avoids contention for the GIL and avoids excessive context shifts in the CPython interpreter.

Then why not just have one thread?

> 2. It potentially keeps the thread that runs the CPython interpreter in cache, as it is always active. And thus it also keeps the objects associated with the CPython interpreter in cache.

So what code runs in the other threads? I think I'm confused...

> So yes, it might be better after all :-)
>
>
> I don't think it would matter much for multicore scalability, as the Python processing is likely the more expensive part.

To benefit from multicore, you need to find something that requires a
lot of CPU time and can be done without holding on to the GIL. If it's
not copying bytes, what is it?

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list