Asyncio question (rmlibre)

rmlibre at riseup.net rmlibre at riseup.net
Thu Feb 27 18:37:31 EST 2020


What resources are you trying to conserve? 


If you want to try conserving time, you shouldn't have to worry about
starting too many background tasks. That's because asyncio code was
designed to be extremely time efficient at handling large numbers of
concurrent async tasks. 

For your application, it seems starting background tasks that
appropriately await execution based on their designated queue is a good
idea. This is more time efficient since it takes full advantage of async
concurrency, while also allowing you to control the order of execution.

Although, there may be other efficiency boosts to be had, for instance,
if all but the precise changes that need to be atomic are run
concurrently. 


However, if you want to conserve cpu cycles per unit time, then
staggering the processing of requests sequentially is the best option,
although, there's little need for async code if this is the case. 


Or, if you'd like to conserve memory, making the code more
generator-based is a good option. Lazy computation is quite efficient on
memory and time. Although, rewriting your codebase to run on generators
can be a lot of work, and their efficiency won't really be felt unless
your code is handling "big data" or very large requests.


In any case, you'd probably want to run some benchmark and profiling
tools against a mock-up runtime of your code and optimize/experiment
only after you've noticed there's an efficiency problem and have deduced
its causes. Barring that, it's just guess-work & may just be a waste of
time.




On 2020-02-21 17:00, python-list-request at python.org wrote:
> Hi all

> I use asyncio in my project, and it works very well without my having to understand what goes on under the hood. It is a multi-user client/server system, and I want it to scale to many concurrent users. I have a situation where I have to decide between two approaches, and I want to choose the least resource-intensive, but I find it hard to reason about which, if either, is better.
>
> I use HTTP. On the initial connection from a client, I set up a session object, and the session id is passed to the client. All subsequent requests from that client include the session id, and the request is passed to the session object for handling.
>
> It is possible for a new request to be received from a client before the previous one has been completed, and I want each request to be handled atomically, so each session maintains its own asyncio.Queue(). The main routine gets the session id from the request and 'puts' the request in the appropriate queue. The session object 'gets' from the queue and handles the request. It works well.
>
> The question is, how to arrange for each session to 'await' its queue. My first attempt was to create a background task for each session which runs for the life-time of the session, and 'awaits' its queue. It works, but I was concerned about having a lot a background tasks active at the same time.
>
> Then I came up with what I thought was a better idea. On the initial connection, I create the session object, send the response to the client, and then 'await' the method that sets up the session's queue. This also works, and there is no background task involved. However, I then realised that the initial response handler never completes, and will 'await' until the session is closed.
>
> Is this better, worse, or does it make no difference? If it makes no difference, I will lean towards the first approach, as it is easier to reason about what is going on.
>
> Thanks for any advice.
>
> Frank Millman


More information about the Python-list mailing list