[Python-Dev] A more flexible task creation

Gustavo Carneiro gjcarneiro at gmail.com
Thu Jun 14 13:33:44 EDT 2018


On Thu, 14 Jun 2018 at 17:40, Tin Tvrtković <tinchester at gmail.com> wrote:

> Hi,
>
> I've been using asyncio a lot lately and have encountered this problem
> several times. Imagine you want to do a lot of queries against a database,
> spawning 10000 tasks in parallel will probably cause a lot of them to fail.
> What you need in a task pool of sorts, to limit concurrency and do only 20
> requests in parallel.
>
> If we were doing this synchronously, we wouldn't spawn 10000 threads using
> 10000 connections, we would use a thread pool with a limited number of
> threads and submit the jobs into its queue.
>
> To me, tasks are (somewhat) logically analogous to threads. The solution
> that first comes to mind is to create an AsyncioTaskExecutor with a
> submit(coro, *args, **kwargs) method. Put a reference to the coroutine and
> its arguments into an asyncio queue. Spawn n tasks pulling from this queue
> and awaiting the coroutines.
>

> It'd probably be useful to have this in the stdlib at some point.
>

Probably a good idea, yes, because it seems a rather common use case.

OTOH, I did something similar but for a different use case.  In my case, I
have a Watchdog class, that takes a list of (coro, *args, **kwargs).  What
it does is ensure there is always a task for each of the co-routines
running, and watches the tasks, if they crash they are automatically
restarted (with logging).  Then there is a stop() method to cancel the
watchdog-managed tasks and await them.  My use case is because I tend to
write a lot of singleton-style objects, which need book keeping tasks, or
redis pubsub listening tasks, and my primary concern is not starting lots
of tasks, it is that the few tasks I have must be restarted if they crash,
forever.

This is why I think it's not that hard to write "sugar" APIs on top of
asyncio, and everyone's needs will be different.

The strict API compatibility requirements of core Python stdlib, coupled
with the very long feature release life-cycles of Python, make me think
this sort of thing perhaps is better built in an utility library on top of
asyncio, rather than inside asyncio itself?  18 months is a long long time
to iterate on these features.  I can't wait for Python 3.8...


>
> Date: Wed, 13 Jun 2018 22:45:22 +0200
>> From: Michel Desmoulin <desmoulinmichel at gmail.com>
>> To: python-dev at python.org
>> Subject: [Python-Dev] A more flexible task creation
>> Message-ID: <bca6b319-c436-c8c2-bb0e-6707f0495c49 at gmail.com>
>> Content-Type: text/plain; charset=utf-8
>>
>> I was working on a concurrency limiting code for asyncio, so the user
>> may submit as many tasks as one wants, but only a max number of tasks
>> will be submitted to the event loop at the same time.
>>
>> However, I wanted that passing an awaitable would always return a task,
>> no matter if the task was currently scheduled or not. The goal is that
>> you could add done callbacks to it, decide to force schedule it, etc
>>
>> I dug in the asyncio.Task code, and encountered:
>>
>>     def __init__(self, coro, *, loop=None):
>>         ...
>>         self._loop.call_soon(self._step)
>>         self.__class__._all_tasks.add(self)
>>
>> I was surprised to see that instantiating a Task class has any side
>> effect at all, let alone 2, and one of them being to be immediately
>> scheduled for execution.
>>
>> I couldn't find a clean way to do what I wanted: either you
>> loop.create_task() and you get a task but it runs, or you don't run
>> anything, but you don't get a nice task object to hold on to.
>>
>> I tried several alternatives, like returning a future, and binding the
>> future awaiting to the submission of a task, but that was complicated
>> code that duplicated a lot of things.
>>
>> I tried creating a custom task, but it was even harder, setting a custom
>> event policy, to provide a custom event loop with my own create_task()
>> accepting parameters. That's a lot to do just to provide a parameter to
>> Task, especially if you already use a custom event loop (e.g: uvloop). I
>> was expecting to have to create a task factory only, but task factories
>> can't get any additional parameters from create_task()).
>>
>> Additionally I can't use ensure_future(), as it doesn't allow to pass
>> any parameter to the underlying Task, so if I want to accept any
>> awaitable in my signature, I need to provide my own custom
>> ensure_future().
>>
>> All those implementations access a lot of _private_api, and do other
>> shady things that linters hate; plus they are fragile at best. What's
>> more, Task being rewritten in C prevents things like setting self._coro,
>> so we can only inherit from the pure Python slow version.
>>
>> In the end, I can't even await the lazy task, because it blocks the
>> entire program.
>>
>> Hence I have 2 distinct, but independent albeit related, proposals:
>>
>> - Allow Task to be created but not scheduled for execution, and add a
>> parameter to ensure_future() and create_task() to control this. Awaiting
>> such a task would just do like asyncio.sleep(O) until it is scheduled
>> for execution.
>>
>> - Add an parameter to ensure_future() and create_task() named "kwargs"
>> that accept a mapping and will be passed as **kwargs to the underlying
>> created Task.
>>
>> I insist on the fact that the 2 proposals are independent, so please
>> don't reject both if you don't like one or the other. Passing a
>> parameter to the underlying custom Task is still of value even without
>> the unscheduled instantiation, and vice versa.
>>
>> Also, if somebody has any idea on how to make a LazyTask that we can
>> await on without blocking everything, I'll take it.
>>
>> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com
>


-- 
Gustavo J. A. M. Carneiro
Gambit Research
"The universe is always one step beyond logic." -- Frank Herbert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180614/ecd38555/attachment.html>


More information about the Python-Dev mailing list