[Async-sig] "Coroutines" sometimes run without being scheduled on an event loop

Dima Tisnek dimaqq at gmail.com
Thu May 3 21:52:32 EDT 2018


My 2c: don't use py3.4; in fact don't use 3.5 either :)
If you decide to support older Python versions, it's only fair that
separate implementation may be needed.

Re: overall problem, why not try the following:
wrap your individual tasks in async def, where each staggers, connects and
resolves and handles cancellation (if it didn't win the race).
IMO that's easier to reason about, debug and works around your problem ;)

On Fri, 4 May 2018 at 9:34 AM, twisteroid ambassador <
twisteroid.ambassador at gmail.com> wrote:

> The real problem I'm playing with is implementing "happy eyeballs",
> where I may have several sockets attempting to connect simultaneously,
> and the first one to successfully connect gets used. I had the idea of
> preparing all of the loop.sock_connect() coroutine objects in advance,
> and scheduling them one by one on the loop, but wanted to make double
> sure that the sockets won't start connecting before the coroutines are
> scheduled. I wanted to write something like this:
>
> successful_socket = await
> staggered_start([loop.sock_connect(socket.socket(), addr) for addr in
> addresses])
>
> where async def staggered_start(coros) is some kind of reusable
> scheduling logic. As it turns out, I can't actually depend on
> loop.sock_connect() doing the Right Thing (TM) if I want to support
> Python 3.4.
>
> On Fri, May 4, 2018 at 12:37 AM, Andrew Svetlov
> <andrew.svetlov at gmail.com> wrote:
> > What real problem do you want to solve?
> > Correct code should always use `await loop.sock_connect(sock, addr)`, it
> > this case the behavior difference never hurts you.
> >
> > On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador
> > <twisteroid.ambassador at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> tl;dr: coroutine functions and regular functions returning Futures
> >> behave differently: the latter may start running immediately without
> >> being scheduled on a loop, or even with no loop running. This might be
> >> bad since the two are sometimes advertised to be interchangeable.
> >>
> >>
> >> I find that sometimes I want to construct a coroutine object, store it
> >> for some time, and run it later. Most times it works like one would
> >> expect: I call a coroutine function which gives me a coroutine object,
> >> I hold on to the coroutine object, I later await it or use
> >> loop.create_task(), asyncio.gather(), etc. on it, and only then it
> >> starts to run.
> >>
> >> However, I have found some cases where the "coroutine" starts running
> >> immediately. The first example is loop.run_in_executor(). I guess this
> >> is somewhat unsurprising since the passed function don't actually run
> >> in the event loop. Demonstrated below with strace and the interactive
> >> console:
> >>
> >> $ strace -e connect -f python3
> >> Python 3.6.5 (default, Apr  4 2018, 15:01:18)
> >> [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
> >> Type "help", "copyright", "credits" or "license" for more information.
> >> >>> import asyncio
> >> >>> import socket
> >> >>> s = socket.socket()
> >> >>> loop = asyncio.get_event_loop()
> >> >>> coro = loop.sock_connect(s, ('127.0.0.1', 80))
> >> >>> loop.run_until_complete(asyncio.sleep(1))
> >> >>> task = loop.create_task(coro)
> >> >>> loop.run_until_complete(asyncio.sleep(1))
> >> connect(3, {sa_family=AF_INET, sin_port=htons(80),
> >> sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection
> >> refused)
> >> >>> s.close()
> >> >>> s = socket.socket()
> >> >>> coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80))
> >> strace: Process 13739 attached
> >> >>> [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80),
> >> >>> sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection
> refused)
> >>
> >> >>> coro2
> >> <Future pending cb=[_chain_future.<locals>._call_check_cancel() at
> >> /usr/lib64/python3.6/asyncio/futures.py:403]>
> >> >>> loop.run_until_complete(asyncio.sleep(1))
> >> >>> coro2
> >> <Future finished exception=ConnectionRefusedError(111, 'Connection
> >> refused')>
> >> >>>
> >>
> >> Note that with loop.sock_connect(), the connect syscall is only run
> >> after loop.create_task() is called on the coroutine AND the loop is
> >> running. On the other hand, as soon as loop.run_in_executor() is
> >> called on socket.connect, the connect syscall gets called, without the
> >> event loop running at all.
> >>
> >> Another such case is with Python 3.4.2, where even loop.sock_connect()
> >> will run immediately:
> >>
> >> $ strace -e connect -f python3
> >> Python 3.4.2 (default, Oct  8 2014, 10:45:20)
> >> [GCC 4.9.1] on linux
> >> Type "help", "copyright", "credits" or "license" for more information.
> >> >>> import socket
> >> >>> import asyncio
> >> >>> loop = asyncio.get_event_loop()
> >> >>> s = socket.socket()
> >> >>> c = loop.sock_connect(s, ('127.0.0.1', 82))
> >> connect(7, {sa_family=AF_INET, sin_port=htons(82),
> >> sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection
> >> refused)
> >> >>> c
> >> <Future finished exception=ConnectionRefusedError(111, 'Connection
> >> refused')>
> >> >>>
> >>
> >> In both these cases, the misbehaving "coroutine" aren't actually
> >> defined as coroutine functions, but regular functions returning a
> >> Future, which is probably why they don't act like coroutines. However,
> >> coroutine functions and regular functions returning Futures are often
> >> used interchangeably: Python docs Section 18.5.3.1 even says:
> >>
> >> > Note: In this documentation, some methods are documented as
> coroutines,
> >> > even if they are plain Python functions returning a Future. This is
> >> > intentional to have a freedom of tweaking the implementation of these
> >> > functions in the future.
> >>
> >> In particular, both run_in_executor() and sock_connect() are
> >> documented as coroutines.
> >>
> >> If an asyncio API may change from a function returning Future to a
> >> coroutine function and vice versa any time, then one cannot rely on
> >> the behavior of creating the "coroutine object" not running the
> >> coroutine immediately. This seems like an important Gotcha waiting to
> >> bite someone.
> >>
> >> Back to the scenario in the beginning. If I want to write a function
> >> that takes coroutine objects and schedule them to run later, and some
> >> coroutine objects turn out to be misbehaving like above, then they
> >> will run too early. To avoid this, I could either 1. pass the
> >> coroutine functions and their arguments separately "callback style",
> >> 2. use functools.partial or lambdas, or 3. always pass in real
> >> coroutine objects returned from coroutine functions defined with
> >> "async def". Does this sound right?
> >>
> >> Thanks,
> >>
> >> twistero
> >> _______________________________________________
> >> Async-sig mailing list
> >> Async-sig at python.org
> >> https://mail.python.org/mailman/listinfo/async-sig
> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
> >
> > --
> > Thanks,
> > Andrew Svetlov
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20180504/5e911b2f/attachment-0001.html>


More information about the Async-sig mailing list