From nugend at gmail.com Fri Mar 22 12:59:06 2019 From: nugend at gmail.com (Daniel Nugent) Date: Fri, 22 Mar 2019 12:59:06 -0400 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> Message-ID: <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Hello, I was hoping that the Async SIG might have some suggestions on how to deal with this sort of issue: More frameworks are adopting asyncio as time marches on. A notable example of this is Jupyter and the Python kernels it supports (please see announcement here: blog.jupyter.org/ipython-7-0-async-repl-a35ce050f7f7). This was enabled by a change in Tornado version 5.0 to support the asyncio event loop. The problem is that this makes any code which inadvertently ran an asyncio event loop (that is, calls through a blocking API provided by a library implemented in asyncio) fail. The Jupyter developers seem to feel that this is a deficiency in the asyncio event loop model and suggest all users encountering such a problem adopt the patch module nest_asyncio (github.com/jupyter/notebook/issues/3397#issuecomment-419474214). However, it is my understanding that the Python team strongly feels this is not the correct path: bugs.python.org/issue33523bugs.python.org/issue29558bugs.python.org/issue22239 I have been trying to figure out the right way to work around this issue such that a library implemented with asyncio that provides a synchronous API will not cause a problem and have come up short thus far. I was considering investigating the janus sync/async queue as a way of facilitating communication between the different modes, but I am not sure that the scenario I describe reflects the intended usage. That is, an outer asyncio driven program fragment calls into middle synchronous code, which calls to inner asynchronous code. It seems that janus is mostly intended to facilitate communication between a single outer asynchronous layer and an inner synchronous layer. However, the documentation is a little sparse so I may just not understand it yet. I don't believe I'm the only person struggling to figure out how to deal with this sort of situation, so I think this would be useful for the community to figure out a solid answer to. For example, I found this blog post which outlines the same sort of problem and suggests that they elected to use nest_asyncio threespeedlogic.com/python-tworoutines.html If anyone could provide guidance on how to go forward, I would appreciate it. I would also like to understand the decision making around not allowing event loop nesting/reentrancy as seen in the bugs.python.orgissues I referenced so that I may explain the tradeoffs of possibly adopting the nest_asyncio patch module (for the sake of argument, lets ignore the possible issues with non-standard event loops) better to my peers. Thank you, -Dan Nugent -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Mar 25 19:37:07 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Mar 2019 16:37:07 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: Thanks for bringing this up -- I think it will be good to get to the bottom of this, before the Jupyter folks accidentally get everyone to use an approach that is unsound. Maybe they can be redirected to a better strategy, or maybe they can convince us to change asyncio: it's totally possible that the reasoning behind this restriction is no longer really valid. I expect that Yury will have to jump in, but I believe he's busy with a release. I also hope Nathaniel has something to say -- I wonder if trio supports nested event loops? (And maybe a Tornado developer?) In the meantime, my take on this is that a nested event loop invocation by a nominally synchronous function, i.e. something that calls run_until_complete(), violates a guarantee that asyncio makes: callbacks may only run when `await` is called, and thus any state that is shared between callbacks or tasks is implicitly protected from mutation *between* `await` calls. This means that asyncio programmers don't have to worry about a class of nasty threading bugs caused by arbitrary interleaving of threads. For example, take these three lines from asyncio/queue.py: self._put(item) self._unfinished_tasks += 1 self._finished.clear() Because there's no `await` visible here, as a reader I know that the accounting of finished and unfinished tasks here will always be in a consistent state when some other piece of code receives control. Also, I looked into nest_asyncio, and I definitely think it should not be recommended -- it disables use of the asyncio accelerator classes implemented in C (starting with Python 3.6). One final thing. What we're talking about here is nested invocation of the "event pump". There's another form of nested event loop invocation where two separate event loop objects exist. That is a much more worrisome scenario, because callbacks associated with one event loop won't run at all while one is waiting for a task on the other loop. Fortunately that's not what is requested here. :-) --Guido On Fri, Mar 22, 2019 at 10:00 AM Daniel Nugent wrote: > Hello, I was hoping that the Async SIG might have some suggestions on how > to deal with this sort of issue: > > More frameworks are adopting asyncio as time marches on. A notable example > of this is Jupyter and the Python kernels it supports (please see > announcement here: blog.jupyter.org/ipython-7-0-async-repl-a35ce050f7f7). > This was enabled by a change in Tornado version 5.0 to support the asyncio > event loop. > > The problem is that this makes any code which inadvertently ran an asyncio > event loop (that is, calls through a blocking API provided by a library > implemented in asyncio) fail. The Jupyter developers seem to feel that this > is a deficiency in the asyncio event loop model and suggest all users > encountering such a problem adopt the patch module nest_asyncio ( > github.com/jupyter/notebook/issues/3397#issuecomment-419474214). > > However, it is my understanding that the Python team strongly feels this > is not the correct path: bugs.python.org/issue33523 > bugs.python.org/issue29558bugs.python.org/issue22239 > > I have been trying to figure out the right way to work around this issue > such that a library implemented with asyncio that provides a synchronous > API will not cause a problem and have come up short thus far. I was > considering investigating the janus sync/async queue as a way of > facilitating communication between the different modes, but I am not sure > that the scenario I describe reflects the intended usage. That is, an outer > asyncio driven program fragment calls into middle synchronous code, which > calls to inner asynchronous code. It seems that janus is mostly intended to > facilitate communication between a single outer asynchronous layer and an > inner synchronous layer. However, the documentation is a little sparse so I > may just not understand it yet. > > I don't believe I'm the only person struggling to figure out how to deal > with this sort of situation, so I think this would be useful for the > community to figure out a solid answer to. For example, I found this blog > post which outlines the same sort of problem and suggests that they elected > to use nest_asyncio threespeedlogic.com/python-tworoutines.html > > If anyone could provide guidance on how to go forward, I would appreciate > it. > > I would also like to understand the decision making around not allowing > event loop nesting/reentrancy as seen in the bugs.python.orgissues I > referenced so that I may explain the tradeoffs of possibly adopting the > nest_asyncio patch module (for the sake of argument, lets ignore the > possible issues with non-standard event loops) better to my peers. > > Thank you, > > -Dan Nugent > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at bendarnell.com Mon Mar 25 19:54:46 2019 From: ben at bendarnell.com (Ben Darnell) Date: Mon, 25 Mar 2019 19:54:46 -0400 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: On Mon, Mar 25, 2019 at 7:37 PM Guido van Rossum wrote: > Thanks for bringing this up -- I think it will be good to get to the > bottom of this, before the Jupyter folks accidentally get everyone to use > an approach that is unsound. Maybe they can be redirected to a better > strategy, or maybe they can convince us to change asyncio: it's totally > possible that the reasoning behind this restriction is no longer really > valid. > > I expect that Yury will have to jump in, but I believe he's busy with a > release. I also hope Nathaniel has something to say -- I wonder if trio > supports nested event loops? (And maybe a Tornado developer?) > Tornado does allow for nested event loops (or did, before we adopted asyncio). It doesn't allow nested invocations of the *same* event loop. > > One final thing. What we're talking about here is nested invocation of the > "event pump". There's another form of nested event loop invocation where > two separate event loop objects exist. That is a much more worrisome > scenario, because callbacks associated with one event loop won't run at all > while one is waiting for a task on the other loop. Fortunately that's not > what is requested here. :-) > > I actually think that nesting multiple event loops is not so problematic, or at least not so problematic to be worth explicitly prohibiting. You wouldn't want to run_forever an inner event loop while an outer one is blocked, but using an inner short-lived event loop is not so bad. It's not good, because it does block the outer event loop, but there are plenty of things you could do that do that - use requests instead of an async http client, use an inner event loop from a different library that you can't detect, etc. Why single out nesting one asyncio event loop inside another as something to prohibit? In the past, when I converted a django app to use tornado I went through a phase where there were multiple nested IOLoops. First convert all the outgoing network calls (which I guess were urllib2 at the time; requests didn't exist yet) to spin up a short-lived IOLoop and run tornado's AsyncHTTPClient (using the equivalent of IOLoop.run_sync, although that method hadn't been added yet). Then replace the outer django handlers with tornado handlers one at a time (using tornado's WSGIContainer to run the django parts). Once WSGIContainer was gone, I could change all the run_sync calls to yields so everything ran on the outer event loop. It wasn't the prettiest or fastest thing I've ever done, but it worked. As for jupyter, I think the best thing for them to do is run all notebook user code in a separate thread dedicated to that purpose, and hide the fact that the notebook itself is running asyncio as much as possible. That user thread can start up its own event loop if it wants, but that's not the jupyter kernel's concern. Until it can be refactored to use separate threads, I think it would be reasonable to let it start up new event loops (and run them for finite durations), although asyncio currently disallows that as long as you're on the same thread as an outer event loop. -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Mar 25 20:01:59 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Mar 2019 17:01:59 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: On Mon, Mar 25, 2019 at 4:54 PM Ben Darnell wrote: > On Mon, Mar 25, 2019 at 7:37 PM Guido van Rossum wrote: > >> Thanks for bringing this up -- I think it will be good to get to the >> bottom of this, before the Jupyter folks accidentally get everyone to use >> an approach that is unsound. Maybe they can be redirected to a better >> strategy, or maybe they can convince us to change asyncio: it's totally >> possible that the reasoning behind this restriction is no longer really >> valid. >> >> I expect that Yury will have to jump in, but I believe he's busy with a >> release. I also hope Nathaniel has something to say -- I wonder if trio >> supports nested event loops? (And maybe a Tornado developer?) >> > > Tornado does allow for nested event loops (or did, before we adopted > asyncio). It doesn't allow nested invocations of the *same* event loop. > Good to know. > One final thing. What we're talking about here is nested invocation of the >> "event pump". There's another form of nested event loop invocation where >> two separate event loop objects exist. That is a much more worrisome >> scenario, because callbacks associated with one event loop won't run at all >> while one is waiting for a task on the other loop. Fortunately that's not >> what is requested here. :-) >> > > I actually think that nesting multiple event loops is not so problematic, > or at least not so problematic to be worth explicitly prohibiting. You > wouldn't want to run_forever an inner event loop while an outer one is > blocked, but using an inner short-lived event loop is not so bad. It's not > good, because it does block the outer event loop, but there are plenty of > things you could do that do that - use requests instead of an async http > client, use an inner event loop from a different library that you can't > detect, etc. Why single out nesting one asyncio event loop inside another > as something to prohibit? > Hm, I didn't mean to single out nesting asyncio. According to (the extreme version of) asyncio's philosophy, *anything* that does I/O is a no-no. (Yes, some people feel even disk I/O should be done asynchronously, and there's a real implementation of that somewhere. Trio supports this: https://trio.readthedocs.io/en/latest/reference-io.html#asynchronous-filesystem-i-o .) > In the past, when I converted a django app to use tornado I went through a > phase where there were multiple nested IOLoops. First convert all the > outgoing network calls (which I guess were urllib2 at the time; requests > didn't exist yet) to spin up a short-lived IOLoop and run tornado's > AsyncHTTPClient (using the equivalent of IOLoop.run_sync, although that > method hadn't been added yet). Then replace the outer django handlers with > tornado handlers one at a time (using tornado's WSGIContainer to run the > django parts). Once WSGIContainer was gone, I could change all the run_sync > calls to yields so everything ran on the outer event loop. It wasn't the > prettiest or fastest thing I've ever done, but it worked. > I have to admit that practicality probably beats purity here. > As for jupyter, I think the best thing for them to do is run all notebook > user code in a separate thread dedicated to that purpose, and hide the fact > that the notebook itself is running asyncio as much as possible. That user > thread can start up its own event loop if it wants, but that's not the > jupyter kernel's concern. Until it can be refactored to use separate > threads, I think it would be reasonable to let it start up new event loops > (and run them for finite durations), although asyncio currently disallows > that as long as you're on the same thread as an outer event loop. > Given PBP, I wonder if we should just relent and have a configurable flag (off by default) to allow nested loop invocations (both the same loop and a different loop). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at bendarnell.com Mon Mar 25 20:11:04 2019 From: ben at bendarnell.com (Ben Darnell) Date: Mon, 25 Mar 2019 20:11:04 -0400 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: On Mon, Mar 25, 2019 at 8:02 PM Guido van Rossum wrote: > > Given PBP, I wonder if we should just relent and have a configurable flag > (off by default) to allow nested loop invocations (both the same loop and a > different loop). > > Allowing reentrant calls to the same loop is not a good idea IMO. At best, you'll need to carefully ensure that the event loop and task implementations are themselves reentrancy-safe (including the C accelerators and third parties like uvloop?), and then it just invites subtle issues in the applications built on top of it. I don't think there's a good reason to allow or support this (and nest_asyncio should be heavily discouraged). I do, however, think that PBP is a good enough reason to allow opt-in use of multiple event loops nested inside each other (maybe something on the EventLoopPolicy for configuration?). -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Mar 25 20:19:02 2019 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Mar 2019 17:19:02 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: On Mon, Mar 25, 2019 at 5:11 PM Ben Darnell wrote: > On Mon, Mar 25, 2019 at 8:02 PM Guido van Rossum wrote: > >> >> Given PBP, I wonder if we should just relent and have a configurable flag >> (off by default) to allow nested loop invocations (both the same loop and a >> different loop). >> >> > Allowing reentrant calls to the same loop is not a good idea IMO. At best, > you'll need to carefully ensure that the event loop and task > implementations are themselves reentrancy-safe (including the C > accelerators and third parties like uvloop?), and then it just invites > subtle issues in the applications built on top of it. I don't think there's > a good reason to allow or support this (and nest_asyncio should be heavily > discouraged). I do, however, think that PBP is a good enough reason to > allow opt-in use of multiple event loops nested inside each other (maybe > something on the EventLoopPolicy for configuration?). > Well, at least I am not alone in being very wary about nest_asyncio (and disappointed that Jupyter recommends it). It would certainly require carefully ensuring reentrancy of the asyncio implementation. I guess that's one reason why nest_asyncio disables the C accelerators and doesn't work with uvloop. Regarding reentrancy of applications, I think that would be somewhat limited -- the critical section I showed in my first message in this thread would still be safe, as long as the queue implementation chooses not to call out to code that uses run_until_complete(). We might need a convention to document whether something runs an event loop (in the strict asyncio philosophy this convention is `async def` of course :-). I guess calling out to a different event loop is no worse than calling out to requests -- I consider both strong violations of asyncio's ideals. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Mar 25 20:37:51 2019 From: glyph at twistedmatrix.com (Glyph) Date: Mon, 25 Mar 2019 17:37:51 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: <94598207-bfbc-4a70-a0a9-51f907d93d02@Frontier> > There's another form of nested event loop invocation where two separate event loop objects exist. That is a much more worrisome scenario, because callbacks associated with one event loop won't run at all while one is waiting for a task on the other loop. This strikes me as a much *less* worrisome scenario. It?s probably a bad idea in application code (just ... await the thing you want to run_until_complete?) but it allows things like debuggers and performance telemetry reporters to use asyncio internally while presenting a necessarily synchronous interface to the caller. So this might be a huge performance problem if you do it accidentally, but it?ll also be relatively easy to spot, and crucially it doesn?t violate any explicit guarantees of the underlying API. -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Mar 25 20:44:06 2019 From: glyph at twistedmatrix.com (Glyph) Date: Mon, 25 Mar 2019 17:44:06 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: > As for jupyter, I think the best thing for them to do is run all notebook user code in a separate thread dedicated to that purpose, and hide the fact that the notebook itself is running asyncio as much as possible. That user thread can start up its own event loop if it wants, but that's not the jupyter kernel's concern. Until it can be refactored to use separate threads, I think it would be reasonable to let it start up new event loops (and run them for finite durations), although asyncio currently disallows that as long as you're on the same thread as an outer event loop.? Definitely disagree about this! If you start hiding this, then it?s impossible to start background tasks which run on the event loop and update a cell; not to mention that depending on your library (ie whether it?s something the kernel itself wants to import), then makes the thread that defined *some* of your classes at import time start being different from the thread that?s executing your code. I frequently use tornado.platform.twisted to do async background work in notebooks and it would break a ton of my work to start requiring manual event-loop management that can?t persist between cells. It?s fine for the kernel to just block for a while if it has synchronous work to do; that?s one of the core benefits of separating the kernels from the UI. -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Mar 25 20:52:03 2019 From: glyph at twistedmatrix.com (Glyph) Date: Mon, 25 Mar 2019 17:52:03 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: <6f59bc44-4343-4969-854f-5a3a37b96124@Frontier> > Allowing reentrant calls to the same loop is not a good idea IMO. At best, you'll need to carefully ensure that the event loop and task implementations are themselves reentrancy-safe (including the C accelerators and third parties like uvloop?), and then it just invites subtle issues in the applications built on top of it. I don't think there's a good reason to allow or support this (and nest_asyncio should be heavily discouraged). I do, however, think that PBP is a good enough reason to allow opt-in use of multiple event loops nested inside each other (maybe something on the EventLoopPolicy for configuration?).?? +1 to all of this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dimaqq at gmail.com Mon Mar 25 21:59:06 2019 From: dimaqq at gmail.com (Dima Tisnek) Date: Tue, 26 Mar 2019 10:59:06 +0900 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <6f59bc44-4343-4969-854f-5a3a37b96124@Frontier> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <6f59bc44-4343-4969-854f-5a3a37b96124@Frontier> Message-ID: End-user point of view, a.k.a. my 2c: re more worrisome scenario: if "objects" from two event loops depends on each other, that's unsolvable in general case. On the other hand, what OP wanted, was akin to DAG-like functionality or locking hierarchy. Naive implementation would block caller callbacks until callee completes, but that may be what the user actually wanted (?). re ipython notebook state reuse across cells: that's a whole different can of worms, because cells can be re-evaluated in arbitrary order. As a user I would expect my async code to not interfere with ipynb internal implementation. In fact, I'd rather see ipynb isolated into own thread/loop/process. After all, I would, at times like to use a debugger. (full disclosure: I use debugger in ipython and it never really worked for me in sync notebook, let alone async). re original proposal: async code calls a synchronous function that wants to do some async work and wait for the result, for example, telemetry bolt-on. I would expect the 2 event loops to be isolated. Attempting to await across loop should raise an exception, as it does. When some application wants to coordinate things that happen in multiple event loops, it should be the application's problem. I think this calls for a higher-level paradigm, something that allows suspension and resumption of entire event loops (maybe executors?) or something that allows several event loops to run without being aware of each other (threads?). I feel that just adding the flag to allow creation / setting of event loop is not enough. We'd need at least a stack where event loops can be pushed and popped from, and possibly more... Cheers, D. On Tue, 26 Mar 2019 at 09:52, Glyph wrote: > > Allowing reentrant calls to the same loop is not a good idea IMO. At best, you'll need to carefully ensure that the event loop and task implementations are themselves reentrancy-safe (including the C accelerators and third parties like uvloop?), and then it just invites subtle issues in the applications built on top of it. I don't think there's a good reason to allow or support this (and nest_asyncio should be heavily discouraged). I do, however, think that PBP is a good enough reason to allow opt-in use of multiple event loops nested inside each other (maybe something on the EventLoopPolicy for configuration?). > > > +1 to all of this. > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From nugend at gmail.com Tue Mar 26 13:35:19 2019 From: nugend at gmail.com (Daniel Nugent) Date: Tue, 26 Mar 2019 13:35:19 -0400 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <6f59bc44-4343-4969-854f-5a3a37b96124@Frontier> Message-ID: <77fb5123-6cb9-4480-b830-1cda50cdceb5@Spark> Not sure if it helps, but I got something working for the problem I was experiencing by detecting if there was a currently running event loop, and then at the synchronous call points, creating and running a new loop on a separate thread. This makes the object in question synchronous or asynchronous but not both. This was kind of a pain in the butt though and it blocks the outer loop anyway. I think I?m in favor of a configurable option to allow separate nested loops if possible. In the narrow situation I am concerned with of allowing library writers to provide synchronous APIs to otherwise asynchronous code that has to run in a world it can?t make usage demands of, it?s a good solution. In that scenario, I think the details of the underlying inner event loop likely won?t leak out to the outer event loop (creating a cross event loop dependency) when it?s being used synchronously. -Dan Nugent On Mar 25, 2019, 21:59 -0400, Dima Tisnek , wrote: > End-user point of view, a.k.a. my 2c: > > re more worrisome scenario: if "objects" from two event loops depends > on each other, that's unsolvable in general case. On the other hand, > what OP wanted, was akin to DAG-like functionality or locking > hierarchy. Naive implementation would block caller callbacks until > callee completes, but that may be what the user actually wanted (?). > > re ipython notebook state reuse across cells: that's a whole different > can of worms, because cells can be re-evaluated in arbitrary order. As > a user I would expect my async code to not interfere with ipynb > internal implementation. In fact, I'd rather see ipynb isolated into > own thread/loop/process. After all, I would, at times like to use a > debugger. > (full disclosure: I use debugger in ipython and it never really worked > for me in sync notebook, let alone async). > > re original proposal: async code calls a synchronous function that > wants to do some async work and wait for the result, for example, > telemetry bolt-on. I would expect the 2 event loops to be isolated. > Attempting to await across loop should raise an exception, as it does. > When some application wants to coordinate things that happen in > multiple event loops, it should be the application's problem. > > > I think this calls for a higher-level paradigm, something that allows > suspension and resumption of entire event loops (maybe executors?) or > something that allows several event loops to run without being aware > of each other (threads?). > > > I feel that just adding the flag to allow creation / setting of event > loop is not enough. > We'd need at least a stack where event loops can be pushed and popped > from, and possibly more... > > Cheers, > D. > > On Tue, 26 Mar 2019 at 09:52, Glyph wrote: > > > > Allowing reentrant calls to the same loop is not a good idea IMO. At best, you'll need to carefully ensure that the event loop and task implementations are themselves reentrancy-safe (including the C accelerators and third parties like uvloop?), and then it just invites subtle issues in the applications built on top of it. I don't think there's a good reason to allow or support this (and nest_asyncio should be heavily discouraged). I do, however, think that PBP is a good enough reason to allow opt-in use of multiple event loops nested inside each other (maybe something on the EventLoopPolicy for configuration?). > > > > > > +1 to all of this. > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Mar 26 14:36:45 2019 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Mar 2019 11:36:45 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <77fb5123-6cb9-4480-b830-1cda50cdceb5@Spark> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <6f59bc44-4343-4969-854f-5a3a37b96124@Frontier> <77fb5123-6cb9-4480-b830-1cda50cdceb5@Spark> Message-ID: Maybe running two *independent* loops should just always be allowed? As was said, it should be no worse than calling requests.get(). There currently is an explicit check against this -- deleting that check seems to make this work, as long as you close the nested loop explicitly. (This may be something that we should fix too, I don't have time to look into it right now.) Are there use cases in Jupyter that wouldn't be satisfied by using a *different* event loop? On Tue, Mar 26, 2019 at 11:01 AM Daniel Nugent wrote: > Not sure if it helps, but I got something working for the problem I was > experiencing by detecting if there was a currently running event loop, and > then at the synchronous call points, creating and running a new loop on a > separate thread. This makes the object in question synchronous or > asynchronous but not both. > > This was kind of a pain in the butt though and it blocks the outer loop > anyway. > > I think I?m in favor of a configurable option to allow separate nested > loops if possible. In the narrow situation I am concerned with of allowing > library writers to provide synchronous APIs to otherwise asynchronous code > that has to run in a world it can?t make usage demands of, it?s a good > solution. In that scenario, I think the details of the underlying inner > event loop likely won?t leak out to the outer event loop (creating a cross > event loop dependency) when it?s being used synchronously. > > -Dan Nugent > On Mar 25, 2019, 21:59 -0400, Dima Tisnek , wrote: > > End-user point of view, a.k.a. my 2c: > > re more worrisome scenario: if "objects" from two event loops depends > on each other, that's unsolvable in general case. On the other hand, > what OP wanted, was akin to DAG-like functionality or locking > hierarchy. Naive implementation would block caller callbacks until > callee completes, but that may be what the user actually wanted (?). > > re ipython notebook state reuse across cells: that's a whole different > can of worms, because cells can be re-evaluated in arbitrary order. As > a user I would expect my async code to not interfere with ipynb > internal implementation. In fact, I'd rather see ipynb isolated into > own thread/loop/process. After all, I would, at times like to use a > debugger. > (full disclosure: I use debugger in ipython and it never really worked > for me in sync notebook, let alone async). > > re original proposal: async code calls a synchronous function that > wants to do some async work and wait for the result, for example, > telemetry bolt-on. I would expect the 2 event loops to be isolated. > Attempting to await across loop should raise an exception, as it does. > When some application wants to coordinate things that happen in > multiple event loops, it should be the application's problem. > > > I think this calls for a higher-level paradigm, something that allows > suspension and resumption of entire event loops (maybe executors?) or > something that allows several event loops to run without being aware > of each other (threads?). > > > I feel that just adding the flag to allow creation / setting of event > loop is not enough. > We'd need at least a stack where event loops can be pushed and popped > from, and possibly more... > > Cheers, > D. > > On Tue, 26 Mar 2019 at 09:52, Glyph wrote: > > > Allowing reentrant calls to the same loop is not a good idea IMO. At best, > you'll need to carefully ensure that the event loop and task > implementations are themselves reentrancy-safe (including the C > accelerators and third parties like uvloop?), and then it just invites > subtle issues in the applications built on top of it. I don't think there's > a good reason to allow or support this (and nest_asyncio should be heavily > discouraged). I do, however, think that PBP is a good enough reason to > allow opt-in use of multiple event loops nested inside each other (maybe > something on the EventLoopPolicy for configuration?). > > > +1 to all of this. > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov at gmail.com Tue Mar 26 14:56:37 2019 From: yselivanov at gmail.com (Yury Selivanov) Date: Tue, 26 Mar 2019 14:56:37 -0400 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> Message-ID: <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> > On Mar 25, 2019, at 8:01 PM, Guido van Rossum wrote: > > Given PBP, I wonder if we should just relent and have a configurable flag (off by default) to allow nested loop invocations (both the same loop and a different loop). > I think that if we implement this feature behind a flag then some libraries will start requiring that flag to be set. Which will inevitably lead us to a situation where it's impossible to use asyncio without the flag. Therefore I suppose we should either just implement this behaviour by default or defer this to 3.9 or later. I myself am -1 on making 'run_until_complete()' reentrant. The separation of async/await code and blocking code is painful enough to some people, introducing another "hybrid" mode will ultimately do more damage than good. E.g. it's hard to reason about this even for me: I simply don't know if I can make uvloop (or asyncio) fully reentrant. In case of Jupyter I don't think it's a good idea for them to advertise nest_asyncio. IMHO the right approach would be to encourage library developers to expose async/await APIs and teach Jupyter users to "await" on async code directly. The linked Jupyter issue (https://github.com/jupyter/notebook/issues/3397 ) is a good example: someone tries to call "asyncio.get_event_loop().run_until_complete(foo())" and the call fails. Instead of recommending to use "nest_asyncio", Jupyter REPL could simply catch the error and suggest the user to await "foo()". We can make that slightly easier by changing the exception type from RuntimeError to NestedAsyncioLoopError. In other words, in the Jupyters case, I think it's a UI/UX problem, not an asyncio problem. Yury -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Mar 26 23:33:39 2019 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 26 Mar 2019 20:33:39 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> Message-ID: On Mon, Mar 25, 2019 at 4:37 PM Guido van Rossum wrote: > > I also hope Nathaniel has something to say -- I wonder if trio supports nested event loops? Trio does have a similar check to prevent starting a new Trio loop inside a running Trio loop, and there's currently no way to disable it: https://github.com/python-trio/trio/blob/444234392c064c0ec5e66b986a693e2e9f76bc58/trio/_core/_run.py#L1398-L1402 Like the comment says, I could imagine changing this if there's a good reason. On Tue, Mar 26, 2019 at 11:56 AM Yury Selivanov wrote: > I think that if we implement this feature behind a flag then some libraries will start requiring that flag to be set. Which will inevitably lead us to a situation where it's impossible to use asyncio without the flag. Therefore I suppose we should either just implement this behaviour by default or defer this to 3.9 or later. It is weird that if you have a synchronous public interface, then it acts differently depending on whether you happened to implement that interface using the socket module directly vs using asyncio. If you want to "hide" that your synchronous API uses asyncio internally, then you can actually do that now using public/quasi-public APIs: def asyncio_run_encapsulated(*args, **kwargs): old_loop = asyncio.get_running_loop() try: asyncio._set_running_loop(None) return asyncio.run(*args, **kwargs) finally: asyncio._set_running_loop(old_loop) def my_sync_api(...): return asyncio_run_encapsulated(my_async_api(...)) But this is also a bit weird, because the check is useful. It's weird that a blocking socket-module-based implementation and a blocking asyncio-based implementation act differently, but arguably the way to make them consistent is to fix the socket module so that it does give an error if you try to issue blocking calls from inside asyncio, rather than remove the error from asyncio. In fact newcomers often make mistakes like using time.sleep or requests from inside async code, and a common question is how to catch this in real code bases. I wonder if we should have an interpreter-managed thread-local flag "we're in async mode", and make blocking operations in the stdlib check it. E.g. as a straw man, sys.set_allow_blocking(True/False), sys.get_allow_blocking(), sys.check_allow_blocking() -> raises an exception if sys.get_allow_blocking() is False, and then add calls to sys.check_allow_blocking() in time.sleep, socket operations with blocking mode enabled, etc. (And encourage third-party libraries that do their own blocking I/O without going through the stdlib to add similar calls.) Async I/O libraries (asyncio/trio/twisted/...) would set the flag appropriately; and if someone like IPython *really wants* to perform blocking operations inside async context, they can fiddle with the flag themselves. > I myself am -1 on making 'run_until_complete()' reentrant. The separation of async/await code and blocking code is painful enough to some people, introducing another "hybrid" mode will ultimately do more damage than good. E.g. it's hard to reason about this even for me: I simply don't know if I can make uvloop (or asyncio) fully reentrant. Yeah, pumping the I/O loop from inside a task that's running on the I/O loop is just a mess. It breaks the async/await readability guarantees, it risks stack overflow, and by the time this stuff bites you you're going to have to backtrack a lonnng way to get to something sensible. Trio definitely does not support this, and I will fight to keep it that way :-). Most traditional GUI I/O loops *do* allow this, and in the traditional Twisted approach of trying to support all the I/O loop APIs on top of each other, this can be a problem ? if you want an adapter to run Qt or Gtk apps on top of your favorite asyncio loop implementation, then your loop implementation needs to support reentrancy. But I guess so far people are OK with doing things the other way (implementing the asyncio APIs on top of the standard GUI event loops). In Trio I have a Cunning Scheme to avoid doing either approach, but we'll see how that goes... > In case of Jupyter I don't think it's a good idea for them to advertise nest_asyncio. IMHO the right approach would be to encourage library developers to expose async/await APIs and teach Jupyter users to "await" on async code directly. > > The linked Jupyter issue (https://github.com/jupyter/notebook/issues/3397) is a good example: someone tries to call "asyncio.get_event_loop().run_until_complete(foo())" and the call fails. Instead of recommending to use "nest_asyncio", Jupyter REPL could simply catch the error and suggest the user to await "foo()". We can make that slightly easier by changing the exception type from RuntimeError to NestedAsyncioLoopError. In other words, in the Jupyters case, I think it's a UI/UX problem, not an asyncio problem. I think this might be too simplistic... Jupyter/IPython are in a tricky place, where some users reasonably want to treat them like a regular REPL, so calling 'asyncio.run(...)' should be supported (and not supporting it would be a backcompat break). But, other users want first-class async/await support integrated into some persistent event loop. (And as Glyph points out, not supporting this is *also* potentially a backcompat break, though probably a much less disruptive one.) To me the key observation is that in Jupyter/IPython, they want their async/await support to work with multiple async library backends. Therefore, they're not going to get away with letting people just assume that there's some ambient Tornado-ish loop running -- they *need* some way to hide that away as an implementation detail, and an interface for users to state which async loop they want to use. Given that, IMO it makes most sense for them to default to providing a sync context by default, by whatever mechanism makes sense -- for a Jupyter kernel, maybe this is a dedicated thread for running user code, whatever. And then for Glyph and everyone who wants to access ambient async functionality from inside the REPL, that's something you opt-in to by running in a special mode, or writing %asyncio at the top of your notebook. -n -- Nathaniel J. Smith -- https://vorpus.org From glyph at twistedmatrix.com Wed Mar 27 01:06:22 2019 From: glyph at twistedmatrix.com (Glyph) Date: Tue, 26 Mar 2019 22:06:22 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <6f59bc44-4343-4969-854f-5a3a37b96124@Frontier> <77fb5123-6cb9-4480-b830-1cda50cdceb5@Spark> Message-ID: > On Mar 26, 2019, at 11:36 AM, Guido van Rossum wrote: > > Maybe running two *independent* loops should just always be allowed? As was said, it should be no worse than calling requests.get(). There currently is an explicit check against this -- deleting that check seems to make this work, as long as you close the nested loop explicitly. (This may be something that we should fix too, I don't have time to look into it right now.) > > Are there use cases in Jupyter that wouldn't be satisfied by using a *different* event loop? Agreed - with the one caveat that perhaps run_until_complete specifically should complain unless you say 'reentrantly=True', just to give people who may not even realize that Jupyter (or whatever other host environment, this isn't really jupyter-specific!) is already running an event loop that they should at least try 'await'-ing first before spinning up a blocking sub-loop. -g -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Wed Mar 27 01:11:49 2019 From: glyph at twistedmatrix.com (Glyph) Date: Tue, 26 Mar 2019 22:11:49 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> Message-ID: <94644B1B-3DA9-4528-A356-8F73EE4C6A9A@twistedmatrix.com> > On Mar 26, 2019, at 11:56 AM, Yury Selivanov wrote: > > >> On Mar 25, 2019, at 8:01 PM, Guido van Rossum > wrote: >> >> Given PBP, I wonder if we should just relent and have a configurable flag (off by default) to allow nested loop invocations (both the same loop and a different loop). >> > > > I think that if we implement this feature behind a flag then some libraries will start requiring that flag to be set. Which will inevitably lead us to a situation where it's impossible to use asyncio without the flag. Therefore I suppose we should either just implement this behaviour by default or defer this to 3.9 or later. How do you feel about my proposal of making the "flag" be simply an argument to run_until_complete? If what you really want to do is start a task or await a future, you should get notified that you're reentrantly blocking; but if you're sure, just pass the arg and be on your way. If it's a "flag" like an env var or some kind of global switch, then I totally agree with you. > I myself am -1 on making 'run_until_complete()' reentrant. The separation of async/await code and blocking code is painful enough to some people, introducing another "hybrid" mode will ultimately do more damage than good. E.g. it's hard to reason about this even for me: I simply don't know if I can make uvloop (or asyncio) fully reentrant. If uvloop has problems with global state that prevent reentrancy, fine - for the use-cases where you're doing this, you already kind of implicitly don't care about performance; someone can instantiate their own, safe loop. (If you can't do this with asyncio though I kinda wonder what's going on.) > In case of Jupyter I don't think it's a good idea for them to advertise nest_asyncio. IMHO the right approach would be to encourage library developers to expose async/await APIs and teach Jupyter users to "await" on async code directly. ??? > The linked Jupyter issue (https://github.com/jupyter/notebook/issues/3397 ) is a good example: someone tries to call "asyncio.get_event_loop().run_until_complete(foo())" and the call fails. Instead of recommending to use "nest_asyncio", Jupyter REPL could simply catch the error and suggest the user to await "foo()". We can make that slightly easier by changing the exception type from RuntimeError to NestedAsyncioLoopError. In other words, in the Jupyters case, I think it's a UI/UX problem, not an asyncio problem. So, you may not be able to `await` right now, today, from a cell, given that that needs some additional support. But you can create_task just fine, right? Making await-with-no-indentation work seamlessly would be beautiful but I don't think we need to wait for that modification to get made in order to enjoy the benefits of proper asynchrony. -g -------------- next part -------------- An HTML attachment was scrubbed... URL: From nugend at gmail.com Wed Mar 27 13:16:12 2019 From: nugend at gmail.com (Daniel Nugent) Date: Wed, 27 Mar 2019 13:16:12 -0400 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> Message-ID: <97564246-4dcd-470f-b70d-d187b9d59643@Spark> FWIW, the ayncio_run_encapsulated approach does not work with the transport/protocol apis because the loop needs to stay alive concurrent with the connection in order for the awaitables to all be on the same loop. I think the notion that allowing an option for nested loops will inevitably lead to a situation where nested loops are always required is maybe a bit pessimistic? -Dan Nugent On Mar 26, 2019, 23:33 -0400, Nathaniel Smith , wrote: > On Mon, Mar 25, 2019 at 4:37 PM Guido van Rossum wrote: > > > > I also hope Nathaniel has something to say -- I wonder if trio supports nested event loops? > > Trio does have a similar check to prevent starting a new Trio loop > inside a running Trio loop, and there's currently no way to disable > it: https://github.com/python-trio/trio/blob/444234392c064c0ec5e66b986a693e2e9f76bc58/trio/_core/_run.py#L1398-L1402 > > Like the comment says, I could imagine changing this if there's a good reason. > > On Tue, Mar 26, 2019 at 11:56 AM Yury Selivanov wrote: > > I think that if we implement this feature behind a flag then some libraries will start requiring that flag to be set. Which will inevitably lead us to a situation where it's impossible to use asyncio without the flag. Therefore I suppose we should either just implement this behaviour by default or defer this to 3.9 or later. > > It is weird that if you have a synchronous public interface, then it > acts differently depending on whether you happened to implement that > interface using the socket module directly vs using asyncio. > > If you want to "hide" that your synchronous API uses asyncio > internally, then you can actually do that now using > public/quasi-public APIs: > > def asyncio_run_encapsulated(*args, **kwargs): > old_loop = asyncio.get_running_loop() > try: > asyncio._set_running_loop(None) > return asyncio.run(*args, **kwargs) > finally: > asyncio._set_running_loop(old_loop) > > def my_sync_api(...): > return asyncio_run_encapsulated(my_async_api(...)) > > But this is also a bit weird, because the check is useful. It's weird > that a blocking socket-module-based implementation and a blocking > asyncio-based implementation act differently, but arguably the way to > make them consistent is to fix the socket module so that it does give > an error if you try to issue blocking calls from inside asyncio, > rather than remove the error from asyncio. In fact newcomers often > make mistakes like using time.sleep or requests from inside async > code, and a common question is how to catch this in real code bases. > > I wonder if we should have an interpreter-managed thread-local flag > "we're in async mode", and make blocking operations in the stdlib > check it. E.g. as a straw man, sys.set_allow_blocking(True/False), > sys.get_allow_blocking(), sys.check_allow_blocking() -> raises an > exception if sys.get_allow_blocking() is False, and then add calls to > sys.check_allow_blocking() in time.sleep, socket operations with > blocking mode enabled, etc. (And encourage third-party libraries that > do their own blocking I/O without going through the stdlib to add > similar calls.) Async I/O libraries (asyncio/trio/twisted/...) would > set the flag appropriately; and if someone like IPython *really wants* > to perform blocking operations inside async context, they can fiddle > with the flag themselves. > > > I myself am -1 on making 'run_until_complete()' reentrant. The separation of async/await code and blocking code is painful enough to some people, introducing another "hybrid" mode will ultimately do more damage than good. E.g. it's hard to reason about this even for me: I simply don't know if I can make uvloop (or asyncio) fully reentrant. > > Yeah, pumping the I/O loop from inside a task that's running on the > I/O loop is just a mess. It breaks the async/await readability > guarantees, it risks stack overflow, and by the time this stuff bites > you you're going to have to backtrack a lonnng way to get to something > sensible. Trio definitely does not support this, and I will fight to > keep it that way :-). > > Most traditional GUI I/O loops *do* allow this, and in the traditional > Twisted approach of trying to support all the I/O loop APIs on top of > each other, this can be a problem ? if you want an adapter to run Qt > or Gtk apps on top of your favorite asyncio loop implementation, then > your loop implementation needs to support reentrancy. But I guess so > far people are OK with doing things the other way (implementing the > asyncio APIs on top of the standard GUI event loops). In Trio I have a > Cunning Scheme to avoid doing either approach, but we'll see how that > goes... > > > In case of Jupyter I don't think it's a good idea for them to advertise nest_asyncio. IMHO the right approach would be to encourage library developers to expose async/await APIs and teach Jupyter users to "await" on async code directly. > > > > The linked Jupyter issue (https://github.com/jupyter/notebook/issues/3397) is a good example: someone tries to call "asyncio.get_event_loop().run_until_complete(foo())" and the call fails. Instead of recommending to use "nest_asyncio", Jupyter REPL could simply catch the error and suggest the user to await "foo()". We can make that slightly easier by changing the exception type from RuntimeError to NestedAsyncioLoopError. In other words, in the Jupyters case, I think it's a UI/UX problem, not an asyncio problem. > > I think this might be too simplistic... Jupyter/IPython are in a > tricky place, where some users reasonably want to treat them like a > regular REPL, so calling 'asyncio.run(...)' should be supported (and > not supporting it would be a backcompat break). But, other users want > first-class async/await support integrated into some persistent event > loop. (And as Glyph points out, not supporting this is *also* > potentially a backcompat break, though probably a much less disruptive > one.) > > To me the key observation is that in Jupyter/IPython, they want their > async/await support to work with multiple async library backends. > Therefore, they're not going to get away with letting people just > assume that there's some ambient Tornado-ish loop running -- they > *need* some way to hide that away as an implementation detail, and an > interface for users to state which async loop they want to use. Given > that, IMO it makes most sense for them to default to providing a sync > context by default, by whatever mechanism makes sense -- for a Jupyter > kernel, maybe this is a dedicated thread for running user code, > whatever. And then for Glyph and everyone who wants to access ambient > async functionality from inside the REPL, that's something you opt-in > to by running in a special mode, or writing %asyncio at the top of > your notebook. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov at gmail.com Wed Mar 27 14:22:16 2019 From: yselivanov at gmail.com (Yury Selivanov) Date: Wed, 27 Mar 2019 14:22:16 -0400 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <94644B1B-3DA9-4528-A356-8F73EE4C6A9A@twistedmatrix.com> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> <94644B1B-3DA9-4528-A356-8F73EE4C6A9A@twistedmatrix.com> Message-ID: > On Mar 27, 2019, at 1:11 AM, Glyph wrote: > > > >> On Mar 26, 2019, at 11:56 AM, Yury Selivanov > wrote: >> >> >>> On Mar 25, 2019, at 8:01 PM, Guido van Rossum > wrote: >>> >>> Given PBP, I wonder if we should just relent and have a configurable flag (off by default) to allow nested loop invocations (both the same loop and a different loop). >>> >> >> >> I think that if we implement this feature behind a flag then some libraries will start requiring that flag to be set. Which will inevitably lead us to a situation where it's impossible to use asyncio without the flag. Therefore I suppose we should either just implement this behaviour by default or defer this to 3.9 or later. > > How do you feel about my proposal of making the "flag" be simply an argument to run_until_complete? If what you really want to do is start a task or await a future, you should get notified that you're reentrantly blocking; but if you're sure, just pass the arg and be on your way. I'm not sure how making it an argument solves anything; please help me understand. I see two scenarios here: 1. Migrating big codebase to asyncio. This is something that Ben mentioned and I also happen to know that a couple big companies were struggling with that. In this case a reentrant event loop can ease the migration pain. But if it's enabled via an argument to run_until_complete/run_forever you will have to blindly enable it for your entire source tree in order to take advantage of it. This argument then becomes, effectively, a global flag, and discourages you from actually refactoring and fixing your code properly. 2. Jupyter case. Their users would still struggle with copy/pasting asyncio.run()/loop.run_until_complete() and get an error. They will still have to add "reentrant=True". From the standpoint of usability, this new argument would not be a significant improvement as I see it. > > If it's a "flag" like an env var or some kind of global switch, then I totally agree with you. > >> I myself am -1 on making 'run_until_complete()' reentrant. The separation of async/await code and blocking code is painful enough to some people, introducing another "hybrid" mode will ultimately do more damage than good. E.g. it's hard to reason about this even for me: I simply don't know if I can make uvloop (or asyncio) fully reentrant. > > If uvloop has problems with global state that prevent reentrancy, fine - for the use-cases where you're doing this, you already kind of implicitly don't care about performance; someone can instantiate their own, safe loop. (If you can't do this with asyncio though I kinda wonder what's going on.) Both asyncio & uvloop have global state: child processes watchers, global system hooks, and signal handlers. Re process watchers: asyncio manages that itself via monitoring on SIGCHLD etc, and uvloop offloads this problem to libuv entirely. Both still have weird bugs with multiple event loops in the same process losing track of their subprocesses. Re signal handlers: they are globally set both for asyncio & uvloop, and running a nested loop with some code that doesn't expect to be run like that can create hard to debug problems. We probably need new signals API in asyncio (I quite like the signals API in Trio). Re global system hooks: hooks to intercept async generators creation/GC are global. Event loops do save/restore them in their various run() methods so it shouldn't be a problem, but this is still a piece of global state to know about. I'm also not entirely sure that it's safe to mix uvloop with vanilla asyncio in the same process. > >> In case of Jupyter I don't think it's a good idea for them to advertise nest_asyncio. IMHO the right approach would be to encourage library developers to expose async/await APIs and teach Jupyter users to "await" on async code directly. > > ??? > >> The linked Jupyter issue (https://github.com/jupyter/notebook/issues/3397 ) is a good example: someone tries to call "asyncio.get_event_loop().run_until_complete(foo())" and the call fails. Instead of recommending to use "nest_asyncio", Jupyter REPL could simply catch the error and suggest the user to await "foo()". We can make that slightly easier by changing the exception type from RuntimeError to NestedAsyncioLoopError. In other words, in the Jupyters case, I think it's a UI/UX problem, not an asyncio problem. > > So, you may not be able to `await` right now, today, from a cell, given that that needs some additional support. But you can create_task just fine, right? Making await-with-no-indentation work seamlessly would be beautiful but I don't think we need to wait for that modification to get made in order to enjoy the benefits of proper asynchrony. I was under impression that Jupyter already allows top-level "await" expressions. If that's true, then both "await foo()" and "asyncio.create_task(foo())" should work. Yury -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 27 16:23:28 2019 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 27 Mar 2019 13:23:28 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: <97564246-4dcd-470f-b70d-d187b9d59643@Spark> References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> <97564246-4dcd-470f-b70d-d187b9d59643@Spark> Message-ID: On Wed, Mar 27, 2019 at 10:44 AM Daniel Nugent wrote: > > FWIW, the ayncio_run_encapsulated approach does not work with the transport/protocol apis because the loop needs to stay alive concurrent with the connection in order for the awaitables to all be on the same loop. Yeah, there are two basic approaches being discussed here: using two different loops, versus re-entering an existing loop. asyncio_run_encapsulated is specifically for the two-loops approach. In this version, the outer loop, and everything running on it, stop entirely while the inner loop is running ? which is exactly what happens with any other synchronous, blocking API. Using asyncio_run_encapsulated(aiohttp.get(...)) in Jupyter is exactly like using requests.get(...), no better or worse. -n -- Nathaniel J. Smith -- https://vorpus.org From guido at python.org Wed Mar 27 16:49:06 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Mar 2019 13:49:06 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> <97564246-4dcd-470f-b70d-d187b9d59643@Spark> Message-ID: On Wed, Mar 27, 2019 at 1:23 PM Nathaniel Smith wrote: > On Wed, Mar 27, 2019 at 10:44 AM Daniel Nugent wrote: > > > > FWIW, the ayncio_run_encapsulated approach does not work with the > transport/protocol apis because the loop needs to stay alive concurrent > with the connection in order for the awaitables to all be on the same loop. > > Yeah, there are two basic approaches being discussed here: using two > different loops, versus re-entering an existing loop. > asyncio_run_encapsulated is specifically for the two-loops approach. > > In this version, the outer loop, and everything running on it, stop > entirely while the inner loop is running ? which is exactly what > happens with any other synchronous, blocking API. Using > asyncio_run_encapsulated(aiohttp.get(...)) in Jupyter is exactly like > using requests.get(...), no better or worse. > And Yury's followup suggests that it's hard to achieve total isolation between loops, due to subprocess management and signal handling (which are global states in the OS, or at least per-thread -- the OS doesn't know about event loops). I just had another silly idea. What if the magical decorator that can be used to create a sync version of an async def (somewhat like tworoutines) made the async version hand off control to a thread pool? Could be a tad slower, but the tenor of the discussion seems to be that performance is not that much of an issue. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 27 18:18:32 2019 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 27 Mar 2019 15:18:32 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> <97564246-4dcd-470f-b70d-d187b9d59643@Spark> Message-ID: On Wed, Mar 27, 2019 at 1:49 PM Guido van Rossum wrote: > > On Wed, Mar 27, 2019 at 1:23 PM Nathaniel Smith wrote: >> >> On Wed, Mar 27, 2019 at 10:44 AM Daniel Nugent wrote: >> > >> > FWIW, the ayncio_run_encapsulated approach does not work with the transport/protocol apis because the loop needs to stay alive concurrent with the connection in order for the awaitables to all be on the same loop. >> >> Yeah, there are two basic approaches being discussed here: using two >> different loops, versus re-entering an existing loop. >> asyncio_run_encapsulated is specifically for the two-loops approach. >> >> In this version, the outer loop, and everything running on it, stop >> entirely while the inner loop is running ? which is exactly what >> happens with any other synchronous, blocking API. Using >> asyncio_run_encapsulated(aiohttp.get(...)) in Jupyter is exactly like >> using requests.get(...), no better or worse. > > > And Yury's followup suggests that it's hard to achieve total isolation between loops, due to subprocess management and signal handling (which are global states in the OS, or at least per-thread -- the OS doesn't know about event loops). The tough thing about signals is that they're all process global state, *not* per-thread. In Trio I think this wouldn't be a big deal ? whenever we touch signal handlers, we save the old value and then restore it afterwards, so the inner loop would just temporarily override the outer loop, which I guess is what you'd expect. (And Trio's subprocess support avoids touching signals or any global state.) Asyncio could potentially do something similar, but its subprocess support does rely on signals, which could get messy since the outer loop can't be allowed to miss any SIGCHLDs. Asyncio does have a mechanism to share SIGCHLD handlers between loops (intended to support the case where you have loops running in multiple threads simultaneously), and it might handle this case too, but I don't know the details well enough to say for sure. > I just had another silly idea. What if the magical decorator that can be used to create a sync version of an async def (somewhat like tworoutines) made the async version hand off control to a thread pool? Could be a tad slower, but the tenor of the discussion seems to be that performance is not that much of an issue. Unfortunately I don't think this helps much... If your async def doesn't use signals, then it won't interfere with the outer loop's signal state and a thread is unnecessary. And if it *does* use signals, then you can't put it in a thread, because Python threads are forbidden to call any of the signal-related APIs. -n -- Nathaniel J. Smith -- https://vorpus.org From guido at python.org Wed Mar 27 18:31:04 2019 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Mar 2019 15:31:04 -0700 Subject: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio In-Reply-To: References: <8524428679A18B459EA4EF0DD068CDABB915BF7F@PWSSMTEXMBX002.AD.MLP.com> <5d5ce6ae-3fe7-4c2d-b3e7-5e62202756c5@Spark> <86EE0536-E15A-4EB5-85C8-BB80EC5191C6@gmail.com> <97564246-4dcd-470f-b70d-d187b9d59643@Spark> Message-ID: On Wed, Mar 27, 2019 at 3:18 PM Nathaniel Smith wrote: > On Wed, Mar 27, 2019 at 1:49 PM Guido van Rossum wrote: > > I just had another silly idea. What if the magical decorator that can be > used to create a sync version of an async def (somewhat like tworoutines) > made the async version hand off control to a thread pool? Could be a tad > slower, but the tenor of the discussion seems to be that performance is not > that much of an issue. > > Unfortunately I don't think this helps much... If your async def > doesn't use signals, then it won't interfere with the outer loop's > signal state and a thread is unnecessary. And if it *does* use > signals, then you can't put it in a thread, because Python threads are > forbidden to call any of the signal-related APIs. > One advantage might be that you can do this without any asyncio changes or monkey-patches, e.g. you could do this with Python 3.6 today. In fact it might save @tworoutine (though I can't say I find its (~foo)(args) notation very readable, nor do I agree with its choice of making the default synchronous. I guess not being allowed to use signals doesn't strike me as a big deal, and IIRC asyncio's subprocess support still works from a thread. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: