From asmodehn at gmail.com Sat Dec 2 12:50:40 2017 From: asmodehn at gmail.com (Asmodehn Shade) Date: Sun, 3 Dec 2017 02:50:40 +0900 Subject: [Async-sig] async await as basic paradigm - relation to delimited continuations ? Message-ID: Hi everyone, I have been working on distributed programming for a while now, however I was away from python land, and didnt get into the twisted/tornado/asyncio train until few years ago. While writing asyncio code and exploring different abstract computing paradigm (my recent hobby seems to be exploring new ways to do computing), I recently discover curio. I read a few github issues, and although I do enjoy experiments, I am also aware of the need for mathematical fundations in order for a project to be able to compose between projects and groups of people and ultimately scale and persist. As well as finding its own roots to support an ecosystem. I was wondering if there was someone on this mailing list versed enough in maths and computer science who could see some kind of association/relation between async/await and delimited continuations... and hopefully fomalize it somehow ? As a side effect, this might give some improvement ideas to curio ... The way I see it so far, very rough and naive : - the event loop is "doing the work", that is a sequence of delimited continuations. - async/await denotes how the work to be done can be composed (wrapping functions that already could compose by themselves to add additional runtime semantics), - tasks are like delimited continuations, and they will be ordered at runtime to execute in a specific sequence. Or am I completely misleaded ? If we scale this up and "distribute" it (actor model style), one actor can match one event-loop, and there are probably many association to draw there, that could indicate for curio (or libraries using it) if it is sensible or not to completely identify the event_loop to a thread... As a broad overview : 1) imperative concepts can be implemented from functional/math concepts (haskell - do notation, relation with category theory) 2) functional concept are currently implemented with imperative languages ( scheme interpreters, haskell compiler, etc. ) 3) looking as async programming and functional/math concepts these days, I am wondering how these relate, and how far down goes the rabbit hole... Thanks a lot for sharing your views on this :-). -- AlexV -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sun Dec 3 00:59:57 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sat, 2 Dec 2017 21:59:57 -0800 Subject: [Async-sig] subclassing CancelledError Message-ID: Hi, I want to ask how people feel about the following. If you raise a subclass of CancelledError from within a task and then call task.result(), CancelledError is raised rather than the subclass. Here is some code to illustrate: class MyCancelledError(CancelledError): pass async def raise_my_cancel(): raise MyCancelledError() task = asyncio.ensure_future(raise_my_cancel()) try: await task except Exception: pass assert task.cancelled() # Raises CancelledError and not MyCancelledError. task.result() Does this seem right to people? Is there a justification for this? If it would help for the discussion, I could provide a use case. Thanks a lot, --Chris From andrew.svetlov at gmail.com Sun Dec 3 05:11:49 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sun, 03 Dec 2017 10:11:49 +0000 Subject: [Async-sig] subclassing CancelledError In-Reply-To: References: Message-ID: IIRC at very early stages Guido van Rossum decided to *freeze* `CancelledError`: user code should not derive from the exception. Like you never derive from StopIteration. On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek wrote: > Hi, I want to ask how people feel about the following. > > If you raise a subclass of CancelledError from within a task and then > call task.result(), CancelledError is raised rather than the subclass. > > Here is some code to illustrate: > > class MyCancelledError(CancelledError): pass > > async def raise_my_cancel(): > raise MyCancelledError() > > task = asyncio.ensure_future(raise_my_cancel()) > try: > await task > except Exception: > pass > assert task.cancelled() > # Raises CancelledError and not MyCancelledError. > task.result() > > Does this seem right to people? Is there a justification for this? > > If it would help for the discussion, I could provide a use case. > > Thanks a lot, > --Chris > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Dec 3 12:04:30 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 3 Dec 2017 09:04:30 -0800 Subject: [Async-sig] subclassing CancelledError In-Reply-To: References: Message-ID: Sounds like an implementation issue. The Future class has a boolean flag indicating whether it's been cancelled, and everything then raises CancelledError when that flag is set. I suppose we could replace that flag with an instance of CancelledError, and when we *catch* CancelledError we set it to that. But it seems messy and I'm not sure that you should attempt this. IMO you should define an exception that does *not* derive from CancelledError and raise that -- it will properly be transmitted. What's your reason for not doing that? IOW why are you trying to pump more than a bit through the narrow "cancelled" channel? On Sun, Dec 3, 2017 at 2:11 AM, Andrew Svetlov wrote: > IIRC at very early stages Guido van Rossum decided to *freeze* > `CancelledError`: user code should not derive from the exception. Like you > never derive from StopIteration. > > On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek > wrote: > >> Hi, I want to ask how people feel about the following. >> >> If you raise a subclass of CancelledError from within a task and then >> call task.result(), CancelledError is raised rather than the subclass. >> >> Here is some code to illustrate: >> >> class MyCancelledError(CancelledError): pass >> >> async def raise_my_cancel(): >> raise MyCancelledError() >> >> task = asyncio.ensure_future(raise_my_cancel()) >> try: >> await task >> except Exception: >> pass >> assert task.cancelled() >> # Raises CancelledError and not MyCancelledError. >> task.result() >> >> Does this seem right to people? Is there a justification for this? >> >> If it would help for the discussion, I could provide a use case. >> >> Thanks a lot, >> --Chris >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > -- > Thanks, > Andrew Svetlov > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sun Dec 3 16:53:12 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 3 Dec 2017 13:53:12 -0800 Subject: [Async-sig] subclassing CancelledError In-Reply-To: References: Message-ID: On Sun, Dec 3, 2017 at 9:04 AM, Guido van Rossum wrote: > Sounds like an implementation issue. The Future class has a boolean flag > indicating whether it's been cancelled, and everything then raises > CancelledError when that flag is set. I suppose we could replace that flag > with an instance of CancelledError, and when we *catch* CancelledError we > set it to that. But it seems messy and I'm not sure that you should attempt > this. IMO you should define an exception that does *not* derive from > CancelledError and raise that -- it will properly be transmitted. What's > your reason for not doing that? IOW why are you trying to pump more than a > bit through the narrow "cancelled" channel? My use case is mostly for diagnostic / logging purposes. I want to log all exceptions bubbling out of a task, and I want to do so from within the coroutine itself rather than when I call task.result(). I also want to be able to detect if something wasn't logged when I call task.result() (to avoid double logging and errors passing silently, etc), and I'd also prefer that task.cancelled(), etc. still return the correct values. My first idea was to define LoggedError(Exception) and LoggedCancelledError(CancelledError) subclasses, and raise a new exception from within the task after logging. If LoggedCancelledError doesn't derive from CancelledError, then task.cancelled() won't reflect that the task was cancelled. Could it simplify things on the asyncio side if the CancelledError and non-CancelledError code paths shared more logic (e.g. by both setting an exception instead of only the generic case setting an exception)? --Chris > > On Sun, Dec 3, 2017 at 2:11 AM, Andrew Svetlov > wrote: >> >> IIRC at very early stages Guido van Rossum decided to *freeze* >> `CancelledError`: user code should not derive from the exception. Like you >> never derive from StopIteration. >> >> On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek >> wrote: >>> >>> Hi, I want to ask how people feel about the following. >>> >>> If you raise a subclass of CancelledError from within a task and then >>> call task.result(), CancelledError is raised rather than the subclass. >>> >>> Here is some code to illustrate: >>> >>> class MyCancelledError(CancelledError): pass >>> >>> async def raise_my_cancel(): >>> raise MyCancelledError() >>> >>> task = asyncio.ensure_future(raise_my_cancel()) >>> try: >>> await task >>> except Exception: >>> pass >>> assert task.cancelled() >>> # Raises CancelledError and not MyCancelledError. >>> task.result() >>> >>> Does this seem right to people? Is there a justification for this? >>> >>> If it would help for the discussion, I could provide a use case. >>> >>> Thanks a lot, >>> --Chris >>> _______________________________________________ >>> Async-sig mailing list >>> Async-sig at python.org >>> https://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> >> -- >> Thanks, >> Andrew Svetlov >> >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) From guido at python.org Sun Dec 3 20:35:06 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 3 Dec 2017 17:35:06 -0800 Subject: [Async-sig] subclassing CancelledError In-Reply-To: References: Message-ID: I think it's too late to change the cancellation logic. Cancellation is tricky. On Sun, Dec 3, 2017 at 1:53 PM, Chris Jerdonek wrote: > On Sun, Dec 3, 2017 at 9:04 AM, Guido van Rossum wrote: > > Sounds like an implementation issue. The Future class has a boolean flag > > indicating whether it's been cancelled, and everything then raises > > CancelledError when that flag is set. I suppose we could replace that > flag > > with an instance of CancelledError, and when we *catch* CancelledError we > > set it to that. But it seems messy and I'm not sure that you should > attempt > > this. IMO you should define an exception that does *not* derive from > > CancelledError and raise that -- it will properly be transmitted. What's > > your reason for not doing that? IOW why are you trying to pump more than > a > > bit through the narrow "cancelled" channel? > > My use case is mostly for diagnostic / logging purposes. I want to log > all exceptions bubbling out of a task, and I want to do so from within > the coroutine itself rather than when I call task.result(). I also > want to be able to detect if something wasn't logged when I call > task.result() (to avoid double logging and errors passing silently, > etc), and I'd also prefer that task.cancelled(), etc. still return the > correct values. > > My first idea was to define LoggedError(Exception) and > LoggedCancelledError(CancelledError) subclasses, and raise a new > exception from within the task after logging. If LoggedCancelledError > doesn't derive from CancelledError, then task.cancelled() won't > reflect that the task was cancelled. > > Could it simplify things on the asyncio side if the CancelledError and > non-CancelledError code paths shared more logic (e.g. by both setting > an exception instead of only the generic case setting an exception)? > > --Chris > > > > > > On Sun, Dec 3, 2017 at 2:11 AM, Andrew Svetlov > > > wrote: > >> > >> IIRC at very early stages Guido van Rossum decided to *freeze* > >> `CancelledError`: user code should not derive from the exception. Like > you > >> never derive from StopIteration. > >> > >> On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek > > >> wrote: > >>> > >>> Hi, I want to ask how people feel about the following. > >>> > >>> If you raise a subclass of CancelledError from within a task and then > >>> call task.result(), CancelledError is raised rather than the subclass. > >>> > >>> Here is some code to illustrate: > >>> > >>> class MyCancelledError(CancelledError): pass > >>> > >>> async def raise_my_cancel(): > >>> raise MyCancelledError() > >>> > >>> task = asyncio.ensure_future(raise_my_cancel()) > >>> try: > >>> await task > >>> except Exception: > >>> pass > >>> assert task.cancelled() > >>> # Raises CancelledError and not MyCancelledError. > >>> task.result() > >>> > >>> Does this seem right to people? Is there a justification for this? > >>> > >>> If it would help for the discussion, I could provide a use case. > >>> > >>> Thanks a lot, > >>> --Chris > >>> _______________________________________________ > >>> Async-sig mailing list > >>> Async-sig at python.org > >>> https://mail.python.org/mailman/listinfo/async-sig > >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ > >> > >> -- > >> Thanks, > >> Andrew Svetlov > >> > >> _______________________________________________ > >> Async-sig mailing list > >> Async-sig at python.org > >> https://mail.python.org/mailman/listinfo/async-sig > >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > >> > > > > > > > > -- > > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Dec 7 03:28:07 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Dec 2017 00:28:07 -0800 Subject: [Async-sig] ANN: Trio v0.2.0 released Message-ID: Hi all, I'm proud to announce the release of Trio v0.2.0. Trio is a new async concurrency library for Python that's obsessed with usability and correctness -- we want to make it easy to get things right. This is the second public release, and it contains major new features and bugfixes from 14 contributors. You can read the full release notes here: https://trio.readthedocs.io/en/latest/history.html#trio-0-2-0-2017-12-06 Some things I'm particularly excited about are: - Comprehensive support for async file I/O - The new 'nursery.start' method for clean startup of complex task trees - The new high-level networking API -- this is roughly the same level of abstraction as twisted/asyncio's protocols/transports. Includes luxuries like happy eyeballs for most robust client connections, and server helpers that integrate with nursery.start. - Complete support for using SSL/TLS encryption over arbitrary transports. You can even do SSL-over-SSL, which is useful for HTTPS proxies and AFAIK not supported by any other Python library. - Task-local storage. - Our new contributing guide: https://trio.readthedocs.io/en/latest/contributing.html To get started with Trio, the best place to start is our tutorial: https://trio.readthedocs.io/en/latest/tutorial.html It doesn't assume any prior familiarity with concurrency or async/await. Share and enjoy, -n -- Nathaniel J. Smith -- https://vorpus.org From yselivanov at gmail.com Thu Dec 7 21:03:46 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Thu, 7 Dec 2017 21:03:46 -0500 Subject: [Async-sig] APIs for high-bandwidth large I/O? In-Reply-To: <20171018200441.505e8145@fsol> References: <20171018200441.505e8145@fsol> Message-ID: <1199ac88-719c-41cb-b852-e347a566e94c@Spark> Hi Antoine, Thanks for posting this, and sorry for the delayed reply! I've known about a possibility to optimize asyncio Protocols for a while. ?I noticed that `Protocol.data_received()` requires making one extra copy of the received data when I was working on the initial version of uvloop. ?Back then my main priority was to make uvloop fully compatible with asyncio, so I wasn't really thinking about improving asyncio design. Let me explain the current flaw of `Protocol.data_received()` so that other people on the list can catch up with the discussion: 1. Currently, when a Transport is reading data, it uses `sock.recv()` call, which returns a `bytes` object, which is then pushed to `Protocol.data_received()`. ?Every time `sock.recv()` is called, a new bytes object is allocated. 2. Typically, protocols need to accumulate bytes objects they receive until they have enough buffered data to be parsed. ?Usually a `deque` is used for that, less optimized code just concatenates all bytes objects into one. 3. When enough data is gathered and a protocol message can be parsed out of it, usually there's a need to concatenate a few buffers from the `deque` or get a slice of the concatenated buffer. ?At this point, we've copied the received data two times. I propose to add another Protocol base class to asyncio: BufferedProtocol. ?It won't have the 'data_received()' method, instead it will have 'get_buffer()' and 'buffer_updated(nbytes)' methods: ? ? class asyncio.BufferedProtocol: ? ? ? ? def get_buffer(self) -> memoryview: ? ? ? ? ? ? pass ? ? ? ? def buffer_updated(self, nbytes: int): ? ? ? ? ? ? pass When the protocol's transport is ready to receive data, it will call `protocol.get_buffer()`. ?The latter must return an object that implements the buffer protocol. ?The transport will request a writable buffer over the returned object and receive data *into* that buffer. When the `sock.recv_into(buffer)` call is done, `protocol.buffer_updated(nbytes)` method will be called. ?The number of bytes received into the buffer will be passed as a first argument. I've implemented the proposed design in uvloop (branch 'get_buffer', [1]) and adjusted your benchmark [2] to use it. ?Here are benchmark results from my machine (macOS): vanilla asyncio: 120-135 Mb/s uvloop: 320-330 Mb/s uvloop/get_buffer: 600-650 Mb/s. The benchmark is quite unstable, but it's clear that Protocol.get_buffer() allows to implement framing way more efficiently. I'm also working on porting asyncpg library to use get_buffer(), as it has a fairly good benchmark suite. ?So far I'm seeing 5-15% speed boost on all benchmarks. ?What's more important is that get_buffer() makes asyncpg buffer implementation simpler! I'm quite happy with these results and Ipropose to implement the get_buffer() API (or its equivalent) in Python 3.7. ?I've opened an issue [3] to discuss the implementation details. [1]?https://github.com/MagicStack/uvloop/tree/get_buffer [2]?https://gist.github.com/1st1/1c606e5b83ef0e9c41faf21564d75ad7 Thanks, Yury On Oct 18, 2017, 2:31 PM -0400, Antoine Pitrou , wrote: > > Hi, > > I am currently looking into ways to optimize large data transfers for a > distributed computing framework > (https://github.com/dask/distributed/). We are using Tornado but the > question is more general, as it turns out that certain kinds of API are > an impediment to such optimizations. > > To put things short, there are a couple benchmarks discussed here: > https://github.com/tornadoweb/tornado/issues/2147#issuecomment-337187960 > > - for Tornado, this benchmark: > https://gist.github.com/pitrou/0f772867008d861c4aa2d2d7b846bbf0 > - for asyncio, this benchmark: > https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc > > Both implement a trivial form of framing using the "preferred" APIs of > each framework (IOStream for Tornado, Protocol for asyncio), and then > benchmark it over 100 MB frames using a simple echo client/server. > > The results (on Python 3.6) are interesting: > - vanilla asyncio achieves 350 MB/s > - vanilla Tornado achieves 400 MB/s > - asyncio + uvloop achieves 600 MB/s > - an optimized Tornado IOStream with a more sophisticated buffering > logic (https://github.com/tornadoweb/tornado/pull/2166) > achieves 700 MB/s > > The latter result is especially interesting. uvloop uses hand-crafted > Cython code + the C libuv library, still, a pure Python version of > Tornado does better thanks to an improved buffering logic in the > streaming layer. > > Even the Tornado result is not ideal. When profiling, we see that > 50% of the runtime is actual IO calls (socket.send and socket.recv), > but the rest is still overhead. Especially, buffering on the read side > still has costly memory copies (b''.join calls take 22% of the time!). > > For a framed layer, you shouldn't need so many copies. Once you've > read the frame length, you can allocate the frame upfront and read into > it. It is at odds, however, with the API exposed by asyncio's Protocol: > data_received() gives you a new bytes object as soon as data arrives. > It's already too late: a spurious memory copy will have to occur. > > Tornado's IOStream is less constrained, but it supports too many read > schemes (including several types of callbacks). So I crafted a limited > version of IOStream (*) that supports little functionality, but is able > to use socket.recv_into() when asked for a given number of bytes. When > benchmarked, this version achieves 950 MB/s. This is still without C > code! > > (*) see > https://github.com/tornadoweb/tornado/compare/master...pitrou:stream_readinto?expand=1 > > When profiling that limited version of IOStream, we see that 68% of the > runtime is actual IO calls (socket.send and socket.recv_into). > Still, 21% of the total runtime is spent allocating a 100 MB buffer for > each frame! That's 70% of the non-IO overhead! Whether or not there > are smart ways to reuse that writable buffer depends on how the > application intends to use data: does it throw it away before the next > read or not? It doesn't sound easily doable in the general case. > > > So I'm wondering which kind of APIs async libraries could expose to > make those use cases faster. I know curio and trio have socket objects > which would probably fit the bill. I don't know if there are > higher-level concepts that may be as adequate for achieving the highest > performance. > > Also, since asyncio is the de facto standard now, I wonder if asyncio > might grow such a new API. That may be troublesome: asyncio already > has Protocols and Streams, and people often complain about its > extensive API surface that's difficult for beginners :-) > > > Addendum: asyncio streams > ------------------------- > > I didn't think asyncio streams would be a good solution, but I still > wrote a benchmark variant for them out of curiosity, and it turns out I > was right. The results: > - vanilla asyncio streams achieve 300 MB/s > - asyncio + uvloop streams achieve 550 MB/s > > The benchmark script is at > https://gist.github.com/pitrou/202221ca9c9c74c0b48373ac89e15fd7 > > Regards > > Antoine. > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov at gmail.com Thu Dec 7 21:05:29 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Thu, 7 Dec 2017 21:05:29 -0500 Subject: [Async-sig] APIs for high-bandwidth large I/O? In-Reply-To: <1199ac88-719c-41cb-b852-e347a566e94c@Spark> References: <20171018200441.505e8145@fsol> <1199ac88-719c-41cb-b852-e347a566e94c@Spark> Message-ID: <0fa00196-0592-4a5a-b27b-dc9b8dfd50a4@Spark> On Dec 7, 2017, 9:03 PM -0500, Yury Selivanov , wrote: [..] > I'm quite happy with these results and Ipropose to implement the get_buffer() API (or its equivalent) in Python 3.7. ?I've opened an issue [3] to discuss the implementation details. Issue:?https://bugs.python.org/issue32251 Yury -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov at gmail.com Thu Dec 7 21:07:45 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Thu, 7 Dec 2017 21:07:45 -0500 Subject: [Async-sig] APIs for high-bandwidth large I/O? In-Reply-To: <1199ac88-719c-41cb-b852-e347a566e94c@Spark> References: <20171018200441.505e8145@fsol> <1199ac88-719c-41cb-b852-e347a566e94c@Spark> Message-ID: On Dec 7, 2017, 9:03 PM -0500, Yury Selivanov , wrote: > > I'm quite happy with these results and Ipropose to implement the get_buffer() API (or its equivalent) in Python 3.7. ?I've opened an issue [3] to discuss the implementation details. > I apologize for the noise, my email client is acting weird. ?Here's the url of the CPython issue I created to discuss the proposed design:?https://bugs.python.org/issue32251. Yury -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Mon Dec 25 00:55:47 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 24 Dec 2017 21:55:47 -0800 Subject: [Async-sig] task.result() and exception traceback display Message-ID: Hi, I noticed that if a task in asyncio raises an exception, then the displayed traceback can be "polluted" by intermediate calls to task.result(). Also, the calls to task.result() can appear out of order relative to each other and to other lines. Here is an example: import asyncio async def raise_error(): raise ValueError() async def main(): task = asyncio.ensure_future(raise_error()) try: await task # call 1 except Exception: pass try: task.result() # call 2 except Exception: pass task.result() # call 3 asyncio.get_event_loop().run_until_complete(main()) The above outputs-- Traceback (most recent call last): File "test.py", line 24, in asyncio.get_event_loop().run_until_complete(main()) File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete return future.result() File "test.py", line 21, in main task.result() # call 3 File "test.py", line 17, in main task.result() # call 2 File "test.py", line 12, in main await task # call 1 File "test.py", line 5, in raise_error raise ValueError() ValueError Notice that the "call 2" line appears in the traceback, even though it doesn't come into play in the exception. Also, the lines don't obey the "most recent call last" rule. If this rule were followed, it should be something more like-- Traceback (most recent call last): File "test.py", line 24, in asyncio.get_event_loop().run_until_complete(main()) File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete return future.result() File "test.py", line 12, in main await task # call 1 File "test.py", line 5, in raise_error raise ValueError() File "test.py", line 17, in main task.result() # call 2 File "test.py", line 21, in main task.result() # call 3 ValueError If people agree there's an issue along these lines, I can file an issue in the tracker. I didn't seem to find one when searching for open issues with search terms like "asyncio traceback". Thanks, --Chris From njs at pobox.com Mon Dec 25 03:48:15 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 25 Dec 2017 00:48:15 -0800 Subject: [Async-sig] task.result() and exception traceback display In-Reply-To: References: Message-ID: I haven't thought about this enough to have an opinion about whether this is correct or how it could be improved, but I can explain why you're seeing what you're seeing :-). The traceback is really a trace of where the exception went after it was raised, with new lines added to the top as it bubbles out. So the bottom line is the 'raise' statement, because that's where it was created, and then it bubbled onto the 'call 1' line and was caught. Then it was raised again and bubbled onto the 'call 2' line. Etc. So you should think of it not as a snapshot of your stack when it was created, but as a travelogue. -n On Sun, Dec 24, 2017 at 9:55 PM, Chris Jerdonek wrote: > Hi, > > I noticed that if a task in asyncio raises an exception, then the > displayed traceback can be "polluted" by intermediate calls to > task.result(). Also, the calls to task.result() can appear out of > order relative to each other and to other lines. > > Here is an example: > > import asyncio > > async def raise_error(): > raise ValueError() > > async def main(): > task = asyncio.ensure_future(raise_error()) > > try: > await task # call 1 > except Exception: > pass > > try: > task.result() # call 2 > except Exception: > pass > > task.result() # call 3 > > asyncio.get_event_loop().run_until_complete(main()) > > The above outputs-- > > Traceback (most recent call last): > File "test.py", line 24, in > asyncio.get_event_loop().run_until_complete(main()) > File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py", > line 467, in run_until_complete > return future.result() > File "test.py", line 21, in main > task.result() # call 3 > File "test.py", line 17, in main > task.result() # call 2 > File "test.py", line 12, in main > await task # call 1 > File "test.py", line 5, in raise_error > raise ValueError() > ValueError > > Notice that the "call 2" line appears in the traceback, even though it > doesn't come into play in the exception. Also, the lines don't obey > the "most recent call last" rule. If this rule were followed, it > should be something more like-- > > Traceback (most recent call last): > File "test.py", line 24, in > asyncio.get_event_loop().run_until_complete(main()) > File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py", > line 467, in run_until_complete > return future.result() > File "test.py", line 12, in main > await task # call 1 > File "test.py", line 5, in raise_error > raise ValueError() > File "test.py", line 17, in main > task.result() # call 2 > File "test.py", line 21, in main > task.result() # call 3 > ValueError > > If people agree there's an issue along these lines, I can file an > issue in the tracker. I didn't seem to find one when searching for > open issues with search terms like "asyncio traceback". > > Thanks, > --Chris > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ -- Nathaniel J. Smith -- https://vorpus.org From chris.jerdonek at gmail.com Mon Dec 25 04:46:32 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Mon, 25 Dec 2017 01:46:32 -0800 Subject: [Async-sig] task.result() and exception traceback display In-Reply-To: References: Message-ID: On Mon, Dec 25, 2017 at 12:48 AM, Nathaniel Smith wrote: > I haven't thought about this enough to have an opinion about whether > this is correct or how it could be improved, but I can explain why > you're seeing what you're seeing :-). > > The traceback is really a trace of where the exception went after it > was raised, with new lines added to the top as it bubbles out. So the > bottom line is the 'raise' statement, because that's where it was > created, and then it bubbled onto the 'call 1' line and was caught. > Then it was raised again and bubbled onto the 'call 2' line. Etc. So > you should think of it not as a snapshot of your stack when it was > created, but as a travelogue. Thanks, Nathaniel. That's a really good explanation. Also, here's a way to see the same behavior without async: def main(): exc = ValueError() try: raise exc # call 1 except Exception: pass try: raise exc # call 2 except Exception: pass raise exc # call 3 main() With this, the traceback looks like-- Traceback (most recent call last): File "test.py", line 16, in main() File "test.py", line 14, in main raise exc # call 3 File "test.py", line 10, in main raise exc # call 2 File "test.py", line 5, in main raise exc # call 1 ValueError Since you can see that the later calls are getting added on the top, it's almost as if it should read: most recent calls **first**. :) (I wonder if there's a 4-word phrase that does accurately describe what's happening.) --Chris > > -n > > On Sun, Dec 24, 2017 at 9:55 PM, Chris Jerdonek > wrote: >> Hi, >> >> I noticed that if a task in asyncio raises an exception, then the >> displayed traceback can be "polluted" by intermediate calls to >> task.result(). Also, the calls to task.result() can appear out of >> order relative to each other and to other lines. >> >> Here is an example: >> >> import asyncio >> >> async def raise_error(): >> raise ValueError() >> >> async def main(): >> task = asyncio.ensure_future(raise_error()) >> >> try: >> await task # call 1 >> except Exception: >> pass >> >> try: >> task.result() # call 2 >> except Exception: >> pass >> >> task.result() # call 3 >> >> asyncio.get_event_loop().run_until_complete(main()) >> >> The above outputs-- >> >> Traceback (most recent call last): >> File "test.py", line 24, in >> asyncio.get_event_loop().run_until_complete(main()) >> File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py", >> line 467, in run_until_complete >> return future.result() >> File "test.py", line 21, in main >> task.result() # call 3 >> File "test.py", line 17, in main >> task.result() # call 2 >> File "test.py", line 12, in main >> await task # call 1 >> File "test.py", line 5, in raise_error >> raise ValueError() >> ValueError >> >> Notice that the "call 2" line appears in the traceback, even though it >> doesn't come into play in the exception. Also, the lines don't obey >> the "most recent call last" rule. If this rule were followed, it >> should be something more like-- >> >> Traceback (most recent call last): >> File "test.py", line 24, in >> asyncio.get_event_loop().run_until_complete(main()) >> File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py", >> line 467, in run_until_complete >> return future.result() >> File "test.py", line 12, in main >> await task # call 1 >> File "test.py", line 5, in raise_error >> raise ValueError() >> File "test.py", line 17, in main >> task.result() # call 2 >> File "test.py", line 21, in main >> task.result() # call 3 >> ValueError >> >> If people agree there's an issue along these lines, I can file an >> issue in the tracker. I didn't seem to find one when searching for >> open issues with search terms like "asyncio traceback". >> >> Thanks, >> --Chris >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > > -- > Nathaniel J. Smith -- https://vorpus.org From pfreixes at gmail.com Sun Dec 31 12:32:21 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Sun, 31 Dec 2017 18:32:21 +0100 Subject: [Async-sig] Asyncio loop instrumentation Message-ID: Hi, foks First of all, I hope that you have had a good 2017 and I wish for your the best for 2018. This email is the continuation of a plan B of the first proposal [1] to articulate a way to measure the load of the Asyncio loop. The main objections with the first implementation were focused on the technical debt that the implementation imposed, taking into account that the feature was definitely out of the main scope of the Asyncio loop goal. Nathaniel proposed a plan B based on implement some kind of instrumentalization that will allow developers to implement features such as the load one. I put off the plan for a while having, wrongly, feeling that an implementation of the loop wired with the proper events will impact with the loop performance. Far away from the reality, the suggested implementation in terms of performance penalty is almost negligible, at least for what I considered the happy path which means that there are no instruments listening for these events. These new implementation of the load method - remember that it returns a load factor between 0.0 and 1.0 that inform you about how bussy is your loop - based on an instrument can be checked with the following snippet: async def coro(loop, idx): await asyncio.sleep(idx % 10) if load() > 0.9: return False start = loop.time() while loop.time() - start < 0.02: pass return True async def run(loop, n): tasks = [coro(loop, i) for i in range(n)] results = await asyncio.gather(*tasks) abandoned = len([r for r in results if not r]) print("Load reached for {} coros/seq: {}, abandoned {}/{}".format(n/10, load(), abandoned)) async def main(loop): await run(loop, 100) loop = asyncio.get_event_loop() loop.add_instrument(LoadInstrument) loop.run_until_complete(main(loop)) The `LoadInstrument` [2] meets the contract of the LoopInstrument[3] that allow it to listen the proper loop signals that will be used to calculate the load of the loop. For this proposal [4], POC, I've preferred make a reduced list of events: * `loop_start` : Executed when the loop starts for the first time. * `tick_start` : Executed when a new loop tick is started. * `io_start` : Executed when a new IO process starts. * `io_end` : Executed when the IO process ends. * `tick_end` : Executed when the loop tick ends. * `loop_stop` : Executed when the loop stops. The idea of giving just this short list of events try to avoid over complicate third loops implementations, implementing the minimum set of events that a typical reactor has to implement. I would like to gather your feedback for this new approximation, and if you believe that it might be interesting which are the next steps that must be done. Cheers, [1] https://mail.python.org/pipermail/async-sig/2017-August/000382.html [2] https://github.com/pfreixes/asyncio_load_instrument/blob/master/asyncio_load_instrument/instrument.py#L8 [3] https://github.com/pfreixes/cpython/blob/asyncio_loop_instrumentation/Lib/asyncio/loop_instruments.py#L9 [4] https://github.com/pfreixes/cpython/commit/adc3ba46979394997c40aa89178b4724442b28eb -- --pau From solipsis at pitrou.net Sun Dec 31 14:02:47 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 31 Dec 2017 20:02:47 +0100 Subject: [Async-sig] Asyncio loop instrumentation References: Message-ID: <20171231200247.755b5c42@fsol> On Sun, 31 Dec 2017 18:32:21 +0100 Pau Freixes wrote: > > These new implementation of the load method - remember that it returns > a load factor between 0.0 and 1.0 that inform you about how bussy is > your loop - What does it mean exactly? Is it the ratio of CPU time over wall clock time? Depending on your needs, the `psutil` library (*) and/or the new `time.thread_time` function (**) may also help. (*) https://psutil.readthedocs.io/en/latest/ (**) https://docs.python.org/3.7/library/time.html#time.thread_time > For this proposal [4], POC, I've preferred make a reduced list of events: > > * `loop_start` : Executed when the loop starts for the first time. > * `tick_start` : Executed when a new loop tick is started. > * `io_start` : Executed when a new IO process starts. > * `io_end` : Executed when the IO process ends. > * `tick_end` : Executed when the loop tick ends. > * `loop_stop` : Executed when the loop stops. What do you call a "IO process" in this context? Regards Antoine. From yselivanov at gmail.com Sun Dec 31 14:12:35 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Sun, 31 Dec 2017 22:12:35 +0300 Subject: [Async-sig] Asyncio loop instrumentation In-Reply-To: <20171231200247.755b5c42@fsol> References: <20171231200247.755b5c42@fsol> Message-ID: <2B052CB7-FFCB-494C-97BA-DA8859B49598@gmail.com> When PEP 567 is accepted, I plan to implement advanced instrumentation in uvloop, to monitor basically all io/callback/loop events. I'm still -1 to do this in asyncio at least in 3.7, because i'd like us to have some time to experiment with such instrumentation in real production code (preferably at scale) Yury Sent from my iPhone > On Dec 31, 2017, at 10:02 PM, Antoine Pitrou wrote: > > On Sun, 31 Dec 2017 18:32:21 +0100 > Pau Freixes wrote: >> >> These new implementation of the load method - remember that it returns >> a load factor between 0.0 and 1.0 that inform you about how bussy is >> your loop - > > What does it mean exactly? Is it the ratio of CPU time over wall clock > time? > > Depending on your needs, the `psutil` library (*) and/or the new > `time.thread_time` function (**) may also help. > > (*) https://psutil.readthedocs.io/en/latest/ > (**) https://docs.python.org/3.7/library/time.html#time.thread_time > >> For this proposal [4], POC, I've preferred make a reduced list of events: >> >> * `loop_start` : Executed when the loop starts for the first time. >> * `tick_start` : Executed when a new loop tick is started. >> * `io_start` : Executed when a new IO process starts. >> * `io_end` : Executed when the IO process ends. >> * `tick_end` : Executed when the loop tick ends. >> * `loop_stop` : Executed when the loop stops. > > What do you call a "IO process" in this context? > > Regards > > Antoine. > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/