From gmludo at gmail.com Fri Oct 6 14:30:25 2017 From: gmludo at gmail.com (Ludovic Gasc) Date: Fri, 6 Oct 2017 14:30:25 -0400 Subject: [Async-sig] Asyncio-tkinter hope this is the right thread for this In-Reply-To: References: Message-ID: Hi Tim, I know I don't answer directly to your question, but we have already made a small application with AsyncIO + Quamash: https://github.com/harvimt/quamash It uses Qt for the rendering. To my little experience it works pretty well, and for my point of view, Qt rendering is nicer than tk. And Qt is LGPL since a long time ago, no more license issues like in the past. Regards. -- Ludovic Gasc (GMLudo) Lead Developer Architect at ALLOcloud https://be.linkedin.com/in/ludovicgasc 2017-09-02 23:26 GMT-04:00 Tim Jones via Async-sig : > Hi I am trying to build an application on the raspberry Pi using python3.6 > and tkinter for the GUI. It is an entry system for a community project and > so I need to be able to receive FOB swipe data into the application which > is running a tkinter user interface. > > Any who I have all the components working ok but can't seem to get them to > work together. I am trying to use asyncio but keep clashing with the > tkinter event loop, so my question is can the asyncio loop work > cooperatively with the tkinter loop? > > Do I need to create a seperate thread, or is there some other solution > available? > > Any help with this would be greatly appreciated > > Regards tim > > Sent from my iPad > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Oct 18 14:04:41 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 18 Oct 2017 20:04:41 +0200 Subject: [Async-sig] APIs for high-bandwidth large I/O? Message-ID: <20171018200441.505e8145@fsol> Hi, I am currently looking into ways to optimize large data transfers for a distributed computing framework (https://github.com/dask/distributed/). We are using Tornado but the question is more general, as it turns out that certain kinds of API are an impediment to such optimizations. To put things short, there are a couple benchmarks discussed here: https://github.com/tornadoweb/tornado/issues/2147#issuecomment-337187960 - for Tornado, this benchmark: https://gist.github.com/pitrou/0f772867008d861c4aa2d2d7b846bbf0 - for asyncio, this benchmark: https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc Both implement a trivial form of framing using the "preferred" APIs of each framework (IOStream for Tornado, Protocol for asyncio), and then benchmark it over 100 MB frames using a simple echo client/server. The results (on Python 3.6) are interesting: - vanilla asyncio achieves 350 MB/s - vanilla Tornado achieves 400 MB/s - asyncio + uvloop achieves 600 MB/s - an optimized Tornado IOStream with a more sophisticated buffering logic (https://github.com/tornadoweb/tornado/pull/2166) achieves 700 MB/s The latter result is especially interesting. uvloop uses hand-crafted Cython code + the C libuv library, still, a pure Python version of Tornado does better thanks to an improved buffering logic in the streaming layer. Even the Tornado result is not ideal. When profiling, we see that 50% of the runtime is actual IO calls (socket.send and socket.recv), but the rest is still overhead. Especially, buffering on the read side still has costly memory copies (b''.join calls take 22% of the time!). For a framed layer, you shouldn't need so many copies. Once you've read the frame length, you can allocate the frame upfront and read into it. It is at odds, however, with the API exposed by asyncio's Protocol: data_received() gives you a new bytes object as soon as data arrives. It's already too late: a spurious memory copy will have to occur. Tornado's IOStream is less constrained, but it supports too many read schemes (including several types of callbacks). So I crafted a limited version of IOStream (*) that supports little functionality, but is able to use socket.recv_into() when asked for a given number of bytes. When benchmarked, this version achieves 950 MB/s. This is still without C code! (*) see https://github.com/tornadoweb/tornado/compare/master...pitrou:stream_readinto?expand=1 When profiling that limited version of IOStream, we see that 68% of the runtime is actual IO calls (socket.send and socket.recv_into). Still, 21% of the total runtime is spent allocating a 100 MB buffer for each frame! That's 70% of the non-IO overhead! Whether or not there are smart ways to reuse that writable buffer depends on how the application intends to use data: does it throw it away before the next read or not? It doesn't sound easily doable in the general case. So I'm wondering which kind of APIs async libraries could expose to make those use cases faster. I know curio and trio have socket objects which would probably fit the bill. I don't know if there are higher-level concepts that may be as adequate for achieving the highest performance. Also, since asyncio is the de facto standard now, I wonder if asyncio might grow such a new API. That may be troublesome: asyncio already has Protocols and Streams, and people often complain about its extensive API surface that's difficult for beginners :-) Addendum: asyncio streams ------------------------- I didn't think asyncio streams would be a good solution, but I still wrote a benchmark variant for them out of curiosity, and it turns out I was right. The results: - vanilla asyncio streams achieve 300 MB/s - asyncio + uvloop streams achieve 550 MB/s The benchmark script is at https://gist.github.com/pitrou/202221ca9c9c74c0b48373ac89e15fd7 Regards Antoine. From dave at dabeaz.com Fri Oct 20 15:31:22 2017 From: dave at dabeaz.com (David Beazley) Date: Fri, 20 Oct 2017 14:31:22 -0500 Subject: [Async-sig] APIs for high-bandwidth large I/O? In-Reply-To: <20171018200441.505e8145@fsol> References: <20171018200441.505e8145@fsol> Message-ID: <5073165E-BEA6-4A68-9BE3-87B3A265A791@dabeaz.com> I adapted this benchmark to Curio using streams and Curio's support for readinto(). Code is at https://gist.github.com/dabeaz/999dc7d08ddd2c0dea790de67948e756 Support for readinto() is somewhat recent in Curio so for testing, you will need the latest version from Github (https://github.com/dabeaz/curio). However, here are the results I got on my machine: - vanilla asyncio archieves 145 MB/s - asyncio + uvloop achieves 340 MB/s - Curio achieves 550 MB/s Asyncio tests were run using: https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc Cheers, Dave > On Oct 18, 2017, at 1:04 PM, Antoine Pitrou wrote: > > > Hi, > > I am currently looking into ways to optimize large data transfers for a > distributed computing framework > (https://github.com/dask/distributed/). We are using Tornado but the > question is more general, as it turns out that certain kinds of API are > an impediment to such optimizations. > > To put things short, there are a couple benchmarks discussed here: > https://github.com/tornadoweb/tornado/issues/2147#issuecomment-337187960 > > - for Tornado, this benchmark: > https://gist.github.com/pitrou/0f772867008d861c4aa2d2d7b846bbf0 > - for asyncio, this benchmark: > https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc > > Both implement a trivial form of framing using the "preferred" APIs of > each framework (IOStream for Tornado, Protocol for asyncio), and then > benchmark it over 100 MB frames using a simple echo client/server. > > The results (on Python 3.6) are interesting: > - vanilla asyncio achieves 350 MB/s > - vanilla Tornado achieves 400 MB/s > - asyncio + uvloop achieves 600 MB/s > - an optimized Tornado IOStream with a more sophisticated buffering > logic (https://github.com/tornadoweb/tornado/pull/2166) > achieves 700 MB/s > > The latter result is especially interesting. uvloop uses hand-crafted > Cython code + the C libuv library, still, a pure Python version of > Tornado does better thanks to an improved buffering logic in the > streaming layer. > > Even the Tornado result is not ideal. When profiling, we see that > 50% of the runtime is actual IO calls (socket.send and socket.recv), > but the rest is still overhead. Especially, buffering on the read side > still has costly memory copies (b''.join calls take 22% of the time!). > > For a framed layer, you shouldn't need so many copies. Once you've > read the frame length, you can allocate the frame upfront and read into > it. It is at odds, however, with the API exposed by asyncio's Protocol: > data_received() gives you a new bytes object as soon as data arrives. > It's already too late: a spurious memory copy will have to occur. > > Tornado's IOStream is less constrained, but it supports too many read > schemes (including several types of callbacks). So I crafted a limited > version of IOStream (*) that supports little functionality, but is able > to use socket.recv_into() when asked for a given number of bytes. When > benchmarked, this version achieves 950 MB/s. This is still without C > code! > > (*) see > https://github.com/tornadoweb/tornado/compare/master...pitrou:stream_readinto?expand=1 > > When profiling that limited version of IOStream, we see that 68% of the > runtime is actual IO calls (socket.send and socket.recv_into). > Still, 21% of the total runtime is spent allocating a 100 MB buffer for > each frame! That's 70% of the non-IO overhead! Whether or not there > are smart ways to reuse that writable buffer depends on how the > application intends to use data: does it throw it away before the next > read or not? It doesn't sound easily doable in the general case. > > > So I'm wondering which kind of APIs async libraries could expose to > make those use cases faster. I know curio and trio have socket objects > which would probably fit the bill. I don't know if there are > higher-level concepts that may be as adequate for achieving the highest > performance. > > Also, since asyncio is the de facto standard now, I wonder if asyncio > might grow such a new API. That may be troublesome: asyncio already > has Protocols and Streams, and people often complain about its > extensive API surface that's difficult for beginners :-) > > > Addendum: asyncio streams > ------------------------- > > I didn't think asyncio streams would be a good solution, but I still > wrote a benchmark variant for them out of curiosity, and it turns out I > was right. The results: > - vanilla asyncio streams achieve 300 MB/s > - asyncio + uvloop streams achieve 550 MB/s > > The benchmark script is at > https://gist.github.com/pitrou/202221ca9c9c74c0b48373ac89e15fd7 > > Regards > > Antoine. > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From solipsis at pitrou.net Sat Oct 21 05:53:01 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 21 Oct 2017 11:53:01 +0200 Subject: [Async-sig] APIs for high-bandwidth large I/O? References: <20171018200441.505e8145@fsol> <5073165E-BEA6-4A68-9BE3-87B3A265A791@dabeaz.com> Message-ID: <20171021115301.086c1cef@fsol> On Fri, 20 Oct 2017 14:31:22 -0500 David Beazley wrote: > I adapted this benchmark to Curio using streams and Curio's support for readinto(). Code is at https://gist.github.com/dabeaz/999dc7d08ddd2c0dea790de67948e756 > Support for readinto() is somewhat recent in Curio so for testing, you will need the latest version from Github (https://github.com/dabeaz/curio). However, here are the results I got on my machine: > > - vanilla asyncio archieves 145 MB/s > - asyncio + uvloop achieves 340 MB/s > - Curio achieves 550 MB/s Thank you Dave! I ran it on my machine and get roughly 910 MB/s, i.e. more or less the same speed as a tweaked Tornado using readinto. I also wrote a variant of the benchmark using socketserver and plain sockets. Its gets 1150 MB/s, and most time seems spent in the kernel, so I'm not quite sure if it's possible to improve over that: https://gist.github.com/pitrou/3ac31e82b4461cbc9b4eee151a47bfee (note that running the server in a separate process doesn't improve things; neither does using writev() or sendmsg() to send the two buffers at once) Regards Antoine. > > Asyncio tests were run using: https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc > > Cheers, > Dave > > > On Oct 18, 2017, at 1:04 PM, Antoine Pitrou wrote: > > > > > > Hi, > > > > I am currently looking into ways to optimize large data transfers for a > > distributed computing framework > > (https://github.com/dask/distributed/). We are using Tornado but the > > question is more general, as it turns out that certain kinds of API are > > an impediment to such optimizations. > > > > To put things short, there are a couple benchmarks discussed here: > > https://github.com/tornadoweb/tornado/issues/2147#issuecomment-337187960 > > > > - for Tornado, this benchmark: > > https://gist.github.com/pitrou/0f772867008d861c4aa2d2d7b846bbf0 > > - for asyncio, this benchmark: > > https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc > > > > Both implement a trivial form of framing using the "preferred" APIs of > > each framework (IOStream for Tornado, Protocol for asyncio), and then > > benchmark it over 100 MB frames using a simple echo client/server. > > > > The results (on Python 3.6) are interesting: > > - vanilla asyncio achieves 350 MB/s > > - vanilla Tornado achieves 400 MB/s > > - asyncio + uvloop achieves 600 MB/s > > - an optimized Tornado IOStream with a more sophisticated buffering > > logic (https://github.com/tornadoweb/tornado/pull/2166) > > achieves 700 MB/s > > > > The latter result is especially interesting. uvloop uses hand-crafted > > Cython code + the C libuv library, still, a pure Python version of > > Tornado does better thanks to an improved buffering logic in the > > streaming layer. > > > > Even the Tornado result is not ideal. When profiling, we see that > > 50% of the runtime is actual IO calls (socket.send and socket.recv), > > but the rest is still overhead. Especially, buffering on the read side > > still has costly memory copies (b''.join calls take 22% of the time!). > > > > For a framed layer, you shouldn't need so many copies. Once you've > > read the frame length, you can allocate the frame upfront and read into > > it. It is at odds, however, with the API exposed by asyncio's Protocol: > > data_received() gives you a new bytes object as soon as data arrives. > > It's already too late: a spurious memory copy will have to occur. > > > > Tornado's IOStream is less constrained, but it supports too many read > > schemes (including several types of callbacks). So I crafted a limited > > version of IOStream (*) that supports little functionality, but is able > > to use socket.recv_into() when asked for a given number of bytes. When > > benchmarked, this version achieves 950 MB/s. This is still without C > > code! > > > > (*) see > > https://github.com/tornadoweb/tornado/compare/master...pitrou:stream_readinto?expand=1 > > > > When profiling that limited version of IOStream, we see that 68% of the > > runtime is actual IO calls (socket.send and socket.recv_into). > > Still, 21% of the total runtime is spent allocating a 100 MB buffer for > > each frame! That's 70% of the non-IO overhead! Whether or not there > > are smart ways to reuse that writable buffer depends on how the > > application intends to use data: does it throw it away before the next > > read or not? It doesn't sound easily doable in the general case. > > > > > > So I'm wondering which kind of APIs async libraries could expose to > > make those use cases faster. I know curio and trio have socket objects > > which would probably fit the bill. I don't know if there are > > higher-level concepts that may be as adequate for achieving the highest > > performance. > > > > Also, since asyncio is the de facto standard now, I wonder if asyncio > > might grow such a new API. That may be troublesome: asyncio already > > has Protocols and Streams, and people often complain about its > > extensive API surface that's difficult for beginners :-) > > > > > > Addendum: asyncio streams > > ------------------------- > > > > I didn't think asyncio streams would be a good solution, but I still > > wrote a benchmark variant for them out of curiosity, and it turns out I > > was right. The results: > > - vanilla asyncio streams achieve 300 MB/s > > - asyncio + uvloop streams achieve 550 MB/s > > > > The benchmark script is at > > https://gist.github.com/pitrou/202221ca9c9c74c0b48373ac89e15fd7 > > > > Regards > > > > Antoine. > > > > > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > From gmludo at gmail.com Sun Oct 22 12:51:26 2017 From: gmludo at gmail.com (Ludovic Gasc) Date: Sun, 22 Oct 2017 18:51:26 +0200 Subject: [Async-sig] APIs for high-bandwidth large I/O? In-Reply-To: <20171018200441.505e8145@fsol> References: <20171018200441.505e8145@fsol> Message-ID: Hi Antoine, First, thanks a lot to share with us the results of your research, it's interesting, even if, for me, I'm not the first concerned with that: Most of our network data we handle with AsyncIO have enough space with a simple UDP packet ;-) Anyway, I have questions about your suggestions, see below: 2017-10-18 20:04 GMT+02:00 Antoine Pitrou : > > Also, since asyncio is the de facto standard now, I wonder if asyncio > might grow such a new API. That may be troublesome: asyncio already > has Protocols and Streams, and people often complain about its > extensive API surface that's difficult for beginners :-) > >From my point of view, I like a lot to code TCP servers with the Streams API of AsyncIO. When I see your piece of code for Tornado: https://gist.github.com/pitrou/0f772867008d861c4aa2d2d7b846bbf0 It looks very similar of the Streams API. To my understanding, some AsyncIO libraries, like aiohttp, don't use Streams API of AsyncIO and implement a specific implementation, especially to have a full control on the buffers: based on the information provided inside the protocol, you are able to know if a small payload or a big payload will arrive on the wire, like Content-Length header with HTTP. My question is: Do you think it's possible to have a simple API to fit at the same time small payloads and big payloads, without impacts on efficiency for all use cases ? Or it's too much integrated with the protocol implementation itself to converge until a simple solution for all use cases ? Maybe before to think about an AsyncIO improvement, it might be a new library at the first step ? To let decide people to use this new buffering algorithm ? Thanks for your responses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine at python.org Mon Oct 23 08:35:35 2017 From: antoine at python.org (Antoine Pitrou) Date: Mon, 23 Oct 2017 14:35:35 +0200 Subject: [Async-sig] APIs for high-bandwidth large I/O? In-Reply-To: References: <20171018200441.505e8145@fsol> Message-ID: Hi Ludovic, Le 22/10/2017 ? 18:51, Ludovic Gasc a ?crit?: > > To my understanding, some AsyncIO libraries, like aiohttp, don't > use?Streams API?of AsyncIO and implement a specific implementation, > especially to have a full control on the buffers: based on the > information provided inside the protocol, you are able to know if a > small payload or a big payload will arrive on the wire, like > Content-Length header with HTTP. > > My question is: Do you think it's possible to have a simple API to fit > at the same time small payloads and big payloads, without impacts on > efficiency for all use cases ? On the write side: I think it's always possible, as my PR for Tornado shows (*). You "just" have to implement a smart buffer management scheme. (*) https://github.com/tornadoweb/tornado/pull/2169 On the read side: you need a readinto()-like API for large buffers. Small buffers can still use a read()-like API for convenience. That means a bit of complication to switch from one mode to the other internally, but it looks doable. On the topic of asyncio, however, asyncio Streams are currently layered over the Transport / Protocol abstraction, and the data_received() paradigm means data is *already* copied at least once when read from the socket to a bytes object whose length isn't controlled by the application, so it's a lot battle unless Streams are reimplemented differently. Regards Antoine.