From manu.mirandad at gmail.com Thu Jun 8 18:32:06 2017 From: manu.mirandad at gmail.com (manuel miranda) Date: Thu, 08 Jun 2017 22:32:06 +0000 Subject: [Async-sig] async/sync library reusage Message-ID: Hello everyone, After using asyncio for a while, I'm struggling to find information about how to support both synchronous and asynchronous use cases for the same library. I.e. imagine you have a package for http requests and you want to give the user the choice to use a synchronous or an asynchronous interface. Right now the approach the community is following is creating separate libraries one for each version. This is far from ideal for several reasons, some I can think of: - Code duplication, most of the functionality is the same in both libraries, only difference is the sync/async behaviors - Some new async libraries lack functionality compared to their sync siblings. Others will introduce bugs that the sync version already solved long ago, etc. - Different interfaces for the user for the same exact functionality. In summary, in some cases it looks like reinventing the wheel. So now comes the question, is there any documentation, guide on what would be best practice supporting this kind of duality? I've been playing a bit with that on my own but I really don't know if I'm doing something stupid or not. Simple example: """ import asyncio class MyConnector: @classmethod async def get(cls, key): return key class AsyncClient: async def get(self, key): return await MyConnector.get(key) class SyncClient: def __init__(self): self.loop = asyncio.get_event_loop() def get(self, key): return self.loop.run_until_complete(MyConnector.get(key)) def sync_call(): client = SyncClient() print(client.get("sync_key")) async def async_call(): client = AsyncClient() print(await client.get("async_key")) if __name__ == "__main__": loop = asyncio.get_event_loop() loop.run_until_complete(async_call()) sync_call() """ This is in case the underlying connector is asynchronous already. If its synchronous and you want to support both modes, you have to rewrite the IO interactions of MyConnector into a new AsyncMyConnector to support asyncio and then use one or the other accordingly in the upper classes. Am I doing it right or there is another better/alternative way? Thanks for your time, Manuel -------------- next part -------------- An HTML attachment was scrubbed... URL: From luciano at ramalho.org Thu Jun 8 20:07:22 2017 From: luciano at ramalho.org (Luciano Ramalho) Date: Fri, 09 Jun 2017 00:07:22 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: Message-ID: Hello, Manuel. The answer to your problem is to refactor the libraries in the "sans I/O" style. Take a look here: http://sans-io.readthedocs.io/ On Thu, 8 Jun 2017 at 19:32 manuel miranda wrote: > Hello everyone, > > After using asyncio for a while, I'm struggling to find information about > how to support both synchronous and asynchronous use cases for the same > library. > > I.e. imagine you have a package for http requests and you want to give the > user the choice to use a synchronous or an asynchronous interface. Right > now the approach the community is following is creating separate libraries > one for each version. This is far from ideal for several reasons, some I > can think of: > > - Code duplication, most of the functionality is the same in both > libraries, only difference is the sync/async behaviors > - Some new async libraries lack functionality compared to their sync > siblings. Others will introduce bugs that the sync version already solved > long ago, etc. > - Different interfaces for the user for the same exact functionality. > > In summary, in some cases it looks like reinventing the wheel. So now > comes the question, is there any documentation, guide on what would be best > practice supporting this kind of duality? I've been playing a bit with that > on my own but I really don't know if I'm doing something stupid or not. > Simple example: > > """ > import asyncio > > > class MyConnector: > > @classmethod > async def get(cls, key): > return key > > > class AsyncClient: > > async def get(self, key): > return await MyConnector.get(key) > > > class SyncClient: > > def __init__(self): > self.loop = asyncio.get_event_loop() > > def get(self, key): > return self.loop.run_until_complete(MyConnector.get(key)) > > > def sync_call(): > client = SyncClient() > print(client.get("sync_key")) > > > async def async_call(): > client = AsyncClient() > print(await client.get("async_key")) > > > if __name__ == "__main__": > loop = asyncio.get_event_loop() > loop.run_until_complete(async_call()) > sync_call() > """ > > This is in case the underlying connector is asynchronous already. If its > synchronous and you want to support both modes, you have to rewrite the IO > interactions of MyConnector into a new AsyncMyConnector to support asyncio > and then use one or the other accordingly in the upper classes. > > Am I doing it right or there is another better/alternative way? > > Thanks for your time, > > Manuel > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Jun 9 01:48:05 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 8 Jun 2017 22:48:05 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: Message-ID: On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda wrote: > Hello everyone, > > After using asyncio for a while, I'm struggling to find information about > how to support both synchronous and asynchronous use cases for the same > library. > > I.e. imagine you have a package for http requests and you want to give the > user the choice to use a synchronous or an asynchronous interface. Right now > the approach the community is following is creating separate libraries one > for each version. This is far from ideal for several reasons, some I can > think of: > > - Code duplication, most of the functionality is the same in both libraries, > only difference is the sync/async behaviors > - Some new async libraries lack functionality compared to their sync > siblings. Others will introduce bugs that the sync version already solved > long ago, etc. > - Different interfaces for the user for the same exact functionality. > > In summary, in some cases it looks like reinventing the wheel. So now comes > the question, is there any documentation, guide on what would be best > practice supporting this kind of duality? I would say that this is something that we as a community are still figuring out. I really like the Sans-IO approach, and it's a really valuable piece of the solution, but it doesn't solve the whole problem by itself - you still need to actually do I/O, and this means things like error handling and timeouts that aren't obviously a natural fit to the Sans-IO approach, and this means you may still have some tricky code that can end up duplicated. (Or maybe the Sans-IO approach can be extended to handle these things too?) There are active discussions happening in projects like urllib3 [1] and packaging [2] about what the best strategy to take is. And the options vary a lot depending on whether you need to support python 2 etc. If you figure out a good approach I think everyone would be interested to hear it :-) -n [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348 [2] Here's the same API implemented three different ways: Using deferreds: https://github.com/pypa/packaging/pull/87 "traditional" sans-IO: https://github.com/pypa/packaging/pull/88 Using the "effect" library: https://github.com/dstufft/packaging/pull/1 -- Nathaniel J. Smith -- https://vorpus.org From yarkot1 at gmail.com Fri Jun 9 02:19:51 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Fri, 09 Jun 2017 06:19:51 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: Message-ID: On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith wrote: > On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda > wrote: > > Hello everyone, > > > > After using asyncio for a while, I'm struggling to find information about > > how to support both synchronous and asynchronous use cases for the same > > library. > > > > I.e. imagine you have a package for http requests and you want to give > the > > user the choice to use a synchronous or an asynchronous interface. Right > now > > the approach the community is following is creating separate libraries > one > > for each version. This is far from ideal for several reasons, some I can > > think of: > > > > - Code duplication, most of the functionality is the same in both > libraries, > > only difference is the sync/async behaviors > > - Some new async libraries lack functionality compared to their sync > > siblings. Others will introduce bugs that the sync version already solved > > long ago, etc. > > - Different interfaces for the user for the same exact functionality. > > > > In summary, in some cases it looks like reinventing the wheel. So now > comes > > the question, is there any documentation, guide on what would be best > > practice supporting this kind of duality? > > I would say that this is something that we as a community are still > figuring out. I really like the Sans-IO approach, and it's a really > valuable piece of the solution, but it doesn't solve the whole problem > by itself - you still need to actually do I/O, and this means things > like error handling and timeouts that aren't obviously a natural fit > to the Sans-IO approach, and this means you may still have some tricky > code that can end up duplicated. (Or maybe the Sans-IO approach can be > extended to handle these things too?) There are active discussions > happening in projects like urllib3 [1] and packaging [2] about what > the best strategy to take is. And the options vary a lot depending on > whether you need to support python 2 etc. > > If you figure out a good approach I think everyone would be interested > to hear it :-) > Just to leave this breadcrumb here - I've said this before, but not thought in depth about it a lot, but pretty sure that in something like Python4, async needs to become "first class citizen," that is from the inside out, right in the bowels of the repl loop. If async is the default, and synchronous calls just a special case (e.g. single-task async), then I'd expect two things (at least): developers would have an easier time, make fewer mistakes in async programming (the language would handle more), and libraries would be unified as async & sync would be the same. Maybe there's something that would make this not make sense, but I'd be really surprised. Larry's gil removal work intuitively seems an enabler for this kind of (potential) work... -y > -n > > [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348 > > [2] Here's the same API implemented three different ways: > Using deferreds: https://github.com/pypa/packaging/pull/87 > "traditional" sans-IO: https://github.com/pypa/packaging/pull/88 > Using the "effect" library: https://github.com/dstufft/packaging/pull/1 > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gronholm at nextday.fi Fri Jun 9 04:05:21 2017 From: alex.gronholm at nextday.fi (=?UTF-8?Q?Alex_Gr=c3=b6nholm?=) Date: Fri, 9 Jun 2017 11:05:21 +0300 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: Message-ID: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi> Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19: > > On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith > wrote: > > On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda > > wrote: > > Hello everyone, > > > > After using asyncio for a while, I'm struggling to find > information about > > how to support both synchronous and asynchronous use cases for > the same > > library. > > > > I.e. imagine you have a package for http requests and you want > to give the > > user the choice to use a synchronous or an asynchronous > interface. Right now > > the approach the community is following is creating separate > libraries one > > for each version. This is far from ideal for several reasons, > some I can > > think of: > > > > - Code duplication, most of the functionality is the same in > both libraries, > > only difference is the sync/async behaviors > > - Some new async libraries lack functionality compared to their sync > > siblings. Others will introduce bugs that the sync version > already solved > > long ago, etc. > > - Different interfaces for the user for the same exact > functionality. > > > > In summary, in some cases it looks like reinventing the wheel. > So now comes > > the question, is there any documentation, guide on what would be > best > > practice supporting this kind of duality? > > I would say that this is something that we as a community are still > figuring out. I really like the Sans-IO approach, and it's a really > valuable piece of the solution, but it doesn't solve the whole problem > by itself - you still need to actually do I/O, and this means things > like error handling and timeouts that aren't obviously a natural fit > to the Sans-IO approach, and this means you may still have some tricky > code that can end up duplicated. (Or maybe the Sans-IO approach can be > extended to handle these things too?) There are active discussions > happening in projects like urllib3 [1] and packaging [2] about what > the best strategy to take is. And the options vary a lot depending on > whether you need to support python 2 etc. > > If you figure out a good approach I think everyone would be interested > to hear it :-) > > > Just to leave this breadcrumb here - I've said this before, but not > thought in depth about it a lot, but pretty sure that in something > like Python4, async needs to become "first class citizen," that is > from the inside out, right in the bowels of the repl loop. > Python 4 will be nothing more than the next minor release after 3.9. Because Guido hates double digit minor versions :) > If async is the default, and synchronous calls just a special case > (e.g. single-task async), then I'd expect two things (at least): > developers would have an easier time, make fewer mistakes in async > programming (the language would handle more), and libraries would be > unified as async & sync would be the same. Are you suggesting the removal of the "await", "async with" and "async for" structures? Those were added deliberately so developers can spot the yield points in a coroutine function. Not having them would give us something like gevent where you can never tell when your task is going to be adjourned in favor of another. > > Maybe there's something that would make this not make sense, but I'd > be really surprised. Larry's gil removal work intuitively seems an > enabler for this kind of (potential) work... > > -y > > > > -n > > [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348 > > [2] Here's the same API implemented three different ways: > Using deferreds: https://github.com/pypa/packaging/pull/87 > "traditional" sans-IO: https://github.com/pypa/packaging/pull/88 > Using the "effect" library: > https://github.com/dstufft/packaging/pull/1 > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yarkot1 at gmail.com Fri Jun 9 04:49:00 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Fri, 09 Jun 2017 08:49:00 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi> References: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi> Message-ID: On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm wrote: > Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19: > > > On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith wrote: > >> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda >> wrote: >> > Hello everyone, >> > >> > After using asyncio for a while, I'm struggling to find information >> about >> > how to support both synchronous and asynchronous use cases for the same >> > library. >> > >> > I.e. imagine you have a package for http requests and you want to give >> the >> > user the choice to use a synchronous or an asynchronous interface. >> Right now >> > the approach the community is following is creating separate libraries >> one >> > for each version. This is far from ideal for several reasons, some I can >> > think of: >> > >> > - Code duplication, most of the functionality is the same in both >> libraries, >> > only difference is the sync/async behaviors >> > - Some new async libraries lack functionality compared to their sync >> > siblings. Others will introduce bugs that the sync version already >> solved >> > long ago, etc. >> > - Different interfaces for the user for the same exact functionality. >> > >> > In summary, in some cases it looks like reinventing the wheel. So now >> comes >> > the question, is there any documentation, guide on what would be best >> > practice supporting this kind of duality? >> >> I would say that this is something that we as a community are still >> figuring out. I really like the Sans-IO approach, and it's a really >> valuable piece of the solution, but it doesn't solve the whole problem >> by itself - you still need to actually do I/O, and this means things >> like error handling and timeouts that aren't obviously a natural fit >> to the Sans-IO approach, and this means you may still have some tricky >> code that can end up duplicated. (Or maybe the Sans-IO approach can be >> extended to handle these things too?) There are active discussions >> happening in projects like urllib3 [1] and packaging [2] about what >> the best strategy to take is. And the options vary a lot depending on >> whether you need to support python 2 etc. >> >> If you figure out a good approach I think everyone would be interested >> to hear it :-) >> > > Just to leave this breadcrumb here - I've said this before, but not > thought in depth about it a lot, but pretty sure that in something like > Python4, async needs to become "first class citizen," that is from the > inside out, right in the bowels of the repl loop. > > Python 4 will be nothing more than the next minor release after 3.9. > Because Guido hates double digit minor versions :) > > If async is the default, and synchronous calls just a special case (e.g. > single-task async), then I'd expect two things (at least): developers would > have an easier time, make fewer mistakes in async programming (the language > would handle more), and libraries would be unified as async & sync would be > the same. > > Are you suggesting the removal of the "await", "async with" and "async > for" structures? Those were added deliberately so developers can spot the > yield points in a coroutine function. Not having them would give us > something like gevent where you can never tell when your task is going to > be adjourned in favor of another. > actually I was bot thinking of that... but I was thinking of processing in the language, rather than a library... In any case, I don't have answers, only a vision which keeps coming up. My interest is not in providing "a solution", rather generating a reasoned discussion... > > Maybe there's something that would make this not make sense, but I'd be > really surprised. Larry's gil removal work intuitively seems an enabler > for this kind of (potential) work... > > -y > > > >> -n >> >> [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348 >> >> [2] Here's the same API implemented three different ways: >> Using deferreds: https://github.com/pypa/packaging/pull/87 >> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88 >> Using the "effect" library: https://github.com/dstufft/packaging/pull/1 >> >> -- >> Nathaniel J. Smith -- https://vorpus.org >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > > > _______________________________________________ > Async-sig mailing listAsync-sig at python.orghttps://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gronholm at nextday.fi Fri Jun 9 04:57:46 2017 From: alex.gronholm at nextday.fi (=?UTF-8?Q?Alex_Gr=c3=b6nholm?=) Date: Fri, 9 Jun 2017 11:57:46 +0300 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi> Message-ID: Yarko Tymciurak kirjoitti 09.06.2017 klo 11:49: > > On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm > wrote: > > Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19: >> >> On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith > > wrote: >> >> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda >> > wrote: >> > Hello everyone, >> > >> > After using asyncio for a while, I'm struggling to find >> information about >> > how to support both synchronous and asynchronous use cases >> for the same >> > library. >> > >> > I.e. imagine you have a package for http requests and you >> want to give the >> > user the choice to use a synchronous or an asynchronous >> interface. Right now >> > the approach the community is following is creating >> separate libraries one >> > for each version. This is far from ideal for several >> reasons, some I can >> > think of: >> > >> > - Code duplication, most of the functionality is the same >> in both libraries, >> > only difference is the sync/async behaviors >> > - Some new async libraries lack functionality compared to >> their sync >> > siblings. Others will introduce bugs that the sync version >> already solved >> > long ago, etc. >> > - Different interfaces for the user for the same exact >> functionality. >> > >> > In summary, in some cases it looks like reinventing the >> wheel. So now comes >> > the question, is there any documentation, guide on what >> would be best >> > practice supporting this kind of duality? >> >> I would say that this is something that we as a community are >> still >> figuring out. I really like the Sans-IO approach, and it's a >> really >> valuable piece of the solution, but it doesn't solve the >> whole problem >> by itself - you still need to actually do I/O, and this means >> things >> like error handling and timeouts that aren't obviously a >> natural fit >> to the Sans-IO approach, and this means you may still have >> some tricky >> code that can end up duplicated. (Or maybe the Sans-IO >> approach can be >> extended to handle these things too?) There are active >> discussions >> happening in projects like urllib3 [1] and packaging [2] >> about what >> the best strategy to take is. And the options vary a lot >> depending on >> whether you need to support python 2 etc. >> >> If you figure out a good approach I think everyone would be >> interested >> to hear it :-) >> >> >> Just to leave this breadcrumb here - I've said this before, but >> not thought in depth about it a lot, but pretty sure that in >> something like Python4, async needs to become "first class >> citizen," that is from the inside out, right in the bowels of the >> repl loop. >> > Python 4 will be nothing more than the next minor release after > 3.9. Because Guido hates double digit minor versions :) > >> If async is the default, and synchronous calls just a special >> case (e.g. single-task async), then I'd expect two things (at >> least): developers would have an easier time, make fewer mistakes >> in async programming (the language would handle more), and >> libraries would be unified as async & sync would be the same. > Are you suggesting the removal of the "await", "async with" and > "async for" structures? Those were added deliberately so > developers can spot the yield points in a coroutine function. Not > having them would give us something like gevent where you can > never tell when your task is going to be adjourned in favor of > another. > > > actually I was bot thinking of that... but I was thinking of > processing in the language, rather than a library... > > In any case, I don't have answers, only a vision which keeps coming > up. My interest is not in providing "a solution", rather generating a > reasoned discussion... Then explain what you mean by making async a first class citizen in Python. In my mind it already is, by courtesy of having the "async def", "await" et al added to the language syntax itself and the inclusion of the asyncio module in the standard library. The only other thing that could've been done is to tie the language syntax to a single event loop implementation but that was deliberately left out. > > > >> >> Maybe there's something that would make this not make sense, but >> I'd be really surprised. Larry's gil removal work intuitively >> seems an enabler for this kind of (potential) work... >> >> -y >> >> >> >> -n >> >> [1] >> https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348 >> >> [2] Here's the same API implemented three different ways: >> Using deferreds: https://github.com/pypa/packaging/pull/87 >> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88 >> Using the "effect" library: >> https://github.com/dstufft/packaging/pull/1 >> >> -- >> Nathaniel J. Smith -- https://vorpus.org >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> >> >> >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct:https://www.python.org/psf/codeofconduct/ > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cory at lukasa.co.uk Fri Jun 9 05:06:39 2017 From: cory at lukasa.co.uk (Cory Benfield) Date: Fri, 9 Jun 2017 10:06:39 +0100 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: Message-ID: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> > On 9 Jun 2017, at 06:48, Nathaniel Smith wrote: > > I would say that this is something that we as a community are still > figuring out. I really like the Sans-IO approach, and it's a really > valuable piece of the solution, but it doesn't solve the whole problem > by itself - you still need to actually do I/O, and this means things > like error handling and timeouts that aren't obviously a natural fit > to the Sans-IO approach, and this means you may still have some tricky > code that can end up duplicated. (Or maybe the Sans-IO approach can be > extended to handle these things too?) There are active discussions > happening in projects like urllib3 [1] and packaging [2] about what > the best strategy to take is. And the options vary a lot depending on > whether you need to support python 2 etc. Let me take a moment to elaborate on some of the thinking that has gone on for urllib3/Requests. We have an unusual set of constraints that are worth understanding, and so I?ll throw out all the ideas we had and why they were rejected (and indeed, why you may not want to reject them). 1. Implement the core library in asyncio, add a synchronous shim on top of it in terms of asyncio.run_until_complete(). This works great in many ways: you get a nice async-based library implementation, you correctly prioritise people using the async case over those using the synchronous one, and you can expect wide support and interop thanks to asyncio?s role as the common event loop implementation. However, you don?t support more novel async paradigms like those used by curio and trio. More damningly for urllib3/Requests, this also limits your supported Python versions to 3.5 and later. There are also some efficiency concerns. Finally, unless you?re willing to only support 3.7 you end up needing to pass loop arguments around which is pretty gross. 2. Have an abstract low-level I/O interface and ?bleach? it (remove the keywords async/await) on Python 2. This would require you write all your code in terms of a small number of abstract I/O operations with ?async? in front of their name, e.g. ?async def send?, ?async def recv?, and so-on. You can then implement these across multiple I/O backends, and also provide a synchronous one that still has ?async? in front of it and just doesn?t ever use the word ?await?. You can then provide a code transformation at install time on Python 2 that transforms that codebase, removing all the words ?async? and ?await? and leaving behind a synchronous-only codebase. The advantages here are better support for novel async paradigms (e.g. curio and trio), the ability to write more native backends for non-asyncio I/O models (e.g. Twisted/Tornado), and having a single codebase that handles sync and async. There are many myriad disadvantages. The first is the most obvious: the code your users run is not the same as the code you shipped. While the transformation is small and pretty easy to understand, that doesn?t remove its risks. It also makes debugging harder and more painful. On top of that, your Python 3 synchronous code looks pretty ugly because you have to write the word ?await? around it even though it is not in fact asynchronous (technically you *don?t* have to do that but I guarantee IDEs will get mad). More subtly, this causes problems for backpressure and task management on event loops. It turns out defining your low-level I/O primitives is not trivial. In urllib3?s case, one of the things we?d need is either the equivalent of ?async def select()? or ?async def new_task?. In the first case, to write this would require a careful management of futures/deferreds and various bits of state in order to correctly suspect execution on event loops. In the second case, the synchronous version of this is called ?threading.Thread? and that has a number of issues. I?d say that if you?re going to use threads you may as well just always use threads, but more importantly it has substantially different semantics to all async task management which make it difficult to reason about and to ensure that the code is sensible. This approach is also entirely untested, at any scale. It?s simply not clear that it works yet. All the tooling would need to be written. 3. Just use Twisted/Tornado. This variation on number (1) turns out to get you surprisingly close to our actual goal. Twisted and Tornado support Python 2 and Python 3, when async/await are present they integrate fairly nicely with them, and they give you the added advantage of allowing your Python 2 users to do asynchronous code so long as they buy into the relevant async ecosystem. It also means that you can use the run_until_complete model for your Python 2 synchronous code. However, these also have some downsides. Twisted, the library I know better, doesn?t yet integrate as cleanly with async/await as we?d like: that?s coming sometime this year, probably with the landing of 3.7. Additionally, Twisted has no equivalent of asyncio.run_until_complete(), which would mean that someone would have to add the relevant Twisted support (either restartable or instantiable reactors, neither of which Twisted has yet). This also adds a potentially sizeable external dependency, which isn?t necessarily all that fun. 4. ??? Who knows. Right now there is no clarity about what we?re going to do. It?s possible that the answer will end up being ?nothing at the moment? and that we?ll wait for the ecosystem to progress for a while before making the change. Either way, it?s clear that there is no easy answer to this problem. Cory -------------- next part -------------- An HTML attachment was scrubbed... URL: From yarkot1 at gmail.com Fri Jun 9 05:08:04 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Fri, 09 Jun 2017 09:08:04 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi> Message-ID: On Fri, Jun 9, 2017 at 3:57 AM Alex Gr?nholm wrote: > Yarko Tymciurak kirjoitti 09.06.2017 klo 11:49: > > > On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm > wrote: > >> Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19: >> >> >> On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith wrote: >> >>> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda >>> wrote: >>> > Hello everyone, >>> > >>> > After using asyncio for a while, I'm struggling to find information >>> about >>> > how to support both synchronous and asynchronous use cases for the same >>> > library. >>> > >>> > I.e. imagine you have a package for http requests and you want to give >>> the >>> > user the choice to use a synchronous or an asynchronous interface. >>> Right now >>> > the approach the community is following is creating separate libraries >>> one >>> > for each version. This is far from ideal for several reasons, some I >>> can >>> > think of: >>> > >>> > - Code duplication, most of the functionality is the same in both >>> libraries, >>> > only difference is the sync/async behaviors >>> > - Some new async libraries lack functionality compared to their sync >>> > siblings. Others will introduce bugs that the sync version already >>> solved >>> > long ago, etc. >>> > - Different interfaces for the user for the same exact functionality. >>> > >>> > In summary, in some cases it looks like reinventing the wheel. So now >>> comes >>> > the question, is there any documentation, guide on what would be best >>> > practice supporting this kind of duality? >>> >>> I would say that this is something that we as a community are still >>> figuring out. I really like the Sans-IO approach, and it's a really >>> valuable piece of the solution, but it doesn't solve the whole problem >>> by itself - you still need to actually do I/O, and this means things >>> like error handling and timeouts that aren't obviously a natural fit >>> to the Sans-IO approach, and this means you may still have some tricky >>> code that can end up duplicated. (Or maybe the Sans-IO approach can be >>> extended to handle these things too?) There are active discussions >>> happening in projects like urllib3 [1] and packaging [2] about what >>> the best strategy to take is. And the options vary a lot depending on >>> whether you need to support python 2 etc. >>> >>> If you figure out a good approach I think everyone would be interested >>> to hear it :-) >>> >> >> Just to leave this breadcrumb here - I've said this before, but not >> thought in depth about it a lot, but pretty sure that in something like >> Python4, async needs to become "first class citizen," that is from the >> inside out, right in the bowels of the repl loop. >> >> Python 4 will be nothing more than the next minor release after 3.9. >> Because Guido hates double digit minor versions :) >> >> If async is the default, and synchronous calls just a special case (e.g. >> single-task async), then I'd expect two things (at least): developers would >> have an easier time, make fewer mistakes in async programming (the language >> would handle more), and libraries would be unified as async & sync would be >> the same. >> >> Are you suggesting the removal of the "await", "async with" and "async >> for" structures? Those were added deliberately so developers can spot the >> yield points in a coroutine function. Not having them would give us >> something like gevent where you can never tell when your task is going to >> be adjourned in favor of another. >> > > actually I was bot thinking of that... but I was thinking of processing > in the language, rather than a library... > > In any case, I don't have answers, only a vision which keeps coming up. > My interest is not in providing "a solution", rather generating a reasoned > discussion... > > Then explain what you mean by making async a first class citizen in > Python. In my mind it already is, by courtesy of having the "async def", > "await" et al added to the language syntax itself and the inclusion of the > asyncio module in the standard library. The only other thing that could've > been done is to tie the language syntax to a single event loop > implementation but that was deliberately left out. > > i'm sorry - I thought that was clear by saying it would be in the repl loop itself and not in a library. and those it wouldn't require two versions of every library. That's what I meant. that is right now it's coming from the outside in, that is to say from applications, closer in, to an attempt at a common library. i'm suggesting it start from the inside of the language out so that all things have that support and that it is not just a library thus any code can take advantage of either single or multiple async tasks, goal being that there only need be on version of libraries. at least that's the discussion I'm calling for. does that help? > > >> >> Maybe there's something that would make this not make sense, but I'd be >> really surprised. Larry's gil removal work intuitively seems an enabler >> for this kind of (potential) work... >> >> -y >> >> >> >>> -n >>> >>> [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348 >>> >>> [2] Here's the same API implemented three different ways: >>> Using deferreds: https://github.com/pypa/packaging/pull/87 >>> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88 >>> Using the "effect" library: https://github.com/dstufft/packaging/pull/1 >>> >>> -- >>> Nathaniel J. Smith -- https://vorpus.org >>> _______________________________________________ >>> Async-sig mailing list >>> Async-sig at python.org >>> https://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> >> >> >> _______________________________________________ >> Async-sig mailing listAsync-sig at python.orghttps://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> >> >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jun 9 11:33:31 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 9 Jun 2017 08:33:31 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi> Message-ID: Yarko, I think your vision is too far out. Maybe something like that could become a reality in Python 5 -- it would require all extensions to become aware of the async stuff (adding it to Python doesn't automatically add it to C!). Also the GIL has nothing to do with this, async tasks all run in the same thread, and if there was no GIL it would not be any different (else two cooperating tasks could be run on different threads and you'd be back on pre-emptive scheduling and the ensuing race conditions). (Note that I refer to Python 4 as Python after the Gilectomy -- it needs to be a new major version since the C API changes dramatically as C extensions will no longer have the protection of the GIL.) --Guido On Fri, Jun 9, 2017 at 2:08 AM, Yarko Tymciurak wrote: > > On Fri, Jun 9, 2017 at 3:57 AM Alex Gr?nholm > wrote: > >> Yarko Tymciurak kirjoitti 09.06.2017 klo 11:49: >> >> >> On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm >> wrote: >> >>> Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19: >>> >>> >>> On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith wrote: >>> >>>> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda >>>> wrote: >>>> > Hello everyone, >>>> > >>>> > After using asyncio for a while, I'm struggling to find information >>>> about >>>> > how to support both synchronous and asynchronous use cases for the >>>> same >>>> > library. >>>> > >>>> > I.e. imagine you have a package for http requests and you want to >>>> give the >>>> > user the choice to use a synchronous or an asynchronous interface. >>>> Right now >>>> > the approach the community is following is creating separate >>>> libraries one >>>> > for each version. This is far from ideal for several reasons, some I >>>> can >>>> > think of: >>>> > >>>> > - Code duplication, most of the functionality is the same in both >>>> libraries, >>>> > only difference is the sync/async behaviors >>>> > - Some new async libraries lack functionality compared to their sync >>>> > siblings. Others will introduce bugs that the sync version already >>>> solved >>>> > long ago, etc. >>>> > - Different interfaces for the user for the same exact functionality. >>>> > >>>> > In summary, in some cases it looks like reinventing the wheel. So now >>>> comes >>>> > the question, is there any documentation, guide on what would be best >>>> > practice supporting this kind of duality? >>>> >>>> I would say that this is something that we as a community are still >>>> figuring out. I really like the Sans-IO approach, and it's a really >>>> valuable piece of the solution, but it doesn't solve the whole problem >>>> by itself - you still need to actually do I/O, and this means things >>>> like error handling and timeouts that aren't obviously a natural fit >>>> to the Sans-IO approach, and this means you may still have some tricky >>>> code that can end up duplicated. (Or maybe the Sans-IO approach can be >>>> extended to handle these things too?) There are active discussions >>>> happening in projects like urllib3 [1] and packaging [2] about what >>>> the best strategy to take is. And the options vary a lot depending on >>>> whether you need to support python 2 etc. >>>> >>>> If you figure out a good approach I think everyone would be interested >>>> to hear it :-) >>>> >>> >>> Just to leave this breadcrumb here - I've said this before, but not >>> thought in depth about it a lot, but pretty sure that in something like >>> Python4, async needs to become "first class citizen," that is from the >>> inside out, right in the bowels of the repl loop. >>> >>> Python 4 will be nothing more than the next minor release after 3.9. >>> Because Guido hates double digit minor versions :) >>> >>> If async is the default, and synchronous calls just a special case (e.g. >>> single-task async), then I'd expect two things (at least): developers would >>> have an easier time, make fewer mistakes in async programming (the language >>> would handle more), and libraries would be unified as async & sync would be >>> the same. >>> >>> Are you suggesting the removal of the "await", "async with" and "async >>> for" structures? Those were added deliberately so developers can spot the >>> yield points in a coroutine function. Not having them would give us >>> something like gevent where you can never tell when your task is going to >>> be adjourned in favor of another. >>> >> >> actually I was bot thinking of that... but I was thinking of processing >> in the language, rather than a library... >> >> In any case, I don't have answers, only a vision which keeps coming up. >> My interest is not in providing "a solution", rather generating a reasoned >> discussion... >> >> Then explain what you mean by making async a first class citizen in >> Python. In my mind it already is, by courtesy of having the "async def", >> "await" et al added to the language syntax itself and the inclusion of the >> asyncio module in the standard library. The only other thing that could've >> been done is to tie the language syntax to a single event loop >> implementation but that was deliberately left out. >> >> i'm sorry - I thought that was clear by saying it would be in the repl > loop itself and not in a library. > > and those it wouldn't require two versions of every library. That's what > I meant. > > that is right now it's coming from the outside in, that is to say from > applications, closer in, to an attempt at a common library. i'm > suggesting it start from the inside of the language out so that all things > have that support and that it is not just a library thus any code can take > advantage of either single or multiple async tasks, goal being that there > only need be on version of libraries. at least that's the discussion I'm > calling for. > > does that help? > > >> >> >>> >>> Maybe there's something that would make this not make sense, but I'd be >>> really surprised. Larry's gil removal work intuitively seems an enabler >>> for this kind of (potential) work... >>> >>> -y >>> >>> >>> >>>> -n >>>> >>>> [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348 >>>> >>>> [2] Here's the same API implemented three different ways: >>>> Using deferreds: https://github.com/pypa/packaging/pull/87 >>>> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88 >>>> Using the "effect" library: https://github.com/dstufft/packaging/pull/1 >>>> >>>> -- >>>> Nathaniel J. Smith -- https://vorpus.org >>>> _______________________________________________ >>>> Async-sig mailing list >>>> Async-sig at python.org >>>> https://mail.python.org/mailman/listinfo/async-sig >>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> >>> >>> >>> _______________________________________________ >>> Async-sig mailing listAsync-sig at python.orghttps://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> >>> >>> _______________________________________________ >>> Async-sig mailing list >>> Async-sig at python.org >>> https://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> >> >> > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jun 9 11:40:16 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 9 Jun 2017 08:40:16 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Cory, I really like your approach #1. You can make it work all the way back to Python 3.3 by using @coroutine and yield from. That's not pretty, but for libraries the goal shouldn't primarily be prettiness of the implementation -- prettiness of the API is much more important, and that's preserved by asyncio's compatibility (code you write that's compatible with Python 3.3 and the latest asyncio from PyPI should still run on Python 3.7 and provide a modern async/await-based API for applications written for 3.7). Also, I don't think the situation with explicitly passing loop= is so terrible as you seem to think. If you rely on the default event loop, you rely on there *being* a default event loop, but there will always be one unless an app goes out of its way to create an event loop and then make it not the default loop. Only the asyncio tests do that. There are a few things you can't do unless you pass an event loop (such as scheduling callbacks before the event loop is started) but other than that it's really not such a big deal as people seem to think it is. (You mostly see the pattern because asyncio itself uses that pattern, because it needs to be robust for the extreme use case where someone *does* hide the active event loop. But there will never be two active event loops.) --Guido On Fri, Jun 9, 2017 at 2:06 AM, Cory Benfield wrote: > > On 9 Jun 2017, at 06:48, Nathaniel Smith wrote: > > I would say that this is something that we as a community are still > figuring out. I really like the Sans-IO approach, and it's a really > valuable piece of the solution, but it doesn't solve the whole problem > by itself - you still need to actually do I/O, and this means things > like error handling and timeouts that aren't obviously a natural fit > to the Sans-IO approach, and this means you may still have some tricky > code that can end up duplicated. (Or maybe the Sans-IO approach can be > extended to handle these things too?) There are active discussions > happening in projects like urllib3 [1] and packaging [2] about what > the best strategy to take is. And the options vary a lot depending on > whether you need to support python 2 etc. > > > Let me take a moment to elaborate on some of the thinking that has gone on > for urllib3/Requests. We have an unusual set of constraints that are worth > understanding, and so I?ll throw out all the ideas we had and why they were > rejected (and indeed, why you may not want to reject them). > > 1. Implement the core library in asyncio, add a synchronous shim on top of > it in terms of asyncio.run_until_complete(). > > This works great in many ways: you get a nice async-based library > implementation, you correctly prioritise people using the async case over > those using the synchronous one, and you can expect wide support and > interop thanks to asyncio?s role as the common event loop implementation. > However, you don?t support more novel async paradigms like those used by > curio and trio. > > More damningly for urllib3/Requests, this also limits your supported > Python versions to 3.5 and later. There are also some efficiency concerns. > Finally, unless you?re willing to only support 3.7 you end up needing to > pass loop arguments around which is pretty gross. > > 2. Have an abstract low-level I/O interface and ?bleach? it (remove the > keywords async/await) on Python 2. > > This would require you write all your code in terms of a small number of > abstract I/O operations with ?async? in front of their name, e.g. ?async > def send?, ?async def recv?, and so-on. You can then implement these across > multiple I/O backends, and also provide a synchronous one that still has > ?async? in front of it and just doesn?t ever use the word ?await?. You can > then provide a code transformation at install time on Python 2 that > transforms that codebase, removing all the words ?async? and ?await? and > leaving behind a synchronous-only codebase. > > The advantages here are better support for novel async paradigms (e.g. > curio and trio), the ability to write more native backends for non-asyncio > I/O models (e.g. Twisted/Tornado), and having a single codebase that > handles sync and async. > > There are many myriad disadvantages. The first is the most obvious: the > code your users run is not the same as the code you shipped. While the > transformation is small and pretty easy to understand, that doesn?t remove > its risks. It also makes debugging harder and more painful. On top of that, > your Python 3 synchronous code looks pretty ugly because you have to write > the word ?await? around it even though it is not in fact asynchronous > (technically you *don?t* have to do that but I guarantee IDEs will get mad). > > More subtly, this causes problems for backpressure and task management on > event loops. It turns out defining your low-level I/O primitives is not > trivial. In urllib3?s case, one of the things we?d need is either the > equivalent of ?async def select()? or ?async def new_task?. In the first > case, to write this would require a careful management of futures/deferreds > and various bits of state in order to correctly suspect execution on event > loops. In the second case, the synchronous version of this is called > ?threading.Thread? and that has a number of issues. I?d say that if you?re > going to use threads you may as well just always use threads, but more > importantly it has substantially different semantics to all async task > management which make it difficult to reason about and to ensure that the > code is sensible. > > This approach is also entirely untested, at any scale. It?s simply not > clear that it works yet. All the tooling would need to be written. > > 3. Just use Twisted/Tornado. > > This variation on number (1) turns out to get you surprisingly close to > our actual goal. Twisted and Tornado support Python 2 and Python 3, when > async/await are present they integrate fairly nicely with them, and they > give you the added advantage of allowing your Python 2 users to do > asynchronous code so long as they buy into the relevant async ecosystem. It > also means that you can use the run_until_complete model for your Python 2 > synchronous code. > > However, these also have some downsides. Twisted, the library I know > better, doesn?t yet integrate as cleanly with async/await as we?d like: > that?s coming sometime this year, probably with the landing of 3.7. > Additionally, Twisted has no equivalent of asyncio.run_until_complete(), > which would mean that someone would have to add the relevant Twisted > support (either restartable or instantiable reactors, neither of which > Twisted has yet). > > This also adds a potentially sizeable external dependency, which isn?t > necessarily all that fun. > > 4. ??? Who knows. > > Right now there is no clarity about what we?re going to do. It?s possible > that the answer will end up being ?nothing at the moment? and that we?ll > wait for the ecosystem to progress for a while before making the change. > Either way, it?s clear that there is no easy answer to this problem. > > Cory > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cory at lukasa.co.uk Fri Jun 9 11:51:16 2017 From: cory at lukasa.co.uk (Cory Benfield) Date: Fri, 9 Jun 2017 16:51:16 +0100 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: > On 9 Jun 2017, at 16:40, Guido van Rossum wrote: > > Also, I don't think the situation with explicitly passing loop= is so terrible as you seem to think. If you rely on the default event loop, you rely on there *being* a default event loop, but there will always be one unless an app goes out of its way to create an event loop and then make it not the default loop. Only the asyncio tests do that. There are a few things you can't do unless you pass an event loop (such as scheduling callbacks before the event loop is started) but other than that it's really not such a big deal as people seem to think it is. (You mostly see the pattern because asyncio itself uses that pattern, because it needs to be robust for the extreme use case where someone *does* hide the active event loop. But there will never be two active event loops.) My concern with multiple loops boils down to the fact that urllib3 supports being used in a multithreaded context where each thread can independently make forward progress on one request. To establish that with a synchronous codebase you either need one event loop per thread or you need to spawn a background thread on startup that owns the only event loop in the process. Generally speaking I?ve not had positive results with libraries spawning their own threads in Python. In my experience this has tended to lead to programs that deadlock mysteriously or that fail to terminate in the face of a Ctrl+C. So I tend to prefer to have users spawn their own threads, which would make me want a ?one-event-loop-per-thread? model: hence, needing a loop parameter to pass around prior to 3.6. I admit that my concerns here regarding libraries spawning their own threads may be overblown: after my series of negative experiences I basically never went back to that model, and it may be that the problems were more user-error than anything else. However, I feel comfortable saying that libraries spawning their own Python threads is definitely subtle and hard to get right, at the very least. Cory From ben at bendarnell.com Fri Jun 9 12:07:51 2017 From: ben at bendarnell.com (Ben Darnell) Date: Fri, 09 Jun 2017 16:07:51 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield wrote: > > > My concern with multiple loops boils down to the fact that urllib3 > supports being used in a multithreaded context where each thread can > independently make forward progress on one request. To establish that with > a synchronous codebase you either need one event loop per thread or you > need to spawn a background thread on startup that owns the only event loop > in the process. > Yeah, one event loop per thread is probably the way to go for integration with synchronous codebases. A dedicated event loop thread may perform better but libraries that spawn threads are problematic. > > Generally speaking I?ve not had positive results with libraries spawning > their own threads in Python. In my experience this has tended to lead to > programs that deadlock mysteriously or that fail to terminate in the face > of a Ctrl+C. So I tend to prefer to have users spawn their own threads, > which would make me want a ?one-event-loop-per-thread? model: hence, > needing a loop parameter to pass around prior to 3.6. > You can avoid the loop parameter on older versions of asyncio (at least as long as the default event loop policy is used) by manually setting your event loop as current before calling run_until_complete (and resetting it afterwards). Tornado's run_sync() method is equivalent to asyncio's run_until_complete(), and Tornado supports multiple IOLoops in this way. We use this to expose a synchronous version of our AsyncHTTPClient: https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 -Ben > > I admit that my concerns here regarding libraries spawning their own > threads may be overblown: after my series of negative experiences I > basically never went back to that model, and it may be that the problems > were more user-error than anything else. However, I feel comfortable saying > that libraries spawning their own Python threads is definitely subtle and > hard to get right, at the very least. > > Cory > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jun 9 12:28:57 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 9 Jun 2017 09:28:57 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: On Fri, Jun 9, 2017 at 8:51 AM, Cory Benfield wrote: > > On 9 Jun 2017, at 16:40, Guido van Rossum wrote: > > Also, I don't think the situation with explicitly passing loop= is so > terrible as you seem to think. If you rely on the default event loop, you > rely on there *being* a default event loop, but there will always be one > unless an app goes out of its way to create an event loop and then make it > not the default loop. Only the asyncio tests do that. There are a few > things you can't do unless you pass an event loop (such as scheduling > callbacks before the event loop is started) but other than that it's really > not such a big deal as people seem to think it is. (You mostly see the > pattern because asyncio itself uses that pattern, because it needs to be > robust for the extreme use case where someone *does* hide the active event > loop. But there will never be two active event loops.) > > > My concern with multiple loops boils down to the fact that urllib3 > supports being used in a multithreaded context where each thread can > independently make forward progress on one request. To establish that with > a synchronous codebase you either need one event loop per thread or you > need to spawn a background thread on startup that owns the only event loop > in the process. > > Generally speaking I?ve not had positive results with libraries spawning > their own threads in Python. In my experience this has tended to lead to > programs that deadlock mysteriously or that fail to terminate in the face > of a Ctrl+C. So I tend to prefer to have users spawn their own threads, > which would make me want a ?one-event-loop-per-thread? model: hence, > needing a loop parameter to pass around prior to 3.6. > > I admit that my concerns here regarding libraries spawning their own > threads may be overblown: after my series of negative experiences I > basically never went back to that model, and it may be that the problems > were more user-error than anything else. However, I feel comfortable saying > that libraries spawning their own Python threads is definitely subtle and > hard to get right, at the very least. At least one of us is still confused. The one-event-loop-per-thread model is supported in asyncio without passing the loop around explicitly. The get_event_loop() implementation stores all its state in thread-locals instance, so it returns the thread's event loop. (Because this is an "advanced" model, you have to explicitly create the event loop with new_event_loop() and make it the default loop for the thread with set_event_loop().) All in all, I'm a bit curious why you would need to use asyncio at all when you've got a thread per request anyway. I agree there are problems with threads that are hidden from an app. Hence asyncio allows you to set the executor where it runs things you pass to run_in_executor() (including some of its own, esp. getaddrinfo()). One note about the one-event-loop-per-thread model: threads should be very cautious touching each other's event loops. This should only be done usingcall_soon_threadsafe()! --Guido -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cory at lukasa.co.uk Fri Jun 9 12:55:35 2017 From: cory at lukasa.co.uk (Cory Benfield) Date: Fri, 9 Jun 2017 17:55:35 +0100 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: > On 9 Jun 2017, at 17:28, Guido van Rossum wrote: > > At least one of us is still confused. The one-event-loop-per-thread model is supported in asyncio without passing the loop around explicitly. The get_event_loop() implementation stores all its state in thread-locals instance, so it returns the thread's event loop. (Because this is an "advanced" model, you have to explicitly create the event loop with new_event_loop() and make it the default loop for the thread with set_event_loop().) Aha, ok, so the confused one is me. I did not know this. =) That definitely works a lot better. It admittedly works less well if someone is doing their own custom event loop stuff, but that?s probably an acceptable limitation up until the time that Python 2 goes quietly into the night. > All in all, I'm a bit curious why you would need to use asyncio at all when you've got a thread per request anyway. Yeah, so this is a bit of a diversion from the original topic of this thread but I think it?s an idea worth discussing in this space. I want to reframe the question a bit if you don?t mind, so shout if you think I?m not responding to quite what you were asking. In my understanding, the question you?re implicitly asking is this: "If you have a thread-safe library today (that is, one that allows users to do threaded I/O with appropriate resource pooling and management), why move to a model built on asyncio?? There are many answers to this question that differ for different libraries with different uses, but for HTTP libraries like urllib3 here are our reasons. The first is that it turns out that even for HTTP/1.1 you need to write something that amounts to a partial event loop to properly handle the protocol. Good HTTP clients need to watch for responses while they?re uploading body data because if a response arrives during that process body upload should be terminated immediately. This is also required for sensibly handling things like Expect: 100-continue, as well as spotting other intermediate responses and connection teardowns sensibly and without throwing exceptions. Today urllib3 does not do this, and it has caused us pain, so our v2 branch includes a backport of the Python 3 selectors module and a hand-written partially-complete event loop that only handles the specific cases we need. This is an extra thing for us to debug and maintain, and ultimately it?d be easier to just delegate the whole thing to event loops written by others who promise to maintain them and make them efficient. The second answer is that I believe good asyncio support in libraries is a vital part of the future of this language, and ?good? asyncio support IMO does as little as possible to block the main event loop. Running all of the complex protocol parsing and state manipulation of the Requests stack on a background thread is not cheap, and involves a lot of GIL swapping around. We have found several bug reports complaining about using Requests with largish-numbers of threads, indicating that our big stack of Python code really does cause contention on the GIL if used heavily. In general, having to defer to a thread to run *Python* code in asyncio is IMO a nasty anti-pattern that should be avoided where possible. It is much less bad to defer to a thread to then block on a syscall (e.g. to get an ?async? getaddrinfo), but doing so to run a big big stack of Python code is vastly less pleasant for the main event loop. For this reason, we?d ideally treat asyncio as the first-class citizen and retrofit on the threaded support, rather than the other way around. This goes doubly so when you consider the other reasons for wanting to use asyncio. The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is a *highly* concurrent protocol. Connections send a lot of control frames back and forth that are invisible to the user working at the semantic HTTP level but that nonetheless need relatively low-latency turnaround (e.g. PING frames). It turns out that in the traditional synchronous HTTP model urllib3 only gets access to the socket to do work when the user calls into our code. If the user goes a ?long? time without calling into urllib3, we take a long time to process any data off the connection. In the best case this causes latency spikes as we process all the data that queued up in the socket. In the worst case, this causes us to lose connections we should have been able to keep because we failed to respond to a PING frame in a timely manner. My experience is that purely synchronous libraries handling HTTP/2 simply cannot provide a positive user experience. HTTP/2 flat-out *requires* either an event loop or a dedicated background thread, and in practice in your dedicated background thread you?d also just end up writing an event loop (see answer 1 again). For this reason, it is basically mandatory for HTTP/2 support in Python to either use an event loop or to spawn out a dedicated C thread that does not hold the GIL to do the I/O (as this thread will be regularly woken up to handle I/O events). Hopefully this (admittedly horrifyingly long) response helps illuminate why we?re interested in asyncio support. It should be noted that if we find ourselves unable to get it in the short term we may simply resort to offering an ?async? API that involves us doing the rough equivalent of running in a thread-pool executor, but I won?t be thrilled about it. ;) Cory -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jun 9 14:23:53 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 9 Jun 2017 11:23:53 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Great write-up! I actually find the async nature of HTTP (both versions) a compelling reason to switch to asyncio. For HTTP/1.1 this sounds mostly like it would make the implementation easier; for HTTP/2 it sounds like it would just be better for the user-side as well (if the user just wants one resource they can safely continue to use the synchronous HTTP/1.1 version of the API.) On Fri, Jun 9, 2017 at 9:55 AM, Cory Benfield wrote: > > On 9 Jun 2017, at 17:28, Guido van Rossum wrote: > > At least one of us is still confused. The one-event-loop-per-thread model > is supported in asyncio without passing the loop around explicitly. The > get_event_loop() implementation stores all its state in thread-locals > instance, so it returns the thread's event loop. (Because this is an > "advanced" model, you have to explicitly create the event loop with > new_event_loop() and make it the default loop for the thread with > set_event_loop().) > > > Aha, ok, so the confused one is me. I did not know this. =) That > definitely works a lot better. It admittedly works less well if someone is > doing their own custom event loop stuff, but that?s probably an acceptable > limitation up until the time that Python 2 goes quietly into the night. > > All in all, I'm a bit curious why you would need to use asyncio at all > when you've got a thread per request anyway. > > > Yeah, so this is a bit of a diversion from the original topic of this > thread but I think it?s an idea worth discussing in this space. I want to > reframe the question a bit if you don?t mind, so shout if you think I?m not > responding to quite what you were asking. In my understanding, the question > you?re implicitly asking is this: > > "If you have a thread-safe library today (that is, one that allows users > to do threaded I/O with appropriate resource pooling and management), why > move to a model built on asyncio?? > > There are many answers to this question that differ for different > libraries with different uses, but for HTTP libraries like urllib3 here are > our reasons. > > The first is that it turns out that even for HTTP/1.1 you need to write > something that amounts to a partial event loop to properly handle the > protocol. Good HTTP clients need to watch for responses while they?re > uploading body data because if a response arrives during that process body > upload should be terminated immediately. This is also required for sensibly > handling things like Expect: 100-continue, as well as spotting other > intermediate responses and connection teardowns sensibly and without > throwing exceptions. > > Today urllib3 does not do this, and it has caused us pain, so our v2 > branch includes a backport of the Python 3 selectors module and a > hand-written partially-complete event loop that only handles the specific > cases we need. This is an extra thing for us to debug and maintain, and > ultimately it?d be easier to just delegate the whole thing to event loops > written by others who promise to maintain them and make them efficient. > > The second answer is that I believe good asyncio support in libraries is a > vital part of the future of this language, and ?good? asyncio support IMO > does as little as possible to block the main event loop. Running all of the > complex protocol parsing and state manipulation of the Requests stack on a > background thread is not cheap, and involves a lot of GIL swapping around. > We have found several bug reports complaining about using Requests with > largish-numbers of threads, indicating that our big stack of Python code > really does cause contention on the GIL if used heavily. In general, having > to defer to a thread to run *Python* code in asyncio is IMO a nasty > anti-pattern that should be avoided where possible. It is much less bad to > defer to a thread to then block on a syscall (e.g. to get an ?async? > getaddrinfo), but doing so to run a big big stack of Python code is vastly > less pleasant for the main event loop. > > For this reason, we?d ideally treat asyncio as the first-class citizen and > retrofit on the threaded support, rather than the other way around. This > goes doubly so when you consider the other reasons for wanting to use > asyncio. > > The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is a > *highly* concurrent protocol. Connections send a lot of control frames back > and forth that are invisible to the user working at the semantic HTTP level > but that nonetheless need relatively low-latency turnaround (e.g. PING > frames). It turns out that in the traditional synchronous HTTP model > urllib3 only gets access to the socket to do work when the user calls into > our code. If the user goes a ?long? time without calling into urllib3, we > take a long time to process any data off the connection. In the best case > this causes latency spikes as we process all the data that queued up in the > socket. In the worst case, this causes us to lose connections we should > have been able to keep because we failed to respond to a PING frame in a > timely manner. > > My experience is that purely synchronous libraries handling HTTP/2 simply > cannot provide a positive user experience. HTTP/2 flat-out *requires* > either an event loop or a dedicated background thread, and in practice in > your dedicated background thread you?d also just end up writing an event > loop (see answer 1 again). For this reason, it is basically mandatory for > HTTP/2 support in Python to either use an event loop or to spawn out a > dedicated C thread that does not hold the GIL to do the I/O (as this thread > will be regularly woken up to handle I/O events). > > Hopefully this (admittedly horrifyingly long) response helps illuminate > why we?re interested in asyncio support. It should be noted that if we find > ourselves unable to get it in the short term we may simply resort to > offering an ?async? API that involves us doing the rough equivalent of > running in a thread-pool executor, but I won?t be thrilled about it. ;) > > Cory > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yarkot1 at gmail.com Fri Jun 9 15:52:36 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Fri, 9 Jun 2017 14:52:36 -0500 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: ...so I really am enjoying the conversation. Guido - re: "vision too far out": yes, for people trying to struggle w/ async support in their libraries, now... but that is also part of my motivation. Python 5? Sure... (I may have to watch it come to use from the grave, but hopefully not... ;-) ). Anyway, from back-porting and tactical "implement now" concerns, to plans for next release, to plans for next version of python, to brainstorming much less concrete future versions - all are an interesting continuum. Re: GIL... sure, sort of, and sort of not. I was thinking "as long as major changes are going on... think about additional structural changes..." More to the point: as I see it, people have a hard time thinking about async in the cooperative-multitasking (CMT) sense, and thus disappointments happen around blocking (missed, or unexpects, e.g. hardware failures). Cory (in his reply - and, yeah: nice writeup!) hints to what I generally structurally like: "...we?d ideally treat asyncio as the first-class citizen and retrofit on the threaded support, rather than the other way around" Structurally, async is light-weight overhead compared to threads, which are lightweight compared to processes, and so a sort of natural app flow seems from lightest-weight, on out. To me, this seems practical for making life easier for developers, because you can imagine "promoting" an async task caught unexpectedly blocking, to a thread, while still having the lightest-weight loop have control over it (promotion out, as well as cancellation while promoted). As for multiple task loops, or loops off in a thread, I haven't thought about it too much, but this seems like nothing new nor unreasonable. I'm thinking of the base-stations we talk over in our mobile connections, which are multiple diskless servers, and hot-promote to "master" server status on hardware failure (or live capacity upgrade, i.e. inserting processors). This pattern seems both reasonable and useful in this context, i.e. the concept of a master loop (which implies communication/control channels - a complication). With some thought, some reasonable ground rules and simplifications, and I would expect much can be done. Appreciate the discussions! - Yarko On Fri, Jun 9, 2017 at 1:23 PM, Guido van Rossum wrote: > Great write-up! I actually find the async nature of HTTP (both versions) a > compelling reason to switch to asyncio. For HTTP/1.1 this sounds mostly > like it would make the implementation easier; for HTTP/2 it sounds like it > would just be better for the user-side as well (if the user just wants one > resource they can safely continue to use the synchronous HTTP/1.1 version > of the API.) > > On Fri, Jun 9, 2017 at 9:55 AM, Cory Benfield wrote: > >> >> On 9 Jun 2017, at 17:28, Guido van Rossum wrote: >> >> At least one of us is still confused. The one-event-loop-per-thread model >> is supported in asyncio without passing the loop around explicitly. The >> get_event_loop() implementation stores all its state in thread-locals >> instance, so it returns the thread's event loop. (Because this is an >> "advanced" model, you have to explicitly create the event loop with >> new_event_loop() and make it the default loop for the thread with >> set_event_loop().) >> >> >> Aha, ok, so the confused one is me. I did not know this. =) That >> definitely works a lot better. It admittedly works less well if someone is >> doing their own custom event loop stuff, but that?s probably an acceptable >> limitation up until the time that Python 2 goes quietly into the night. >> >> All in all, I'm a bit curious why you would need to use asyncio at all >> when you've got a thread per request anyway. >> >> >> Yeah, so this is a bit of a diversion from the original topic of this >> thread but I think it?s an idea worth discussing in this space. I want to >> reframe the question a bit if you don?t mind, so shout if you think I?m not >> responding to quite what you were asking. In my understanding, the question >> you?re implicitly asking is this: >> >> "If you have a thread-safe library today (that is, one that allows users >> to do threaded I/O with appropriate resource pooling and management), why >> move to a model built on asyncio?? >> >> There are many answers to this question that differ for different >> libraries with different uses, but for HTTP libraries like urllib3 here are >> our reasons. >> >> The first is that it turns out that even for HTTP/1.1 you need to write >> something that amounts to a partial event loop to properly handle the >> protocol. Good HTTP clients need to watch for responses while they?re >> uploading body data because if a response arrives during that process body >> upload should be terminated immediately. This is also required for sensibly >> handling things like Expect: 100-continue, as well as spotting other >> intermediate responses and connection teardowns sensibly and without >> throwing exceptions. >> >> Today urllib3 does not do this, and it has caused us pain, so our v2 >> branch includes a backport of the Python 3 selectors module and a >> hand-written partially-complete event loop that only handles the specific >> cases we need. This is an extra thing for us to debug and maintain, and >> ultimately it?d be easier to just delegate the whole thing to event loops >> written by others who promise to maintain them and make them efficient. >> >> The second answer is that I believe good asyncio support in libraries is >> a vital part of the future of this language, and ?good? asyncio support IMO >> does as little as possible to block the main event loop. Running all of the >> complex protocol parsing and state manipulation of the Requests stack on a >> background thread is not cheap, and involves a lot of GIL swapping around. >> We have found several bug reports complaining about using Requests with >> largish-numbers of threads, indicating that our big stack of Python code >> really does cause contention on the GIL if used heavily. In general, having >> to defer to a thread to run *Python* code in asyncio is IMO a nasty >> anti-pattern that should be avoided where possible. It is much less bad to >> defer to a thread to then block on a syscall (e.g. to get an ?async? >> getaddrinfo), but doing so to run a big big stack of Python code is vastly >> less pleasant for the main event loop. >> >> For this reason, we?d ideally treat asyncio as the first-class citizen >> and retrofit on the threaded support, rather than the other way around. >> This goes doubly so when you consider the other reasons for wanting to use >> asyncio. >> >> The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is >> a *highly* concurrent protocol. Connections send a lot of control frames >> back and forth that are invisible to the user working at the semantic HTTP >> level but that nonetheless need relatively low-latency turnaround (e.g. >> PING frames). It turns out that in the traditional synchronous HTTP model >> urllib3 only gets access to the socket to do work when the user calls into >> our code. If the user goes a ?long? time without calling into urllib3, we >> take a long time to process any data off the connection. In the best case >> this causes latency spikes as we process all the data that queued up in the >> socket. In the worst case, this causes us to lose connections we should >> have been able to keep because we failed to respond to a PING frame in a >> timely manner. >> >> My experience is that purely synchronous libraries handling HTTP/2 simply >> cannot provide a positive user experience. HTTP/2 flat-out *requires* >> either an event loop or a dedicated background thread, and in practice in >> your dedicated background thread you?d also just end up writing an event >> loop (see answer 1 again). For this reason, it is basically mandatory for >> HTTP/2 support in Python to either use an event loop or to spawn out a >> dedicated C thread that does not hold the GIL to do the I/O (as this thread >> will be regularly woken up to handle I/O events). >> >> Hopefully this (admittedly horrifyingly long) response helps illuminate >> why we?re interested in asyncio support. It should be noted that if we find >> ourselves unable to get it in the short term we may simply resort to >> offering an ?async? API that involves us doing the rough equivalent of >> running in a thread-pool executor, but I won?t be thrilled about it. ;) >> >> Cory >> > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pfreixes at gmail.com Mon Jun 12 07:39:48 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Mon, 12 Jun 2017 13:39:48 +0200 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Sorry a bit of topic, but I would like to figure out why older python versions, prior this commit [1], the get_event_loop is not considered deterministic does anybody know the reason behind this change? [1] https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell wrote: > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield wrote: >> >> >> >> My concern with multiple loops boils down to the fact that urllib3 >> supports being used in a multithreaded context where each thread can >> independently make forward progress on one request. To establish that with a >> synchronous codebase you either need one event loop per thread or you need >> to spawn a background thread on startup that owns the only event loop in the >> process. > > > Yeah, one event loop per thread is probably the way to go for integration > with synchronous codebases. A dedicated event loop thread may perform better > but libraries that spawn threads are problematic. > >> >> >> Generally speaking I?ve not had positive results with libraries spawning >> their own threads in Python. In my experience this has tended to lead to >> programs that deadlock mysteriously or that fail to terminate in the face of >> a Ctrl+C. So I tend to prefer to have users spawn their own threads, which >> would make me want a ?one-event-loop-per-thread? model: hence, needing a >> loop parameter to pass around prior to 3.6. > > > You can avoid the loop parameter on older versions of asyncio (at least as > long as the default event loop policy is used) by manually setting your > event loop as current before calling run_until_complete (and resetting it > afterwards). > > Tornado's run_sync() method is equivalent to asyncio's run_until_complete(), > and Tornado supports multiple IOLoops in this way. We use this to expose a > synchronous version of our AsyncHTTPClient: > https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 > > -Ben > >> >> >> I admit that my concerns here regarding libraries spawning their own >> threads may be overblown: after my series of negative experiences I >> basically never went back to that model, and it may be that the problems >> were more user-error than anything else. However, I feel comfortable saying >> that libraries spawning their own Python threads is definitely subtle and >> hard to get right, at the very least. >> >> Cory >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- --pau From guido at python.org Mon Jun 12 11:36:01 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Jun 2017 08:36:01 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: In theory it's possible to create two event loops (using new_event_loop()), then set one as the default event loop (using set_event_loop()), then run the other one (using run_forever() or run_until_complete()). To tasks running in the latter event loop, get_event_loop() would nevertheless return the former. On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes wrote: > Sorry a bit of topic, but I would like to figure out why older python > versions, prior this commit [1], the get_event_loop is not considered > deterministic > > does anybody know the reason behind this change? > > > [1] https://github.com/python/cpython/commit/ > 600a349781bfa0a8239e1cb95fac29c7c4a3302e > > On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell wrote: > > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield wrote: > >> > >> > >> > >> My concern with multiple loops boils down to the fact that urllib3 > >> supports being used in a multithreaded context where each thread can > >> independently make forward progress on one request. To establish that > with a > >> synchronous codebase you either need one event loop per thread or you > need > >> to spawn a background thread on startup that owns the only event loop > in the > >> process. > > > > > > Yeah, one event loop per thread is probably the way to go for integration > > with synchronous codebases. A dedicated event loop thread may perform > better > > but libraries that spawn threads are problematic. > > > >> > >> > >> Generally speaking I?ve not had positive results with libraries spawning > >> their own threads in Python. In my experience this has tended to lead to > >> programs that deadlock mysteriously or that fail to terminate in the > face of > >> a Ctrl+C. So I tend to prefer to have users spawn their own threads, > which > >> would make me want a ?one-event-loop-per-thread? model: hence, needing a > >> loop parameter to pass around prior to 3.6. > > > > > > You can avoid the loop parameter on older versions of asyncio (at least > as > > long as the default event loop policy is used) by manually setting your > > event loop as current before calling run_until_complete (and resetting it > > afterwards). > > > > Tornado's run_sync() method is equivalent to asyncio's > run_until_complete(), > > and Tornado supports multiple IOLoops in this way. We use this to expose > a > > synchronous version of our AsyncHTTPClient: > > https://github.com/tornadoweb/tornado/blob/ > 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 > > > > -Ben > > > >> > >> > >> I admit that my concerns here regarding libraries spawning their own > >> threads may be overblown: after my series of negative experiences I > >> basically never went back to that model, and it may be that the problems > >> were more user-error than anything else. However, I feel comfortable > saying > >> that libraries spawning their own Python threads is definitely subtle > and > >> hard to get right, at the very least. > >> > >> Cory > >> _______________________________________________ > >> Async-sig mailing list > >> Async-sig at python.org > >> https://mail.python.org/mailman/listinfo/async-sig > >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > > > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > > > > -- > --pau > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pfreixes at gmail.com Mon Jun 12 11:49:41 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Mon, 12 Jun 2017 17:49:41 +0200 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: And what about the rationale of having multiple loop instances in the same thread switching btw them. Im still trying to find out what patterns need this... Do you have an example? Btw thanks for the first explanation El 12/06/2017 17:36, "Guido van Rossum" escribi?: > In theory it's possible to create two event loops (using > new_event_loop()), then set one as the default event loop (using > set_event_loop()), then run the other one (using run_forever() or > run_until_complete()). To tasks running in the latter event loop, > get_event_loop() would nevertheless return the former. > > On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes wrote: > >> Sorry a bit of topic, but I would like to figure out why older python >> versions, prior this commit [1], the get_event_loop is not considered >> deterministic >> >> does anybody know the reason behind this change? >> >> >> [1] https://github.com/python/cpython/commit/600a349781bfa0a8239 >> e1cb95fac29c7c4a3302e >> >> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell wrote: >> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >> wrote: >> >> >> >> >> >> >> >> My concern with multiple loops boils down to the fact that urllib3 >> >> supports being used in a multithreaded context where each thread can >> >> independently make forward progress on one request. To establish that >> with a >> >> synchronous codebase you either need one event loop per thread or you >> need >> >> to spawn a background thread on startup that owns the only event loop >> in the >> >> process. >> > >> > >> > Yeah, one event loop per thread is probably the way to go for >> integration >> > with synchronous codebases. A dedicated event loop thread may perform >> better >> > but libraries that spawn threads are problematic. >> > >> >> >> >> >> >> Generally speaking I?ve not had positive results with libraries >> spawning >> >> their own threads in Python. In my experience this has tended to lead >> to >> >> programs that deadlock mysteriously or that fail to terminate in the >> face of >> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads, >> which >> >> would make me want a ?one-event-loop-per-thread? model: hence, needing >> a >> >> loop parameter to pass around prior to 3.6. >> > >> > >> > You can avoid the loop parameter on older versions of asyncio (at least >> as >> > long as the default event loop policy is used) by manually setting your >> > event loop as current before calling run_until_complete (and resetting >> it >> > afterwards). >> > >> > Tornado's run_sync() method is equivalent to asyncio's >> run_until_complete(), >> > and Tornado supports multiple IOLoops in this way. We use this to >> expose a >> > synchronous version of our AsyncHTTPClient: >> > https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83 >> f951758c96775a43e80475b/tornado/httpclient.py#L54 >> > >> > -Ben >> > >> >> >> >> >> >> I admit that my concerns here regarding libraries spawning their own >> >> threads may be overblown: after my series of negative experiences I >> >> basically never went back to that model, and it may be that the >> problems >> >> were more user-error than anything else. However, I feel comfortable >> saying >> >> that libraries spawning their own Python threads is definitely subtle >> and >> >> hard to get right, at the very least. >> >> >> >> Cory >> >> _______________________________________________ >> >> Async-sig mailing list >> >> Async-sig at python.org >> >> https://mail.python.org/mailman/listinfo/async-sig >> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > >> > >> > _______________________________________________ >> > Async-sig mailing list >> > Async-sig at python.org >> > https://mail.python.org/mailman/listinfo/async-sig >> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > >> >> >> >> -- >> --pau >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jun 12 11:58:14 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Jun 2017 08:58:14 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Multiple loops in the same thread is purely theoretical -- the API allows it but there's no use case. It might be necessary if a platform has a UI-only event loop that cannot be extended to do I/O -- the only solution to do background I/O might be to alternate between two loops. (Though in that case I would still prefer a thread for the background I/O.) On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes wrote: > And what about the rationale of having multiple loop instances in the same > thread switching btw them. Im still trying to find out what patterns need > this... Do you have an example? > > Btw thanks for the first explanation > > El 12/06/2017 17:36, "Guido van Rossum" escribi?: > >> In theory it's possible to create two event loops (using >> new_event_loop()), then set one as the default event loop (using >> set_event_loop()), then run the other one (using run_forever() or >> run_until_complete()). To tasks running in the latter event loop, >> get_event_loop() would nevertheless return the former. >> >> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes wrote: >> >>> Sorry a bit of topic, but I would like to figure out why older python >>> versions, prior this commit [1], the get_event_loop is not considered >>> deterministic >>> >>> does anybody know the reason behind this change? >>> >>> >>> [1] https://github.com/python/cpython/commit/600a349781bfa0a8239 >>> e1cb95fac29c7c4a3302e >>> >>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell wrote: >>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>> wrote: >>> >> >>> >> >>> >> >>> >> My concern with multiple loops boils down to the fact that urllib3 >>> >> supports being used in a multithreaded context where each thread can >>> >> independently make forward progress on one request. To establish that >>> with a >>> >> synchronous codebase you either need one event loop per thread or you >>> need >>> >> to spawn a background thread on startup that owns the only event loop >>> in the >>> >> process. >>> > >>> > >>> > Yeah, one event loop per thread is probably the way to go for >>> integration >>> > with synchronous codebases. A dedicated event loop thread may perform >>> better >>> > but libraries that spawn threads are problematic. >>> > >>> >> >>> >> >>> >> Generally speaking I?ve not had positive results with libraries >>> spawning >>> >> their own threads in Python. In my experience this has tended to lead >>> to >>> >> programs that deadlock mysteriously or that fail to terminate in the >>> face of >>> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads, >>> which >>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>> needing a >>> >> loop parameter to pass around prior to 3.6. >>> > >>> > >>> > You can avoid the loop parameter on older versions of asyncio (at >>> least as >>> > long as the default event loop policy is used) by manually setting your >>> > event loop as current before calling run_until_complete (and resetting >>> it >>> > afterwards). >>> > >>> > Tornado's run_sync() method is equivalent to asyncio's >>> run_until_complete(), >>> > and Tornado supports multiple IOLoops in this way. We use this to >>> expose a >>> > synchronous version of our AsyncHTTPClient: >>> > https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83 >>> f951758c96775a43e80475b/tornado/httpclient.py#L54 >>> > >>> > -Ben >>> > >>> >> >>> >> >>> >> I admit that my concerns here regarding libraries spawning their own >>> >> threads may be overblown: after my series of negative experiences I >>> >> basically never went back to that model, and it may be that the >>> problems >>> >> were more user-error than anything else. However, I feel comfortable >>> saying >>> >> that libraries spawning their own Python threads is definitely subtle >>> and >>> >> hard to get right, at the very least. >>> >> >>> >> Cory >>> >> _______________________________________________ >>> >> Async-sig mailing list >>> >> Async-sig at python.org >>> >> https://mail.python.org/mailman/listinfo/async-sig >>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> > >>> > >>> > _______________________________________________ >>> > Async-sig mailing list >>> > Async-sig at python.org >>> > https://mail.python.org/mailman/listinfo/async-sig >>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> > >>> >>> >>> >>> -- >>> --pau >>> _______________________________________________ >>> Async-sig mailing list >>> Async-sig at python.org >>> https://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Mon Jun 12 12:25:23 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 12 Jun 2017 16:25:23 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Unit tests at least. Running every test in own loop is crucial fro tests isolation. On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum wrote: > Multiple loops in the same thread is purely theoretical -- the API allows > it but there's no use case. It might be necessary if a platform has a > UI-only event loop that cannot be extended to do I/O -- the only solution > to do background I/O might be to alternate between two loops. (Though in > that case I would still prefer a thread for the background I/O.) > > On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes wrote: > >> And what about the rationale of having multiple loop instances in the >> same thread switching btw them. Im still trying to find out what patterns >> need this... Do you have an example? >> >> Btw thanks for the first explanation >> >> El 12/06/2017 17:36, "Guido van Rossum" escribi?: >> >>> In theory it's possible to create two event loops (using >>> new_event_loop()), then set one as the default event loop (using >>> set_event_loop()), then run the other one (using run_forever() or >>> run_until_complete()). To tasks running in the latter event loop, >>> get_event_loop() would nevertheless return the former. >>> >>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes wrote: >>> >>>> Sorry a bit of topic, but I would like to figure out why older python >>>> versions, prior this commit [1], the get_event_loop is not considered >>>> deterministic >>>> >>>> does anybody know the reason behind this change? >>>> >>>> >>>> [1] >>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e >>>> >>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell wrote: >>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>>> wrote: >>>> >> >>>> >> >>>> >> >>>> >> My concern with multiple loops boils down to the fact that urllib3 >>>> >> supports being used in a multithreaded context where each thread can >>>> >> independently make forward progress on one request. To establish >>>> that with a >>>> >> synchronous codebase you either need one event loop per thread or >>>> you need >>>> >> to spawn a background thread on startup that owns the only event >>>> loop in the >>>> >> process. >>>> > >>>> > >>>> > Yeah, one event loop per thread is probably the way to go for >>>> integration >>>> > with synchronous codebases. A dedicated event loop thread may perform >>>> better >>>> > but libraries that spawn threads are problematic. >>>> > >>>> >> >>>> >> >>>> >> Generally speaking I?ve not had positive results with libraries >>>> spawning >>>> >> their own threads in Python. In my experience this has tended to >>>> lead to >>>> >> programs that deadlock mysteriously or that fail to terminate in the >>>> face of >>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads, >>>> which >>>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>>> needing a >>>> >> loop parameter to pass around prior to 3.6. >>>> > >>>> > >>>> > You can avoid the loop parameter on older versions of asyncio (at >>>> least as >>>> > long as the default event loop policy is used) by manually setting >>>> your >>>> > event loop as current before calling run_until_complete (and >>>> resetting it >>>> > afterwards). >>>> > >>>> > Tornado's run_sync() method is equivalent to asyncio's >>>> run_until_complete(), >>>> > and Tornado supports multiple IOLoops in this way. We use this to >>>> expose a >>>> > synchronous version of our AsyncHTTPClient: >>>> > >>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 >>>> > >>>> > -Ben >>>> > >>>> >> >>>> >> >>>> >> I admit that my concerns here regarding libraries spawning their own >>>> >> threads may be overblown: after my series of negative experiences I >>>> >> basically never went back to that model, and it may be that the >>>> problems >>>> >> were more user-error than anything else. However, I feel comfortable >>>> saying >>>> >> that libraries spawning their own Python threads is definitely >>>> subtle and >>>> >> hard to get right, at the very least. >>>> >> >>>> >> Cory >>>> >> _______________________________________________ >>>> >> Async-sig mailing list >>>> >> Async-sig at python.org >>>> >> https://mail.python.org/mailman/listinfo/async-sig >>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> > >>>> > >>>> > _______________________________________________ >>>> > Async-sig mailing list >>>> > Async-sig at python.org >>>> > https://mail.python.org/mailman/listinfo/async-sig >>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> > >>>> >>>> >>>> >>>> -- >>>> --pau >>>> _______________________________________________ >>>> Async-sig mailing list >>>> Async-sig at python.org >>>> https://mail.python.org/mailman/listinfo/async-sig >>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> >>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jun 12 12:37:12 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Jun 2017 09:37:12 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Yes, but not co-existing, I hope! On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov wrote: > Unit tests at least. Running every test in own loop is crucial fro tests > isolation. > > On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum wrote: > >> Multiple loops in the same thread is purely theoretical -- the API allows >> it but there's no use case. It might be necessary if a platform has a >> UI-only event loop that cannot be extended to do I/O -- the only solution >> to do background I/O might be to alternate between two loops. (Though in >> that case I would still prefer a thread for the background I/O.) >> >> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes wrote: >> >>> And what about the rationale of having multiple loop instances in the >>> same thread switching btw them. Im still trying to find out what patterns >>> need this... Do you have an example? >>> >>> Btw thanks for the first explanation >>> >>> El 12/06/2017 17:36, "Guido van Rossum" escribi?: >>> >>>> In theory it's possible to create two event loops (using >>>> new_event_loop()), then set one as the default event loop (using >>>> set_event_loop()), then run the other one (using run_forever() or >>>> run_until_complete()). To tasks running in the latter event loop, >>>> get_event_loop() would nevertheless return the former. >>>> >>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes >>>> wrote: >>>> >>>>> Sorry a bit of topic, but I would like to figure out why older python >>>>> versions, prior this commit [1], the get_event_loop is not considered >>>>> deterministic >>>>> >>>>> does anybody know the reason behind this change? >>>>> >>>>> >>>>> [1] https://github.com/python/cpython/commit/ >>>>> 600a349781bfa0a8239e1cb95fac29c7c4a3302e >>>>> >>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell >>>>> wrote: >>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>>>> wrote: >>>>> >> >>>>> >> >>>>> >> >>>>> >> My concern with multiple loops boils down to the fact that urllib3 >>>>> >> supports being used in a multithreaded context where each thread can >>>>> >> independently make forward progress on one request. To establish >>>>> that with a >>>>> >> synchronous codebase you either need one event loop per thread or >>>>> you need >>>>> >> to spawn a background thread on startup that owns the only event >>>>> loop in the >>>>> >> process. >>>>> > >>>>> > >>>>> > Yeah, one event loop per thread is probably the way to go for >>>>> integration >>>>> > with synchronous codebases. A dedicated event loop thread may >>>>> perform better >>>>> > but libraries that spawn threads are problematic. >>>>> > >>>>> >> >>>>> >> >>>>> >> Generally speaking I?ve not had positive results with libraries >>>>> spawning >>>>> >> their own threads in Python. In my experience this has tended to >>>>> lead to >>>>> >> programs that deadlock mysteriously or that fail to terminate in >>>>> the face of >>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own >>>>> threads, which >>>>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>>>> needing a >>>>> >> loop parameter to pass around prior to 3.6. >>>>> > >>>>> > >>>>> > You can avoid the loop parameter on older versions of asyncio (at >>>>> least as >>>>> > long as the default event loop policy is used) by manually setting >>>>> your >>>>> > event loop as current before calling run_until_complete (and >>>>> resetting it >>>>> > afterwards). >>>>> > >>>>> > Tornado's run_sync() method is equivalent to asyncio's >>>>> run_until_complete(), >>>>> > and Tornado supports multiple IOLoops in this way. We use this to >>>>> expose a >>>>> > synchronous version of our AsyncHTTPClient: >>>>> > https://github.com/tornadoweb/tornado/blob/ >>>>> 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 >>>>> > >>>>> > -Ben >>>>> > >>>>> >> >>>>> >> >>>>> >> I admit that my concerns here regarding libraries spawning their own >>>>> >> threads may be overblown: after my series of negative experiences I >>>>> >> basically never went back to that model, and it may be that the >>>>> problems >>>>> >> were more user-error than anything else. However, I feel >>>>> comfortable saying >>>>> >> that libraries spawning their own Python threads is definitely >>>>> subtle and >>>>> >> hard to get right, at the very least. >>>>> >> >>>>> >> Cory >>>>> >> _______________________________________________ >>>>> >> Async-sig mailing list >>>>> >> Async-sig at python.org >>>>> >> https://mail.python.org/mailman/listinfo/async-sig >>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > Async-sig mailing list >>>>> > Async-sig at python.org >>>>> > https://mail.python.org/mailman/listinfo/async-sig >>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> --pau >>>>> _______________________________________________ >>>>> Async-sig mailing list >>>>> Async-sig at python.org >>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>> >>>> >>>> >>>> >>>> -- >>>> --Guido van Rossum (python.org/~guido) >>>> >>> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > -- > Thanks, > Andrew Svetlov > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Mon Jun 12 12:57:29 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 12 Jun 2017 16:57:29 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Yes, but with one exception: default event loop created on module import stage might co-exist with a loop created for test. It leads to mystic hangs, you know. Please recall code like: class A: mongodb = motor.motor_asyncio.AsyncIOMotorClient() On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum wrote: > Yes, but not co-existing, I hope! > > On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov > wrote: > >> Unit tests at least. Running every test in own loop is crucial fro tests >> isolation. >> >> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum >> wrote: >> >>> Multiple loops in the same thread is purely theoretical -- the API >>> allows it but there's no use case. It might be necessary if a platform has >>> a UI-only event loop that cannot be extended to do I/O -- the only solution >>> to do background I/O might be to alternate between two loops. (Though in >>> that case I would still prefer a thread for the background I/O.) >>> >>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes wrote: >>> >>>> And what about the rationale of having multiple loop instances in the >>>> same thread switching btw them. Im still trying to find out what patterns >>>> need this... Do you have an example? >>>> >>>> Btw thanks for the first explanation >>>> >>>> El 12/06/2017 17:36, "Guido van Rossum" escribi?: >>>> >>>>> In theory it's possible to create two event loops (using >>>>> new_event_loop()), then set one as the default event loop (using >>>>> set_event_loop()), then run the other one (using run_forever() or >>>>> run_until_complete()). To tasks running in the latter event loop, >>>>> get_event_loop() would nevertheless return the former. >>>>> >>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes >>>>> wrote: >>>>> >>>>>> Sorry a bit of topic, but I would like to figure out why older python >>>>>> versions, prior this commit [1], the get_event_loop is not considered >>>>>> deterministic >>>>>> >>>>>> does anybody know the reason behind this change? >>>>>> >>>>>> >>>>>> [1] >>>>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e >>>>>> >>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell >>>>>> wrote: >>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>>>>> wrote: >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> My concern with multiple loops boils down to the fact that urllib3 >>>>>> >> supports being used in a multithreaded context where each thread >>>>>> can >>>>>> >> independently make forward progress on one request. To establish >>>>>> that with a >>>>>> >> synchronous codebase you either need one event loop per thread or >>>>>> you need >>>>>> >> to spawn a background thread on startup that owns the only event >>>>>> loop in the >>>>>> >> process. >>>>>> > >>>>>> > >>>>>> > Yeah, one event loop per thread is probably the way to go for >>>>>> integration >>>>>> > with synchronous codebases. A dedicated event loop thread may >>>>>> perform better >>>>>> > but libraries that spawn threads are problematic. >>>>>> > >>>>>> >> >>>>>> >> >>>>>> >> Generally speaking I?ve not had positive results with libraries >>>>>> spawning >>>>>> >> their own threads in Python. In my experience this has tended to >>>>>> lead to >>>>>> >> programs that deadlock mysteriously or that fail to terminate in >>>>>> the face of >>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own >>>>>> threads, which >>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>>>>> needing a >>>>>> >> loop parameter to pass around prior to 3.6. >>>>>> > >>>>>> > >>>>>> > You can avoid the loop parameter on older versions of asyncio (at >>>>>> least as >>>>>> > long as the default event loop policy is used) by manually setting >>>>>> your >>>>>> > event loop as current before calling run_until_complete (and >>>>>> resetting it >>>>>> > afterwards). >>>>>> > >>>>>> > Tornado's run_sync() method is equivalent to asyncio's >>>>>> run_until_complete(), >>>>>> > and Tornado supports multiple IOLoops in this way. We use this to >>>>>> expose a >>>>>> > synchronous version of our AsyncHTTPClient: >>>>>> > >>>>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 >>>>>> > >>>>>> > -Ben >>>>>> > >>>>>> >> >>>>>> >> >>>>>> >> I admit that my concerns here regarding libraries spawning their >>>>>> own >>>>>> >> threads may be overblown: after my series of negative experiences I >>>>>> >> basically never went back to that model, and it may be that the >>>>>> problems >>>>>> >> were more user-error than anything else. However, I feel >>>>>> comfortable saying >>>>>> >> that libraries spawning their own Python threads is definitely >>>>>> subtle and >>>>>> >> hard to get right, at the very least. >>>>>> >> >>>>>> >> Cory >>>>>> >> _______________________________________________ >>>>>> >> Async-sig mailing list >>>>>> >> Async-sig at python.org >>>>>> >> https://mail.python.org/mailman/listinfo/async-sig >>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>> > >>>>>> > >>>>>> > _______________________________________________ >>>>>> > Async-sig mailing list >>>>>> > Async-sig at python.org >>>>>> > https://mail.python.org/mailman/listinfo/async-sig >>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> --pau >>>>>> _______________________________________________ >>>>>> Async-sig mailing list >>>>>> Async-sig at python.org >>>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> --Guido van Rossum (python.org/~guido) >>>>> >>>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> _______________________________________________ >>> Async-sig mailing list >>> Async-sig at python.org >>> https://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> >> -- >> Thanks, >> Andrew Svetlov >> > > > > -- > --Guido van Rossum (python.org/~guido) > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at bendarnell.com Mon Jun 12 12:14:08 2017 From: ben at bendarnell.com (Ben Darnell) Date: Mon, 12 Jun 2017 16:14:08 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: In Tornado this comes up sometimes in initialization scenarios: def main(): # Since main is synchronous, we need a synchronous HTTP client with tornado.httpclient.HTTPClient() as client: # HTTPClient creates its own event loop and runs it behind the scenes. # This is not the same as the event loop under which main() is running. resp = client.fetch(url) if __name__ == '__main__': IOLoop.current().add_callback(main) IOLoop.current().start() This is never an ideal scenario (it would be better to make main() a coroutine and use an async HTTP client), but it does sometimes come up as the most expedient option. This scenario is also why methods like EventLoop.is_running() tend to be misguided - the question of "can I use this event loop" is not directly related to "is this event loop running". -Ben On Mon, Jun 12, 2017 at 11:58 AM Guido van Rossum wrote: > Multiple loops in the same thread is purely theoretical -- the API allows > it but there's no use case. It might be necessary if a platform has a > UI-only event loop that cannot be extended to do I/O -- the only solution > to do background I/O might be to alternate between two loops. (Though in > that case I would still prefer a thread for the background I/O.) > > On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes wrote: > >> And what about the rationale of having multiple loop instances in the >> same thread switching btw them. Im still trying to find out what patterns >> need this... Do you have an example? >> >> Btw thanks for the first explanation >> >> El 12/06/2017 17:36, "Guido van Rossum" escribi?: >> >>> In theory it's possible to create two event loops (using >>> new_event_loop()), then set one as the default event loop (using >>> set_event_loop()), then run the other one (using run_forever() or >>> run_until_complete()). To tasks running in the latter event loop, >>> get_event_loop() would nevertheless return the former. >>> >>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes wrote: >>> >>>> Sorry a bit of topic, but I would like to figure out why older python >>>> versions, prior this commit [1], the get_event_loop is not considered >>>> deterministic >>>> >>>> does anybody know the reason behind this change? >>>> >>>> >>>> [1] >>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e >>>> >>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell wrote: >>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>>> wrote: >>>> >> >>>> >> >>>> >> >>>> >> My concern with multiple loops boils down to the fact that urllib3 >>>> >> supports being used in a multithreaded context where each thread can >>>> >> independently make forward progress on one request. To establish >>>> that with a >>>> >> synchronous codebase you either need one event loop per thread or >>>> you need >>>> >> to spawn a background thread on startup that owns the only event >>>> loop in the >>>> >> process. >>>> > >>>> > >>>> > Yeah, one event loop per thread is probably the way to go for >>>> integration >>>> > with synchronous codebases. A dedicated event loop thread may perform >>>> better >>>> > but libraries that spawn threads are problematic. >>>> > >>>> >> >>>> >> >>>> >> Generally speaking I?ve not had positive results with libraries >>>> spawning >>>> >> their own threads in Python. In my experience this has tended to >>>> lead to >>>> >> programs that deadlock mysteriously or that fail to terminate in the >>>> face of >>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads, >>>> which >>>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>>> needing a >>>> >> loop parameter to pass around prior to 3.6. >>>> > >>>> > >>>> > You can avoid the loop parameter on older versions of asyncio (at >>>> least as >>>> > long as the default event loop policy is used) by manually setting >>>> your >>>> > event loop as current before calling run_until_complete (and >>>> resetting it >>>> > afterwards). >>>> > >>>> > Tornado's run_sync() method is equivalent to asyncio's >>>> run_until_complete(), >>>> > and Tornado supports multiple IOLoops in this way. We use this to >>>> expose a >>>> > synchronous version of our AsyncHTTPClient: >>>> > >>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 >>>> > >>>> > -Ben >>>> > >>>> >> >>>> >> >>>> >> I admit that my concerns here regarding libraries spawning their own >>>> >> threads may be overblown: after my series of negative experiences I >>>> >> basically never went back to that model, and it may be that the >>>> problems >>>> >> were more user-error than anything else. However, I feel comfortable >>>> saying >>>> >> that libraries spawning their own Python threads is definitely >>>> subtle and >>>> >> hard to get right, at the very least. >>>> >> >>>> >> Cory >>>> >> _______________________________________________ >>>> >> Async-sig mailing list >>>> >> Async-sig at python.org >>>> >> https://mail.python.org/mailman/listinfo/async-sig >>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> > >>>> > >>>> > _______________________________________________ >>>> > Async-sig mailing list >>>> > Async-sig at python.org >>>> > https://mail.python.org/mailman/listinfo/async-sig >>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> > >>>> >>>> >>>> >>>> -- >>>> --pau >>>> _______________________________________________ >>>> Async-sig mailing list >>>> Async-sig at python.org >>>> https://mail.python.org/mailman/listinfo/async-sig >>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> >>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jun 12 14:50:26 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Jun 2017 11:50:26 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Honestly I think we're in agreement. There's never a use for one loop running while another is the default. There are some rare use cases for multiple loops running but before the mentioned commit it was up to the app to ensure to switch the default loop when running a loop. The commit took the ability to screw up there out of the user's hand. On Mon, Jun 12, 2017 at 9:57 AM, Andrew Svetlov wrote: > Yes, but with one exception: default event loop created on module import > stage might co-exist with a loop created for test. > It leads to mystic hangs, you know. > Please recall code like: > class A: > mongodb = motor.motor_asyncio.AsyncIOMotorClient() > > On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum wrote: > >> Yes, but not co-existing, I hope! >> >> On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov > > wrote: >> >>> Unit tests at least. Running every test in own loop is crucial fro tests >>> isolation. >>> >>> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum >>> wrote: >>> >>>> Multiple loops in the same thread is purely theoretical -- the API >>>> allows it but there's no use case. It might be necessary if a platform has >>>> a UI-only event loop that cannot be extended to do I/O -- the only solution >>>> to do background I/O might be to alternate between two loops. (Though in >>>> that case I would still prefer a thread for the background I/O.) >>>> >>>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes >>>> wrote: >>>> >>>>> And what about the rationale of having multiple loop instances in the >>>>> same thread switching btw them. Im still trying to find out what patterns >>>>> need this... Do you have an example? >>>>> >>>>> Btw thanks for the first explanation >>>>> >>>>> El 12/06/2017 17:36, "Guido van Rossum" escribi?: >>>>> >>>>>> In theory it's possible to create two event loops (using >>>>>> new_event_loop()), then set one as the default event loop (using >>>>>> set_event_loop()), then run the other one (using run_forever() or >>>>>> run_until_complete()). To tasks running in the latter event loop, >>>>>> get_event_loop() would nevertheless return the former. >>>>>> >>>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes >>>>>> wrote: >>>>>> >>>>>>> Sorry a bit of topic, but I would like to figure out why older python >>>>>>> versions, prior this commit [1], the get_event_loop is not considered >>>>>>> deterministic >>>>>>> >>>>>>> does anybody know the reason behind this change? >>>>>>> >>>>>>> >>>>>>> [1] https://github.com/python/cpython/commit/ >>>>>>> 600a349781bfa0a8239e1cb95fac29c7c4a3302e >>>>>>> >>>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell >>>>>>> wrote: >>>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>>>>>> wrote: >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> My concern with multiple loops boils down to the fact that urllib3 >>>>>>> >> supports being used in a multithreaded context where each thread >>>>>>> can >>>>>>> >> independently make forward progress on one request. To establish >>>>>>> that with a >>>>>>> >> synchronous codebase you either need one event loop per thread or >>>>>>> you need >>>>>>> >> to spawn a background thread on startup that owns the only event >>>>>>> loop in the >>>>>>> >> process. >>>>>>> > >>>>>>> > >>>>>>> > Yeah, one event loop per thread is probably the way to go for >>>>>>> integration >>>>>>> > with synchronous codebases. A dedicated event loop thread may >>>>>>> perform better >>>>>>> > but libraries that spawn threads are problematic. >>>>>>> > >>>>>>> >> >>>>>>> >> >>>>>>> >> Generally speaking I?ve not had positive results with libraries >>>>>>> spawning >>>>>>> >> their own threads in Python. In my experience this has tended to >>>>>>> lead to >>>>>>> >> programs that deadlock mysteriously or that fail to terminate in >>>>>>> the face of >>>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own >>>>>>> threads, which >>>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>>>>>> needing a >>>>>>> >> loop parameter to pass around prior to 3.6. >>>>>>> > >>>>>>> > >>>>>>> > You can avoid the loop parameter on older versions of asyncio (at >>>>>>> least as >>>>>>> > long as the default event loop policy is used) by manually setting >>>>>>> your >>>>>>> > event loop as current before calling run_until_complete (and >>>>>>> resetting it >>>>>>> > afterwards). >>>>>>> > >>>>>>> > Tornado's run_sync() method is equivalent to asyncio's >>>>>>> run_until_complete(), >>>>>>> > and Tornado supports multiple IOLoops in this way. We use this to >>>>>>> expose a >>>>>>> > synchronous version of our AsyncHTTPClient: >>>>>>> > https://github.com/tornadoweb/tornado/blob/ >>>>>>> 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 >>>>>>> > >>>>>>> > -Ben >>>>>>> > >>>>>>> >> >>>>>>> >> >>>>>>> >> I admit that my concerns here regarding libraries spawning their >>>>>>> own >>>>>>> >> threads may be overblown: after my series of negative experiences >>>>>>> I >>>>>>> >> basically never went back to that model, and it may be that the >>>>>>> problems >>>>>>> >> were more user-error than anything else. However, I feel >>>>>>> comfortable saying >>>>>>> >> that libraries spawning their own Python threads is definitely >>>>>>> subtle and >>>>>>> >> hard to get right, at the very least. >>>>>>> >> >>>>>>> >> Cory >>>>>>> >> _______________________________________________ >>>>>>> >> Async-sig mailing list >>>>>>> >> Async-sig at python.org >>>>>>> >> https://mail.python.org/mailman/listinfo/async-sig >>>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>> > >>>>>>> > >>>>>>> > _______________________________________________ >>>>>>> > Async-sig mailing list >>>>>>> > Async-sig at python.org >>>>>>> > https://mail.python.org/mailman/listinfo/async-sig >>>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>> > >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> --pau >>>>>>> _______________________________________________ >>>>>>> Async-sig mailing list >>>>>>> Async-sig at python.org >>>>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> --Guido van Rossum (python.org/~guido) >>>>>> >>>>> >>>> >>>> >>>> -- >>>> --Guido van Rossum (python.org/~guido) >>>> _______________________________________________ >>>> Async-sig mailing list >>>> Async-sig at python.org >>>> https://mail.python.org/mailman/listinfo/async-sig >>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>> >>> -- >>> Thanks, >>> Andrew Svetlov >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > -- > Thanks, > Andrew Svetlov > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Mon Jun 12 15:05:22 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 12 Jun 2017 19:05:22 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: Agree in general but current asyncio still may shoot your leg. The solution (at least for my unittest example) might be in adding top level functions for running asyncio code (asyncio.run() and asyncio.run_forever() as Yury Selivanov proposed in https://github.com/python/asyncio/pull/465) After this we could raise a warning in `asyncio.get_event_loop()` if the loop was not set explicitly by `asyncio.set_event_loop()`. On Mon, Jun 12, 2017 at 9:50 PM Guido van Rossum wrote: > Honestly I think we're in agreement. There's never a use for one loop > running while another is the default. There are some rare use cases for > multiple loops running but before the mentioned commit it was up to the app > to ensure to switch the default loop when running a loop. The commit took > the ability to screw up there out of the user's hand. > > On Mon, Jun 12, 2017 at 9:57 AM, Andrew Svetlov > wrote: > >> Yes, but with one exception: default event loop created on module import >> stage might co-exist with a loop created for test. >> It leads to mystic hangs, you know. >> Please recall code like: >> class A: >> mongodb = motor.motor_asyncio.AsyncIOMotorClient() >> >> On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum >> wrote: >> >>> Yes, but not co-existing, I hope! >>> >>> On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov < >>> andrew.svetlov at gmail.com> wrote: >>> >>>> Unit tests at least. Running every test in own loop is crucial fro >>>> tests isolation. >>>> >>>> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum >>>> wrote: >>>> >>>>> Multiple loops in the same thread is purely theoretical -- the API >>>>> allows it but there's no use case. It might be necessary if a platform has >>>>> a UI-only event loop that cannot be extended to do I/O -- the only solution >>>>> to do background I/O might be to alternate between two loops. (Though in >>>>> that case I would still prefer a thread for the background I/O.) >>>>> >>>>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes >>>>> wrote: >>>>> >>>>>> And what about the rationale of having multiple loop instances in the >>>>>> same thread switching btw them. Im still trying to find out what patterns >>>>>> need this... Do you have an example? >>>>>> >>>>>> Btw thanks for the first explanation >>>>>> >>>>>> El 12/06/2017 17:36, "Guido van Rossum" escribi?: >>>>>> >>>>>>> In theory it's possible to create two event loops (using >>>>>>> new_event_loop()), then set one as the default event loop (using >>>>>>> set_event_loop()), then run the other one (using run_forever() or >>>>>>> run_until_complete()). To tasks running in the latter event loop, >>>>>>> get_event_loop() would nevertheless return the former. >>>>>>> >>>>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes >>>>>>> wrote: >>>>>>> >>>>>>>> Sorry a bit of topic, but I would like to figure out why older >>>>>>>> python >>>>>>>> versions, prior this commit [1], the get_event_loop is not >>>>>>>> considered >>>>>>>> deterministic >>>>>>>> >>>>>>>> does anybody know the reason behind this change? >>>>>>>> >>>>>>>> >>>>>>>> [1] >>>>>>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e >>>>>>>> >>>>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell >>>>>>>> wrote: >>>>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>>>>>>> wrote: >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> My concern with multiple loops boils down to the fact that >>>>>>>> urllib3 >>>>>>>> >> supports being used in a multithreaded context where each thread >>>>>>>> can >>>>>>>> >> independently make forward progress on one request. To establish >>>>>>>> that with a >>>>>>>> >> synchronous codebase you either need one event loop per thread >>>>>>>> or you need >>>>>>>> >> to spawn a background thread on startup that owns the only event >>>>>>>> loop in the >>>>>>>> >> process. >>>>>>>> > >>>>>>>> > >>>>>>>> > Yeah, one event loop per thread is probably the way to go for >>>>>>>> integration >>>>>>>> > with synchronous codebases. A dedicated event loop thread may >>>>>>>> perform better >>>>>>>> > but libraries that spawn threads are problematic. >>>>>>>> > >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> Generally speaking I?ve not had positive results with libraries >>>>>>>> spawning >>>>>>>> >> their own threads in Python. In my experience this has tended to >>>>>>>> lead to >>>>>>>> >> programs that deadlock mysteriously or that fail to terminate in >>>>>>>> the face of >>>>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own >>>>>>>> threads, which >>>>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>>>>>>> needing a >>>>>>>> >> loop parameter to pass around prior to 3.6. >>>>>>>> > >>>>>>>> > >>>>>>>> > You can avoid the loop parameter on older versions of asyncio (at >>>>>>>> least as >>>>>>>> > long as the default event loop policy is used) by manually >>>>>>>> setting your >>>>>>>> > event loop as current before calling run_until_complete (and >>>>>>>> resetting it >>>>>>>> > afterwards). >>>>>>>> > >>>>>>>> > Tornado's run_sync() method is equivalent to asyncio's >>>>>>>> run_until_complete(), >>>>>>>> > and Tornado supports multiple IOLoops in this way. We use this to >>>>>>>> expose a >>>>>>>> > synchronous version of our AsyncHTTPClient: >>>>>>>> > >>>>>>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 >>>>>>>> > >>>>>>>> > -Ben >>>>>>>> > >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> I admit that my concerns here regarding libraries spawning their >>>>>>>> own >>>>>>>> >> threads may be overblown: after my series of negative >>>>>>>> experiences I >>>>>>>> >> basically never went back to that model, and it may be that the >>>>>>>> problems >>>>>>>> >> were more user-error than anything else. However, I feel >>>>>>>> comfortable saying >>>>>>>> >> that libraries spawning their own Python threads is definitely >>>>>>>> subtle and >>>>>>>> >> hard to get right, at the very least. >>>>>>>> >> >>>>>>>> >> Cory >>>>>>>> >> _______________________________________________ >>>>>>>> >> Async-sig mailing list >>>>>>>> >> Async-sig at python.org >>>>>>>> >> https://mail.python.org/mailman/listinfo/async-sig >>>>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>>> > >>>>>>>> > >>>>>>>> > _______________________________________________ >>>>>>>> > Async-sig mailing list >>>>>>>> > Async-sig at python.org >>>>>>>> > https://mail.python.org/mailman/listinfo/async-sig >>>>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> --pau >>>>>>>> _______________________________________________ >>>>>>>> Async-sig mailing list >>>>>>>> Async-sig at python.org >>>>>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> --Guido van Rossum (python.org/~guido) >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> --Guido van Rossum (python.org/~guido) >>>>> _______________________________________________ >>>>> Async-sig mailing list >>>>> Async-sig at python.org >>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>> >>>> -- >>>> Thanks, >>>> Andrew Svetlov >>>> >>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> -- >> Thanks, >> Andrew Svetlov >> > > > > -- > --Guido van Rossum (python.org/~guido) > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jun 12 15:09:27 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Jun 2017 12:09:27 -0700 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: I think we're getting way beyond the rationale Pau Freixes requested... On Mon, Jun 12, 2017 at 12:05 PM, Andrew Svetlov wrote: > Agree in general but current asyncio still may shoot your leg. > The solution (at least for my unittest example) might be in adding top > level functions for running asyncio code (asyncio.run() and > asyncio.run_forever() as Yury Selivanov proposed in > https://github.com/python/asyncio/pull/465) > After this we could raise a warning in `asyncio.get_event_loop()` if the > loop was not set explicitly by `asyncio.set_event_loop()`. > > On Mon, Jun 12, 2017 at 9:50 PM Guido van Rossum wrote: > >> Honestly I think we're in agreement. There's never a use for one loop >> running while another is the default. There are some rare use cases for >> multiple loops running but before the mentioned commit it was up to the app >> to ensure to switch the default loop when running a loop. The commit took >> the ability to screw up there out of the user's hand. >> >> On Mon, Jun 12, 2017 at 9:57 AM, Andrew Svetlov > > wrote: >> >>> Yes, but with one exception: default event loop created on module import >>> stage might co-exist with a loop created for test. >>> It leads to mystic hangs, you know. >>> Please recall code like: >>> class A: >>> mongodb = motor.motor_asyncio.AsyncIOMotorClient() >>> >>> On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum >>> wrote: >>> >>>> Yes, but not co-existing, I hope! >>>> >>>> On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov < >>>> andrew.svetlov at gmail.com> wrote: >>>> >>>>> Unit tests at least. Running every test in own loop is crucial fro >>>>> tests isolation. >>>>> >>>>> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum >>>>> wrote: >>>>> >>>>>> Multiple loops in the same thread is purely theoretical -- the API >>>>>> allows it but there's no use case. It might be necessary if a platform has >>>>>> a UI-only event loop that cannot be extended to do I/O -- the only solution >>>>>> to do background I/O might be to alternate between two loops. (Though in >>>>>> that case I would still prefer a thread for the background I/O.) >>>>>> >>>>>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes >>>>>> wrote: >>>>>> >>>>>>> And what about the rationale of having multiple loop instances in >>>>>>> the same thread switching btw them. Im still trying to find out what >>>>>>> patterns need this... Do you have an example? >>>>>>> >>>>>>> Btw thanks for the first explanation >>>>>>> >>>>>>> El 12/06/2017 17:36, "Guido van Rossum" escribi?: >>>>>>> >>>>>>>> In theory it's possible to create two event loops (using >>>>>>>> new_event_loop()), then set one as the default event loop (using >>>>>>>> set_event_loop()), then run the other one (using run_forever() or >>>>>>>> run_until_complete()). To tasks running in the latter event loop, >>>>>>>> get_event_loop() would nevertheless return the former. >>>>>>>> >>>>>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Sorry a bit of topic, but I would like to figure out why older >>>>>>>>> python >>>>>>>>> versions, prior this commit [1], the get_event_loop is not >>>>>>>>> considered >>>>>>>>> deterministic >>>>>>>>> >>>>>>>>> does anybody know the reason behind this change? >>>>>>>>> >>>>>>>>> >>>>>>>>> [1] https://github.com/python/cpython/commit/ >>>>>>>>> 600a349781bfa0a8239e1cb95fac29c7c4a3302e >>>>>>>>> >>>>>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell >>>>>>>>> wrote: >>>>>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield >>>>>>>>> wrote: >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> My concern with multiple loops boils down to the fact that >>>>>>>>> urllib3 >>>>>>>>> >> supports being used in a multithreaded context where each >>>>>>>>> thread can >>>>>>>>> >> independently make forward progress on one request. To >>>>>>>>> establish that with a >>>>>>>>> >> synchronous codebase you either need one event loop per thread >>>>>>>>> or you need >>>>>>>>> >> to spawn a background thread on startup that owns the only >>>>>>>>> event loop in the >>>>>>>>> >> process. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Yeah, one event loop per thread is probably the way to go for >>>>>>>>> integration >>>>>>>>> > with synchronous codebases. A dedicated event loop thread may >>>>>>>>> perform better >>>>>>>>> > but libraries that spawn threads are problematic. >>>>>>>>> > >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> Generally speaking I?ve not had positive results with libraries >>>>>>>>> spawning >>>>>>>>> >> their own threads in Python. In my experience this has tended >>>>>>>>> to lead to >>>>>>>>> >> programs that deadlock mysteriously or that fail to terminate >>>>>>>>> in the face of >>>>>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own >>>>>>>>> threads, which >>>>>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence, >>>>>>>>> needing a >>>>>>>>> >> loop parameter to pass around prior to 3.6. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > You can avoid the loop parameter on older versions of asyncio >>>>>>>>> (at least as >>>>>>>>> > long as the default event loop policy is used) by manually >>>>>>>>> setting your >>>>>>>>> > event loop as current before calling run_until_complete (and >>>>>>>>> resetting it >>>>>>>>> > afterwards). >>>>>>>>> > >>>>>>>>> > Tornado's run_sync() method is equivalent to asyncio's >>>>>>>>> run_until_complete(), >>>>>>>>> > and Tornado supports multiple IOLoops in this way. We use this >>>>>>>>> to expose a >>>>>>>>> > synchronous version of our AsyncHTTPClient: >>>>>>>>> > https://github.com/tornadoweb/tornado/blob/ >>>>>>>>> 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54 >>>>>>>>> > >>>>>>>>> > -Ben >>>>>>>>> > >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> I admit that my concerns here regarding libraries spawning >>>>>>>>> their own >>>>>>>>> >> threads may be overblown: after my series of negative >>>>>>>>> experiences I >>>>>>>>> >> basically never went back to that model, and it may be that the >>>>>>>>> problems >>>>>>>>> >> were more user-error than anything else. However, I feel >>>>>>>>> comfortable saying >>>>>>>>> >> that libraries spawning their own Python threads is definitely >>>>>>>>> subtle and >>>>>>>>> >> hard to get right, at the very least. >>>>>>>>> >> >>>>>>>>> >> Cory >>>>>>>>> >> _______________________________________________ >>>>>>>>> >> Async-sig mailing list >>>>>>>>> >> Async-sig at python.org >>>>>>>>> >> https://mail.python.org/mailman/listinfo/async-sig >>>>>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > _______________________________________________ >>>>>>>>> > Async-sig mailing list >>>>>>>>> > Async-sig at python.org >>>>>>>>> > https://mail.python.org/mailman/listinfo/async-sig >>>>>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> --pau >>>>>>>>> _______________________________________________ >>>>>>>>> Async-sig mailing list >>>>>>>>> Async-sig at python.org >>>>>>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> --Guido van Rossum (python.org/~guido) >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> --Guido van Rossum (python.org/~guido) >>>>>> _______________________________________________ >>>>>> Async-sig mailing list >>>>>> Async-sig at python.org >>>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>>>>> >>>>> -- >>>>> Thanks, >>>>> Andrew Svetlov >>>>> >>>> >>>> >>>> >>>> -- >>>> --Guido van Rossum (python.org/~guido) >>>> >>> -- >>> Thanks, >>> Andrew Svetlov >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > -- > Thanks, > Andrew Svetlov > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From manu.mirandad at gmail.com Mon Jun 12 17:20:17 2017 From: manu.mirandad at gmail.com (manuel miranda) Date: Mon, 12 Jun 2017 21:20:17 +0000 Subject: [Async-sig] async/sync library reusage In-Reply-To: References: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk> Message-ID: So, I've been playing a bit with the information I saw in this thread (thank you all for the responses) and I got something super simple working: https://gist.github.com/argaen/056a43b083a29f76ac6e2fa97b3e08d1 What I like about this (and that's what I was aiming for) is that the user uses the same class/interface no matter if its inside asyncio world or not. So both `await fn()` and `fn()` work producing the expected results. Now some cons (that in the case of my library are acceptable): - This aims only for asyncio compatibility, other async frameworks like trio, curio, etc. wouldn't work - No python2 compatibility (although Nathaniel's idea of bleaching could still be applied) - I guess it adds some overhead to both sync and async versions, I will do some benchmarking when I have time (actually this one will be the one deciding whether I do the integration or not) Pros: - User is agnostic to the async/sync implementation. If you are in asyncio world, just use `async fn()` and if not `fn()`. Both will work - There is compatibility between classes using this approach - No duplication of code I haven't thought yet about async context managers, iterations and so but I guess there is a way to fix that too (or not, I have no idea). One fun part of all this is if its possible (meaning easily) to reuse also the tests to test both the sync and the async version... :rolling_eyes: On Fri, Jun 9, 2017 at 9:52 PM Yarko Tymciurak wrote: > ...so I really am enjoying the conversation. > > Guido - re: "vision too far out": yes, for people trying to struggle w/ > async support in their libraries, now... but that is also part of my > motivation. Python 5? Sure... (I may have to watch it come to use from > the grave, but hopefully not... ;-) ). Anyway, from back-porting and > tactical "implement now" concerns, to plans for next release, to plans for > next version of python, to brainstorming much less concrete future versions > - all are an interesting continuum. > > Re: GIL... sure, sort of, and sort of not. I was thinking "as long as > major changes are going on... think about additional structural > changes..." More to the point: as I see it, people have a hard time > thinking about async in the cooperative-multitasking (CMT) sense, and thus > disappointments happen around blocking (missed, or unexpects, e.g. hardware > failures). Cory (in his reply - and, yeah: nice writeup!) hints to what I > generally structurally like: > > "...we?d ideally treat asyncio as the first-class citizen and retrofit on > the threaded support, rather than the other way around" > > Structurally, async is light-weight overhead compared to threads, which > are lightweight compared to processes, and so a sort of natural app flow > seems from lightest-weight, on out. To me, this seems practical for making > life easier for developers, because you can imagine "promoting" an async > task caught unexpectedly blocking, to a thread, while still having the > lightest-weight loop have control over it (promotion out, as well as > cancellation while promoted). > > As for multiple task loops, or loops off in a thread, I haven't thought > about it too much, but this seems like nothing new nor unreasonable. I'm > thinking of the base-stations we talk over in our mobile connections, which > are multiple diskless servers, and hot-promote to "master" server status on > hardware failure (or live capacity upgrade, i.e. inserting processors). > This pattern seems both reasonable and useful in this context, i.e. the > concept of a master loop (which implies communication/control channels - a > complication). With some thought, some reasonable ground rules and > simplifications, and I would expect much can be done. > > Appreciate the discussions! > > - Yarko > On Fri, Jun 9, 2017 at 1:23 PM, Guido van Rossum wrote: > >> Great write-up! I actually find the async nature of HTTP (both versions) >> a compelling reason to switch to asyncio. For HTTP/1.1 this sounds mostly >> like it would make the implementation easier; for HTTP/2 it sounds like it >> would just be better for the user-side as well (if the user just wants one >> resource they can safely continue to use the synchronous HTTP/1.1 version >> of the API.) >> >> On Fri, Jun 9, 2017 at 9:55 AM, Cory Benfield wrote: >> >>> >>> On 9 Jun 2017, at 17:28, Guido van Rossum wrote: >>> >>> At least one of us is still confused. The one-event-loop-per-thread >>> model is supported in asyncio without passing the loop around explicitly. >>> The get_event_loop() implementation stores all its state in thread-locals >>> instance, so it returns the thread's event loop. (Because this is an >>> "advanced" model, you have to explicitly create the event loop with >>> new_event_loop() and make it the default loop for the thread with >>> set_event_loop().) >>> >>> >>> Aha, ok, so the confused one is me. I did not know this. =) That >>> definitely works a lot better. It admittedly works less well if someone is >>> doing their own custom event loop stuff, but that?s probably an acceptable >>> limitation up until the time that Python 2 goes quietly into the night. >>> >>> All in all, I'm a bit curious why you would need to use asyncio at all >>> when you've got a thread per request anyway. >>> >>> >>> Yeah, so this is a bit of a diversion from the original topic of this >>> thread but I think it?s an idea worth discussing in this space. I want to >>> reframe the question a bit if you don?t mind, so shout if you think I?m not >>> responding to quite what you were asking. In my understanding, the question >>> you?re implicitly asking is this: >>> >>> "If you have a thread-safe library today (that is, one that allows users >>> to do threaded I/O with appropriate resource pooling and management), why >>> move to a model built on asyncio?? >>> >>> There are many answers to this question that differ for different >>> libraries with different uses, but for HTTP libraries like urllib3 here are >>> our reasons. >>> >>> The first is that it turns out that even for HTTP/1.1 you need to write >>> something that amounts to a partial event loop to properly handle the >>> protocol. Good HTTP clients need to watch for responses while they?re >>> uploading body data because if a response arrives during that process body >>> upload should be terminated immediately. This is also required for sensibly >>> handling things like Expect: 100-continue, as well as spotting other >>> intermediate responses and connection teardowns sensibly and without >>> throwing exceptions. >>> >>> Today urllib3 does not do this, and it has caused us pain, so our v2 >>> branch includes a backport of the Python 3 selectors module and a >>> hand-written partially-complete event loop that only handles the specific >>> cases we need. This is an extra thing for us to debug and maintain, and >>> ultimately it?d be easier to just delegate the whole thing to event loops >>> written by others who promise to maintain them and make them efficient. >>> >>> The second answer is that I believe good asyncio support in libraries is >>> a vital part of the future of this language, and ?good? asyncio support IMO >>> does as little as possible to block the main event loop. Running all of the >>> complex protocol parsing and state manipulation of the Requests stack on a >>> background thread is not cheap, and involves a lot of GIL swapping around. >>> We have found several bug reports complaining about using Requests with >>> largish-numbers of threads, indicating that our big stack of Python code >>> really does cause contention on the GIL if used heavily. In general, having >>> to defer to a thread to run *Python* code in asyncio is IMO a nasty >>> anti-pattern that should be avoided where possible. It is much less bad to >>> defer to a thread to then block on a syscall (e.g. to get an ?async? >>> getaddrinfo), but doing so to run a big big stack of Python code is vastly >>> less pleasant for the main event loop. >>> >>> For this reason, we?d ideally treat asyncio as the first-class citizen >>> and retrofit on the threaded support, rather than the other way around. >>> This goes doubly so when you consider the other reasons for wanting to use >>> asyncio. >>> >>> The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is >>> a *highly* concurrent protocol. Connections send a lot of control frames >>> back and forth that are invisible to the user working at the semantic HTTP >>> level but that nonetheless need relatively low-latency turnaround (e.g. >>> PING frames). It turns out that in the traditional synchronous HTTP model >>> urllib3 only gets access to the socket to do work when the user calls into >>> our code. If the user goes a ?long? time without calling into urllib3, we >>> take a long time to process any data off the connection. In the best case >>> this causes latency spikes as we process all the data that queued up in the >>> socket. In the worst case, this causes us to lose connections we should >>> have been able to keep because we failed to respond to a PING frame in a >>> timely manner. >>> >>> My experience is that purely synchronous libraries handling HTTP/2 >>> simply cannot provide a positive user experience. HTTP/2 flat-out >>> *requires* either an event loop or a dedicated background thread, and in >>> practice in your dedicated background thread you?d also just end up writing >>> an event loop (see answer 1 again). For this reason, it is basically >>> mandatory for HTTP/2 support in Python to either use an event loop or to >>> spawn out a dedicated C thread that does not hold the GIL to do the I/O (as >>> this thread will be regularly woken up to handle I/O events). >>> >>> Hopefully this (admittedly horrifyingly long) response helps illuminate >>> why we?re interested in asyncio support. It should be noted that if we find >>> ourselves unable to get it in the short term we may simply resort to >>> offering an ?async? API that involves us doing the rough equivalent of >>> running in a thread-pool executor, but I won?t be thrilled about it. ;) >>> >>> Cory >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pfreixes at gmail.com Thu Jun 15 17:40:04 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Thu, 15 Jun 2017 23:40:04 +0200 Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor Message-ID: Hi guys, recently I've been trying to implement a POC of the default event loop implemented by asyncio but using a fair scheduling reactor. At the moment is just a POC [1], something to test the rationale and pending to be evolved in someone more mature, but before of that I would prefer to share my thoughts and get all of the comments from you. The current implementation is based on a FIFO queue that is filled with all of the callbacks that have to be executed, these callbacks can stand for: 1) Run tasks, either to be started or resumed 2) Run future callbacks. 3) Run scheduled callbacks. 4) Run file descriptors callbacks. Worth mentioning that most of them internally are chained in somehow, perhaps a future callback can wake up a resumed task. Also, have in mind that the API published by asyncio to schedule callbacks can be used by asyncio itself or by the user, callsoon for example. The usual flow the reactor is the following one: 1) Check the file descriptors with events, and stack the handlers into the reactor. 2) Pop all outdated scheduled callbacks and push them into the reactor. 3) Iterate for N first elements at the queue, where N stands for the number of the handles stacked at that moment. Future handles stacked during that iteration won't be handled, they must wait until next whole iteration 4) Go to the point 1. As you can observe here, the IO is only made once per loop and should wait until all handles that are in a specific moment are executed. This implements in somehow a natural backpressure, the read and also the accept the new connections will rely on the buffers run by the operating system. That implementation can be seen as simple, but it stands on a solid strategy and follows KISS design that helps to scare the bugs. Why fair scheduling? Not all code that is written in the same module, in terms of loop sharing, has the same requirements. Some part might need N and other parts M. When this implementation cant be decoupled, and it means that the cost of placing them into a separated pieces inside of your architecture are too expensive, in that scenario the developer cant express this difference to make the underlying implementation aware of that. For example, an API with a regular endpoint accessed by the user and another one with the health-check of the system, which has completely different requirements in terms of IO. Not only due to the nature of the resources accessed, also because of the frequency of use. Meanwhile, the healthcheck is accessed to a known a frequency at X seconds, the other endpoint has a variable frequency of use. Do you believe that asyncio will be able to preserve the health-check frequency at any moment? Absolutely not. Therefore, the idea of implementing a fair scheduling reactor is based on the needed of address these kind of situations, giving to the developer an interface to isolate different resources. Basic principles The basic principles of the implementation are: - The cost of the scheduling has to be the same of the current implementation, no overhead - The design has to follow the current one, having the implicit backpressure that was commented. I will focus in the second principle, taking into account that the first one is a matter of implementation. To achieve the same behavior, the new implementation only split the resources - handles, schedules, file descriptors - in isolated partitions to then implement for each partition the same algorithm than the current one. The developer can create a new partition using a new function called `spawn`, this function takes as an argument a coroutine, the task wrapped to that coroutine and all of the resources created inside this coroutine will belong to that partition. For example: >>> async def background_task(): >>> task = [ fetch() for i in range(1000)] >>> return (await asyncio.gather(*t)) >>> >>> async def foo(): >>> return (await asyncio.spawn(background_task())) All resources created inside the scope of the `background_tasks` are isolated to one partition. The 1000 sockets will schedule callbacks that will be stacked in the same queue. The partition is by default identified with the hash of the task that warps the `background_task`, but the user can pass an alternative value. >>> async def foo(): >>> return (await asyncio.spawn(healtheck(), partition='healthcheck')) Internally the implementation has a default ROOT partition that is used for all of these resources that are not executed inside of the scope of a spawn function. As you can guess, if you don't use the spawn method the reactor will run exactly as the current implementation. Having all the resources in the same queue. Round robin between partitions. The differents partitions that exist at some moment share the CPU resource using a round robin strategy. It gives the same chance to all partitions to run the same amount of handles, but with one particularity. Each time that a partition runs out of handles, the loop is restarted again to handle the file descriptors and the delayed calls but only for that specific partition that runs out of handles. The side effect is clear, have the same backpressure mechanism. But, per partition. The black hole of the current implementation. There is always a but, at least I've found a situation where this strategy can perform in the same way as the current one, without applying any fair scheduling. Although the code uses the spawn method. Have a look at the following snippet: >>> async def healtheck(request): >>> await request.resp() >>> >>> async def view(request): >>> return (await asyncio.spawn(healthcheck(request))) The task that wraps the healtcheck coroutine that is being isolated in a partition, won't be scheduled until the data from the file descriptor that is read by a callback that is in fact executed inside of the ROOT partition. Therefore, in the worst case scenario, the fair scheduling will become a simple FIFO scheduling. IMHO there is not an easy way to solve that issue, or at least without changing the current picture. And try to solve it, might end up having a messy implementation and a buggy code. Although, I believe that this still worth it, having in mind the benefits that it will bring us for all of those cases where the user needs to isolate resources. Thoughts, comments, and others will be welcomed. [1] https://github.com/pfreixes/qloop -- --pau From njs at pobox.com Thu Jun 15 18:13:43 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 15 Jun 2017 15:13:43 -0700 Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor In-Reply-To: References: Message-ID: A few quick thoughts: You might find these notes interesting: https://github.com/python-trio/trio/issues/32 It sounds like a more precise description of your scheduler would be "hierarchical FIFO"? i.e., there's a top-level scheduler that selects between "child" schedulers in round-robin/FIFO fashion, and each child scheduler is round-robin/FIFO within its schedulable entities? [1] Generally I would think of a "fair" scheduler as one that notices when e.g. one task is blocking the event loop for a long time when it runs, and penalizing it by not letting it run as often. For your motivating example of a health-check: it sounds like another way to express your goal would be by attaching a static "priority level" to the health-check task(s), such that they get to run first whenever they're ready. Have you considered that as an alternative approach? But also... isn't part of the point of a healthcheck that it *should* get slow if the system is overloaded? -n [1] http://intronetworks.cs.luc.edu/current/html/queuing.html#hierarchical-queuing On Thu, Jun 15, 2017 at 2:40 PM, Pau Freixes wrote: > Hi guys, recently I've been trying to implement a POC of the default > event loop implemented by asyncio but using a fair scheduling reactor. > At the moment is just > a POC [1], something to test the rationale and pending to be evolved > in someone more mature, but before of that I would prefer to share my > thoughts and get all of the comments from you. > > The current implementation is based on a FIFO queue that is filled > with all of the callbacks that have to be executed, these callbacks > can stand for: > > 1) Run tasks, either to be started or resumed > 2) Run future callbacks. > 3) Run scheduled callbacks. > 4) Run file descriptors callbacks. > > Worth mentioning that most of them internally are chained in somehow, > perhaps a future callback can wake up a resumed task. Also, have in > mind that the API published by asyncio to schedule callbacks can be > used by asyncio itself or by the user, callsoon for example. > > The usual flow the reactor is the following one: > > 1) Check the file descriptors with events, and stack the handlers into > the reactor. > 2) Pop all outdated scheduled callbacks and push them into the reactor. > 3) Iterate for N first elements at the queue, where N stands for the number of > the handles stacked at that moment. Future handles stacked during that > iteration won't be handled, they must wait until next whole iteration > 4) Go to the point 1. > > As you can observe here, the IO is only made once per loop and should > wait until all handles that are in a specific moment are executed. > > This implements in somehow a natural backpressure, the read and also > the accept the new connections will rely on the buffers run by the > operating system. > > That implementation can be seen as simple, but it stands on a solid > strategy and follows KISS design that helps to scare the bugs. > > Why fair scheduling? > > Not all code that is written in the same module, in terms of loop > sharing, has the same requirements. Some part might need N and other > parts M. When this implementation cant be decoupled, and it means that > the cost of placing them into a separated pieces inside of your > architecture are too expensive, in that scenario the developer cant > express this difference to make the underlying implementation aware of > that. > > For example, an API with a regular endpoint accessed by the user and > another one with the health-check of the system, which has completely > different requirements in terms of IO. Not only due to the nature of > the resources accessed, also because of the frequency of use. > Meanwhile, the healthcheck is accessed to a known a frequency at X > seconds, the other endpoint has a variable frequency of use. > > Do you believe that asyncio will be able to preserve the health-check > frequency at any moment? Absolutely not. > > Therefore, the idea of implementing a fair scheduling reactor is based > on the needed of address these kind of situations, giving to the > developer an interface to isolate different resources. > > Basic principles > > The basic principles of the implementation are: > > - The cost of the scheduling has to be the same of the current > implementation, no overhead > - The design has to follow the current one, having the implicit > backpressure that was commented. > > I will focus in the second principle, taking into account that the > first one is a matter of implementation. > > To achieve the same behavior, the new implementation only split the > resources - handles, schedules, file descriptors - in isolated > partitions to then implement for each partition the same algorithm > than the current one. The developer can create a new partition using a > new function called `spawn`, this function takes as an argument a > coroutine, the task wrapped to that coroutine and all of the resources > created inside this coroutine will belong to that partition. For > example: > >>>> async def background_task(): >>>> task = [ fetch() for i in range(1000)] >>>> return (await asyncio.gather(*t)) >>>> >>>> async def foo(): >>>> return (await asyncio.spawn(background_task())) > > > All resources created inside the scope of the `background_tasks` are > isolated to one partition. The 1000 sockets will schedule callbacks > that will be stacked in the same queue. > > The partition is by default identified with the hash of the task that > warps the `background_task`, but the user can pass an alternative > value. > >>>> async def foo(): >>>> return (await asyncio.spawn(healtheck(), partition='healthcheck')) > > Internally the implementation has a default ROOT partition that is > used for all of these resources that are not executed inside of the > scope of a spawn function. As you can guess, if you don't use the > spawn method the reactor will run exactly as the current > implementation. Having all the resources in the same queue. > > > Round robin between partitions. > > The differents partitions that exist at some moment share the CPU > resource using a round robin strategy. It gives the same chance to all > partitions to run the same amount of handles, but with one > particularity. Each time that a partition runs out of handles, the > loop is restarted again to handle the file descriptors and the delayed > calls but only for that specific partition that runs out of handles. > > The side effect is clear, have the same backpressure mechanism. But, > per partition. > > > The black hole of the current implementation. > > There is always a but, at least I've found a situation where this > strategy can perform in the same way as the current one, without > applying any fair scheduling. Although the code uses the spawn method. > > Have a look at the following snippet: > >>>> async def healtheck(request): >>>> await request.resp() >>>> >>>> async def view(request): >>>> return (await asyncio.spawn(healthcheck(request))) > > > The task that wraps the healtcheck coroutine that is being isolated in > a partition, won't be scheduled until the data from the file > descriptor that is read by a callback that is in fact executed inside > of the ROOT partition. Therefore, in the worst case scenario, the fair > scheduling will become a simple FIFO scheduling. > > IMHO there is not an easy way to solve that issue, or at least without > changing the current picture. And try to solve it, might end up having > a messy implementation and a buggy code. > > Although, I believe that this still worth it, having in mind the > benefits that it will bring us for all of those cases where the user > needs to isolate resources. > > Thoughts, comments, and others will be welcomed. > > [1] https://github.com/pfreixes/qloop > > -- > --pau > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ -- Nathaniel J. Smith -- https://vorpus.org From dimaqq at gmail.com Fri Jun 16 07:47:03 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Fri, 16 Jun 2017 13:47:03 +0200 Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor In-Reply-To: References: Message-ID: Just a couple of thoughts: 1. A good place to start is Linux BFS (simple, correct, good performance). Typical test case is "make -j4", perhaps there's a way to simulate something similar? 2. There's a space to research async/await scheduling (in academic sense), older research may have been focused on .net, and current research probably on browsers and node, in any case javascript. d. On 15 June 2017 at 23:40, Pau Freixes wrote: > Hi guys, recently I've been trying to implement a POC of the default > event loop implemented by asyncio but using a fair scheduling reactor. > At the moment is just > a POC [1], something to test the rationale and pending to be evolved > in someone more mature, but before of that I would prefer to share my > thoughts and get all of the comments from you. > > The current implementation is based on a FIFO queue that is filled > with all of the callbacks that have to be executed, these callbacks > can stand for: > > 1) Run tasks, either to be started or resumed > 2) Run future callbacks. > 3) Run scheduled callbacks. > 4) Run file descriptors callbacks. > > Worth mentioning that most of them internally are chained in somehow, > perhaps a future callback can wake up a resumed task. Also, have in > mind that the API published by asyncio to schedule callbacks can be > used by asyncio itself or by the user, callsoon for example. > > The usual flow the reactor is the following one: > > 1) Check the file descriptors with events, and stack the handlers into > the reactor. > 2) Pop all outdated scheduled callbacks and push them into the reactor. > 3) Iterate for N first elements at the queue, where N stands for the number of > the handles stacked at that moment. Future handles stacked during that > iteration won't be handled, they must wait until next whole iteration > 4) Go to the point 1. > > As you can observe here, the IO is only made once per loop and should > wait until all handles that are in a specific moment are executed. > > This implements in somehow a natural backpressure, the read and also > the accept the new connections will rely on the buffers run by the > operating system. > > That implementation can be seen as simple, but it stands on a solid > strategy and follows KISS design that helps to scare the bugs. > > Why fair scheduling? > > Not all code that is written in the same module, in terms of loop > sharing, has the same requirements. Some part might need N and other > parts M. When this implementation cant be decoupled, and it means that > the cost of placing them into a separated pieces inside of your > architecture are too expensive, in that scenario the developer cant > express this difference to make the underlying implementation aware of > that. > > For example, an API with a regular endpoint accessed by the user and > another one with the health-check of the system, which has completely > different requirements in terms of IO. Not only due to the nature of > the resources accessed, also because of the frequency of use. > Meanwhile, the healthcheck is accessed to a known a frequency at X > seconds, the other endpoint has a variable frequency of use. > > Do you believe that asyncio will be able to preserve the health-check > frequency at any moment? Absolutely not. > > Therefore, the idea of implementing a fair scheduling reactor is based > on the needed of address these kind of situations, giving to the > developer an interface to isolate different resources. > > Basic principles > > The basic principles of the implementation are: > > - The cost of the scheduling has to be the same of the current > implementation, no overhead > - The design has to follow the current one, having the implicit > backpressure that was commented. > > I will focus in the second principle, taking into account that the > first one is a matter of implementation. > > To achieve the same behavior, the new implementation only split the > resources - handles, schedules, file descriptors - in isolated > partitions to then implement for each partition the same algorithm > than the current one. The developer can create a new partition using a > new function called `spawn`, this function takes as an argument a > coroutine, the task wrapped to that coroutine and all of the resources > created inside this coroutine will belong to that partition. For > example: > >>>> async def background_task(): >>>> task = [ fetch() for i in range(1000)] >>>> return (await asyncio.gather(*t)) >>>> >>>> async def foo(): >>>> return (await asyncio.spawn(background_task())) > > > All resources created inside the scope of the `background_tasks` are > isolated to one partition. The 1000 sockets will schedule callbacks > that will be stacked in the same queue. > > The partition is by default identified with the hash of the task that > warps the `background_task`, but the user can pass an alternative > value. > >>>> async def foo(): >>>> return (await asyncio.spawn(healtheck(), partition='healthcheck')) > > Internally the implementation has a default ROOT partition that is > used for all of these resources that are not executed inside of the > scope of a spawn function. As you can guess, if you don't use the > spawn method the reactor will run exactly as the current > implementation. Having all the resources in the same queue. > > > Round robin between partitions. > > The differents partitions that exist at some moment share the CPU > resource using a round robin strategy. It gives the same chance to all > partitions to run the same amount of handles, but with one > particularity. Each time that a partition runs out of handles, the > loop is restarted again to handle the file descriptors and the delayed > calls but only for that specific partition that runs out of handles. > > The side effect is clear, have the same backpressure mechanism. But, > per partition. > > > The black hole of the current implementation. > > There is always a but, at least I've found a situation where this > strategy can perform in the same way as the current one, without > applying any fair scheduling. Although the code uses the spawn method. > > Have a look at the following snippet: > >>>> async def healtheck(request): >>>> await request.resp() >>>> >>>> async def view(request): >>>> return (await asyncio.spawn(healthcheck(request))) > > > The task that wraps the healtcheck coroutine that is being isolated in > a partition, won't be scheduled until the data from the file > descriptor that is read by a callback that is in fact executed inside > of the ROOT partition. Therefore, in the worst case scenario, the fair > scheduling will become a simple FIFO scheduling. > > IMHO there is not an easy way to solve that issue, or at least without > changing the current picture. And try to solve it, might end up having > a messy implementation and a buggy code. > > Although, I believe that this still worth it, having in mind the > benefits that it will bring us for all of those cases where the user > needs to isolate resources. > > Thoughts, comments, and others will be welcomed. > > [1] https://github.com/pfreixes/qloop > > -- > --pau > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From pfreixes at gmail.com Sat Jun 17 05:50:25 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Sat, 17 Jun 2017 11:50:25 +0200 Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor In-Reply-To: References: Message-ID: Hi Nathaniel ... Let me share my thoughts regarding your response > It sounds like a more precise description of your scheduler would be > "hierarchical FIFO"? i.e., there's a top-level scheduler that selects > between "child" schedulers in round-robin/FIFO fashion, and each child > scheduler is round-robin/FIFO within its schedulable entities? [1] > Generally I would think of a "fair" scheduler as one that notices when > e.g. one task is blocking the event loop for a long time when it runs, > and penalizing it by not letting it run as often. Agree with the controversial meaning of fair scheduling, I would rename my title as Fair Queuing [1]. Regarding your notes, pretty interesting, and your thoughts about how the current lack could be tackled in the near future. But, I would say that these notes are more about flow control at the network level, where mine was more oriented on business logic. I'm more inclined to think that both works at different level and different granularity, and are not excluding. The draft that I presented is keen on allow the user - in a way of pattern - to split and isolate resources from top to bottom. Yes, I was thinking about different strategies that might give more granularity and fine control to the user such as weighted partitions, or even have just two queues LOW and HIGH. But these last ones didn't appeal to me for the following reasons: - Give to the user the chance to mix partitions and weights might end up having some code difficult to understand. - The LOW and HIGH priority was too much restrictive. Give control to the user to configure what is LOW and what is HIGH might end up also having an overall performance worst than it was expected. The idea behind the Fair queuing is: Give enough control to the user but without taking the chance to screw up the whole thing. > But also... isn't part of the point of a healthcheck that it *should* > get slow if the system is overloaded? Not really, the response of a health-check is dichotomic: Yes or No. The problem on sharing resources between the health-check and the user flow is when the last one impacts on the first one having, as a result, false negatives. The dynamic allocation of resources to scale up horizontally to suit more traffic and reduce the pressure to the main flow is run by other actors, that of course can rely on metrics that are sent out by the user flow. [1] https://en.wikipedia.org/wiki/Fair_queuing -- --pau From mehaase at gmail.com Wed Jun 21 13:50:57 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Wed, 21 Jun 2017 13:50:57 -0400 Subject: [Async-sig] Cancelling SSL connection Message-ID: (I'm not sure if this is a newbie question or a bug report or something in between. I apologize in advance if its off-topic. Let me know if I should post this somewhere else.) If a task is cancelled while SSL is being negotiated, then an SSLError is raised, but there's no way (as far as I can tell) for the caller to catch it. (The example below is pretty contrived, but in an application I'm working on, the user can cancel downloads at any time.) Here's an example: import asyncio, random, ssl async def download(host): ssl_context = ssl.create_default_context() reader, writer = await asyncio.open_connection(host, 443, ssl=ssl_context) request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n' writer.write(request.encode('ascii')) lines = list() while True: newdata = await reader.readline() if newdata == b'\r\n': break else: lines.append(newdata.decode('utf8').rstrip('\r\n')) return lines[0] async def main(): while True: task = asyncio.Task(download('www.python.org')) await asyncio.sleep(random.uniform(0.0, 0.5)) task.cancel() try: response = await task print(response) except asyncio.CancelledError: print('request cancelled!') except ssl.SSLError: print('caught SSL error') await asyncio.sleep(1) loop = asyncio.get_event_loop() loop.run_until_complete(main()) loop.close() Running this script yields the following output: HTTP/1.1 200 OK request cancelled! HTTP/1.1 200 OK HTTP/1.1 200 OK : SSL handshake failed Traceback (most recent call last): File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in _create_connection_transport yield from waiter File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup future.result() concurrent.futures._base.CancelledError During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in _on_handshake_complete raise handshake_exc File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in _process_write_backlog ssldata = self._sslpipe.shutdown(self._finalize) File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in shutdown ssldata, appdata = self.feed_ssldata(b'') File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in feed_ssldata self._sslobj.unwrap() File "/usr/lib/python3.6/ssl.py", line 692, in unwrap return self._sslobj.shutdown() ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299) Is this a bug that I should file, or is there some reason that it's intended to work this way? I can work around it with asyncio.shield(), but I think I would prefer for the asyncio/sslproto.py to catch the SSLError and ignore it. Maybe I'm being short sighted. Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From dimaqq at gmail.com Wed Jun 21 15:49:27 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Wed, 21 Jun 2017 21:49:27 +0200 Subject: [Async-sig] Cancelling SSL connection In-Reply-To: References: Message-ID: Looks like a bug in the `ssl` module, not `asyncio`. Refer to https://github.com/openssl/openssl/issues/710 IMO `ssl` module should be prepared for this. I'd say post a bug to cpython and see what core devs have to say about it :) Please note exact versions of python and openssl ofc. my 2c: openssl has been a moving target every so often, it's quite possible that this change in the API escaped the devs. On 21 June 2017 at 19:50, Mark E. Haase wrote: > (I'm not sure if this is a newbie question or a bug report or something in > between. I apologize in advance if its off-topic. Let me know if I should > post this somewhere else.) > > If a task is cancelled while SSL is being negotiated, then an SSLError is > raised, but there's no way (as far as I can tell) for the caller to catch > it. (The example below is pretty contrived, but in an application I'm > working on, the user can cancel downloads at any time.) Here's an example: > > import asyncio, random, ssl > > async def download(host): > ssl_context = ssl.create_default_context() > reader, writer = await asyncio.open_connection(host, 443, > ssl=ssl_context) > request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n' > writer.write(request.encode('ascii')) > lines = list() > while True: > newdata = await reader.readline() > if newdata == b'\r\n': > break > else: > lines.append(newdata.decode('utf8').rstrip('\r\n')) > return lines[0] > > async def main(): > while True: > task = asyncio.Task(download('www.python.org')) > await asyncio.sleep(random.uniform(0.0, 0.5)) > task.cancel() > try: > response = await task > print(response) > except asyncio.CancelledError: > print('request cancelled!') > except ssl.SSLError: > print('caught SSL error') > await asyncio.sleep(1) > > loop = asyncio.get_event_loop() > loop.run_until_complete(main()) > loop.close() > > Running this script yields the following output: > > HTTP/1.1 200 OK > request cancelled! > HTTP/1.1 200 OK > HTTP/1.1 200 OK > : SSL handshake > failed > Traceback (most recent call last): > File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in > _create_connection_transport > yield from waiter > File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup > future.result() > concurrent.futures._base.CancelledError > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in > _on_handshake_complete > raise handshake_exc > File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in > _process_write_backlog > ssldata = self._sslpipe.shutdown(self._finalize) > File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in shutdown > ssldata, appdata = self.feed_ssldata(b'') > File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in > feed_ssldata > self._sslobj.unwrap() > File "/usr/lib/python3.6/ssl.py", line 692, in unwrap > return self._sslobj.shutdown() > ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299) > > Is this a bug that I should file, or is there some reason that it's intended > to work this way? I can work around it with asyncio.shield(), but I think I > would prefer for the asyncio/sslproto.py to catch the SSLError and ignore > it. Maybe I'm being short sighted. > > Thanks, > Mark > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > From njs at pobox.com Wed Jun 21 18:47:04 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 21 Jun 2017 15:47:04 -0700 Subject: [Async-sig] Cancelling SSL connection In-Reply-To: References: Message-ID: SSLObject.unwrap has the contract that if it finishes successfully, then the SSL connection has been cleanly shut down and both sides remain in sync, and can continue to use the socket in unencrypted mode. When asyncio calls unwrap before the handshake has completed, then this contract is impossible to fulfill, and raising an error is the right thing to do. So imo the ssl module is correct here, and this is a (minor) bug in asyncio. On Jun 21, 2017 12:49 PM, "Dima Tisnek" wrote: > Looks like a bug in the `ssl` module, not `asyncio`. > > Refer to https://github.com/openssl/openssl/issues/710 > IMO `ssl` module should be prepared for this. > > I'd say post a bug to cpython and see what core devs have to say about it > :) > Please note exact versions of python and openssl ofc. > > my 2c: openssl has been a moving target every so often, it's quite > possible that this change in the API escaped the devs. > > On 21 June 2017 at 19:50, Mark E. Haase wrote: > > (I'm not sure if this is a newbie question or a bug report or something > in > > between. I apologize in advance if its off-topic. Let me know if I should > > post this somewhere else.) > > > > If a task is cancelled while SSL is being negotiated, then an SSLError is > > raised, but there's no way (as far as I can tell) for the caller to catch > > it. (The example below is pretty contrived, but in an application I'm > > working on, the user can cancel downloads at any time.) Here's an > example: > > > > import asyncio, random, ssl > > > > async def download(host): > > ssl_context = ssl.create_default_context() > > reader, writer = await asyncio.open_connection(host, 443, > > ssl=ssl_context) > > request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n' > > writer.write(request.encode('ascii')) > > lines = list() > > while True: > > newdata = await reader.readline() > > if newdata == b'\r\n': > > break > > else: > > lines.append(newdata.decode('utf8').rstrip('\r\n')) > > return lines[0] > > > > async def main(): > > while True: > > task = asyncio.Task(download('www.python.org')) > > await asyncio.sleep(random.uniform(0.0, 0.5)) > > task.cancel() > > try: > > response = await task > > print(response) > > except asyncio.CancelledError: > > print('request cancelled!') > > except ssl.SSLError: > > print('caught SSL error') > > await asyncio.sleep(1) > > > > loop = asyncio.get_event_loop() > > loop.run_until_complete(main()) > > loop.close() > > > > Running this script yields the following output: > > > > HTTP/1.1 200 OK > > request cancelled! > > HTTP/1.1 200 OK > > HTTP/1.1 200 OK > > : SSL > handshake > > failed > > Traceback (most recent call last): > > File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in > > _create_connection_transport > > yield from waiter > > File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup > > future.result() > > concurrent.futures._base.CancelledError > > > > During handling of the above exception, another exception occurred: > > > > Traceback (most recent call last): > > File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in > > _on_handshake_complete > > raise handshake_exc > > File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in > > _process_write_backlog > > ssldata = self._sslpipe.shutdown(self._finalize) > > File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in > shutdown > > ssldata, appdata = self.feed_ssldata(b'') > > File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in > > feed_ssldata > > self._sslobj.unwrap() > > File "/usr/lib/python3.6/ssl.py", line 692, in unwrap > > return self._sslobj.shutdown() > > ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299) > > > > Is this a bug that I should file, or is there some reason that it's > intended > > to work this way? I can work around it with asyncio.shield(), but I > think I > > would prefer for the asyncio/sslproto.py to catch the SSLError and ignore > > it. Maybe I'm being short sighted. > > > > Thanks, > > Mark > > > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mehaase at gmail.com Fri Jun 23 10:11:10 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Fri, 23 Jun 2017 10:11:10 -0400 Subject: [Async-sig] Cancelling SSL connection In-Reply-To: References: Message-ID: Thanks Dima & Nathaniel. I opened an asyncio bug. ( http://bugs.python.org/issue30740) Cheers, Mark On Wed, Jun 21, 2017 at 6:47 PM, Nathaniel Smith wrote: > SSLObject.unwrap has the contract that if it finishes successfully, then > the SSL connection has been cleanly shut down and both sides remain in > sync, and can continue to use the socket in unencrypted mode. When asyncio > calls unwrap before the handshake has completed, then this contract is > impossible to fulfill, and raising an error is the right thing to do. So > imo the ssl module is correct here, and this is a (minor) bug in asyncio. > > On Jun 21, 2017 12:49 PM, "Dima Tisnek" wrote: > >> Looks like a bug in the `ssl` module, not `asyncio`. >> >> Refer to https://github.com/openssl/openssl/issues/710 >> IMO `ssl` module should be prepared for this. >> >> I'd say post a bug to cpython and see what core devs have to say about it >> :) >> Please note exact versions of python and openssl ofc. >> >> my 2c: openssl has been a moving target every so often, it's quite >> possible that this change in the API escaped the devs. >> >> On 21 June 2017 at 19:50, Mark E. Haase wrote: >> > (I'm not sure if this is a newbie question or a bug report or something >> in >> > between. I apologize in advance if its off-topic. Let me know if I >> should >> > post this somewhere else.) >> > >> > If a task is cancelled while SSL is being negotiated, then an SSLError >> is >> > raised, but there's no way (as far as I can tell) for the caller to >> catch >> > it. (The example below is pretty contrived, but in an application I'm >> > working on, the user can cancel downloads at any time.) Here's an >> example: >> > >> > import asyncio, random, ssl >> > >> > async def download(host): >> > ssl_context = ssl.create_default_context() >> > reader, writer = await asyncio.open_connection(host, 443, >> > ssl=ssl_context) >> > request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n' >> > writer.write(request.encode('ascii')) >> > lines = list() >> > while True: >> > newdata = await reader.readline() >> > if newdata == b'\r\n': >> > break >> > else: >> > lines.append(newdata.decode('utf8').rstrip('\r\n')) >> > return lines[0] >> > >> > async def main(): >> > while True: >> > task = asyncio.Task(download('www.python.org')) >> > await asyncio.sleep(random.uniform(0.0, 0.5)) >> > task.cancel() >> > try: >> > response = await task >> > print(response) >> > except asyncio.CancelledError: >> > print('request cancelled!') >> > except ssl.SSLError: >> > print('caught SSL error') >> > await asyncio.sleep(1) >> > >> > loop = asyncio.get_event_loop() >> > loop.run_until_complete(main()) >> > loop.close() >> > >> > Running this script yields the following output: >> > >> > HTTP/1.1 200 OK >> > request cancelled! >> > HTTP/1.1 200 OK >> > HTTP/1.1 200 OK >> > : SSL >> handshake >> > failed >> > Traceback (most recent call last): >> > File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in >> > _create_connection_transport >> > yield from waiter >> > File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup >> > future.result() >> > concurrent.futures._base.CancelledError >> > >> > During handling of the above exception, another exception occurred: >> > >> > Traceback (most recent call last): >> > File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in >> > _on_handshake_complete >> > raise handshake_exc >> > File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in >> > _process_write_backlog >> > ssldata = self._sslpipe.shutdown(self._finalize) >> > File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in >> shutdown >> > ssldata, appdata = self.feed_ssldata(b'') >> > File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in >> > feed_ssldata >> > self._sslobj.unwrap() >> > File "/usr/lib/python3.6/ssl.py", line 692, in unwrap >> > return self._sslobj.shutdown() >> > ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299) >> > >> > Is this a bug that I should file, or is there some reason that it's >> intended >> > to work this way? I can work around it with asyncio.shield(), but I >> think I >> > would prefer for the asyncio/sslproto.py to catch the SSLError and >> ignore >> > it. Maybe I'm being short sighted. >> > >> > Thanks, >> > Mark >> > >> > _______________________________________________ >> > Async-sig mailing list >> > Async-sig at python.org >> > https://mail.python.org/mailman/listinfo/async-sig >> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sun Jun 25 17:13:12 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 25 Jun 2017 14:13:12 -0700 Subject: [Async-sig] "read-write" synchronization Message-ID: I'm relatively new to async programming in Python and am thinking through possibilities for doing "read-write" synchronization. I'm using asyncio, and the synchronization primitives that asyncio exposes are relatively simple [1]. Have options for async read-write synchronization already been discussed in any detail? I'm interested in designs where "readers" don't need to acquire a lock -- only writers. It seems like one way to deal with the main race condition I see that comes up would be to use loop.time(). Does that ring a bell, or might there be a much simpler way? Thanks, --Chris [1] https://docs.python.org/3/library/asyncio-sync.html From andrew.svetlov at gmail.com Sun Jun 25 17:16:33 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sun, 25 Jun 2017 21:16:33 +0000 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: There is https://github.com/aio-libs/aiorwlock On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek wrote: > I'm relatively new to async programming in Python and am thinking > through possibilities for doing "read-write" synchronization. > > I'm using asyncio, and the synchronization primitives that asyncio > exposes are relatively simple [1]. Have options for async read-write > synchronization already been discussed in any detail? > > I'm interested in designs where "readers" don't need to acquire a lock > -- only writers. It seems like one way to deal with the main race > condition I see that comes up would be to use loop.time(). Does that > ring a bell, or might there be a much simpler way? > > Thanks, > --Chris > > > [1] https://docs.python.org/3/library/asyncio-sync.html > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sun Jun 25 17:24:50 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 25 Jun 2017 14:24:50 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: Thank you. I had seen that, but it seems heavier weight than needed. And it also requires locking on reading. --Chris On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov wrote: > There is https://github.com/aio-libs/aiorwlock > > On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek > wrote: >> >> I'm relatively new to async programming in Python and am thinking >> through possibilities for doing "read-write" synchronization. >> >> I'm using asyncio, and the synchronization primitives that asyncio >> exposes are relatively simple [1]. Have options for async read-write >> synchronization already been discussed in any detail? >> >> I'm interested in designs where "readers" don't need to acquire a lock >> -- only writers. It seems like one way to deal with the main race >> condition I see that comes up would be to use loop.time(). Does that >> ring a bell, or might there be a much simpler way? >> >> Thanks, >> --Chris >> >> >> [1] https://docs.python.org/3/library/asyncio-sync.html >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -- > Thanks, > Andrew Svetlov From chris.jerdonek at gmail.com Sun Jun 25 17:54:44 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 25 Jun 2017 14:54:44 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: The read-write operations I'm protecting will have coroutines inside that need to be awaited on, so I don't think I'll be able to take advantage to that extreme. But I think I might be able to use your point to simplify the logic a little. (To rephrase, you're reminding me that context switches can't happen at arbitrary lines of code. I only need to be prepared for the cases where there's an await / yield from.) --Chris On Sun, Jun 25, 2017 at 2:30 PM, Guido van Rossum wrote: > The secret is that as long as you don't yield no other task will run so you > don't need locks at all. > > On Jun 25, 2017 2:24 PM, "Chris Jerdonek" wrote: >> >> Thank you. I had seen that, but it seems heavier weight than needed. >> And it also requires locking on reading. >> >> --Chris >> >> On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov >> wrote: >> > There is https://github.com/aio-libs/aiorwlock >> > >> > On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek >> > >> > wrote: >> >> >> >> I'm relatively new to async programming in Python and am thinking >> >> through possibilities for doing "read-write" synchronization. >> >> >> >> I'm using asyncio, and the synchronization primitives that asyncio >> >> exposes are relatively simple [1]. Have options for async read-write >> >> synchronization already been discussed in any detail? >> >> >> >> I'm interested in designs where "readers" don't need to acquire a lock >> >> -- only writers. It seems like one way to deal with the main race >> >> condition I see that comes up would be to use loop.time(). Does that >> >> ring a bell, or might there be a much simpler way? >> >> >> >> Thanks, >> >> --Chris >> >> >> >> >> >> [1] https://docs.python.org/3/library/asyncio-sync.html >> >> _______________________________________________ >> >> Async-sig mailing list >> >> Async-sig at python.org >> >> https://mail.python.org/mailman/listinfo/async-sig >> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > >> > -- >> > Thanks, >> > Andrew Svetlov >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ From gvanrossum at gmail.com Sun Jun 25 17:30:39 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 25 Jun 2017 14:30:39 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: The secret is that as long as you don't yield no other task will run so you don't need locks at all. On Jun 25, 2017 2:24 PM, "Chris Jerdonek" wrote: > Thank you. I had seen that, but it seems heavier weight than needed. > And it also requires locking on reading. > > --Chris > > On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov > wrote: > > There is https://github.com/aio-libs/aiorwlock > > > > On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek < > chris.jerdonek at gmail.com> > > wrote: > >> > >> I'm relatively new to async programming in Python and am thinking > >> through possibilities for doing "read-write" synchronization. > >> > >> I'm using asyncio, and the synchronization primitives that asyncio > >> exposes are relatively simple [1]. Have options for async read-write > >> synchronization already been discussed in any detail? > >> > >> I'm interested in designs where "readers" don't need to acquire a lock > >> -- only writers. It seems like one way to deal with the main race > >> condition I see that comes up would be to use loop.time(). Does that > >> ring a bell, or might there be a much simpler way? > >> > >> Thanks, > >> --Chris > >> > >> > >> [1] https://docs.python.org/3/library/asyncio-sync.html > >> _______________________________________________ > >> Async-sig mailing list > >> Async-sig at python.org > >> https://mail.python.org/mailman/listinfo/async-sig > >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > > -- > > Thanks, > > Andrew Svetlov > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Jun 25 18:09:16 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 25 Jun 2017 15:09:16 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 2:13 PM, Chris Jerdonek wrote: > I'm relatively new to async programming in Python and am thinking > through possibilities for doing "read-write" synchronization. > > I'm using asyncio, and the synchronization primitives that asyncio > exposes are relatively simple [1]. Have options for async read-write > synchronization already been discussed in any detail? As a general comment: I used to think rwlocks were a simple extension to regular locks, but it turns out there's actually this huge increase in design complexity. Do you want your lock to be read-biased, write-biased, task-fair, phase-fair? Can you acquire a write lock if you already hold one (i.e., are write locks reentrant)? What about acquiring a read lock if you already hold the write lock? Can you atomically upgrade/downgrade a lock? This makes it much harder to come up with a one-size-fits-all design suitable for adding to something like the python stdlib. -n -- Nathaniel J. Smith -- https://vorpus.org From chris.jerdonek at gmail.com Sun Jun 25 18:27:38 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 25 Jun 2017 15:27:38 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 3:09 PM, Nathaniel Smith wrote: > On Sun, Jun 25, 2017 at 2:13 PM, Chris Jerdonek > wrote: >> I'm using asyncio, and the synchronization primitives that asyncio >> exposes are relatively simple [1]. Have options for async read-write >> synchronization already been discussed in any detail? > > As a general comment: I used to think rwlocks were a simple extension > to regular locks, but it turns out there's actually this huge increase > in design complexity. Do you want your lock to be read-biased, > write-biased, task-fair, phase-fair? Can you acquire a write lock if > you already hold one (i.e., are write locks reentrant)? What about > acquiring a read lock if you already hold the write lock? Can you > atomically upgrade/downgrade a lock? This makes it much harder to come > up with a one-size-fits-all design suitable for adding to something > like the python stdlib. I agree. And my point about asyncio's primitives wasn't a criticism or request that more be added. I was asking more if there has been any discussion of general approaches and patterns that take advantage of the event loop's single thread, etc. Maybe what I'll do is briefly write up the approach I have in mind, and people can let me know if I'm on the right track. :) --Chris From yarkot1 at gmail.com Sun Jun 25 18:38:41 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Sun, 25 Jun 2017 22:38:41 +0000 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 4:54 PM Chris Jerdonek wrote: > The read-write operations I'm protecting will have coroutines inside > that need to be awaited on, so I don't think I'll be able to take > advantage to that extreme. > > But I think I might be able to use your point to simplify the logic a > little. (To rephrase, you're reminding me that context switches can't > happen at arbitrary lines of code. I only need to be prepared for the > cases where there's an await / yield from.) The "secret" Guido refers to we should pull out front and center, explicitly at all times - asynchronous programming is nothing more than cooperative multitasking. Patterns suited for preemptive multi-tasking (executive-based, interrupt based, etc.) are suspect, potentially misplaced when they show up in a cooperative multitasking context. To be a well-behaved (capable of effective cooperation) task in such a system, you should guard against getting embroiled in potentially blocking I/O tasks whose latency you are not able to control (within facilities available in a cooperative multitasking context). The raises a couple of questions: to be well-behaved, simple control flow is desireable (i.e. not nested layers of yields, except perhaps for a pipeline case); and "read/write" control from memory space w/in the process (since external I/O is generally not for async) begs the question: what for? Eliminate globals, encapsulate and limit access as needed theough usual programming methods. I'm sure someone will find an edgecase to challenge my above rule-of-thumb, but as you're new to this, I think this is a pretty good place to start. Ask yourself if what your trying to do w/ async is suited for async. Cheers, Yarko > > > --Chris > > > On Sun, Jun 25, 2017 at 2:30 PM, Guido van Rossum > wrote: > > The secret is that as long as you don't yield no other task will run so > you > > don't need locks at all. > > > > On Jun 25, 2017 2:24 PM, "Chris Jerdonek" > wrote: > >> > >> Thank you. I had seen that, but it seems heavier weight than needed. > >> And it also requires locking on reading. > >> > >> --Chris > >> > >> On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov > >> wrote: > >> > There is https://github.com/aio-libs/aiorwlock > >> > > >> > On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek > >> > > >> > wrote: > >> >> > >> >> I'm relatively new to async programming in Python and am thinking > >> >> through possibilities for doing "read-write" synchronization. > >> >> > >> >> I'm using asyncio, and the synchronization primitives that asyncio > >> >> exposes are relatively simple [1]. Have options for async read-write > >> >> synchronization already been discussed in any detail? > >> >> > >> >> I'm interested in designs where "readers" don't need to acquire a > lock > >> >> -- only writers. It seems like one way to deal with the main race > >> >> condition I see that comes up would be to use loop.time(). Does that > >> >> ring a bell, or might there be a much simpler way? > >> >> > >> >> Thanks, > >> >> --Chris > >> >> > >> >> > >> >> [1] https://docs.python.org/3/library/asyncio-sync.html > >> >> _______________________________________________ > >> >> Async-sig mailing list > >> >> Async-sig at python.org > >> >> https://mail.python.org/mailman/listinfo/async-sig > >> >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > >> > > >> > -- > >> > Thanks, > >> > Andrew Svetlov > >> _______________________________________________ > >> Async-sig mailing list > >> Async-sig at python.org > >> https://mail.python.org/mailman/listinfo/async-sig > >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Sun Jun 25 23:33:57 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 25 Jun 2017 20:33:57 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 3:38 PM, Yarko Tymciurak wrote: > To be a well-behaved (capable of effective cooperation) task in such a > system, you should guard against getting embroiled in potentially blocking > I/O tasks whose latency you are not able to control (within facilities > available in a cooperative multitasking context). The raises a couple of > questions: to be well-behaved, simple control flow is desireable (i.e. not > nested layers of yields, except perhaps for a pipeline case); and > "read/write" control from memory space w/in the process (since external I/O > is generally not for async) begs the question: what for? Eliminate > globals, encapsulate and limit access as needed through usual programming > methods. > Before anyone takes this paragraph too seriously, there seem to be a bunch of misunderstandings underlying this paragraph. - *All* blocking I/O is wrong in an async task, regardless of whether you can control its latency. (The only safe way to do I/O is using a primitive that works with `await`.) - There's nothing wrong with `yield` itself. (You shouldn't do I/O in a generator used in an async task -- but that's just due to the general ban on I/O.) - Using async tasks don't make globals more risky than regular code (in fact they are safer here than in traditional multi-threaded code). - What on earth is "read/write" control from memory space w/in the process? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sun Jun 25 23:34:50 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 25 Jun 2017 20:34:50 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: So here's one approach I'm thinking about for implementing readers-writer synchronization. Does this seem reasonable as a starting point, or am I missing something much simpler? I know there are various things you can prioritize for (readers vs. writers, etc), but I'm less concerned about those for now. The global state is-- * reader_count: an integer count of the active (reading) readers * writer_lock: an asyncio Lock object * no_readers_event: an asyncio Event object signaling no active readers * no_writer_event: an asyncio Event object signaling no active writer Untested pseudo-code for a writer-- async with writer_lock: no_writer_event.clear() # Wait for the readers to finish. await no_readers_event.wait() # Do the write. await write() # Awaken waiting readers. no_writer_event.set() Untested pseudo-code for a reader-- while True: await no_writer_event.wait() # Check the writer_lock again in case a new writer has # started writing. if not writer_lock.locked(): # Then we can do the read. break reader_count += 1 if reader_count == 1: no_readers_event.clear() # Do the read. await read() reader_count -= 1 if reader_count == 0: # Awaken any waiting writer. no_readers_event.set() One thing I'm not clear about is when the writer_lock is released and the no_writer_event set, are there any guarantees about what coroutine will be awakened first -- a writer waiting on the lock or the readers waiting on the no_writer_event? Similarly, is there a way to avoid having to have readers check the writer_lock again when a reader waiting on no_writer_event is awakened? --Chris On Sun, Jun 25, 2017 at 3:27 PM, Chris Jerdonek wrote: > On Sun, Jun 25, 2017 at 3:09 PM, Nathaniel Smith wrote: >> On Sun, Jun 25, 2017 at 2:13 PM, Chris Jerdonek >> wrote: >>> I'm using asyncio, and the synchronization primitives that asyncio >>> exposes are relatively simple [1]. Have options for async read-write >>> synchronization already been discussed in any detail? >> >> As a general comment: I used to think rwlocks were a simple extension >> to regular locks, but it turns out there's actually this huge increase >> in design complexity. Do you want your lock to be read-biased, >> write-biased, task-fair, phase-fair? Can you acquire a write lock if >> you already hold one (i.e., are write locks reentrant)? What about >> acquiring a read lock if you already hold the write lock? Can you >> atomically upgrade/downgrade a lock? This makes it much harder to come >> up with a one-size-fits-all design suitable for adding to something >> like the python stdlib. > > I agree. And my point about asyncio's primitives wasn't a criticism or > request that more be added. I was asking more if there has been any > discussion of general approaches and patterns that take advantage of > the event loop's single thread, etc. > > Maybe what I'll do is briefly write up the approach I have in mind, > and people can let me know if I'm on the right track. :) > > --Chris From yarkot1 at gmail.com Mon Jun 26 00:01:32 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Sun, 25 Jun 2017 23:01:32 -0500 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 10:46 PM, Yarko Tymciurak wrote: > > > On Sun, Jun 25, 2017 at 10:33 PM, Guido van Rossum > wrote: > >> On Sun, Jun 25, 2017 at 3:38 PM, Yarko Tymciurak >> wrote: >> >>> To be a well-behaved (capable of effective cooperation) task in such a >>> system, you should guard against getting embroiled in potentially blocking >>> I/O tasks whose latency you are not able to control (within facilities >>> available in a cooperative multitasking context). The raises a couple of >>> questions: to be well-behaved, simple control flow is desireable (i.e. not >>> nested layers of yields, except perhaps for a pipeline case); and >>> "read/write" control from memory space w/in the process (since external I/O >>> is generally not for async) begs the question: what for? Eliminate >>> globals, encapsulate and limit access as needed through usual programming >>> methods. >>> >> >> Before anyone takes this paragraph too seriously, there seem to be a >> bunch of misunderstandings underlying this paragraph. >> > > yes - thanks for the clarifications... I'm speaking from the perspective > of an ECE, and thinking in the small-scale (embedded) of things like when > in general is cooperative multitasking (very light-weight) more performant > than pre-emptive... so from that space: > >> >> - *All* blocking I/O is wrong in an async task, regardless of whether you >> can control its latency. (The only safe way to do I/O is using a primitive >> that works with `await`.) >> > yes, and from ECE perspective the only I/O is "local" device (e.g. RAM, which itself has rather deterministic setup and write times...), etc. my more general point (sorry - should have made it explicit) is that if you call a library routine, you may not expect it's calling external I/O, so that requires either care (or defensively guarding against it, e.g. with timers ... another story). This in particular is an error which I saw in OpenStack swift project - they depended on fast local storage device I/O. Except when devices started failing. Then they mistakenly assumed this was python's fault - missing the programming error of doing async (gevent - but same issue) I/O (which might be ok, within limits, but was not guarded against - was done in an unreliable way). So - whether intentionally doing such "risky" but seemingly reliable and "ok" I/O and failing to put in place guards, as must be in cooperative multitasking, or if you just get surprised that some library you thought was inoccuous is somewhere doing some surprise I/O (logging? anything...).... in cooperative multi-tasking, you can get away with some things, but it is _all_ your responsibility to guard against problems. That was my point here. >> - There's nothing wrong with `yield` itself. (You shouldn't do I/O in a >> generator used in an async task -- but that's just due to the general ban >> on I/O.) >> > Yes; as above. But I'm calling local variables (strictly speaking) I/O too. And you might consider REDIS as "to RAM, so how different is that?" --- well, it's through another process, and ... up to a preemptive scheduler, and all sorts of things. So, sure - you _can_ do it, if you put in guards. But don't. Or at least, have very specific good reasons, and understand the coding cost of trying to do so. In other words - don't. > >> - Using async tasks don't make globals more risky than regular code (in >> fact they are safer here than in traditional multi-threaded code). >> >> - What on earth is "read/write" control from memory space w/in the >> process? >> > Sorry - these last two were a bit of a joke on my part. The silly: only valid I/O is to variables. But you don't need that, because you have normal variable scoping/encapsulation rules. So (I suppose my joke continued), the only reason to have "read/write controls left is against (!) global variables. Answer - don't; and you don'' need R/W controls, because you have normal encapsulation controls of variables from the language. So - in cooperative multitasking, my argument goes, there can be (!) no reasonable motivation for R/W controls. -- Yarko > >> >> -- >> --Guido van Rossum (python.org/~guido) >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yarkot1 at gmail.com Sun Jun 25 23:46:02 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Sun, 25 Jun 2017 22:46:02 -0500 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 10:33 PM, Guido van Rossum wrote: > On Sun, Jun 25, 2017 at 3:38 PM, Yarko Tymciurak > wrote: > >> To be a well-behaved (capable of effective cooperation) task in such a >> system, you should guard against getting embroiled in potentially blocking >> I/O tasks whose latency you are not able to control (within facilities >> available in a cooperative multitasking context). The raises a couple of >> questions: to be well-behaved, simple control flow is desireable (i.e. not >> nested layers of yields, except perhaps for a pipeline case); and >> "read/write" control from memory space w/in the process (since external I/O >> is generally not for async) begs the question: what for? Eliminate >> globals, encapsulate and limit access as needed through usual programming >> methods. >> > > Before anyone takes this paragraph too seriously, there seem to be a bunch > of misunderstandings underlying this paragraph. > yes - thanks for the clarifications... I'm speaking from the perspective of an ECE, and thinking in the small-scale (embedded) of things like when in general is cooperative multitasking (very light-weight) more performant than pre-emptive... so from that space: > > - *All* blocking I/O is wrong in an async task, regardless of whether you > can control its latency. (The only safe way to do I/O is using a primitive > that works with `await`.) > > - There's nothing wrong with `yield` itself. (You shouldn't do I/O in a > generator used in an async task -- but that's just due to the general ban > on I/O.) > > - Using async tasks don't make globals more risky than regular code (in > fact they are safer here than in traditional multi-threaded code). > > - What on earth is "read/write" control from memory space w/in the process? > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dimaqq at gmail.com Mon Jun 26 04:43:41 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Mon, 26 Jun 2017 10:43:41 +0200 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: Chris, this led to an interesting discussion, which then went pretty far from the original concern. Perhaps you can share your use-case, both as pseudo-code and a link to real code. I'm specifically interested to see why/where you'd like to use a read-write async lock, to evaluate if this is something common or specific, and if, perhaps, some other paradigm (like queue, worker pool, ...) may be more useful in general case. I'm also curious if a full set of async sync primitives may one day lead to async monitors. Granted, simple use of async monitor is really a future/promise, but perhaps there are complex use cases in the UI/react domain with its promise/stream dichotomy. Cheers, d. On 25 June 2017 at 23:13, Chris Jerdonek wrote: > I'm relatively new to async programming in Python and am thinking > through possibilities for doing "read-write" synchronization. > > I'm using asyncio, and the synchronization primitives that asyncio > exposes are relatively simple [1]. Have options for async read-write > synchronization already been discussed in any detail? > > I'm interested in designs where "readers" don't need to acquire a lock > -- only writers. It seems like one way to deal with the main race > condition I see that comes up would be to use loop.time(). Does that > ring a bell, or might there be a much simpler way? > > Thanks, > --Chris > > > [1] https://docs.python.org/3/library/asyncio-sync.html > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From chris.jerdonek at gmail.com Mon Jun 26 05:28:26 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Mon, 26 Jun 2017 02:28:26 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek wrote: > Perhaps you can share your use-case, both as pseudo-code and a link to > real code. > > I'm specifically interested to see why/where you'd like to use a > read-write async lock, to evaluate if this is something common or > specific, and if, perhaps, some other paradigm (like queue, worker > pool, ...) may be more useful in general case. > > I'm also curious if a full set of async sync primitives may one day > lead to async monitors. Granted, simple use of async monitor is really > a future/promise, but perhaps there are complex use cases in the > UI/react domain with its promise/stream dichotomy. Thank you, Dima. In my last email I shared pseudo-code for an approach to read-write synchronization that is independent of use case. [1] For the use case, my original purpose in mind was to synchronize many small file operations on disk like creating and removing directories that possibly share intermediate segments. The real code isn't public. But these would be operations like os.makedirs() and os.removedirs() that would be wrapped by loop.run_in_executor() to be non-blocking. The directory removal using os.removedirs() is the operation I thought should require exclusive access, so as not to interfere with directory creations in progress. Perhaps a simpler, dirtier approach would be not to synchronize at all and simply retry directory creations that fail until they succeed. That could be enough to handle rare cases where simultaneous creation and removal causes an error. You could view this an EAFP approach. Either way, I think the process of thinking through patterns for read-write synchronization is helpful for getting a better general feel and understanding of async. --Chris > > Cheers, > d. > > On 25 June 2017 at 23:13, Chris Jerdonek wrote: >> I'm relatively new to async programming in Python and am thinking >> through possibilities for doing "read-write" synchronization. >> >> I'm using asyncio, and the synchronization primitives that asyncio >> exposes are relatively simple [1]. Have options for async read-write >> synchronization already been discussed in any detail? >> >> I'm interested in designs where "readers" don't need to acquire a lock >> -- only writers. It seems like one way to deal with the main race >> condition I see that comes up would be to use loop.time(). Does that >> ring a bell, or might there be a much simpler way? >> >> Thanks, >> --Chris >> >> >> [1] https://docs.python.org/3/library/asyncio-sync.html >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ From dimaqq at gmail.com Mon Jun 26 12:25:49 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Mon, 26 Jun 2017 18:25:49 +0200 Subject: [Async-sig] async generator confusion or bug? Message-ID: Hi group, I'm trying to cross-use an sync generator across several async functions. Is it allowed or a completely bad idea? (if so, why?) Here's MRE: import asyncio async def generator(): while True: x = yield print("received", x) await asyncio.sleep(0.1) async def user(name, g): print("sending", name) await g.asend(name) async def helper(): g = generator() await g.asend(None) await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)]) if __name__ == "__main__": asyncio.get_event_loop().run_until_complete(helper()) And the output it produces when ran (py3.6.1): sending user-1 received user-1 sending user-2 sending user-0 received None received None Where are those None's coming from in the end? Where did "user-0" and "user-1" data go? Is this a bug, or am I hopelessly confused? Thanks! From dimaqq at gmail.com Mon Jun 26 13:02:17 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Mon, 26 Jun 2017 19:02:17 +0200 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: A little epiphany on my part: In threaded world, a lock (etc.) can be used for 2 distinct purposes: *1 synchronise [access to resource in the] library implementation, and *2 synchronise users of a library It's easy since taken lock has an owner (thread). Both library and user stack frames belong to either this thread or some other. In the async world, users are opaque to library implementation (technically own async threads). Therefore only use case #1 is valid. Moreover, it occurs to me that lock/unlock pair must be confined to same async function. Going beyond that restriction is bug-prone like crazy (even for me). Chris, coming back to your use-case. Do you want to synchronise side-effect creation/deletion for the sanity of side-effects only? Or do you imply that callers' actions are synchronised too? In other words, do your callers use those directories out of band? P.S./O.T. when it comes to directories, you probably want hierarchical locks rather than RW. On 26 June 2017 at 11:28, Chris Jerdonek wrote: > On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek wrote: >> Perhaps you can share your use-case, both as pseudo-code and a link to >> real code. >> >> I'm specifically interested to see why/where you'd like to use a >> read-write async lock, to evaluate if this is something common or >> specific, and if, perhaps, some other paradigm (like queue, worker >> pool, ...) may be more useful in general case. >> >> I'm also curious if a full set of async sync primitives may one day >> lead to async monitors. Granted, simple use of async monitor is really >> a future/promise, but perhaps there are complex use cases in the >> UI/react domain with its promise/stream dichotomy. > > Thank you, Dima. In my last email I shared pseudo-code for an approach > to read-write synchronization that is independent of use case. [1] > > For the use case, my original purpose in mind was to synchronize many > small file operations on disk like creating and removing directories > that possibly share intermediate segments. The real code isn't public. > But these would be operations like os.makedirs() and os.removedirs() > that would be wrapped by loop.run_in_executor() to be non-blocking. > The directory removal using os.removedirs() is the operation I thought > should require exclusive access, so as not to interfere with directory > creations in progress. > > Perhaps a simpler, dirtier approach would be not to synchronize at all > and simply retry directory creations that fail until they succeed. > That could be enough to handle rare cases where simultaneous creation > and removal causes an error. You could view this an EAFP approach. > > Either way, I think the process of thinking through patterns for > read-write synchronization is helpful for getting a better general > feel and understanding of async. > > --Chris > > >> >> Cheers, >> d. >> >> On 25 June 2017 at 23:13, Chris Jerdonek wrote: >>> I'm relatively new to async programming in Python and am thinking >>> through possibilities for doing "read-write" synchronization. >>> >>> I'm using asyncio, and the synchronization primitives that asyncio >>> exposes are relatively simple [1]. Have options for async read-write >>> synchronization already been discussed in any detail? >>> >>> I'm interested in designs where "readers" don't need to acquire a lock >>> -- only writers. It seems like one way to deal with the main race >>> condition I see that comes up would be to use loop.time(). Does that >>> ring a bell, or might there be a much simpler way? >>> >>> Thanks, >>> --Chris >>> >>> >>> [1] https://docs.python.org/3/library/asyncio-sync.html >>> _______________________________________________ >>> Async-sig mailing list >>> Async-sig at python.org >>> https://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ From yselivanov at gmail.com Mon Jun 26 13:48:37 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Mon, 26 Jun 2017 13:48:37 -0400 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: References: Message-ID: Hi Dima, > On Jun 26, 2017, at 12:25 PM, Dima Tisnek wrote: > > Hi group, > > I'm trying to cross-use an sync generator across several async functions. > Is it allowed or a completely bad idea? (if so, why?) It is allowed, but leads to complex code. > > Here's MRE: > > import asyncio > > > async def generator(): > while True: > x = yield > print("received", x) > await asyncio.sleep(0.1) > > > async def user(name, g): > print("sending", name) > await g.asend(name) > > > async def helper(): > g = generator() > await g.asend(None) > > await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)]) > > > if __name__ == "__main__": > asyncio.get_event_loop().run_until_complete(helper()) > > > And the output it produces when ran (py3.6.1): > > sending user-1 > received user-1 > sending user-2 > sending user-0 > received None > received None > > > Where are those None's coming from in the end? > Where did "user-0" and "user-1" data go? Interesting. If I replace "gather" with three consecutive awaits of "asend", everything works as expected. So there's some weird interaction of asend/gather, or maybe you did find a bug. Need more time to investigate. Would you mind to open an issue on bugs.python? Thanks, Yury From andrew.svetlov at gmail.com Mon Jun 26 13:53:25 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 26 Jun 2017 17:53:25 +0000 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: References: Message-ID: IIRC gather collects coroutines in arbitrary order, maybe it's the source of misunderstanding? On Mon, Jun 26, 2017 at 8:48 PM Yury Selivanov wrote: > Hi Dima, > > > On Jun 26, 2017, at 12:25 PM, Dima Tisnek wrote: > > > > Hi group, > > > > I'm trying to cross-use an sync generator across several async functions. > > Is it allowed or a completely bad idea? (if so, why?) > > It is allowed, but leads to complex code. > > > > > Here's MRE: > > > > import asyncio > > > > > > async def generator(): > > while True: > > x = yield > > print("received", x) > > await asyncio.sleep(0.1) > > > > > > async def user(name, g): > > print("sending", name) > > await g.asend(name) > > > > > > async def helper(): > > g = generator() > > await g.asend(None) > > > > await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)]) > > > > > > if __name__ == "__main__": > > asyncio.get_event_loop().run_until_complete(helper()) > > > > > > And the output it produces when ran (py3.6.1): > > > > sending user-1 > > received user-1 > > sending user-2 > > sending user-0 > > received None > > received None > > > > > > Where are those None's coming from in the end? > > Where did "user-0" and "user-1" data go? > > > Interesting. If I replace "gather" with three consecutive awaits of > "asend", everything works as expected. So there's some weird interaction > of asend/gather, or maybe you did find a bug. Need more time to > investigate. > > Would you mind to open an issue on bugs.python? > > Thanks, > Yury > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov at gmail.com Mon Jun 26 13:55:03 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Mon, 26 Jun 2017 13:55:03 -0400 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: References: Message-ID: <27F15750-EA96-4A24-8B21-83A9D6B71F7A@gmail.com> > On Jun 26, 2017, at 1:53 PM, Andrew Svetlov wrote: > > IIRC gather collects coroutines in arbitrary order, maybe it's the source of misunderstanding? Yes, but that does not explain "receiving None" messages. Let's move this discussion to the bug tracker. Yury From chris.jerdonek at gmail.com Mon Jun 26 14:21:47 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Mon, 26 Jun 2017 11:21:47 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Mon, Jun 26, 2017 at 10:02 AM, Dima Tisnek wrote: > Chris, coming back to your use-case. > Do you want to synchronise side-effect creation/deletion for the > sanity of side-effects only? > Or do you imply that callers' actions are synchronised too? > In other words, do your callers use those directories out of band? If I understand your question, the former. The callers aren't / need not be synchronized, and they aren't aware of the underlying synchronization happening inside the higher-level create() and delete() functions they would be using. (These are the two higher-level functions described in my pseudocode.) The synchronization is needed inside these create() and delete() functions since the low-level directory operations occur in different threads (because they are wrapped by run_in_executor()). --Chris > > > P.S./O.T. when it comes to directories, you probably want hierarchical > locks rather than RW. > > > On 26 June 2017 at 11:28, Chris Jerdonek wrote: >> On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek wrote: >>> Perhaps you can share your use-case, both as pseudo-code and a link to >>> real code. >>> >>> I'm specifically interested to see why/where you'd like to use a >>> read-write async lock, to evaluate if this is something common or >>> specific, and if, perhaps, some other paradigm (like queue, worker >>> pool, ...) may be more useful in general case. >>> >>> I'm also curious if a full set of async sync primitives may one day >>> lead to async monitors. Granted, simple use of async monitor is really >>> a future/promise, but perhaps there are complex use cases in the >>> UI/react domain with its promise/stream dichotomy. >> >> Thank you, Dima. In my last email I shared pseudo-code for an approach >> to read-write synchronization that is independent of use case. [1] >> >> For the use case, my original purpose in mind was to synchronize many >> small file operations on disk like creating and removing directories >> that possibly share intermediate segments. The real code isn't public. >> But these would be operations like os.makedirs() and os.removedirs() >> that would be wrapped by loop.run_in_executor() to be non-blocking. >> The directory removal using os.removedirs() is the operation I thought >> should require exclusive access, so as not to interfere with directory >> creations in progress. >> >> Perhaps a simpler, dirtier approach would be not to synchronize at all >> and simply retry directory creations that fail until they succeed. >> That could be enough to handle rare cases where simultaneous creation >> and removal causes an error. You could view this an EAFP approach. >> >> Either way, I think the process of thinking through patterns for >> read-write synchronization is helpful for getting a better general >> feel and understanding of async. >> >> --Chris >> >> >>> >>> Cheers, >>> d. >>> >>> On 25 June 2017 at 23:13, Chris Jerdonek wrote: >>>> I'm relatively new to async programming in Python and am thinking >>>> through possibilities for doing "read-write" synchronization. >>>> >>>> I'm using asyncio, and the synchronization primitives that asyncio >>>> exposes are relatively simple [1]. Have options for async read-write >>>> synchronization already been discussed in any detail? >>>> >>>> I'm interested in designs where "readers" don't need to acquire a lock >>>> -- only writers. It seems like one way to deal with the main race >>>> condition I see that comes up would be to use loop.time(). Does that >>>> ring a bell, or might there be a much simpler way? >>>> >>>> Thanks, >>>> --Chris >>>> >>>> >>>> [1] https://docs.python.org/3/library/asyncio-sync.html >>>> _______________________________________________ >>>> Async-sig mailing list >>>> Async-sig at python.org >>>> https://mail.python.org/mailman/listinfo/async-sig >>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ From dimaqq at gmail.com Mon Jun 26 14:56:39 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Mon, 26 Jun 2017 20:56:39 +0200 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: <27F15750-EA96-4A24-8B21-83A9D6B71F7A@gmail.com> References: <27F15750-EA96-4A24-8B21-83A9D6B71F7A@gmail.com> Message-ID: Thanks Yuri for quick reply. http://bugs.python.org/issue30773 created :) On 26 June 2017 at 19:55, Yury Selivanov wrote: > >> On Jun 26, 2017, at 1:53 PM, Andrew Svetlov wrote: >> >> IIRC gather collects coroutines in arbitrary order, maybe it's the source of misunderstanding? > > Yes, but that does not explain "receiving None" messages. Let's move this discussion to the bug tracker. > > Yury > From dimaqq at gmail.com Mon Jun 26 15:37:19 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Mon, 26 Jun 2017 21:37:19 +0200 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: Chris, here's a simple RWLock implementation and analysis: ``` import asyncio class RWLock: def __init__(self): self.cond = asyncio.Condition() self.readers = 0 self.writer = False async def lock(self, write=False): async with self.cond: # write requested: there cannot be readers or writers # read requested: there can be other readers but not writers while self.readers and write or self.writer: self.cond.wait() if write: self.writer = True else: self.readers += 1 # self.cond.notifyAll() would be good taste # however no waiters can be unblocked by this state change async def unlock(self, write=False): async with self.cond: if write: self.writer = False else: self.readers -= 1 self.cond.notifyAll() # notify (one) could be used `if not write:` ``` Note that `.unlock` cannot validate that it's called by same coroutine as `.lock` was. That's because there's no concept for "current_thread" for coroutines -- there can be many waiting on each other in the stack. Obv., this code could be nicer: * separate context managers for read and write cases * .unlock can be automatic (if self.writer: unlock_for_write()) at the cost of opening doors wide open to bugs * policy can be introduced if `.lock` identified itself (by an object(), since there's no thread id) in shared state * notifyAll() makes real life use O(N^2) for N being number of simultaneous write lock requests Feel free to use it :) On 26 June 2017 at 20:21, Chris Jerdonek wrote: > On Mon, Jun 26, 2017 at 10:02 AM, Dima Tisnek wrote: >> Chris, coming back to your use-case. >> Do you want to synchronise side-effect creation/deletion for the >> sanity of side-effects only? >> Or do you imply that callers' actions are synchronised too? >> In other words, do your callers use those directories out of band? > > If I understand your question, the former. The callers aren't / need > not be synchronized, and they aren't aware of the underlying > synchronization happening inside the higher-level create() and > delete() functions they would be using. (These are the two > higher-level functions described in my pseudocode.) > > The synchronization is needed inside these create() and delete() > functions since the low-level directory operations occur in different > threads (because they are wrapped by run_in_executor()). > > --Chris > >> >> >> P.S./O.T. when it comes to directories, you probably want hierarchical >> locks rather than RW. >> >> >> On 26 June 2017 at 11:28, Chris Jerdonek wrote: >>> On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek wrote: >>>> Perhaps you can share your use-case, both as pseudo-code and a link to >>>> real code. >>>> >>>> I'm specifically interested to see why/where you'd like to use a >>>> read-write async lock, to evaluate if this is something common or >>>> specific, and if, perhaps, some other paradigm (like queue, worker >>>> pool, ...) may be more useful in general case. >>>> >>>> I'm also curious if a full set of async sync primitives may one day >>>> lead to async monitors. Granted, simple use of async monitor is really >>>> a future/promise, but perhaps there are complex use cases in the >>>> UI/react domain with its promise/stream dichotomy. >>> >>> Thank you, Dima. In my last email I shared pseudo-code for an approach >>> to read-write synchronization that is independent of use case. [1] >>> >>> For the use case, my original purpose in mind was to synchronize many >>> small file operations on disk like creating and removing directories >>> that possibly share intermediate segments. The real code isn't public. >>> But these would be operations like os.makedirs() and os.removedirs() >>> that would be wrapped by loop.run_in_executor() to be non-blocking. >>> The directory removal using os.removedirs() is the operation I thought >>> should require exclusive access, so as not to interfere with directory >>> creations in progress. >>> >>> Perhaps a simpler, dirtier approach would be not to synchronize at all >>> and simply retry directory creations that fail until they succeed. >>> That could be enough to handle rare cases where simultaneous creation >>> and removal causes an error. You could view this an EAFP approach. >>> >>> Either way, I think the process of thinking through patterns for >>> read-write synchronization is helpful for getting a better general >>> feel and understanding of async. >>> >>> --Chris >>> >>> >>>> >>>> Cheers, >>>> d. >>>> >>>> On 25 June 2017 at 23:13, Chris Jerdonek wrote: >>>>> I'm relatively new to async programming in Python and am thinking >>>>> through possibilities for doing "read-write" synchronization. >>>>> >>>>> I'm using asyncio, and the synchronization primitives that asyncio >>>>> exposes are relatively simple [1]. Have options for async read-write >>>>> synchronization already been discussed in any detail? >>>>> >>>>> I'm interested in designs where "readers" don't need to acquire a lock >>>>> -- only writers. It seems like one way to deal with the main race >>>>> condition I see that comes up would be to use loop.time(). Does that >>>>> ring a bell, or might there be a much simpler way? >>>>> >>>>> Thanks, >>>>> --Chris >>>>> >>>>> >>>>> [1] https://docs.python.org/3/library/asyncio-sync.html >>>>> _______________________________________________ >>>>> Async-sig mailing list >>>>> Async-sig at python.org >>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ From dimaqq at gmail.com Mon Jun 26 15:38:40 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Mon, 26 Jun 2017 21:38:40 +0200 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: - self.cond.wait() + await self.cond.wait() I've no tests for this :P On 26 June 2017 at 21:37, Dima Tisnek wrote: > Chris, here's a simple RWLock implementation and analysis: > > ``` > import asyncio > > > class RWLock: > def __init__(self): > self.cond = asyncio.Condition() > self.readers = 0 > self.writer = False > > async def lock(self, write=False): > async with self.cond: > # write requested: there cannot be readers or writers > # read requested: there can be other readers but not writers > while self.readers and write or self.writer: > self.cond.wait() > if write: self.writer = True > else: self.readers += 1 > # self.cond.notifyAll() would be good taste > # however no waiters can be unblocked by this state change > > async def unlock(self, write=False): > async with self.cond: > if write: self.writer = False > else: self.readers -= 1 > self.cond.notifyAll() # notify (one) could be used `if not write:` > ``` > > Note that `.unlock` cannot validate that it's called by same coroutine > as `.lock` was. > That's because there's no concept for "current_thread" for coroutines > -- there can be many waiting on each other in the stack. > > Obv., this code could be nicer: > * separate context managers for read and write cases > * .unlock can be automatic (if self.writer: unlock_for_write()) at the > cost of opening doors wide open to bugs > * policy can be introduced if `.lock` identified itself (by an > object(), since there's no thread id) in shared state > * notifyAll() makes real life use O(N^2) for N being number of > simultaneous write lock requests > > Feel free to use it :) > > > > On 26 June 2017 at 20:21, Chris Jerdonek wrote: >> On Mon, Jun 26, 2017 at 10:02 AM, Dima Tisnek wrote: >>> Chris, coming back to your use-case. >>> Do you want to synchronise side-effect creation/deletion for the >>> sanity of side-effects only? >>> Or do you imply that callers' actions are synchronised too? >>> In other words, do your callers use those directories out of band? >> >> If I understand your question, the former. The callers aren't / need >> not be synchronized, and they aren't aware of the underlying >> synchronization happening inside the higher-level create() and >> delete() functions they would be using. (These are the two >> higher-level functions described in my pseudocode.) >> >> The synchronization is needed inside these create() and delete() >> functions since the low-level directory operations occur in different >> threads (because they are wrapped by run_in_executor()). >> >> --Chris >> >>> >>> >>> P.S./O.T. when it comes to directories, you probably want hierarchical >>> locks rather than RW. >>> >>> >>> On 26 June 2017 at 11:28, Chris Jerdonek wrote: >>>> On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek wrote: >>>>> Perhaps you can share your use-case, both as pseudo-code and a link to >>>>> real code. >>>>> >>>>> I'm specifically interested to see why/where you'd like to use a >>>>> read-write async lock, to evaluate if this is something common or >>>>> specific, and if, perhaps, some other paradigm (like queue, worker >>>>> pool, ...) may be more useful in general case. >>>>> >>>>> I'm also curious if a full set of async sync primitives may one day >>>>> lead to async monitors. Granted, simple use of async monitor is really >>>>> a future/promise, but perhaps there are complex use cases in the >>>>> UI/react domain with its promise/stream dichotomy. >>>> >>>> Thank you, Dima. In my last email I shared pseudo-code for an approach >>>> to read-write synchronization that is independent of use case. [1] >>>> >>>> For the use case, my original purpose in mind was to synchronize many >>>> small file operations on disk like creating and removing directories >>>> that possibly share intermediate segments. The real code isn't public. >>>> But these would be operations like os.makedirs() and os.removedirs() >>>> that would be wrapped by loop.run_in_executor() to be non-blocking. >>>> The directory removal using os.removedirs() is the operation I thought >>>> should require exclusive access, so as not to interfere with directory >>>> creations in progress. >>>> >>>> Perhaps a simpler, dirtier approach would be not to synchronize at all >>>> and simply retry directory creations that fail until they succeed. >>>> That could be enough to handle rare cases where simultaneous creation >>>> and removal causes an error. You could view this an EAFP approach. >>>> >>>> Either way, I think the process of thinking through patterns for >>>> read-write synchronization is helpful for getting a better general >>>> feel and understanding of async. >>>> >>>> --Chris >>>> >>>> >>>>> >>>>> Cheers, >>>>> d. >>>>> >>>>> On 25 June 2017 at 23:13, Chris Jerdonek wrote: >>>>>> I'm relatively new to async programming in Python and am thinking >>>>>> through possibilities for doing "read-write" synchronization. >>>>>> >>>>>> I'm using asyncio, and the synchronization primitives that asyncio >>>>>> exposes are relatively simple [1]. Have options for async read-write >>>>>> synchronization already been discussed in any detail? >>>>>> >>>>>> I'm interested in designs where "readers" don't need to acquire a lock >>>>>> -- only writers. It seems like one way to deal with the main race >>>>>> condition I see that comes up would be to use loop.time(). Does that >>>>>> ring a bell, or might there be a much simpler way? >>>>>> >>>>>> Thanks, >>>>>> --Chris >>>>>> >>>>>> >>>>>> [1] https://docs.python.org/3/library/asyncio-sync.html >>>>>> _______________________________________________ >>>>>> Async-sig mailing list >>>>>> Async-sig at python.org >>>>>> https://mail.python.org/mailman/listinfo/async-sig >>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/ From njs at pobox.com Mon Jun 26 18:22:47 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 26 Jun 2017 15:22:47 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek wrote: > Note that `.unlock` cannot validate that it's called by same coroutine > as `.lock` was. > That's because there's no concept for "current_thread" for coroutines > -- there can be many waiting on each other in the stack. This is also a surprisingly complex design question. Your async RWLock actually matches how Python's threading.Lock works: you're explicitly allowed to acquire it in one thread and then release it from another. People sometimes find this surprising, and it prevents some kinds of error-checking. For example, this code *probably* deadlocks: lock = threading.Lock() lock.acquire() # probably deadlocks lock.acquire() but the interpreter can't detect this and raise an error, because in theory some other thread might come along and call lock.release(). On the other hand, it is sometimes useful to be able to acquire a lock in one thread and then "hand it off" to e.g. a child thread. (Reentrant locks, OTOH, do have an implicit concept of ownership -- they kind of have to, if you think about it -- so even if you don't need reentrancy they can be useful because they'll raise a noisy error if you accidentally try to release a lock from the wrong thread.) In trio we do have a current_task() concept, and the basic trio.Lock [1] does track ownership, and I even have a Semaphore-equivalent that tracks ownership as well [2]. The motivation here is that I want to provide nice debugging tools to detect things like deadlocks, which is only possible when your primitives have some kind of ownership tracking. So far this just means that we detect and error on these kinds of simple cases: lock = trio.Lock() await lock.acquire() # raises an error await lock.acquire() But I have ambitions to do more [3] :-). However, this raises some tricky design questions around how and whether to support the "passing ownership" cases. Of course you can always fall back on something like a raw Semaphore, but it turns out that trio.run_in_worker_thread (our equivalent of asyncio's run_in_executor) actually wants to do something like pass ownership from the calling task into the spawned thread. So far I've handled this by adding acquire_on_behalf_of/release_on_behalf_of methods to the primitive that run_in_worker_thread uses, but this isn't really fully baked yet. -n [1] https://trio.readthedocs.io/en/latest/reference-core.html#trio.Lock [2] https://trio.readthedocs.io/en/latest/reference-core.html#trio.CapacityLimiter [3] https://github.com/python-trio/trio/issues/182 -- Nathaniel J. Smith -- https://vorpus.org From chris.jerdonek at gmail.com Mon Jun 26 21:41:16 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Mon, 26 Jun 2017 18:41:16 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek wrote: > Chris, here's a simple RWLock implementation and analysis: > ... > Obv., this code could be nicer: > * separate context managers for read and write cases > * .unlock can be automatic (if self.writer: unlock_for_write()) at the > cost of opening doors wide open to bugs > * policy can be introduced if `.lock` identified itself (by an > object(), since there's no thread id) in shared state > * notifyAll() makes real life use O(N^2) for N being number of > simultaneous write lock requests > > Feel free to use it :) Thanks, Dima. However, as I said in my earlier posts, I'm actually more interested in exploring approaches to synchronizing readers and writers in async code that don't require locking on reads. (This is also why I've always been saying RW "synchronization" instead of RW "locking.") I'm interested in this because I think the single-threadedness of the event loop might be what makes this simplification possible over the traditional multi-threaded approach (along the lines Guido was mentioning). It also makes the "fast path" faster. Lastly, the API for the callers is just to call read() or write(), so there is no need for a general RWLock construct or to work through RWLock semantics of the sort Nathaniel mentioned. I coded up a working version of the pseudo-code I included in an earlier email so people can see how it works. I included it at the bottom of this email and also in this gist: https://gist.github.com/cjerdonek/858e1467f768ee045849ea81ddb47901 --Chris import asyncio import random NO_READERS_EVENT = asyncio.Event() NO_WRITERS_EVENT = asyncio.Event() WRITE_LOCK = asyncio.Lock() class State: reader_count = 0 mock_file_data = 'initial' async def read_file(): data = State.mock_file_data print(f'read: {data}') async def write_file(data): print(f'writing: {data}') State.mock_file_data = data await asyncio.sleep(0.5) async def write(data): async with WRITE_LOCK: NO_WRITERS_EVENT.clear() # Wait for the readers to finish. await NO_READERS_EVENT.wait() # Do the file write. await write_file(data) # Awaken waiting readers. NO_WRITERS_EVENT.set() async def read(): while True: await NO_WRITERS_EVENT.wait() # Check the writer_lock again in case a new writer has # started writing. if WRITE_LOCK.locked(): print(f'cannot read: still writing: {State.mock_file_data!r}') else: # Otherwise, we can do the read. break State.reader_count += 1 if State.reader_count == 1: NO_READERS_EVENT.clear() # Do the file read. await read_file() State.reader_count -= 1 if State.reader_count == 0: # Awaken any waiting writer. NO_READERS_EVENT.set() async def delayed(coro): await asyncio.sleep(random.random()) await coro async def test_synchronization(): NO_READERS_EVENT.set() NO_WRITERS_EVENT.set() coros = [ read(), read(), read(), read(), read(), read(), write('apple'), write('banana'), ] # Add a delay before each coroutine for variety. coros = [delayed(coro) for coro in coros] await asyncio.gather(*coros) if __name__ == '__main__': asyncio.get_event_loop().run_until_complete(test_synchronization()) # Sample output: # # read: initial # read: initial # read: initial # read: initial # writing: banana # writing: apple # cannot read: still writing: 'apple' # cannot read: still writing: 'apple' # read: apple # read: apple From yselivanov at gmail.com Mon Jun 26 21:54:50 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Mon, 26 Jun 2017 21:54:50 -0400 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: References: Message-ID: (Posting here, rather than to the issue, because I think this actually needs more exposure). I looked at the code (genobject.c) and I think I know what's going on here. Normally, when you work with an asynchronous generator (AG) you interact with it through "asend" or "athrow" *coroutines*. Each AG has its own private state, and when you await on "asend" coroutine you are changing that state. The state changes on each "asend.send" or "asend.throw" call. The normal relation between AGs and asends is 1 to 1. AG - asend However, in your example you change that to 1 to many: asend / AG - asend \ asend Both 'ensure_future' and 'gather' will wrap each asend coroutine into an 'asyncio.Task'. And each Task will call "asend.send(None)" right in its '__init__', which changes the underlying *shared* AG instance completely out of order. I don't see how this can be fixed (or that it even needs to be fixed), so I propose to simply raise an exception if an AG has more than one asends changing it state *at the same time*. Thoughts? Yury > On Jun 26, 2017, at 12:25 PM, Dima Tisnek wrote: > > Hi group, > > I'm trying to cross-use an sync generator across several async functions. > Is it allowed or a completely bad idea? (if so, why?) > > Here's MRE: > > import asyncio > > > async def generator(): > while True: > x = yield > print("received", x) > await asyncio.sleep(0.1) > > > async def user(name, g): > print("sending", name) > await g.asend(name) > > > async def helper(): > g = generator() > await g.asend(None) > > await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)]) > > > if __name__ == "__main__": > asyncio.get_event_loop().run_until_complete(helper()) > > > And the output it produces when ran (py3.6.1): > > sending user-1 > received user-1 > sending user-2 > sending user-0 > received None > received None > > > Where are those None's coming from in the end? > Where did "user-0" and "user-1" data go? > > Is this a bug, or am I hopelessly confused? > Thanks! > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From guido at python.org Mon Jun 26 22:46:53 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Jun 2017 19:46:53 -0700 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: References: Message-ID: It does look complicated. The crux of the problem seems to be that helper() is essentially async def helper(): g = generator() await g.asend(None) await asyncio.gather(user("user-0", g), user("user-1", g), user("user-2", g)) which means that it is attempting to wait for three calls to user() concurrently. This feels to me similar to three threads attempting to call next() or send() on the same generator in parallel, which AFAIR is explicitly guarded against somewhere. So detecting disallowing a similar situation for async generators makes sense (and can be considered a bugfix). On Mon, Jun 26, 2017 at 6:54 PM, Yury Selivanov wrote: > (Posting here, rather than to the issue, because I think this actually > needs more exposure). > > I looked at the code (genobject.c) and I think I know what's going on > here. Normally, when you work with an asynchronous generator (AG) you > interact with it through "asend" or "athrow" *coroutines*. > > Each AG has its own private state, and when you await on "asend" coroutine > you are changing that state. The state changes on each "asend.send" or > "asend.throw" call. The normal relation between AGs and asends is 1 to 1. > > AG - asend > > However, in your example you change that to 1 to many: > > asend > / > AG - asend > \ > asend > > Both 'ensure_future' and 'gather' will wrap each asend coroutine into an > 'asyncio.Task'. And each Task will call "asend.send(None)" right in its > '__init__', which changes the underlying *shared* AG instance completely > out of order. > > I don't see how this can be fixed (or that it even needs to be fixed), so > I propose to simply raise an exception if an AG has more than one asends > changing it state *at the same time*. > > Thoughts? > > Yury > > > On Jun 26, 2017, at 12:25 PM, Dima Tisnek wrote: > > > > Hi group, > > > > I'm trying to cross-use an sync generator across several async functions. > > Is it allowed or a completely bad idea? (if so, why?) > > > > Here's MRE: > > > > import asyncio > > > > > > async def generator(): > > while True: > > x = yield > > print("received", x) > > await asyncio.sleep(0.1) > > > > > > async def user(name, g): > > print("sending", name) > > await g.asend(name) > > > > > > async def helper(): > > g = generator() > > await g.asend(None) > > > > await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)]) > > > > > > if __name__ == "__main__": > > asyncio.get_event_loop().run_until_complete(helper()) > > > > > > And the output it produces when ran (py3.6.1): > > > > sending user-1 > > received user-1 > > sending user-2 > > sending user-0 > > received None > > received None > > > > > > Where are those None's coming from in the end? > > Where did "user-0" and "user-1" data go? > > > > Is this a bug, or am I hopelessly confused? > > Thanks! > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Jun 26 23:19:18 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 26 Jun 2017 20:19:18 -0700 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: References: Message-ID: I actually thought that async generators already guarded against this using their ag_running attribute. If I try running Dima's example with async_generator, I get: sending user-1 received user-1 sending user-2 sending user-0 Traceback (most recent call last): [...] ValueError: async generator already executing The relevant code is here: https://github.com/njsmith/async_generator/blob/e303e077c9dcb5880c0ce9930d560b282f8288ec/async_generator/impl.py#L273-L279 But I added this in the first place because I thought it was needed for compatibility with native async generators :-) -n On Jun 26, 2017 6:54 PM, "Yury Selivanov" wrote: > (Posting here, rather than to the issue, because I think this actually > needs more exposure). > > I looked at the code (genobject.c) and I think I know what's going on > here. Normally, when you work with an asynchronous generator (AG) you > interact with it through "asend" or "athrow" *coroutines*. > > Each AG has its own private state, and when you await on "asend" coroutine > you are changing that state. The state changes on each "asend.send" or > "asend.throw" call. The normal relation between AGs and asends is 1 to 1. > > AG - asend > > However, in your example you change that to 1 to many: > > asend > / > AG - asend > \ > asend > > Both 'ensure_future' and 'gather' will wrap each asend coroutine into an > 'asyncio.Task'. And each Task will call "asend.send(None)" right in its > '__init__', which changes the underlying *shared* AG instance completely > out of order. > > I don't see how this can be fixed (or that it even needs to be fixed), so > I propose to simply raise an exception if an AG has more than one asends > changing it state *at the same time*. > > Thoughts? > > Yury > > > On Jun 26, 2017, at 12:25 PM, Dima Tisnek wrote: > > > > Hi group, > > > > I'm trying to cross-use an sync generator across several async functions. > > Is it allowed or a completely bad idea? (if so, why?) > > > > Here's MRE: > > > > import asyncio > > > > > > async def generator(): > > while True: > > x = yield > > print("received", x) > > await asyncio.sleep(0.1) > > > > > > async def user(name, g): > > print("sending", name) > > await g.asend(name) > > > > > > async def helper(): > > g = generator() > > await g.asend(None) > > > > await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)]) > > > > > > if __name__ == "__main__": > > asyncio.get_event_loop().run_until_complete(helper()) > > > > > > And the output it produces when ran (py3.6.1): > > > > sending user-1 > > received user-1 > > sending user-2 > > sending user-0 > > received None > > received None > > > > > > Where are those None's coming from in the end? > > Where did "user-0" and "user-1" data go? > > > > Is this a bug, or am I hopelessly confused? > > Thanks! > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov at gmail.com Tue Jun 27 00:13:05 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Tue, 27 Jun 2017 00:13:05 -0400 Subject: [Async-sig] async generator confusion or bug? In-Reply-To: References: Message-ID: <5DD12A9F-18FF-4F15-A373-79FB3341D04F@gmail.com> Thanks Guido and Nathaniel. I'll work on a fix. Yury > On Jun 26, 2017, at 11:19 PM, Nathaniel Smith wrote: > > I actually thought that async generators already guarded against this using their ag_running attribute. If I try running Dima's example with async_generator, I get: > > sending user-1 > received user-1 > sending user-2 > sending user-0 > Traceback (most recent call last): > [...] > ValueError: async generator already executing > > The relevant code is here: > https://github.com/njsmith/async_generator/blob/e303e077c9dcb5880c0ce9930d560b282f8288ec/async_generator/impl.py#L273-L279 > > But I added this in the first place because I thought it was needed for compatibility with native async generators :-) > > -n > > On Jun 26, 2017 6:54 PM, "Yury Selivanov" wrote: > (Posting here, rather than to the issue, because I think this actually needs more exposure). > > I looked at the code (genobject.c) and I think I know what's going on here. Normally, when you work with an asynchronous generator (AG) you interact with it through "asend" or "athrow" *coroutines*. > > Each AG has its own private state, and when you await on "asend" coroutine you are changing that state. The state changes on each "asend.send" or "asend.throw" call. The normal relation between AGs and asends is 1 to 1. > > AG - asend > > However, in your example you change that to 1 to many: > > asend > / > AG - asend > \ > asend > > Both 'ensure_future' and 'gather' will wrap each asend coroutine into an 'asyncio.Task'. And each Task will call "asend.send(None)" right in its '__init__', which changes the underlying *shared* AG instance completely out of order. > > I don't see how this can be fixed (or that it even needs to be fixed), so I propose to simply raise an exception if an AG has more than one asends changing it state *at the same time*. > > Thoughts? > > Yury > > > On Jun 26, 2017, at 12:25 PM, Dima Tisnek wrote: > > > > Hi group, > > > > I'm trying to cross-use an sync generator across several async functions. > > Is it allowed or a completely bad idea? (if so, why?) > > > > Here's MRE: > > > > import asyncio > > > > > > async def generator(): > > while True: > > x = yield > > print("received", x) > > await asyncio.sleep(0.1) > > > > > > async def user(name, g): > > print("sending", name) > > await g.asend(name) > > > > > > async def helper(): > > g = generator() > > await g.asend(None) > > > > await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)]) > > > > > > if __name__ == "__main__": > > asyncio.get_event_loop().run_until_complete(helper()) > > > > > > And the output it produces when ran (py3.6.1): > > > > sending user-1 > > received user-1 > > sending user-2 > > sending user-0 > > received None > > received None > > > > > > Where are those None's coming from in the end? > > Where did "user-0" and "user-1" data go? > > > > Is this a bug, or am I hopelessly confused? > > Thanks! > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > From chris.jerdonek at gmail.com Tue Jun 27 03:29:10 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Tue, 27 Jun 2017 00:29:10 -0700 Subject: [Async-sig] question re: asyncio.Condition lock acquisition order Message-ID: I have a couple questions about asyncio's synchronization primitives. Say a coroutine acquires an asyncio Condition's underlying lock, calls notify() (or notify_all()), and then releases the lock. In terms of which coroutines will acquire the lock next, is any preference given between (1) coroutines waiting to acquire the underlying lock, and (2) coroutines waiting on the Condition object itself? The documentation doesn't seem to say anything about this. Also, more generally (and I'm sure this question gets asked a lot), does asyncio provide any guarantees about the order in which awaiting coroutines are awakened? For example, for synchronization primitives, does each primitive maintain a FIFO queue of who will be awakened next, or are there no guarantees about the order? Thanks a lot, --Chris From andrew.svetlov at gmail.com Tue Jun 27 03:48:58 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 27 Jun 2017 07:48:58 +0000 Subject: [Async-sig] question re: asyncio.Condition lock acquisition order In-Reply-To: References: Message-ID: AFAIK No any guarantee On Tue, Jun 27, 2017 at 10:29 AM Chris Jerdonek wrote: > I have a couple questions about asyncio's synchronization primitives. > > Say a coroutine acquires an asyncio Condition's underlying lock, calls > notify() (or notify_all()), and then releases the lock. In terms of > which coroutines will acquire the lock next, is any preference given > between (1) coroutines waiting to acquire the underlying lock, and (2) > coroutines waiting on the Condition object itself? The documentation > doesn't seem to say anything about this. > > Also, more generally (and I'm sure this question gets asked a lot), > does asyncio provide any guarantees about the order in which awaiting > coroutines are awakened? For example, for synchronization primitives, > does each primitive maintain a FIFO queue of who will be awakened > next, or are there no guarantees about the order? > > Thanks a lot, > --Chris > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jun 27 06:29:23 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Jun 2017 03:29:23 -0700 Subject: [Async-sig] question re: asyncio.Condition lock acquisition order In-Reply-To: References: Message-ID: On Tue, Jun 27, 2017 at 12:29 AM, Chris Jerdonek wrote: > I have a couple questions about asyncio's synchronization primitives. > > Say a coroutine acquires an asyncio Condition's underlying lock, calls > notify() (or notify_all()), and then releases the lock. In terms of > which coroutines will acquire the lock next, is any preference given > between (1) coroutines waiting to acquire the underlying lock, and (2) > coroutines waiting on the Condition object itself? The documentation > doesn't seem to say anything about this. > > Also, more generally (and I'm sure this question gets asked a lot), > does asyncio provide any guarantees about the order in which awaiting > coroutines are awakened? For example, for synchronization primitives, > does each primitive maintain a FIFO queue of who will be awakened > next, or are there no guarantees about the order? In fact asyncio.Lock's implementation is careful to maintain strict FIFO fairness, i.e. whoever calls acquire() first is guaranteed to get the lock first. Whether this is something you feel you can depend on I'll leave to your conscience :-). Though the docs do say "only one coroutine proceeds when a release() call resets the state to unlocked; first coroutine which is blocked in acquire() is being processed", which I think might be intended to say that they're FIFO-fair? asyncio.Condition internally maintains a FIFO list so that notify(1) is guaranteed to wake up the task that called wait() first. But if you notify multiple tasks at once, then I don't think there's any guarantee that they'll get the lock in FIFO order -- basically notify{,_all} just wakes them up, and then the next time they run they try to call lock.acquire(), so it depends on the underlying scheduler to decide who gets to run first. There's also an edge condition where if a task blocked in wait() gets cancelled, then... well, it's complicated. If notify has not been called yet, then it wakes up, reacquires the lock, and then raises CancelledError. If it's already been notified and is waiting to acquire the lock, then I think it goes to the back of the line of tasks waiting for the lock, but otherwise swallows the CancelledError. And then returns None, which is not a documented return value. In case it's interesting for comparison -- hopefully these comments aren't getting annoying -- trio does provide documented fairness guarantees for all its synchronization primitives: https://trio.readthedocs.io/en/latest/reference-core.html#fairness There's some question about whether this is a great idea or what the best definition of "fairness" is, so it also provides trio.StrictFIFOLock for cases where FIFO fairness is actually a requirement for correctness and you want to document this in the code: https://trio.readthedocs.io/en/latest/reference-core.html#trio.StrictFIFOLock And trio.Condition.notify moves tasks from the Condition wait queue directly to the Lock wait queue while preserving FIFO order. (The trade-off is that this means that trio.Condition can only be used with trio.Lock exactly, while asyncio.Condition works with any object that provides the asyncio.Lock interface.) Also, it has a similar edge case around cancellation, because cancellation and condition variables are very tricky :-). Though I guess trio's version arguably a little less quirky because it acts the same regardless of whether it's in the wait-for-notify or wait-for-lock phase, it will only ever drop to the back of the line once, and cancellation in trio is level-triggered rather than edge-triggered so discarding the notification isn't a big deal. -n -- Nathaniel J. Smith -- https://vorpus.org From chris.jerdonek at gmail.com Tue Jun 27 07:15:28 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Tue, 27 Jun 2017 04:15:28 -0700 Subject: [Async-sig] question re: asyncio.Condition lock acquisition order In-Reply-To: References: Message-ID: On Tue, Jun 27, 2017 at 3:29 AM, Nathaniel Smith wrote: > In fact asyncio.Lock's implementation is careful to maintain strict > FIFO fairness, i.e. whoever calls acquire() first is guaranteed to get > the lock first. Whether this is something you feel you can depend on > I'll leave to your conscience :-). Though the docs do say "only one > coroutine proceeds when a release() call resets the state to unlocked; > first coroutine which is blocked in acquire() is being processed", > which I think might be intended to say that they're FIFO-fair? > ... Thanks. All that is really interesting, especially the issue you linked to in the Trio docs re: fairness: https://github.com/python-trio/trio/issues/54 Thinking through the requirements I want for my RW synchronization use case in more detail, I think I want the completion of any "write" to be followed by exhausting all "reads." I'm not sure if that qualifies as barging. Hopefully this will be implementable easily enough with the available primitives, given what you say. Can anything similar be said not about synchronization primitives, but about awakening coroutines in general? Do event loops maintain strict FIFO queues when it comes to deciding which awaiting coroutine to awaken? (I hope that question makes sense!) --Chris From njs at pobox.com Tue Jun 27 18:48:40 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Jun 2017 15:48:40 -0700 Subject: [Async-sig] question re: asyncio.Condition lock acquisition order In-Reply-To: References: Message-ID: On Tue, Jun 27, 2017 at 4:15 AM, Chris Jerdonek wrote: > On Tue, Jun 27, 2017 at 3:29 AM, Nathaniel Smith wrote: >> In fact asyncio.Lock's implementation is careful to maintain strict >> FIFO fairness, i.e. whoever calls acquire() first is guaranteed to get >> the lock first. Whether this is something you feel you can depend on >> I'll leave to your conscience :-). Though the docs do say "only one >> coroutine proceeds when a release() call resets the state to unlocked; >> first coroutine which is blocked in acquire() is being processed", >> which I think might be intended to say that they're FIFO-fair? >> ... > > Thanks. All that is really interesting, especially the issue you > linked to in the Trio docs re: fairness: > https://github.com/python-trio/trio/issues/54 > > Thinking through the requirements I want for my RW synchronization use > case in more detail, I think I want the completion of any "write" to > be followed by exhausting all "reads." I'm not sure if that qualifies > as barging. Hopefully this will be implementable easily enough with > the available primitives, given what you say. I've only seen the term "barging" used in discussions of regular locks, though I'm not an expert, just someone with eclectic reading habits. But RWLocks have some extra subtleties that "barging" vs "non-barging" don't really capture. Say you have the following sequence: task w0 acquires for write task r1 attempts to acquire for read (and blocks) task r2 attempts to acquire for read (and blocks) task w1 attempts to acquire for write (and blocks) task r3 attempts to acquire for read (and blocks) task w0 releases the write lock task r4 attempts to acquire for read What happens? If r1+r2+r3+r4 are able to take the lock, then you're "read-biased" (which is essentially the same as barging for readers, but it can be extra dangerous for RWLocks, because if you have a heavy read load it's very easy for readers to starve writers). If tasks r1+r2 wake up, but r3+r4 have to wait, then you're "task-fair" (the equivalent of FIFO fairness for RWLocks). If r1+r2+r3 wake up, but r4 has to wait, then you're "phase fair". There are some notes here that are poorly organized but perhaps retain some small interest: https://github.com/python-trio/trio/blob/master/trio/_core/_parking_lot.py If I ever implement one of these it'll probably be phase-fair, because (a) it has some nice theoretical properties, and (b) it happens to be particularly easy to implement using my existing wait-queue primitive, and task-fair isn't :-). > Can anything similar be said not about synchronization primitives, but > about awakening coroutines in general? Do event loops maintain strict > FIFO queues when it comes to deciding which awaiting coroutine to > awaken? (I hope that question makes sense!) Something like that. There's some complication because there are two ways that a task can become runnable: directly by another piece of code in the system (e.g., releasing a lock), or via some I/O (e.g., bytes arriving on a socket). If you really wanted to ensure that tasks ran exactly in the order that they became runnable, then you need to check for I/O constantly, but this is inefficient. So usually what cooperative scheduling systems guarantee is a kind of "batched FIFO": they do a poll for I/O (a which point they may discover some new runnable tasks), and then take a snapshot of all the runnable tasks, and then run all of the tasks in their snapshot once before considering any new tasks. So this isn't quite strict FIFO, but it's fair-like-FIFO (the discrepancy between when each task should run under strict FIFO, and when it actually runs, is bounded by the number of active tasks; there's no possibility of a runnable task being left unscheduled for an arbitrary amount of time). Curio used to allow woken-by-code tasks to starve out woken-by-I/O tasks, and you might be interested in the discussion in the PR that changed that: https://github.com/dabeaz/curio/pull/127 In trio I actually randomize the order within each batch because I don't want people to accidentally encode assumptions about the scheduler (e.g. in their test suites). This is because I have hopes of eventually doing something fancier :-): https://github.com/python-trio/trio/issues/32 ("If you liked issue #54, you'll love #32!"). Many systems are not this paranoid though, and actually are strict-FIFO for tasks that are woken-by-code - but this is definitely one of those features where depending on it is dubious. In asyncio for example the event loop is pluggable and the scheduling policy is a feature of the event loop, so even if the implementation in the stdlib is strict FIFO you don't know about third-party ones. -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Tue Jun 27 18:52:48 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Jun 2017 15:52:48 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Mon, Jun 26, 2017 at 6:41 PM, Chris Jerdonek wrote: > On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek wrote: >> Chris, here's a simple RWLock implementation and analysis: >> ... >> Obv., this code could be nicer: >> * separate context managers for read and write cases >> * .unlock can be automatic (if self.writer: unlock_for_write()) at the >> cost of opening doors wide open to bugs >> * policy can be introduced if `.lock` identified itself (by an >> object(), since there's no thread id) in shared state >> * notifyAll() makes real life use O(N^2) for N being number of >> simultaneous write lock requests >> >> Feel free to use it :) > > Thanks, Dima. However, as I said in my earlier posts, I'm actually > more interested in exploring approaches to synchronizing readers and > writers in async code that don't require locking on reads. (This is > also why I've always been saying RW "synchronization" instead of RW > "locking.") > > I'm interested in this because I think the single-threadedness of the > event loop might be what makes this simplification possible over the > traditional multi-threaded approach (along the lines Guido was > mentioning). It also makes the "fast path" faster. Lastly, the API for > the callers is just to call read() or write(), so there is no need for > a general RWLock construct or to work through RWLock semantics of the > sort Nathaniel mentioned. > > I coded up a working version of the pseudo-code I included in an > earlier email so people can see how it works. I included it at the > bottom of this email and also in this gist: > https://gist.github.com/cjerdonek/858e1467f768ee045849ea81ddb47901 FWIW, to me this just looks like an implementation of an async RWLock? It's common for async synchronization primitives to be simpler internally than threading primitives because the async ones don't need to worry about being pre-empted at arbitrary points, but from the caller's point of view you still have basically a blocking acquire() method, and then you do your stuff (potentially blocking while you're at it), and then you call a non-blocking release(), just like every other async lock. -n -- Nathaniel J. Smith -- https://vorpus.org From chris.jerdonek at gmail.com Tue Jun 27 19:39:02 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Tue, 27 Jun 2017 16:39:02 -0700 Subject: [Async-sig] "read-write" synchronization In-Reply-To: References: Message-ID: On Tue, Jun 27, 2017 at 3:52 PM, Nathaniel Smith wrote: > On Mon, Jun 26, 2017 at 6:41 PM, Chris Jerdonek > wrote: >> I coded up a working version of the pseudo-code I included in an >> earlier email so people can see how it works. I included it at the >> bottom of this email and also in this gist: >> https://gist.github.com/cjerdonek/858e1467f768ee045849ea81ddb47901 > > FWIW, to me this just looks like an implementation of an async RWLock? > It's common for async synchronization primitives to be simpler > internally than threading primitives because the async ones don't need > to worry about being pre-empted at arbitrary points, but from the > caller's point of view you still have basically a blocking acquire() > method, and then you do your stuff (potentially blocking while you're > at it), and then you call a non-blocking release(), just like every > other async lock. Yes and no I think. Internally, the implementation does just amount to applying an async RWLock. But the difference I was getting at is that the use case doesn't require exposing the RWLock in the API (e.g. underlying acquire() and release() methods). This means you can avoid having to think about some of the tricky design questions you started discussing in an earlier email of yours: > This is also a surprisingly complex design question. Your async RWLock > actually matches how Python's threading.Lock works: you're explicitly > allowed to acquire it in one thread and then release it from another. > People sometimes find this surprising, and it prevents some kinds of > error-checking. For example, this code *probably* deadlocks: > ... So my point was just that if the API is narrowed to exposing only "read" and "write" operations (to support the easier task of synchronizing reads and writes) and the RWLock kept private, you can avoid having to think through and support full-blown RWLock design and use cases, like with issues around passing ownership, etc. The API restricts how the RWLock is ever used, so it needn't be a complete RWLock implementation. --Chris From chris.jerdonek at gmail.com Wed Jun 28 19:32:46 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Wed, 28 Jun 2017 16:32:46 -0700 Subject: [Async-sig] question re: asyncio.Condition lock acquisition order In-Reply-To: References: Message-ID: On Tue, Jun 27, 2017 at 3:48 PM, Nathaniel Smith wrote: > On Tue, Jun 27, 2017 at 4:15 AM, Chris Jerdonek >> Thinking through the requirements I want for my RW synchronization use >> case in more detail, I think I want the completion of any "write" to >> be followed by exhausting all "reads." I'm not sure if that qualifies >> as barging. Hopefully this will be implementable easily enough with >> the available primitives, given what you say. > > I've only seen the term "barging" used in discussions of regular > locks, though I'm not an expert, just someone with eclectic reading > habits. But RWLocks have some extra subtleties that "barging" vs > "non-barging" don't really capture. Say you have the following > sequence: > > task w0 acquires for write > task r1 attempts to acquire for read (and blocks) > task r2 attempts to acquire for read (and blocks) > task w1 attempts to acquire for write (and blocks) > task r3 attempts to acquire for read (and blocks) > task w0 releases the write lock > task r4 attempts to acquire for read > > What happens? If r1+r2+r3+r4 are able to take the lock, then you're > "read-biased" (which is essentially the same as barging for readers, > but it can be extra dangerous for RWLocks, because if you have a heavy > read load it's very easy for readers to starve writers). All really interesting and informative again. Thank you, Nathaniel. Regarding the above, in my case the "writes" will be a background cleanup task that can happen as time is available. So it will be okay if it is starved. --Chris From dimaqq at gmail.com Fri Jun 30 06:11:46 2017 From: dimaqq at gmail.com (Dima Tisnek) Date: Fri, 30 Jun 2017 12:11:46 +0200 Subject: [Async-sig] async documentation methods Message-ID: Hi all, I'm working to improve async docs, and I wonder if/how async methods ought to be marked in the documentation, for example library/async-sync.rst: """ ... It [lock] has two basic methods, `acquire()` and `release()`. ... """ In fact, these methods are not symmetric, the earlier is asynchronous and the latter synchronous: Definitions are `async def acquire()` and `def release()`. Likewise user is expected to call `await .acquire()` and `.release()`. This is user-facing documentation, IMO it should be clearer. Although there are examples for this specific case, I'm concerned with general documentation best practice. Should this example read, e.g.: * two methods, `async acquire()` and `release()` or perhaps * two methods, used `await x.acquire()` and `x.release()` or something else? If there's a good example already Python docs or in some 3rd party docs, please tell. Likewise, should there be marks on iterators? async generators? things that ought to be used as context managers? Cheers, d. From andrew.svetlov at gmail.com Fri Jun 30 06:28:45 2017 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 30 Jun 2017 10:28:45 +0000 Subject: [Async-sig] async documentation methods In-Reply-To: References: Message-ID: I like "two methods, `async acquire()` and `release()`" Regarding to extra markups -- I created sphinxcontrib-asyncio [1] library for it. Hmm, README is pretty empty but we do use the library for documenting aio-libs and aiohttp [2] itself We use ".. comethod:: connect(request)" for method and "cofunction" for top level functions. Additional markup for methods that could be used as async context managers: .. comethod:: delete(url, **kwargs) :async-with: :coroutine: and `:async-for:` for async iterators. 1. https://github.com/aio-libs/sphinxcontrib-asyncio 2. https://github.com/aio-libs/aiohttp On Fri, Jun 30, 2017 at 1:11 PM Dima Tisnek wrote: > Hi all, > > I'm working to improve async docs, and I wonder if/how async methods > ought to be marked in the documentation, for example > library/async-sync.rst: > > """ ... It [lock] has two basic methods, `acquire()` and `release()`. ... > """ > > In fact, these methods are not symmetric, the earlier is asynchronous > and the latter synchronous: > > Definitions are `async def acquire()` and `def release()`. > Likewise user is expected to call `await .acquire()` and `.release()`. > > This is user-facing documentation, IMO it should be clearer. > Although there are examples for this specific case, I'm concerned with > general documentation best practice. > > Should this example read, e.g.: > * two methods, `async acquire()` and `release()` > or perhaps > * two methods, used `await x.acquire()` and `x.release()` > or something else? > > If there's a good example already Python docs or in some 3rd party > docs, please tell. > > Likewise, should there be marks on iterators? async generators? things > that ought to be used as context managers? > > Cheers, > d. > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Jun 30 11:31:52 2017 From: brett at python.org (Brett Cannon) Date: Fri, 30 Jun 2017 15:31:52 +0000 Subject: [Async-sig] async documentation methods In-Reply-To: References: Message-ID: Curio uses `.. asyncfunction:: acquire` and it renders as `await acquire()` at least in the function definition. On Fri, 30 Jun 2017 at 03:36 Andrew Svetlov wrote: > I like "two methods, `async acquire()` and `release()`" > > Regarding to extra markups -- I created sphinxcontrib-asyncio [1] library > for it. Hmm, README is pretty empty but we do use the library for > documenting aio-libs and aiohttp [2] itself > > We use ".. comethod:: connect(request)" for method and "cofunction" for > top level functions. > > Additional markup for methods that could be used as async context managers: > > .. comethod:: delete(url, **kwargs) > :async-with: > :coroutine: > > and `:async-for:` for async iterators. > > > 1. https://github.com/aio-libs/sphinxcontrib-asyncio > 2. https://github.com/aio-libs/aiohttp > > On Fri, Jun 30, 2017 at 1:11 PM Dima Tisnek wrote: > >> Hi all, >> >> I'm working to improve async docs, and I wonder if/how async methods >> ought to be marked in the documentation, for example >> library/async-sync.rst: >> >> """ ... It [lock] has two basic methods, `acquire()` and `release()`. ... >> """ >> >> In fact, these methods are not symmetric, the earlier is asynchronous >> and the latter synchronous: >> >> Definitions are `async def acquire()` and `def release()`. >> Likewise user is expected to call `await .acquire()` and `.release()`. >> >> This is user-facing documentation, IMO it should be clearer. >> Although there are examples for this specific case, I'm concerned with >> general documentation best practice. >> >> Should this example read, e.g.: >> * two methods, `async acquire()` and `release()` >> or perhaps >> * two methods, used `await x.acquire()` and `x.release()` >> or something else? >> >> If there's a good example already Python docs or in some 3rd party >> docs, please tell. >> >> Likewise, should there be marks on iterators? async generators? things >> that ought to be used as context managers? >> >> Cheers, >> d. >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > -- > Thanks, > Andrew Svetlov > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov at gmail.com Fri Jun 30 14:41:33 2017 From: yselivanov at gmail.com (Yury Selivanov) Date: Fri, 30 Jun 2017 14:41:33 -0400 Subject: [Async-sig] async documentation methods In-Reply-To: References: Message-ID: <6b4085a6-6783-4ec8-b40d-2d3328dec926@Spark> Hi Dima, Have you seen?https://github.com/asyncio-docs? ?I'm trying to get some work going there to improve asyncio docs in 3.7. Will start committing more of my time there soon. Thanks, Yury On Jun 30, 2017, 6:11 AM -0400, Dima Tisnek , wrote: > Hi all, > > I'm working to improve async docs, and I wonder if/how async methods > ought to be marked in the documentation, for example > library/async-sync.rst: > > """ ... It [lock] has two basic methods, `acquire()` and `release()`. ... """ > > In fact, these methods are not symmetric, the earlier is asynchronous > and the latter synchronous: > > Definitions are `async def acquire()` and `def release()`. > Likewise user is expected to call `await .acquire()` and `.release()`. > > This is user-facing documentation, IMO it should be clearer. > Although there are examples for this specific case, I'm concerned with > general documentation best practice. > > Should this example read, e.g.: > * two methods, `async acquire()` and `release()` > or perhaps > * two methods, used `await x.acquire()` and `x.release()` > or something else? > > If there's a good example already Python docs or in some 3rd party > docs, please tell. > > Likewise, should there be marks on iterators? async generators? things > that ought to be used as context managers? > > Cheers, > d. > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: