From mehaase at gmail.com Fri Apr 13 11:55:38 2018 From: mehaase at gmail.com (Mark E. Haase) Date: Fri, 13 Apr 2018 11:55:38 -0400 Subject: [Async-sig] Simplifying stack traces for tasks? In-Reply-To: References: Message-ID: Thanks for the feedback! I put this aside for a while but I'm coming back to it now and cleaning it up. The approach used in this first post was obviously very clumsy. In my latest version I am using module instance directly (as shown in Nathaniel's reply) and using the qualified package name (as suggested by Roger). I created an explicit blacklist (incomplete--still needs more testing) of functions to hide in my custom backtraces and refactored a bit so I can write tests for it. Code below. One interesting thing I learned while working on this is that the backtraces change depending on the asyncio debug mode, because in debug mode couroutines are wrapped in CoroWrapper[1], which adds a frame everytime a coroutine sends, throws, etc. So I am now thinking that my custom excepthook is probably most useful in debug mode, but probably not good to enable in production. I'm working on a more general asyncio task group library that will include this excepthook. I'll release the whole thing on PyPI when it's done. [1] https://github.com/python/cpython/blob/23ab5ee667a9b29014f6f7f01797c611f63ff743/Lib/asyncio/coroutines.py#L25 --- def _async_excepthook(type_, exc, tb): ''' An ``excepthook`` that hides event loop internals and displays task group information. :param type type_: the exception type :param Exception exc: the exception itself :param tb tb: a traceback of the exception ''' print(_async_excepthook_format(type_, exc, tb)) def _async_excepthook_format(type_, exc, tb): ''' This helper function is used for testing. :param type type_: the exception type :param Exception exc: the exception itself :param tracetack tb: a traceback of the exception :return: the formatted traceback as a string ''' format_str = '' cause_exc = None cause_str = None if exc.__cause__ is not None: cause_exc = exc.__cause__ cause_str = 'The above exception was the direct cause ' \ 'of the following exception:' elif exc.__context__ is not None and not exc.__suppress_context__: cause_exc = exc.__context__ cause_str = '\nDuring handling of the above exception, ' \ 'another exception occurred:' if cause_exc: format_str += _async_excepthook_format(type(cause_exc), cause_exc, cause_exc.__traceback__) if cause_str: format_str += '\n{}\n\n'.format(cause_str) format_str += 'Async Traceback (most recent call last):\n' # Need file, line, function, text for frame, line_no in traceback.walk_tb(tb): if _async_excepthook_exclude(frame): format_str += ' ---\n' else: code = frame.f_code filename = code.co_filename line = linecache.getline(filename, line_no).strip() format_str += ' File "{}", line {}, in {}\n' \ .format(filename, line_no, code.co_name) format_str += ' {}\n'.format(line) format_str += '{}: {}'.format(type_.__name__, exc) return format_str _ASYNC_EXCEPTHOOK_BLACKLIST = { 'asyncio.base_events': ('_run_once', 'call_later', 'call_soon'), 'asyncio.coroutines': ('__next__', 'send', 'throw'), 'asyncio.events': ('__init__', '_run'), 'asyncio.tasks': ('_step', '_wakeup'), 'traceback': ('extract', 'extract_stack'), } def _async_excepthook_exclude(frame): ''' Return True if ``frame`` should be excluded from tracebacks. ''' module = frame.f_globals['__name__'] function = frame.f_code.co_name return module in _ASYNC_EXCEPTHOOK_BLACKLIST and \ function in _ASYNC_EXCEPTHOOK_BLACKLIST[module] On Tue, Nov 14, 2017 at 7:15 PM, Nathaniel Smith wrote: > On Tue, Nov 14, 2017 at 2:00 PM, Roger Pate wrote: > > On Tue, Nov 14, 2017 at 9:54 AM, Mark E. Haase > wrote: > > ... > >> print('Async Traceback (most recent call last):') > >> for frame in traceback.extract_tb(tb): > >> head, tail = os.path.split(frame.filename) > >> if (head.endswith('asyncio') or tail == 'traceback.py') and > \ > >> frame.name.startswith('_'): > > ... > >> The meat of it is towards the bottom, "if head.endswith('asyncio')..." > There > >> are a lot of debatable details and this implementation is pretty hacky > and > >> clumsy, but I have found it valuable in my own usage, and I haven't yet > >> missed the omitted stack frames. > > > > It would be better to determine if the qualified module name is > > "traceback" or starts with "asyncio." (or topmost package is > > "asyncio", etc.) rather than allow false positives for > > random_package.asyncio.module._any_function or > > random_package.traceback._any_function. I don't see an easy way to > > get the module object at this point in your hook; however: > > You can't get the module from the cooked data that extract_tb returns, > but it's there in the tb object itself. This walks the traceback and > prints each frame's module: > > current = tb > while current is not None: > print("Next module", current.tb_frame.f_globals.get("__name__")) > current = current.tb_next > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmludo at gmail.com Mon Apr 16 18:05:16 2018 From: gmludo at gmail.com (Ludovic Gasc) Date: Tue, 17 Apr 2018 00:05:16 +0200 Subject: [Async-sig] asyncio.Lock equivalent for multiple processes Message-ID: Hi, I'm looking for a equivalent of asyncio.Lock ( https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock) but shared between several processes on the same server, because I'm migrating a daemon from mono-worker to multi-worker pattern. For now, the closest solution in term of API seems aioredlock: https://github.com/joanvila/aioredlock#aioredlock But I'm not a big fan to use polling nor with a timeout because the lock I need is very critical, I prefer to block the code than unlock with timeout. Do I miss a new awesome library or do you have an easier approach ? Thanks for your responses. -- Ludovic Gasc (GMLudo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmludo at gmail.com Tue Apr 17 06:01:21 2018 From: gmludo at gmail.com (Ludovic Gasc) Date: Tue, 17 Apr 2018 12:01:21 +0200 Subject: [Async-sig] [python-tulip] asyncio.Lock equivalent for multiple processes In-Reply-To: References: Message-ID: Hi Roberto, Thanks for the pointer, it's exactly the type of feedbacks I'm looking for: Ideas that are out-of-box of my confort zone. However, in our use case, we are using gunicorn, that uses forks instead of multiprocessing to my knowledge, I can't use multiprocessing without to remove gunicorn. If somebody is using aioredlock in his project, I'm interested by feedbacks. Have a nice week. -- Ludovic Gasc (GMLudo) 2018-04-17 7:19 GMT+02:00 Roberto Mart?nez : > > Hi, > > I don't know if there is a third party solution for this. > > I think the closest you can get today using the standard library is using > a multiprocessing.manager().Lock (which can be shared among processes) and > call the lock.acquire() function with asyncio.run_in_executor(), using a > ThreadedPoolExecutor to avoid blocking the asyncio event loop. > > Best regards, > Roberto > > > El mar., 17 abr. 2018 a las 0:05, Ludovic Gasc () > escribi?: > >> Hi, >> >> I'm looking for a equivalent of asyncio.Lock (https://docs.python.org/3/ >> library/asyncio-sync.html#asyncio.Lock) but shared between several >> processes on the same server, because I'm migrating a daemon from >> mono-worker to multi-worker pattern. >> >> For now, the closest solution in term of API seems aioredlock: >> https://github.com/joanvila/aioredlock#aioredlock >> But I'm not a big fan to use polling nor with a timeout because the lock >> I need is very critical, I prefer to block the code than unlock with >> timeout. >> >> Do I miss a new awesome library or do you have an easier approach ? >> >> Thanks for your responses. >> -- >> Ludovic Gasc (GMLudo) >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nickolainovik at gmail.com Tue Apr 17 06:46:10 2018 From: nickolainovik at gmail.com (Nickolai Novik) Date: Tue, 17 Apr 2018 10:46:10 +0000 Subject: [Async-sig] [python-tulip] asyncio.Lock equivalent for multiple processes In-Reply-To: References: Message-ID: Hi, redis lock has own limitations and depending on your use case it may or may not be suitable [1]. If possible I would redefine problem and also considered: 1) create worker per specific resource type to avoid locking 2) optimistic locking 3) File system lock like in twisted, but not sure about performance and edge cases there 4) make operation on resource idempotent [1] http://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html [2] https://github.com/twisted/twisted/blob/e38cc25a67747899c6984d6ebaa8d3d134799415/src/twisted/python/lockfile.py On Tue, 17 Apr 2018 at 13:01 Ludovic Gasc wrote: > Hi Roberto, > > Thanks for the pointer, it's exactly the type of feedbacks I'm looking > for: Ideas that are out-of-box of my confort zone. > However, in our use case, we are using gunicorn, that uses forks instead > of multiprocessing to my knowledge, I can't use multiprocessing without to > remove gunicorn. > > If somebody is using aioredlock in his project, I'm interested by > feedbacks. > > Have a nice week. > > > -- > Ludovic Gasc (GMLudo) > > 2018-04-17 7:19 GMT+02:00 Roberto Mart?nez : > >> >> Hi, >> >> I don't know if there is a third party solution for this. >> >> I think the closest you can get today using the standard library is using >> a multiprocessing.manager().Lock (which can be shared among processes) and >> call the lock.acquire() function with asyncio.run_in_executor(), using a >> ThreadedPoolExecutor to avoid blocking the asyncio event loop. >> >> Best regards, >> Roberto >> >> >> El mar., 17 abr. 2018 a las 0:05, Ludovic Gasc () >> escribi?: >> >>> Hi, >>> >>> I'm looking for a equivalent of asyncio.Lock ( >>> https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock) but >>> shared between several processes on the same server, because I'm migrating >>> a daemon from mono-worker to multi-worker pattern. >>> >>> For now, the closest solution in term of API seems aioredlock: >>> https://github.com/joanvila/aioredlock#aioredlock >>> But I'm not a big fan to use polling nor with a timeout because the lock >>> I need is very critical, I prefer to block the code than unlock with >>> timeout. >>> >>> Do I miss a new awesome library or do you have an easier approach ? >>> >>> Thanks for your responses. >>> -- >>> Ludovic Gasc (GMLudo) >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmludo at gmail.com Tue Apr 17 07:34:47 2018 From: gmludo at gmail.com (Ludovic Gasc) Date: Tue, 17 Apr 2018 13:34:47 +0200 Subject: [Async-sig] [python-tulip] asyncio.Lock equivalent for multiple processes In-Reply-To: References: Message-ID: Hi Nickolai, Thanks for your suggestions, especially for the file system lock: We don't have often locks, but we must be sure it's locked. For 1) and 4) suggestions, in fact we have several systems to sync and also a PostgreSQL transaction, the request must be treated by the same worker from beginning to end and the other systems aren't idempotent at all, it's "old-school" proprietary systems, good luck to change that ;-) Regards. -- Ludovic Gasc (GMLudo) 2018-04-17 12:46 GMT+02:00 Nickolai Novik : > Hi, redis lock has own limitations and depending on your use case it may > or may not be suitable [1]. If possible I would redefine problem and also > considered: > 1) create worker per specific resource type to avoid locking > 2) optimistic locking > 3) File system lock like in twisted, but not sure about performance and > edge cases there > 4) make operation on resource idempotent > > [1] http://martin.kleppmann.com/2016/02/08/how-to-do- > distributed-locking.html > [2] https://github.com/twisted/twisted/blob/e38cc25a67747899c6984d6ebaa8d3 > d134799415/src/twisted/python/lockfile.py > > On Tue, 17 Apr 2018 at 13:01 Ludovic Gasc wrote: > >> Hi Roberto, >> >> Thanks for the pointer, it's exactly the type of feedbacks I'm looking >> for: Ideas that are out-of-box of my confort zone. >> However, in our use case, we are using gunicorn, that uses forks instead >> of multiprocessing to my knowledge, I can't use multiprocessing without to >> remove gunicorn. >> >> If somebody is using aioredlock in his project, I'm interested by >> feedbacks. >> >> Have a nice week. >> >> >> -- >> Ludovic Gasc (GMLudo) >> >> 2018-04-17 7:19 GMT+02:00 Roberto Mart?nez : >> >>> >>> Hi, >>> >>> I don't know if there is a third party solution for this. >>> >>> I think the closest you can get today using the standard library is >>> using a multiprocessing.manager().Lock (which can be shared among >>> processes) and call the lock.acquire() function with >>> asyncio.run_in_executor(), using a ThreadedPoolExecutor to avoid blocking >>> the asyncio event loop. >>> >>> Best regards, >>> Roberto >>> >>> >>> El mar., 17 abr. 2018 a las 0:05, Ludovic Gasc () >>> escribi?: >>> >>>> Hi, >>>> >>>> I'm looking for a equivalent of asyncio.Lock ( >>>> https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock) but >>>> shared between several processes on the same server, because I'm migrating >>>> a daemon from mono-worker to multi-worker pattern. >>>> >>>> For now, the closest solution in term of API seems aioredlock: >>>> https://github.com/joanvila/aioredlock#aioredlock >>>> But I'm not a big fan to use polling nor with a timeout because the >>>> lock I need is very critical, I prefer to block the code than unlock with >>>> timeout. >>>> >>>> Do I miss a new awesome library or do you have an easier approach ? >>>> >>>> Thanks for your responses. >>>> -- >>>> Ludovic Gasc (GMLudo) >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Tue Apr 17 07:45:08 2018 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Tue, 17 Apr 2018 04:45:08 -0700 Subject: [Async-sig] [python-tulip] asyncio.Lock equivalent for multiple processes In-Reply-To: References: Message-ID: If you're already using PostgreSQL, you might also look at "advisory locks": https://www.postgresql.org/docs/current/static/explicit-locking.html#ADVISORY-LOCKS --Chris On Tue, Apr 17, 2018 at 4:34 AM, Ludovic Gasc wrote: > Hi Nickolai, > > Thanks for your suggestions, especially for the file system lock: We don't > have often locks, but we must be sure it's locked. > > For 1) and 4) suggestions, in fact we have several systems to sync and also > a PostgreSQL transaction, the request must be treated by the same worker > from beginning to end and the other systems aren't idempotent at all, it's > "old-school" proprietary systems, good luck to change that ;-) > > Regards. > -- > Ludovic Gasc (GMLudo) > > 2018-04-17 12:46 GMT+02:00 Nickolai Novik : >> >> Hi, redis lock has own limitations and depending on your use case it may >> or may not be suitable [1]. If possible I would redefine problem and also >> considered: >> 1) create worker per specific resource type to avoid locking >> 2) optimistic locking >> 3) File system lock like in twisted, but not sure about performance and >> edge cases there >> 4) make operation on resource idempotent >> >> [1] >> http://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html >> [2] >> https://github.com/twisted/twisted/blob/e38cc25a67747899c6984d6ebaa8d3d134799415/src/twisted/python/lockfile.py >> >> On Tue, 17 Apr 2018 at 13:01 Ludovic Gasc wrote: >>> >>> Hi Roberto, >>> >>> Thanks for the pointer, it's exactly the type of feedbacks I'm looking >>> for: Ideas that are out-of-box of my confort zone. >>> However, in our use case, we are using gunicorn, that uses forks instead >>> of multiprocessing to my knowledge, I can't use multiprocessing without to >>> remove gunicorn. >>> >>> If somebody is using aioredlock in his project, I'm interested by >>> feedbacks. >>> >>> Have a nice week. >>> >>> >>> -- >>> Ludovic Gasc (GMLudo) >>> >>> 2018-04-17 7:19 GMT+02:00 Roberto Mart?nez : >>>> >>>> >>>> Hi, >>>> >>>> I don't know if there is a third party solution for this. >>>> >>>> I think the closest you can get today using the standard library is >>>> using a multiprocessing.manager().Lock (which can be shared among processes) >>>> and call the lock.acquire() function with asyncio.run_in_executor(), using a >>>> ThreadedPoolExecutor to avoid blocking the asyncio event loop. >>>> >>>> Best regards, >>>> Roberto >>>> >>>> >>>> El mar., 17 abr. 2018 a las 0:05, Ludovic Gasc () >>>> escribi?: >>>>> >>>>> Hi, >>>>> >>>>> I'm looking for a equivalent of asyncio.Lock >>>>> (https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock) but >>>>> shared between several processes on the same server, because I'm migrating a >>>>> daemon from mono-worker to multi-worker pattern. >>>>> >>>>> For now, the closest solution in term of API seems aioredlock: >>>>> https://github.com/joanvila/aioredlock#aioredlock >>>>> But I'm not a big fan to use polling nor with a timeout because the >>>>> lock I need is very critical, I prefer to block the code than unlock with >>>>> timeout. >>>>> >>>>> Do I miss a new awesome library or do you have an easier approach ? >>>>> >>>>> Thanks for your responses. >>>>> -- >>>>> Ludovic Gasc (GMLudo) >>> >>> > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > From dimaqq at gmail.com Tue Apr 17 08:17:55 2018 From: dimaqq at gmail.com (Dima Tisnek) Date: Tue, 17 Apr 2018 20:17:55 +0800 Subject: [Async-sig] asyncio.Lock equivalent for multiple processes In-Reply-To: References: Message-ID: Hi Ludovic, I believe it's relatively straightforward to implement the core functionality, if you can at first reduce it to: * allow only one coro to wait on lock at a given time (i.e. one user per process / event loop) * decide explicitly if you want other coros to continue (I assume so, as blocking entire process would be trivial) * don't care about performance too much :) Once that's done, you can allow multiple users per event loop by wrapping your inter-process lock in a regular async lock. Wrt. performance, you can start with a simple client-server implementation, for example where: * single-threaded server listens on some port, accepts 1 connection at a time, writes something on the connection and waits for connection to be closed * each client connects (not informative due to listen backlog) and waits for data, when client gets the data, it has the lock * when client wants to release the lock, it closes the connection, which unblocks the server * socket communication is relatively easy to marry to the event loop :) If you want high performance (i.e. low latency), you'd probably want to go with futex, but that may prove hard to marry to asyncio internals. I guess locking can always be proxied through a thread, at some cost to performance. If performance is important, I'd suggest starting with a thread proxy from the start. It could go like this: Each named lock gets own thread (in each process / event loop), a sync lock and condition variable. When a coro want to take the lock, it creates an empty Future, ephemerally takes the sync lock, adds this future to waiters, and signals on the condition variable and awaits this Future. Thread wakes up, validates there's someone in the queue under sync lock, tries to take classical inter-process lock (sysv or file or whatever), and when that succeeds, resolves the future using loop.call_soon_threadsafe(). I'm omitting implementation details, like what if Future is leaked (discarded before it's resolved), how release is orchestrated, etc. The key point is that offloading locking to a dedicated thread allows to reduce original problem to synchronous interprocess locking problem. Cheers! On 17 April 2018 at 06:05, Ludovic Gasc wrote: > Hi, > > I'm looking for a equivalent of asyncio.Lock > (https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock) but > shared between several processes on the same server, because I'm migrating a > daemon from mono-worker to multi-worker pattern. > > For now, the closest solution in term of API seems aioredlock: > https://github.com/joanvila/aioredlock#aioredlock > But I'm not a big fan to use polling nor with a timeout because the lock I > need is very critical, I prefer to block the code than unlock with timeout. > > Do I miss a new awesome library or do you have an easier approach ? > > Thanks for your responses. > -- > Ludovic Gasc (GMLudo) > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > From gmludo at gmail.com Tue Apr 17 09:04:37 2018 From: gmludo at gmail.com (Ludovic Gasc) Date: Tue, 17 Apr 2018 15:04:37 +0200 Subject: [Async-sig] [python-tulip] Re: asyncio.Lock equivalent for multiple processes In-Reply-To: <20180417134100.2fff0e3a@fsol> References: <20180417134100.2fff0e3a@fsol> Message-ID: Hi Antoine & Chris, Thanks a lot for the advisory lock, I didn't know this feature in PostgreSQL. Indeed, it seems to fit my problem. The small latest problem I have is that we have string names for locks, but advisory locks accept only integers. Nevertheless, it isn't a problem, I will do a mapping between names and integers. Yours. -- Ludovic Gasc (GMLudo) 2018-04-17 13:41 GMT+02:00 Antoine Pitrou : > On Tue, 17 Apr 2018 13:34:47 +0200 > Ludovic Gasc wrote: > > Hi Nickolai, > > > > Thanks for your suggestions, especially for the file system lock: We > don't > > have often locks, but we must be sure it's locked. > > > > For 1) and 4) suggestions, in fact we have several systems to sync and > also > > a PostgreSQL transaction, the request must be treated by the same worker > > from beginning to end and the other systems aren't idempotent at all, > it's > > "old-school" proprietary systems, good luck to change that ;-) > > If you already have a PostgreSQL connection, can't you use a PostgreSQL > lock? e.g. an "advisory lock" as described in > https://www.postgresql.org/docs/9.1/static/explicit-locking.html > > Regards > > Antoine. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmludo at gmail.com Tue Apr 17 09:08:13 2018 From: gmludo at gmail.com (Ludovic Gasc) Date: Tue, 17 Apr 2018 15:08:13 +0200 Subject: [Async-sig] asyncio.Lock equivalent for multiple processes In-Reply-To: References: Message-ID: Hi Dima, Thanks for your time and explanations :-) However, I have the intuition that it will take me more time to implement your idea compare to the builtin feature of PostgreSQL. Nevertheless, I keep your idea in mind in case of I have problems with PostgreSQL. Have a nice day. -- Ludovic Gasc (GMLudo) 2018-04-17 14:17 GMT+02:00 Dima Tisnek : > Hi Ludovic, > > I believe it's relatively straightforward to implement the core > functionality, if you can at first reduce it to: > * allow only one coro to wait on lock at a given time (i.e. one user > per process / event loop) > * decide explicitly if you want other coros to continue (I assume so, > as blocking entire process would be trivial) > * don't care about performance too much :) > > Once that's done, you can allow multiple users per event loop by > wrapping your inter-process lock in a regular async lock. > > Wrt. performance, you can start with a simple client-server > implementation, for example where: > * single-threaded server listens on some port, accepts 1 connection at > a time, writes something on the connection and waits for connection to > be closed > * each client connects (not informative due to listen backlog) and > waits for data, when client gets the data, it has the lock > * when client wants to release the lock, it closes the connection, > which unblocks the server > * socket communication is relatively easy to marry to the event loop :) > > If you want high performance (i.e. low latency), you'd probably want > to go with futex, but that may prove hard to marry to asyncio > internals. > I guess locking can always be proxied through a thread, at some cost > to performance. > > > If performance is important, I'd suggest starting with a thread proxy > from the start. It could go like this: > Each named lock gets own thread (in each process / event loop), a sync > lock and condition variable. > When a coro want to take the lock, it creates an empty Future, > ephemerally takes the sync lock, adds this future to waiters, and > signals on the condition variable and awaits this Future. > Thread wakes up, validates there's someone in the queue under sync > lock, tries to take classical inter-process lock (sysv or file or > whatever), and when that succeeds, resolves the future using > loop.call_soon_threadsafe(). > I'm omitting implementation details, like what if Future is leaked > (discarded before it's resolved), how release is orchestrated, etc. > The key point is that offloading locking to a dedicated thread allows > to reduce original problem to synchronous interprocess locking > problem. > > > Cheers! > > > On 17 April 2018 at 06:05, Ludovic Gasc wrote: > > Hi, > > > > I'm looking for a equivalent of asyncio.Lock > > (https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock) but > > shared between several processes on the same server, because I'm > migrating a > > daemon from mono-worker to multi-worker pattern. > > > > For now, the closest solution in term of API seems aioredlock: > > https://github.com/joanvila/aioredlock#aioredlock > > But I'm not a big fan to use polling nor with a timeout because the lock > I > > need is very critical, I prefer to block the code than unlock with > timeout. > > > > Do I miss a new awesome library or do you have an easier approach ? > > > > Thanks for your responses. > > -- > > Ludovic Gasc (GMLudo) > > > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Apr 17 09:16:54 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 17 Apr 2018 15:16:54 +0200 Subject: [Async-sig] asyncio.Lock equivalent for multiple processes References: <20180417134100.2fff0e3a@fsol> Message-ID: <20180417151654.31f22050@fsol> You could simply use something like the first 64 bits of sha1("myapp:") Regards Antoine. On Tue, 17 Apr 2018 15:04:37 +0200 Ludovic Gasc wrote: > Hi Antoine & Chris, > > Thanks a lot for the advisory lock, I didn't know this feature in > PostgreSQL. > Indeed, it seems to fit my problem. > > The small latest problem I have is that we have string names for locks, > but advisory locks accept only integers. > Nevertheless, it isn't a problem, I will do a mapping between names and > integers. > > Yours. > > -- > Ludovic Gasc (GMLudo) > > 2018-04-17 13:41 GMT+02:00 Antoine Pitrou : > > > On Tue, 17 Apr 2018 13:34:47 +0200 > > Ludovic Gasc wrote: > > > Hi Nickolai, > > > > > > Thanks for your suggestions, especially for the file system lock: We > > don't > > > have often locks, but we must be sure it's locked. > > > > > > For 1) and 4) suggestions, in fact we have several systems to sync and > > also > > > a PostgreSQL transaction, the request must be treated by the same worker > > > from beginning to end and the other systems aren't idempotent at all, > > it's > > > "old-school" proprietary systems, good luck to change that ;-) > > > > If you already have a PostgreSQL connection, can't you use a PostgreSQL > > lock? e.g. an "advisory lock" as described in > > https://www.postgresql.org/docs/9.1/static/explicit-locking.html > > > > Regards > > > > Antoine. > > > > > > > From robertomartinezp at gmail.com Tue Apr 17 01:19:21 2018 From: robertomartinezp at gmail.com (=?UTF-8?Q?Roberto_Mart=C3=ADnez?=) Date: Tue, 17 Apr 2018 05:19:21 +0000 Subject: [Async-sig] [python-tulip] asyncio.Lock equivalent for multiple processes In-Reply-To: References: Message-ID: Hi, I don't know if there is a third party solution for this. I think the closest you can get today using the standard library is using a multiprocessing.manager().Lock (which can be shared among processes) and call the lock.acquire() function with asyncio.run_in_executor(), using a ThreadedPoolExecutor to avoid blocking the asyncio event loop. Best regards, Roberto El mar., 17 abr. 2018 a las 0:05, Ludovic Gasc () escribi?: > Hi, > > I'm looking for a equivalent of asyncio.Lock ( > https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock) but > shared between several processes on the same server, because I'm migrating > a daemon from mono-worker to multi-worker pattern. > > For now, the closest solution in term of API seems aioredlock: > https://github.com/joanvila/aioredlock#aioredlock > But I'm not a big fan to use polling nor with a timeout because the lock I > need is very critical, I prefer to block the code than unlock with timeout. > > Do I miss a new awesome library or do you have an easier approach ? > > Thanks for your responses. > -- > Ludovic Gasc (GMLudo) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmludo at gmail.com Tue Apr 17 17:39:27 2018 From: gmludo at gmail.com (Ludovic Gasc) Date: Tue, 17 Apr 2018 23:39:27 +0200 Subject: [Async-sig] asyncio.Lock equivalent for multiple processes In-Reply-To: <20180417151654.31f22050@fsol> References: <20180417134100.2fff0e3a@fsol> <20180417151654.31f22050@fsol> Message-ID: 2018-04-17 15:16 GMT+02:00 Antoine Pitrou : > > > You could simply use something like the first 64 bits of > sha1("myapp:") > I have followed your idea, except I used hashtext directly, it's an internal postgresql function that generates an integer directly. For now, it seems to work pretty well but I didn't yet finished all tests. The final result is literally 3 lines of Python inside an async contextmanager, I like this solution ;-) : @asynccontextmanager async def lock(env, category='global', name='global'): # Alternative lock id with 'mytable'::regclass::integer OID await env['aiopg']['cursor'].execute("SELECT pg_advisory_lock( hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)}) yield None await env['aiopg']['cursor'].execute("SELECT pg_advisory_unlock( hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)}) > > Regards > > Antoine. > > > On Tue, 17 Apr 2018 15:04:37 +0200 > Ludovic Gasc wrote: > > Hi Antoine & Chris, > > > > Thanks a lot for the advisory lock, I didn't know this feature in > > PostgreSQL. > > Indeed, it seems to fit my problem. > > > > The small latest problem I have is that we have string names for locks, > > but advisory locks accept only integers. > > Nevertheless, it isn't a problem, I will do a mapping between names and > > integers. > > > > Yours. > > > > -- > > Ludovic Gasc (GMLudo) > > > > 2018-04-17 13:41 GMT+02:00 Antoine Pitrou : > > > > > On Tue, 17 Apr 2018 13:34:47 +0200 > > > Ludovic Gasc wrote: > > > > Hi Nickolai, > > > > > > > > Thanks for your suggestions, especially for the file system lock: > We > > > don't > > > > have often locks, but we must be sure it's locked. > > > > > > > > For 1) and 4) suggestions, in fact we have several systems to sync > and > > > also > > > > a PostgreSQL transaction, the request must be treated by the same > worker > > > > from beginning to end and the other systems aren't idempotent at > all, > > > it's > > > > "old-school" proprietary systems, good luck to change that ;-) > > > > > > If you already have a PostgreSQL connection, can't you use a PostgreSQL > > > lock? e.g. an "advisory lock" as described in > > > https://www.postgresql.org/docs/9.1/static/explicit-locking.html > > > > > > Regards > > > > > > Antoine. > > > > > > > > > > > > > > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 17 19:21:31 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 17 Apr 2018 23:21:31 +0000 Subject: [Async-sig] asyncio.Lock equivalent for multiple processes In-Reply-To: References: <20180417134100.2fff0e3a@fsol> <20180417151654.31f22050@fsol> Message-ID: Pretty sure you want to add a try/finally around that yield, so you release the lock on errors. On Tue, Apr 17, 2018, 14:39 Ludovic Gasc wrote: > 2018-04-17 15:16 GMT+02:00 Antoine Pitrou : > >> >> >> You could simply use something like the first 64 bits of >> sha1("myapp:") >> > > I have followed your idea, except I used hashtext directly, it's an > internal postgresql function that generates an integer directly. > > For now, it seems to work pretty well but I didn't yet finished all tests. > The final result is literally 3 lines of Python inside an async > contextmanager, I like this solution ;-) : > > @asynccontextmanager > async def lock(env, category='global', name='global'): > # Alternative lock id with 'mytable'::regclass::integer OID > await env['aiopg']['cursor'].execute("SELECT pg_advisory_lock( > hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)}) > > yield None > > await env['aiopg']['cursor'].execute("SELECT pg_advisory_unlock( > hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)}) > > > >> >> Regards >> >> Antoine. >> >> >> On Tue, 17 Apr 2018 15:04:37 +0200 >> Ludovic Gasc wrote: >> > Hi Antoine & Chris, >> > >> > Thanks a lot for the advisory lock, I didn't know this feature in >> > PostgreSQL. >> > Indeed, it seems to fit my problem. >> > >> > The small latest problem I have is that we have string names for locks, >> > but advisory locks accept only integers. >> > Nevertheless, it isn't a problem, I will do a mapping between names and >> > integers. >> > >> > Yours. >> > >> > -- >> > Ludovic Gasc (GMLudo) >> > >> > 2018-04-17 13:41 GMT+02:00 Antoine Pitrou : >> > >> > > On Tue, 17 Apr 2018 13:34:47 +0200 >> > > Ludovic Gasc wrote: >> > > > Hi Nickolai, >> > > > >> > > > Thanks for your suggestions, especially for the file system lock: >> We >> > > don't >> > > > have often locks, but we must be sure it's locked. >> > > > >> > > > For 1) and 4) suggestions, in fact we have several systems to sync >> and >> > > also >> > > > a PostgreSQL transaction, the request must be treated by the same >> worker >> > > > from beginning to end and the other systems aren't idempotent at >> all, >> > > it's >> > > > "old-school" proprietary systems, good luck to change that ;-) >> > > >> > > If you already have a PostgreSQL connection, can't you use a >> PostgreSQL >> > > lock? e.g. an "advisory lock" as described in >> > > https://www.postgresql.org/docs/9.1/static/explicit-locking.html >> > > >> > > Regards >> > > >> > > Antoine. >> > > >> > > >> > > >> > >> >> >> >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmludo at gmail.com Wed Apr 18 01:09:46 2018 From: gmludo at gmail.com (Ludovic Gasc) Date: Wed, 18 Apr 2018 05:09:46 +0000 Subject: [Async-sig] asyncio.Lock equivalent for multiple processes In-Reply-To: References: <20180417134100.2fff0e3a@fsol> <20180417151654.31f22050@fsol> Message-ID: Indeed, thanks for the suggestion :-) Le mer. 18 avr. 2018 ? 01:21, Nathaniel Smith a ?crit : > Pretty sure you want to add a try/finally around that yield, so you > release the lock on errors. > > On Tue, Apr 17, 2018, 14:39 Ludovic Gasc wrote: > >> 2018-04-17 15:16 GMT+02:00 Antoine Pitrou : >> >>> >>> >>> You could simply use something like the first 64 bits of >>> sha1("myapp:") >>> >> >> I have followed your idea, except I used hashtext directly, it's an >> internal postgresql function that generates an integer directly. >> >> For now, it seems to work pretty well but I didn't yet finished all tests. >> The final result is literally 3 lines of Python inside an async >> contextmanager, I like this solution ;-) : >> >> @asynccontextmanager >> async def lock(env, category='global', name='global'): >> # Alternative lock id with 'mytable'::regclass::integer OID >> await env['aiopg']['cursor'].execute("SELECT pg_advisory_lock( >> hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)}) >> >> yield None >> >> await env['aiopg']['cursor'].execute("SELECT pg_advisory_unlock( >> hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)}) >> >> >> >>> >>> Regards >>> >>> Antoine. >>> >>> >>> On Tue, 17 Apr 2018 15:04:37 +0200 >>> Ludovic Gasc wrote: >>> > Hi Antoine & Chris, >>> > >>> > Thanks a lot for the advisory lock, I didn't know this feature in >>> > PostgreSQL. >>> > Indeed, it seems to fit my problem. >>> > >>> > The small latest problem I have is that we have string names for locks, >>> > but advisory locks accept only integers. >>> > Nevertheless, it isn't a problem, I will do a mapping between names and >>> > integers. >>> > >>> > Yours. >>> > >>> > -- >>> > Ludovic Gasc (GMLudo) >>> > >>> > 2018-04-17 13:41 GMT+02:00 Antoine Pitrou : >>> > >>> > > On Tue, 17 Apr 2018 13:34:47 +0200 >>> > > Ludovic Gasc wrote: >>> > > > Hi Nickolai, >>> > > > >>> > > > Thanks for your suggestions, especially for the file system lock: >>> We >>> > > don't >>> > > > have often locks, but we must be sure it's locked. >>> > > > >>> > > > For 1) and 4) suggestions, in fact we have several systems to sync >>> and >>> > > also >>> > > > a PostgreSQL transaction, the request must be treated by the same >>> worker >>> > > > from beginning to end and the other systems aren't idempotent at >>> all, >>> > > it's >>> > > > "old-school" proprietary systems, good luck to change that ;-) >>> > > >>> > > If you already have a PostgreSQL connection, can't you use a >>> PostgreSQL >>> > > lock? e.g. an "advisory lock" as described in >>> > > https://www.postgresql.org/docs/9.1/static/explicit-locking.html >>> > > >>> > > Regards >>> > > >>> > > Antoine. >>> > > >>> > > >>> > > >>> > >>> >>> >>> >>> _______________________________________________ >>> Async-sig mailing list >>> Async-sig at python.org >>> https://mail.python.org/mailman/listinfo/async-sig >>> Code of Conduct: https://www.python.org/psf/codeofconduct/ >>> >> >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mehaase at gmail.com Tue Apr 24 17:25:46 2018 From: mehaase at gmail.com (Mark E. Haase) Date: Tue, 24 Apr 2018 17:25:46 -0400 Subject: [Async-sig] Cancelling a coroutine from a signal handler? Message-ID: I am trying to understand some unexpected behavior in asyncio. My goal is to use a custom signal handler to cleanly unwind an asyncio program that has many different tasks running. Here's a simplified test case: 1 import asyncio, logging, random, signal, sys 2 3 logging.basicConfig(level=logging.DEBUG) 4 logger = logging.getLogger() 5 6 async def main(): 7 try: 8 await asyncio.Event().wait() 9 except asyncio.CancelledError: 10 logger.info('cancelled main()') 11 # cleanup logic 12 13 def handle_sigint(signal, frame): 14 global sigint_count 15 global main_coro 16 sigint_count += 1 17 if sigint_count == 1: 18 logger.warn('Received interrupt: shutting down...') 19 main_coro.throw(asyncio.CancelledError()) 20 # missing event loop logic? 21 else: 22 logger.warn('Received 2nd interrupt: exiting!') 23 main_coro.throw(SystemExit(1)) 24 25 sigint_count = 0 26 signal.signal(signal.SIGINT, handle_sigint) 27 loop = asyncio.get_event_loop() 28 main_coro = main() 29 try: 30 loop.run_until_complete(main_coro) 31 except StopIteration: 32 logger.info('run_until_complete() finished') The main() function is a placeholder that represents some long running task, e.g. a server that is waiting for new connections. The handle_sigint() function is supposed to attempt to cancel main() so that it can gracefully exit, but if it receives a second interrupt, then the process exits immediately. Here's an example running the program and then typing Ctrl+C. $ python test.py DEBUG:asyncio:Using selector: EpollSelector ^CWARNING:root:Received interrupt: shutting down... INFO:root:cancelled main() INFO:root:run_until_complete() finished This works as I expect it to. Of course my cleanup logic (line 10) isn't actually doing anything. In a real server, I might want to send goodbye messages to connected clients. To mock this behavior, I'll modify line 11: 11 await asyncio.sleep(0) Surprisingly, now my cleanup code hangs: $ python test.py DEBUG:asyncio:Using selector: EpollSelector ^CWARNING:root:Received interrupt: shutting down... INFO:root:cancelled main() ^CWARNING:root:Received 2nd interrupt: exiting! Notice that the program doesn't exit after the first interrupt. It enters the exception handler and appears to hang on the await expression on line 11. I have to interrupt it a second time, which throws SystemExit instead. I puzzled over this for quite some time until I realized that I can force main() to resume by changing line 20: 20 main_coro.send(None) With this change, the interrupt causes the cleanup logic to run to completion and the program exits normally. Of course, if I add a second await expression: 11 await asyncio.sleep(0); await asyncio.sleep(0) Then I also have to step twice: 20 main_coro.send(None); main_coro.send(None) My mental model of how the event loop works is pretty poor, but I roughly understand that the event loop is responsible for driving coroutines. It appears here that the event loop has stopped driving my main() coroutine, and so the only way to force it to complete is to call send() from my code. Can somebody explain *why* the event loop is not driving my coroutine? Is this a bug or am I missing something conceptually? More broadly, handling KeyboardInterrupt in async code seems very tricky, but I also cannot figure out how to make this interrupt approach work. Is one of these better than the other? What is the best practice here? Would it be terrible to add `while True: main_coro.send(None)` to my signal handler? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 24 21:54:53 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 24 Apr 2018 18:54:53 -0700 Subject: [Async-sig] Cancelling a coroutine from a signal handler? In-Reply-To: References: Message-ID: On Tue, Apr 24, 2018 at 2:25 PM, Mark E. Haase wrote: > My mental model of how the event loop works is pretty poor, but I roughly > understand that the event loop is responsible for driving coroutines. It > appears > here that the event loop has stopped driving my main() coroutine, and so the > only way to force it to complete is to call send() from my code. It hasn't stopped driving your main() coroutine ? as far as it knows, main() is still waiting for the Event.wait() call to complete, and as soon as it does it will start iterating the coroutine again. You really, really, definitely should not be trying to manually iterate a coroutine object associate with a Task. > More broadly, handling KeyboardInterrupt in async code seems very tricky, > but I also cannot figure out how to make this interrupt approach work. Is > one > of these better than the other? What is the best practice here? Would it be > terrible to add `while True: main_coro.send(None)` to my signal handler? Yes, it would be terrible :-). Instead of trying to throw exceptions manually, you should call the cancel() method on the Task object. (Of if you want to abort immediately because the previous control-C was ignored, use something like os._exit() or os.abort().) The other complication is that doing *anything* from a signal handler is fraught with peril, because of reentrancy issues. I actually don't think there are *any* functions in asyncio that are guaranteed to be safe to call from a signal handler. Looking at the code for Task.cancel, I definitely don't trust that it's safe to call from a signal handler. The simplest solution would be to use asyncio's native signal handler support instead of the signal module: https://docs.python.org/3/library/asyncio-eventloop.html#unix-signals However, there are some trade-offs: - it's not implemented on Windows - it relies on the event loop running. In particular, if the event loop is stalled (e.g. because some task got stuck in an infinite loop), then your signal handler will never be called, so your "emergency abort" code won't work. Alternatively, you can define a handler using signal.signal, and then arrange to re-enter the asyncio main loop yourself before calling Task.cancel. I believe that the only guaranteed-to-be-safe way to do this is: - in your signal handler, spawn a new thread (!) - from the new thread, call loop.call_soon_threadsafe(your_main_task.cancel) (Trio's version of call_soon_threadsafe *is* guaranteed to be both thread- and signal-safe, but asyncio's isn't, and in asyncio there are multiple event loop implementations so even if one happens to be signal-safe by chance you don't know about the others... also Trio handles control-C automatically so you don't need to worry about this in the first place. But I don't know how to port Trio's generic solution to asyncio :-(.) -n -- Nathaniel J. Smith -- https://vorpus.org From dimaqq at gmail.com Tue Apr 24 22:33:17 2018 From: dimaqq at gmail.com (Dima Tisnek) Date: Wed, 25 Apr 2018 10:33:17 +0800 Subject: [Async-sig] Cancelling a coroutine from a signal handler? In-Reply-To: References: Message-ID: Perhaps it's good to distinguish between graceful shutdown signal (cancel all head/logical tasks, or even all tasks, let finally blocks run) and hard stop signal. In the past, synchronous code, I've used following paradigm: def custom_signal(): alarm(5) raise KeyboardInterrupt() Keyboard interrupt was chosen so that manual execution is stopped with ^C in the same way server process, this makes testing much easier :) Also it inherits from BaseException, which is nice. I think that something similar can be done for you asynchronous case -- graceful shutdown using asyncio builtin signal handling and hard stop using signal.SIG_DFL and signal number where that means termination. On 25 April 2018 at 09:54, Nathaniel Smith wrote: > On Tue, Apr 24, 2018 at 2:25 PM, Mark E. Haase wrote: >> My mental model of how the event loop works is pretty poor, but I roughly >> understand that the event loop is responsible for driving coroutines. It >> appears >> here that the event loop has stopped driving my main() coroutine, and so the >> only way to force it to complete is to call send() from my code. > > It hasn't stopped driving your main() coroutine ? as far as it knows, > main() is still waiting for the Event.wait() call to complete, and as > soon as it does it will start iterating the coroutine again. > > You really, really, definitely should not be trying to manually > iterate a coroutine object associate with a Task. > >> More broadly, handling KeyboardInterrupt in async code seems very tricky, >> but I also cannot figure out how to make this interrupt approach work. Is >> one >> of these better than the other? What is the best practice here? Would it be >> terrible to add `while True: main_coro.send(None)` to my signal handler? > > Yes, it would be terrible :-). > > Instead of trying to throw exceptions manually, you should call the > cancel() method on the Task object. (Of if you want to abort > immediately because the previous control-C was ignored, use something > like os._exit() or os.abort().) > > The other complication is that doing *anything* from a signal handler > is fraught with peril, because of reentrancy issues. I actually don't > think there are *any* functions in asyncio that are guaranteed to be > safe to call from a signal handler. Looking at the code for > Task.cancel, I definitely don't trust that it's safe to call from a > signal handler. > > The simplest solution would be to use asyncio's native signal handler > support instead of the signal module: > https://docs.python.org/3/library/asyncio-eventloop.html#unix-signals > However, there are some trade-offs: > - it's not implemented on Windows > - it relies on the event loop running. In particular, if the event > loop is stalled (e.g. because some task got stuck in an infinite > loop), then your signal handler will never be called, so your > "emergency abort" code won't work. > > Alternatively, you can define a handler using signal.signal, and then > arrange to re-enter the asyncio main loop yourself before calling > Task.cancel. I believe that the only guaranteed-to-be-safe way to do > this is: > > - in your signal handler, spawn a new thread (!) > - from the new thread, call loop.call_soon_threadsafe(your_main_task.cancel) > > (Trio's version of call_soon_threadsafe *is* guaranteed to be both > thread- and signal-safe, but asyncio's isn't, and in asyncio there are > multiple event loop implementations so even if one happens to be > signal-safe by chance you don't know about the others... also Trio > handles control-C automatically so you don't need to worry about this > in the first place. But I don't know how to port Trio's generic > solution to asyncio :-(.) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From njs at pobox.com Wed Apr 25 05:24:15 2018 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 25 Apr 2018 02:24:15 -0700 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful Message-ID: Hi all, I just posted another essay on concurrent API design: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ This is the one that finally gets at the core reasons why Trio exists; I've been trying to figure out how to write it for at least a year now. I hope you like it. (Guido: this is the one you should read :-). Or if it's too much, you can jump to the conclusion [1], and I'm happy to come find you somewhere with a whiteboard, if that'd be helpful!) -n [1] https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#conclusion -- Nathaniel J. Smith -- https://vorpus.org From andrew.svetlov at gmail.com Wed Apr 25 06:03:29 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 25 Apr 2018 10:03:29 +0000 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful In-Reply-To: References: Message-ID: Interesting, thanks On Wed, Apr 25, 2018 at 12:24 PM Nathaniel Smith wrote: > Hi all, > > I just posted another essay on concurrent API design: > > > https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ > > This is the one that finally gets at the core reasons why Trio exists; > I've been trying to figure out how to write it for at least a year > now. I hope you like it. > > (Guido: this is the one you should read :-). Or if it's too much, you > can jump to the conclusion [1], and I'm happy to come find you > somewhere with a whiteboard, if that'd be helpful!) > > -n > > [1] > https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#conclusion > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Apr 25 06:17:25 2018 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 25 Apr 2018 12:17:25 +0200 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful References: Message-ID: <20180425121725.2e511740@fsol> On Wed, 25 Apr 2018 02:24:15 -0700 Nathaniel Smith wrote: > Hi all, > > I just posted another essay on concurrent API design: > > https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ > > This is the one that finally gets at the core reasons why Trio exists; > I've been trying to figure out how to write it for at least a year > now. I hope you like it. My experience is indeed that something like the nursery construct would make concurrent programming much more robust in complex cases. This is a great explanation why. API note: I would expect to be able to use it this way: class MyEndpoint: def __init__(self): self._nursery = open_nursery() # Lots of behaviour methods that can put new tasks in the nursery def close(self): self._nursery.close() Also perhaps more finegrained shutdown routines such as: * Nursery.join(cancel_after=None): wait for all tasks to join, cancel the remaining ones after the given timeout Regards Antoine. From mehaase at gmail.com Wed Apr 25 09:33:42 2018 From: mehaase at gmail.com (Mark E. Haase) Date: Wed, 25 Apr 2018 09:33:42 -0400 Subject: [Async-sig] Cancelling a coroutine from a signal handler? In-Reply-To: References: Message-ID: On Tue, Apr 24, 2018 at 9:54 PM, Nathaniel Smith wrote: > > The simplest solution would be to use asyncio's native signal handler > support instead of the signal module: > https://docs.python.org/3/library/asyncio-eventloop.html#unix-signals > Ahh, wow, I don't know how I missed this. I've been obsessing over coroutines and event loops for hours, now I realize that I misunderstood the voodoo in the signal module. Thank you for pointing me in this direction! Alternatively, you can define a handler using signal.signal, and then > arrange to re-enter the asyncio main loop yourself before calling > Task.cancel. I believe that the only guaranteed-to-be-safe way to do > this is: This is also an interesting approach that I will experiment with. I guess this solves problem #1 (works on Windows) but not #2 (task stuck in loop), right? (The latter is a feature of all cooperative multitasking systems, yeah?) Great blog post today! I really enjoy your writing style and Trio is really exciting. Cheers, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Apr 25 11:08:14 2018 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 25 Apr 2018 15:08:14 +0000 Subject: [Async-sig] Cancelling a coroutine from a signal handler? In-Reply-To: References: Message-ID: On Wed, Apr 25, 2018, 06:34 Mark E. Haase wrote: > > This is also an interesting approach that I will experiment with. I guess > this solves problem #1 (works on Windows) but not #2 (task stuck in loop), > right? (The latter is a feature of all cooperative multitasking systems, > yeah?) > If a task is hogging the loop, then you won't be able to shut down politely using Task.cancel or similar. But if you're using signal.signal directly then it would mean that your signal handler would still *run* while the loop was blocked, so you'd at least have the option of escalating to os._exit or similar. I'm not sure I *really* advocate spawning a thread from your signal handler just to call one loop method, but, hey, at least you know your options :-). > Great blog post today! I really enjoy your writing style and Trio is > really exciting. > Thanks! -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 26 00:43:31 2018 From: guido at python.org (Guido van Rossum) Date: Wed, 25 Apr 2018 21:43:31 -0700 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful In-Reply-To: References: Message-ID: Now there's a PEP I'd like to see. On Wed, Apr 25, 2018 at 2:24 AM, Nathaniel Smith wrote: > Hi all, > > I just posted another essay on concurrent API design: > > https://vorpus.org/blog/notes-on-structured-concurrency-or- > go-statement-considered-harmful/ > > This is the one that finally gets at the core reasons why Trio exists; > I've been trying to figure out how to write it for at least a year > now. I hope you like it. > > (Guido: this is the one you should read :-). Or if it's too much, you > can jump to the conclusion [1], and I'm happy to come find you > somewhere with a whiteboard, if that'd be helpful!) > > -n > > [1] https://vorpus.org/blog/notes-on-structured-concurrency-or- > go-statement-considered-harmful/#conclusion > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dimaqq at gmail.com Thu Apr 26 22:55:21 2018 From: dimaqq at gmail.com (Dima Tisnek) Date: Fri, 27 Apr 2018 10:55:21 +0800 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful In-Reply-To: References: Message-ID: My 2c after careful reading: restarting tasks automatically (custom nursery example) is quite questionable: * it's unexpected * it's not generally safe (argument reuse, side effects) * user's coroutine can be decorated to achieve same effect I'd say just remove this, it's not relevant to your thesis. It's very nice to have the escape hatch of posting tasks to "someone else's" nursery. I feel there are more caveats to posting a task to parent's or global nursery though. Consider that local tasks typically await on other local tasks. What happens when N1-task1 waits on N2-task2 and N2-task9 encounters an error? My guess is N2-task2 is cancelled, which by default cancels N1-task1 too, right? That kinda break the abstraction, doesn't it? If the escape hatch is available, how about allowing tasks to be moved between nurseries? Is dependency inversion allowed? (as in given parent N1 and child N1.N2, can N1.N2.t2 await on N1.t1 ?) If that's the case, I guess it's not a "tree of tasks", as in the graph is arbitrary, not DAG. I've seen [proprietary] strict DAG task frameworks. while they are useful to e.g. perform sub-requests in parallel, they are not general enough to be useful at large. Thus I'm assuming trio does not enforce DAG... Finally, slob programmers like me occasionally want fire-and-forget tasks, aka daemonic threads. Some are long-lived, e.g. "battery status poller", others short-lived, e.g. "tail part of low-latency logging". Obv., a careful programmer would keep track of those, but we want things simple :) Perhaps in line with batteries included principle, trio could include a standard way to accomplish that? Thanks again for the great post! I think you could publish an article on this, it would be good to have wider discussion, academic, ES6, etc. d. On 25 April 2018 at 17:24, Nathaniel Smith wrote: > Hi all, > > I just posted another essay on concurrent API design: > > https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ > > This is the one that finally gets at the core reasons why Trio exists; > I've been trying to figure out how to write it for at least a year > now. I hope you like it. > > (Guido: this is the one you should read :-). Or if it's too much, you > can jump to the conclusion [1], and I'm happy to come find you > somewhere with a whiteboard, if that'd be helpful!) > > -n > > [1] https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#conclusion > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From njs at pobox.com Fri Apr 27 00:44:05 2018 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 26 Apr 2018 21:44:05 -0700 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful In-Reply-To: References: Message-ID: On Thu, Apr 26, 2018 at 7:55 PM, Dima Tisnek wrote: > My 2c after careful reading: > > restarting tasks automatically (custom nursery example) is quite questionable: > * it's unexpected > * it's not generally safe (argument reuse, side effects) > * user's coroutine can be decorated to achieve same effect It's an example of something that a user could implement. I guess if you go to the trouble of implementing this behavior, then it is no longer unexpected and you can also cope with handling the edge cases :-).There may be some reason why it turns out to be a bad idea specifically in the context of Python, but it's one of the features that's famously helpful for making Erlang work so well, so it seemed worth mentioning. > It's very nice to have the escape hatch of posting tasks to "someone > else's" nursery. > I feel there are more caveats to posting a task to parent's or global > nursery though. > Consider that local tasks typically await on other local tasks. > What happens when N1-task1 waits on N2-task2 and N2-task9 encounters an error? > My guess is N2-task2 is cancelled, which by default cancels N1-task1 too, right? > That kinda break the abstraction, doesn't it? "Await on a task" is not a verb that Trio has. (We don't even have task objects, except in some low-level plumbing/introspection APIs.) You can do 'await queue.get()' to wait for another task to send you something, but if the other task gets cancelled then the data will just... never arrive. There is some discussion here of moving from a queue.Queue-like model to a model with separate send- and receive-channels: https://github.com/python-trio/trio/issues/497 If we do this (which I suspect we will), then probably the task that gets cancelled was holding the only reference to the send-channel (or even better, did 'with send_channel: ...'), so the channel will get closed, and then the call to get() will raise an error which it can handle or not... But yes, you do need to spend some time thinking about what kind of task tree topology makes sense for your problem. Trio can give you tools but it's not a replacement for thoughtful design :-). > If the escape hatch is available, how about allowing tasks to be moved > between nurseries? That would be possible (and in fact there's one special case internally where we do it!), but I haven't seen a good reason yet to implement it as a standard feature. If someone shows up with use cases then we could talk about it :-). > Is dependency inversion allowed? > (as in given parent N1 and child N1.N2, can N1.N2.t2 await on N1.t1 ?) > If that's the case, I guess it's not a "tree of tasks", as in the > graph is arbitrary, not DAG. See above re: not having "wait on a task" as a verb. > I've seen [proprietary] strict DAG task frameworks. > while they are useful to e.g. perform sub-requests in parallel, > they are not general enough to be useful at large. > Thus I'm assuming trio does not enforce DAG... The task tree itself is in fact a tree, not a DAG. But that tree doesn't control which tasks can talk to each other. It's just used for exception propagation, and for enforcing that all children have to finish before the parent can continue. (Just like how in a regular function call, the caller stops while the callee is running.) Does that help? > Finally, slob programmers like me occasionally want fire-and-forget > tasks, aka daemonic threads. > Some are long-lived, e.g. "battery status poller", others short-lived, > e.g. "tail part of low-latency logging". > Obv., a careful programmer would keep track of those, but we want > things simple :) > Perhaps in line with batteries included principle, trio could include > a standard way to accomplish that? Well, what semantics do you want? If the battery status poller crashes, what should happen? If the "tail part of low-latency logging" command is still running when you go to shut down, do you want to wait a bit for it to finish, or cancel it, or ...? You can certainly implement some helper like: async with open_throwaway_nursery() as throwaway_nursery: # If this crashes, we ignore the problem, maybe log it or something throwaway_nursery.start_soon(some_fn) ... # When we exit the with block, it gets cancelled ... if that's what you want. Before adding anything like this to trio itself though I'd like to see some evidence of how it's being used in real-ish projects. > Thanks again for the great post! > I think you could publish an article on this, it would be good to have > wider discussion, academic, ES6, etc. Thanks for the vote of confidence :-). And, we'll see... -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Fri Apr 27 01:03:04 2018 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 26 Apr 2018 22:03:04 -0700 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful In-Reply-To: References: Message-ID: On Wed, Apr 25, 2018 at 9:43 PM, Guido van Rossum wrote: > Now there's a PEP I'd like to see. Which part? -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Fri Apr 27 01:08:33 2018 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 26 Apr 2018 22:08:33 -0700 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful In-Reply-To: <20180425121725.2e511740@fsol> References: <20180425121725.2e511740@fsol> Message-ID: On Wed, Apr 25, 2018 at 3:17 AM, Antoine Pitrou wrote: > On Wed, 25 Apr 2018 02:24:15 -0700 > Nathaniel Smith wrote: >> Hi all, >> >> I just posted another essay on concurrent API design: >> >> https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ >> >> This is the one that finally gets at the core reasons why Trio exists; >> I've been trying to figure out how to write it for at least a year >> now. I hope you like it. > > My experience is indeed that something like the nursery construct would > make concurrent programming much more robust in complex cases. > This is a great explanation why. Thanks! > API note: I would expect to be able to use it this way: > > class MyEndpoint: > > def __init__(self): > self._nursery = open_nursery() > > # Lots of behaviour methods that can put new tasks in the nursery > > def close(self): > self._nursery.close() You might expect to be able to use it that way, but you can't! The 'async with' part of 'async with open_nursery()' is mandatory. This is what I mean about it forcing you to rethink things, and why I think there is room for genuine controversy :-). (Just like there was about goto -- it's weird to think that it could have turned out differently in hindsight, but people really did have valid concerns...) I think the pattern we're settling on for this particular case is: class MyEndpoint: def __init__(self, nursery, ...): self._nursery = nursery # methods here that use nursery @asynccontextmanager async def open_my_endpoint(...): async with trio.open_nursery() as nursery: yield MyEndpoint(nursery, ...) Then most end-users do 'async with open_my_endpoint() as endpoint:' and then use the 'endpoint' object inside the block; or if you have some special reason why you need to have multiple endpoints in the same nursery (e.g. you have an unbounded number of endpoints and don't want to have to somehow write an unbounded number of 'async with' blocks in your source code), then you can call MyEndpoint() directly and pass an explicit nursery. A little bit of extra fuss, but not too much. So that's how you handle it. Why do we make you jump through these hoops? The problem is, we want to enforce that each nursery object's lifetime is bound to the lifetime of a calling frame. The point of the 'async with' in 'async with open_nursery()' is to perform this binding. To reduce errors, open_nursery() doesn't even return a nursery object ? only open_nursery().__aenter__() does that. Otherwise, if a task in the nursery has an unhandled error, we have nowhere to report it (among other issues). Of course this is Python, so you can always do gross hacks like calling __aenter__ yourself, but then you're responsible for making sure the context manager semantics are respected. In most systems you'd expect this kind of thing to syntactically enforced as part of the language; it's actually pretty amazing that Trio is able to makes things work as well as it can as a "mere library". It's really a testament to how much thought has been put into Python -- other languages don't really have any equivalent to with or Python's generator-based async/await. > Also perhaps more finegrained shutdown routines such as: > > * Nursery.join(cancel_after=None): > > wait for all tasks to join, cancel the remaining ones > after the given timeout Hmm, I've never needed that particular pattern, but it's actually pretty easy to express. I didn't go into it in this writeup, but: because nurseries need to be able to cancel their contents in order to unwind the stack during exception propagation, they need to enclose their contents in a cancel scope. And since they have this cancel scope anyway, we expose it on the nursery object. And cancel scopes allow you to adjust their deadline. So if you write: async with trio.open_nursery() as nursery: ... blah blah ... # Last line before exiting the block and triggering the implicit join(): nursery.cancel_scope.deadline = trio.current_time() + TIMEOUT then it'll give you the semantics you're asking about. There could be more sugar for this if it turns out to be useful. Maybe a .timeout attribute on cancel scopes that's a magic property always equal to (self.deadline - trio.current_time()), so you could do 'nursery.cancel_scope.timeout = TIMEOUT'? -n -- Nathaniel J. Smith -- https://vorpus.org From guido at python.org Fri Apr 27 13:21:27 2018 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Apr 2018 10:21:27 -0700 Subject: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful In-Reply-To: References: Message-ID: Adding nurseries to asyncio (or wherever in the stdlib they fit -- if they can be independent from asyncio and shared between asyncio and trio, all the better). On Thu, Apr 26, 2018 at 10:03 PM, Nathaniel Smith wrote: > On Wed, Apr 25, 2018 at 9:43 PM, Guido van Rossum > wrote: > > Now there's a PEP I'd like to see. > > Which part? > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: