From chris.jerdonek at gmail.com Mon Feb 12 23:10:32 2018 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Mon, 12 Feb 2018 20:10:32 -0800 Subject: [Async-sig] avoiding accidentally calling blocking code Message-ID: Hi, This is a general Python async question I've had that I haven't seen a discussion of. Do people know of any techniques other than manually inspecting code line by line to find out if you're accidentally making a raw call to a blocking function in a coroutine? Related to this, again, outside of inspecting the external source code, is there any way to know if a function you're calling (e.g. in someone else's library or in the standard lib) has the potential of blocking? What would be the way of detecting something like this if it made its way into "production"? What would be some of the symptoms? --Chris From dimaqq at gmail.com Tue Feb 13 00:07:24 2018 From: dimaqq at gmail.com (Dima Tisnek) Date: Tue, 13 Feb 2018 13:07:24 +0800 Subject: [Async-sig] avoiding accidentally calling blocking code In-Reply-To: References: Message-ID: Let me try to answer the question behind the question. Like any code validation, it's a healthy mix of: * unit tests [perhaps with mock.patch_all_known_blocking_calls(side_effect=Exception)] * good judgement [open("foo").read() technically blocks, but only a problem on network filesystems] * monitoring prod [at least to ensure that tests correspond to prod] * peer review Specific issues you've raised: re: detecting blocking calls ahead of time. That's a hard problem (insert quip about Turing completeness) Perhaps it can be alleviated by function annotations, but that's dangerously close to Java way... re: detecting blocking calls at run time. Let's say there's a per-thread global "blocking lock" that's managed by event loop executor and it's in "allowed" state when executor has no runnables and wishes to select/poll/epoll and it's in "denied" state when user code is running. Some calls can be caught (e.g. check fd fcntl bits before any recv/send/...) Some calls can never be caught (anything via ctypes or extensions in general) Case in point: what do you propose to do about socket.gethostbyname()? re: after the fact. hack1. run a separate thread/process that inspects event loop thread's `wchan` value (Linux) hack2. same via taskstats hack3. detect event loop lag (false positives are cpu-bound tasks and overloaded system, but I think you'd like to detect those too) On 13 February 2018 at 12:10, Chris Jerdonek wrote: > Hi, > > This is a general Python async question I've had that I haven't seen a > discussion of. Do people know of any techniques other than manually > inspecting code line by line to find out if you're accidentally making > a raw call to a blocking function in a coroutine? > > Related to this, again, outside of inspecting the external source > code, is there any way to know if a function you're calling (e.g. in > someone else's library or in the standard lib) has the potential of > blocking? > > What would be the way of detecting something like this if it made its > way into "production"? What would be some of the symptoms? > > --Chris > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ From andrew.svetlov at gmail.com Tue Feb 13 02:24:36 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 13 Feb 2018 07:24:36 +0000 Subject: [Async-sig] avoiding accidentally calling blocking code In-Reply-To: References: Message-ID: loop.slow_callback_duration + PYTHONASYNCIODEBUG=1 Does it fork for you? Do you need extra info? On Tue, Feb 13, 2018 at 7:07 AM Dima Tisnek wrote: > Let me try to answer the question behind the question. > > Like any code validation, it's a healthy mix of: > * unit tests [perhaps with > mock.patch_all_known_blocking_calls(side_effect=Exception)] > * good judgement [open("foo").read() technically blocks, but only a > problem on network filesystems] > * monitoring prod [at least to ensure that tests correspond to prod] > * peer review > > > Specific issues you've raised: > > re: detecting blocking calls ahead of time. > That's a hard problem (insert quip about Turing completeness) > Perhaps it can be alleviated by function annotations, but that's > dangerously close to Java way... > > re: detecting blocking calls at run time. > Let's say there's a per-thread global "blocking lock" that's managed > by event loop executor and it's in "allowed" state when executor has > no runnables and wishes to select/poll/epoll and it's in "denied" > state when user code is running. > Some calls can be caught (e.g. check fd fcntl bits before any > recv/send/...) > Some calls can never be caught (anything via ctypes or extensions in > general) > Case in point: what do you propose to do about socket.gethostbyname()? > > re: after the fact. > hack1. run a separate thread/process that inspects event loop thread's > `wchan` value (Linux) > hack2. same via taskstats > hack3. detect event loop lag (false positives are cpu-bound tasks and > overloaded system, but I think you'd like to detect those too) > > On 13 February 2018 at 12:10, Chris Jerdonek > wrote: > > Hi, > > > > This is a general Python async question I've had that I haven't seen a > > discussion of. Do people know of any techniques other than manually > > inspecting code line by line to find out if you're accidentally making > > a raw call to a blocking function in a coroutine? > > > > Related to this, again, outside of inspecting the external source > > code, is there any way to know if a function you're calling (e.g. in > > someone else's library or in the standard lib) has the potential of > > blocking? > > > > What would be the way of detecting something like this if it made its > > way into "production"? What would be some of the symptoms? > > > > --Chris > > _______________________________________________ > > Async-sig mailing list > > Async-sig at python.org > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > _______________________________________________ > Async-sig mailing list > Async-sig at python.org > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Wed Feb 14 03:42:20 2018 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Wed, 14 Feb 2018 00:42:20 -0800 Subject: [Async-sig] avoiding accidentally calling blocking code In-Reply-To: References: Message-ID: Thanks, Dima and Andrew, for your suggestions. Re: loop.slow_callback_duration, that's interesting. I didn't know about that. However, it doesn't seem like that alone would work to catch many operations that are normally fast, but strictly speaking blocking. For example, it seems like simple disk I/O operations wouldn't be caught even with slow_callback_duration set to a small value. Dima suggests that such calls are okay. Is there a consensus on that? Dima's mock.patch_all_known_blocking_calls() is an interesting idea and seems like it would work for the case I mentioned. Has anyone started writing such a method (e.g. for certain standard lib modules)? --Chris On Mon, Feb 12, 2018 at 11:24 PM, Andrew Svetlov wrote: > loop.slow_callback_duration + PYTHONASYNCIODEBUG=1 > Does it fork for you? > Do you need extra info? > > On Tue, Feb 13, 2018 at 7:07 AM Dima Tisnek wrote: >> >> Let me try to answer the question behind the question. >> >> Like any code validation, it's a healthy mix of: >> * unit tests [perhaps with >> mock.patch_all_known_blocking_calls(side_effect=Exception)] >> * good judgement [open("foo").read() technically blocks, but only a >> problem on network filesystems] >> * monitoring prod [at least to ensure that tests correspond to prod] >> * peer review >> >> >> Specific issues you've raised: >> >> re: detecting blocking calls ahead of time. >> That's a hard problem (insert quip about Turing completeness) >> Perhaps it can be alleviated by function annotations, but that's >> dangerously close to Java way... >> >> re: detecting blocking calls at run time. >> Let's say there's a per-thread global "blocking lock" that's managed >> by event loop executor and it's in "allowed" state when executor has >> no runnables and wishes to select/poll/epoll and it's in "denied" >> state when user code is running. >> Some calls can be caught (e.g. check fd fcntl bits before any >> recv/send/...) >> Some calls can never be caught (anything via ctypes or extensions in >> general) >> Case in point: what do you propose to do about socket.gethostbyname()? >> >> re: after the fact. >> hack1. run a separate thread/process that inspects event loop thread's >> `wchan` value (Linux) >> hack2. same via taskstats >> hack3. detect event loop lag (false positives are cpu-bound tasks and >> overloaded system, but I think you'd like to detect those too) >> >> On 13 February 2018 at 12:10, Chris Jerdonek >> wrote: >> > Hi, >> > >> > This is a general Python async question I've had that I haven't seen a >> > discussion of. Do people know of any techniques other than manually >> > inspecting code line by line to find out if you're accidentally making >> > a raw call to a blocking function in a coroutine? >> > >> > Related to this, again, outside of inspecting the external source >> > code, is there any way to know if a function you're calling (e.g. in >> > someone else's library or in the standard lib) has the potential of >> > blocking? >> > >> > What would be the way of detecting something like this if it made its >> > way into "production"? What would be some of the symptoms? >> > >> > --Chris >> > _______________________________________________ >> > Async-sig mailing list >> > Async-sig at python.org >> > https://mail.python.org/mailman/listinfo/async-sig >> > Code of Conduct: https://www.python.org/psf/codeofconduct/ >> _______________________________________________ >> Async-sig mailing list >> Async-sig at python.org >> https://mail.python.org/mailman/listinfo/async-sig >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -- > Thanks, > Andrew Svetlov From andrew.svetlov at gmail.com Wed Feb 14 04:41:15 2018 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 14 Feb 2018 09:41:15 +0000 Subject: [Async-sig] avoiding accidentally calling blocking code In-Reply-To: References: Message-ID: Mocking everything sounds promising but the problem is in *everything* word. slow_callback_duration is a viable compromise. Personally I don't care about fast blocking IO until it is really very fast. On Wed, Feb 14, 2018 at 10:42 AM Chris Jerdonek wrote: > Thanks, Dima and Andrew, for your suggestions. > > Re: loop.slow_callback_duration, that's interesting. I didn't know > about that. However, it doesn't seem like that alone would work to > catch many operations that are normally fast, but strictly speaking > blocking. For example, it seems like simple disk I/O operations > wouldn't be caught even with slow_callback_duration set to a small > value. Dima suggests that such calls are okay. Is there a consensus on > that? > > Dima's mock.patch_all_known_blocking_calls() is an interesting idea > and seems like it would work for the case I mentioned. Has anyone > started writing such a method (e.g. for certain standard lib modules)? > > --Chris > > > On Mon, Feb 12, 2018 at 11:24 PM, Andrew Svetlov > wrote: > > loop.slow_callback_duration + PYTHONASYNCIODEBUG=1 > > Does it fork for you? > > Do you need extra info? > > > > On Tue, Feb 13, 2018 at 7:07 AM Dima Tisnek wrote: > >> > >> Let me try to answer the question behind the question. > >> > >> Like any code validation, it's a healthy mix of: > >> * unit tests [perhaps with > >> mock.patch_all_known_blocking_calls(side_effect=Exception)] > >> * good judgement [open("foo").read() technically blocks, but only a > >> problem on network filesystems] > >> * monitoring prod [at least to ensure that tests correspond to prod] > >> * peer review > >> > >> > >> Specific issues you've raised: > >> > >> re: detecting blocking calls ahead of time. > >> That's a hard problem (insert quip about Turing completeness) > >> Perhaps it can be alleviated by function annotations, but that's > >> dangerously close to Java way... > >> > >> re: detecting blocking calls at run time. > >> Let's say there's a per-thread global "blocking lock" that's managed > >> by event loop executor and it's in "allowed" state when executor has > >> no runnables and wishes to select/poll/epoll and it's in "denied" > >> state when user code is running. > >> Some calls can be caught (e.g. check fd fcntl bits before any > >> recv/send/...) > >> Some calls can never be caught (anything via ctypes or extensions in > >> general) > >> Case in point: what do you propose to do about socket.gethostbyname()? > >> > >> re: after the fact. > >> hack1. run a separate thread/process that inspects event loop thread's > >> `wchan` value (Linux) > >> hack2. same via taskstats > >> hack3. detect event loop lag (false positives are cpu-bound tasks and > >> overloaded system, but I think you'd like to detect those too) > >> > >> On 13 February 2018 at 12:10, Chris Jerdonek > >> wrote: > >> > Hi, > >> > > >> > This is a general Python async question I've had that I haven't seen a > >> > discussion of. Do people know of any techniques other than manually > >> > inspecting code line by line to find out if you're accidentally making > >> > a raw call to a blocking function in a coroutine? > >> > > >> > Related to this, again, outside of inspecting the external source > >> > code, is there any way to know if a function you're calling (e.g. in > >> > someone else's library or in the standard lib) has the potential of > >> > blocking? > >> > > >> > What would be the way of detecting something like this if it made its > >> > way into "production"? What would be some of the symptoms? > >> > > >> > --Chris > >> > _______________________________________________ > >> > Async-sig mailing list > >> > Async-sig at python.org > >> > https://mail.python.org/mailman/listinfo/async-sig > >> > Code of Conduct: https://www.python.org/psf/codeofconduct/ > >> _______________________________________________ > >> Async-sig mailing list > >> Async-sig at python.org > >> https://mail.python.org/mailman/listinfo/async-sig > >> Code of Conduct: https://www.python.org/psf/codeofconduct/ > > > > -- > > Thanks, > > Andrew Svetlov > -- Thanks, Andrew Svetlov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Feb 14 05:18:33 2018 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 14 Feb 2018 02:18:33 -0800 Subject: [Async-sig] avoiding accidentally calling blocking code In-Reply-To: References: Message-ID: On Wed, Feb 14, 2018 at 12:42 AM, Chris Jerdonek wrote: > Thanks, Dima and Andrew, for your suggestions. > > Re: loop.slow_callback_duration, that's interesting. I didn't know > about that. However, it doesn't seem like that alone would work to > catch many operations that are normally fast, but strictly speaking > blocking. For example, it seems like simple disk I/O operations > wouldn't be caught even with slow_callback_duration set to a small > value. Dima suggests that such calls are okay. Is there a consensus on > that? It's not generally possible to avoid occasional arbitrary blocking, e.g. due to the GC running, the OS scheduler, page faults, etc. Basically the problem caused by blocking is when it means that other tasks are stuck waiting when they could be getting useful work done. If callbacks are finishing quickly then this isn't happening, so slow_callback_duration is checking for exactly the right thing. Where it might fall down is for operations that are only occasionally slow, so they slip past your testing. E.g. if your disk is fast when testing on your developer machine, but then in production you run on some high-occupancy cloud host and a noisy neighbor starts pounding the disk and suddenly your disk latencies shoot up. > Dima's mock.patch_all_known_blocking_calls() is an interesting idea > and seems like it would work for the case I mentioned. Has anyone > started writing such a method (e.g. for certain standard lib modules)? The implementation of gevent.monkey.patch_all() is probably not too far from what you want. -n -- Nathaniel J. Smith -- https://vorpus.org