[Python-ideas] Protecting finally clauses of interruptions
Yury Selivanov
yselivanov.ml at gmail.com
Wed Apr 4 18:59:57 CEST 2012
On 2012-04-04, at 4:04 AM, Paul Colomiets wrote:
> Hi,
>
> On Wed, Apr 4, 2012 at 4:23 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> On 2012-04-03, at 3:22 PM, Paul Colomiets wrote:
>>> (Although, I don't know how `yield from` changes working with
>>> yield-based coroutines, may be it's behavior is quite different)
>>>
>>> For greenlets situation is a bit different, as Python knows the
>>> stack there, but you still need to traverse it (or as Andrew
>>> mentioned, you can just propagate flag).
>>
>> Why traverse? Why propagate? As I explained in my previous posts
>> here, you need to protect only the top-stack coroutines in the
>> timeouts or trampoline execution queues. You should illustrate
>> your logic with a more clear example - say three or four coroutines
>> that call each other + with a glimpse of how your trampoline works.
>> But I'm not sure that is really necessary.
>>
>
> Here is more detailed previous example (although, still simplified):
>
> @coroutine
> def add_money(user_id, money):
> yield redis_lock(user_id)
> try:
> yield redis_incr('user:'+user_id+':money', money)
> finally:
> yield redis_unlock(user_id)
>
> # this one is crucial to show the point of discusssion
> # other function are similar:
> @coroutine
> def redis_unlock(lock):
> yield redis_socket.wait_write() # yields back when socket is
> ready for writing
> cmd = ('DEL user:'+lock+'\n').encode('ascii')
> redis_socket.write(cmd) # should be loop here, actually
> yield redis_socket.wait_read()
> result = redis_socket.read(1024) # here loop too
> assert result == 'OK\n'
>
> The trampoline when gets coroutine from `next()` or `send()` method
> puts it on top of stack and doesn't dispatch original one until topmost
> one is exited.
>
> The point is that if timeout arrives inside a `redis_unlock` function, we
> must wait until finally from `add_user` is finished
How can it "arrive" inside "redis_unlock"? Let's assume you called
"add_money" as such:
yield add_money(1, 10).with_timeout(10)
Then it's the 'add_money' coroutine that should be in the tieouts queue/tree!
'add_money' specifically should be tried to be interrupted when your 10s timeout
reaches. And if 'add_money' is in its 'finally' statement - you simply postpone
its interruption, meaning that 'redis_unlock' will end its execution nicely.
Again, I'm not sure how exactly you manage your timeouts. The way I am,
simplified: I have a timeouts heapq with pointers to those coroutines
that were *explicitly* executed with a timeout. So I'm protecting only
the coroutines in that queue, because only them can be interrupted. And
the coroutines they call, are protected *automatically*.
If you do it differently, can you please elaborate on how your scheduler
is actually designed?
>>>
>>> The whole intention of using coroutine library is to not to
>>> have thread pool. Could you describe your use case
>>> with more details?
>>
>> Well, our company has been using coroutines for like 2.5 years
>> now (the framework in not yet opensourced). And in our practice
>> threadpool is really handy, as it allows you to:
>>
>> - use non-asyncronous libraries, which you don't want to
>> monkeypatch with greensockets (or even unable to mokeypatch)
>>
>
> And we rewrite them in python. It seems to be more useful.
Sometimes you can't afford the luxury ;)
>
>> - wrap some functions that are usually very fast, but once in
>> a while may take some time. And sometimes you don't want to
>> offload them to a separate process
>>
>
> Ack.
>
>> - and yes, do DNS lookups if you don't have a compiled cpython
>> extension that wraps c-ares or something alike.
>>
>
> Maybe let's propose asynchronous DNS library for python?
> We have same problem, although we do not resolve hosts at
> runtime (only at startup) so synchronous one is well enough
> for our purposes.
>
>> Please let's avoid shifting further discussion to proving or
>> disproving the necessity of threadpools.
>
> Agreed.
>
>> They are being actively used and there is a demand for
>> (more or less) graceful threads interruption or abortion.
>>
>
> Given use cases, what stops you to make explicit
> interrtuption points?
>
>>
>> Please write a PEP and we'll continue discussion from that
>> point. Hopefully, it will get more attention than this thread.
>>
>
> I don't see the point in writing a PEP until I have an idea
> what PEP should propose. If you have, you can do it. Again
OK, point taken. Please give me couple of days to at least
come up with a summary document. I still don't like your
solution because it works directly with frames. With an
upcoming PyPy support of python 3, I don't think I want
to loose the JIT support.
I also want to take a look at the new PyPy continuations.
Ideally, as I proposed earlier, we should introduce some
sort of interruption protocol -- method 'interrupt()', with
perhaps a callback.
> you want to implement thread interruption, and that's not
> my point, there is another thread for that.
We have two requests: ability to safely interrupt python
function or generator (1); ability to safely interrupt
python's threads (2). Both (1) and (2) share the same
requirement of safe 'finally' statements. To me, both
features are similar enough to come up with a single
solution, rather than inventing different approaches.
> On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>>
>> I don't think a frame flag on its own is quite enough.
>> You don't just want to prevent interruptions while in
>> a finally block, you want to defer them until the finally
>> counter gets back to zero. Making the interrupter sleep
>> and try again in that situation is rather ugly.
That's the second reason I don't like your proposal.
def foo():
try:
..
finally:
yield unlock()
# <--- the ideal point to interrupt foo
f = open('a', 'w')
# what if we interrupt it here?
try:
..
finally:
f.close()
>> So perhaps there could also be a callback that gets
>> invoked when the counter goes down to zero.
>
> Do you mean put callback in a frame, which get
> executed at next bytecode just like signal handler,
> except it waits until finally clause is executed?
>
> I would work, except in may have light performance
> impact on each bytecode. But I'm not sure if it will
> be noticeable.
That's essentially the way we currently did it. We transform the
coroutine's __code__ object to make it from:
def a():
try:
# code1
finally:
# code2
to:
def a():
__self__ = __get_current_coroutine()
try:
# code1
finally:
__self__.enter_finally()
try:
# code2
finally:
__self__.exit_finally()
'enter_finally' and 'exit_finally' maintain the internal counter
of finally blocks. If a coroutine needs to be interrupted, we check
that counter. If it is 0 - throw in a special exception. If not -
wait till it becomes 0 and throw the exception in 'exit_finally'.
Works flawlessly, but with the high cost of having to patch
__code__ objects.
-
Yury
More information about the Python-ideas
mailing list