[Python-ideas] The async API of the future: yield-from

Ben Darnell ben at bendarnell.com
Mon Oct 15 00:09:10 CEST 2012


On Sun, Oct 14, 2012 at 7:36 AM, Guido van Rossum <guido at python.org> wrote:
>> So it would look something like
>>
>> Yield-from:
>>    task1 = subtask1(args1)
>>    task2 = subtask2(args2)
>>    res1, res2 = yield from par(task1, task2)
>>
>> where the implementation of par() is left as an exercise for
>> the reader.
>
> So, can par() be as simple as
>
> def par(*args):
>   results = []
>   for task in args:
>     result = yield from task
>     results.append(result)
>   return results
>
> ???
>
> Or does it need to interact with the scheduler to ensure fairness?
> (Not having built one of these, my intuition for how the primitives
> fit together is still lacking, so excuse me for asking naive
> questions.)

It's not just fairness, it needs to interact with the scheduler to get
any parallelism at all if the sub-generators have more than one step.
Consider:

def task1():
  print "1A"
  yield
  print "1B"
  yield
  print "1C"
  # and so on...

def task2():
  print "2A"
  yield
  print "2B"
  yield
  print "2C"

def outer():
  yield from par(task1(), task2())

Both tasks are started immediately, but can't progress further until
they are yielded from to advance the iterator.  So with this version
of par() you get 1A, 2A, 1B, 1C..., 2B, 2C.  To get parallelism I
think you have to schedule each sub-generator separately instead of
just yielding from them (which negates some of the benefits of yield
from like easy error handling).

Even if there is a clever version of par() that works more like yield
from, you'd need to go back to explicit scheduling if you wanted
parallel execution without forcing everything to finish at the same
time (which is simple with Futures).


>
> Of course there's the question of what to do when one of the tasks
> raises an error -- I haven't quite figured that out in NDB either, it
> runs all the tasks to completion but the caller only sees the first
> exception. I briefly considered having an "multi-exception" but it
> felt too weird -- though I'm not married to that decision.

In general for this kind of parallel operation I think it's fine to
say that one (unspecified) exception is raised in the outer function
and the rest are hidden.  With futures, "(r1, r2) = yield (f1, f2)" is
just shorthand for "r1 = yield f1; r2 = yield f2", so separating the
yields to have separate try/except blocks is no problem.  WIth yield
from it's not as good because the second operation can't proceed while
the outer function is waiting for the first.

-Ben



More information about the Python-ideas mailing list