From manu.mirandad at gmail.com  Thu Jun  8 18:32:06 2017
From: manu.mirandad at gmail.com (manuel miranda)
Date: Thu, 08 Jun 2017 22:32:06 +0000
Subject: [Async-sig] async/sync library reusage
Message-ID: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>

Hello everyone,

After using asyncio for a while, I'm struggling to find information about
how to support both synchronous and asynchronous use cases for the same
library.

I.e. imagine you have a package for http requests and you want to give the
user the choice to use a synchronous or an asynchronous interface. Right
now the approach the community is following is creating separate libraries
one for each version. This is far from ideal for several reasons, some I
can think of:

- Code duplication, most of the functionality is the same in both
libraries, only difference is the sync/async behaviors
- Some new async libraries lack functionality compared to their sync
siblings. Others will introduce bugs that the sync version already solved
long ago, etc.
- Different interfaces for the user for the same exact functionality.

In summary, in some cases it looks like reinventing the wheel. So now comes
the question, is there any documentation, guide on what would be best
practice supporting this kind of duality? I've been playing a bit with that
on my own but I really don't know if I'm doing something stupid or not.
Simple example:

"""
import asyncio


class MyConnector:

    @classmethod
    async def get(cls, key):
        return key


class AsyncClient:

    async def get(self, key):
        return await MyConnector.get(key)


class SyncClient:

    def __init__(self):
        self.loop = asyncio.get_event_loop()

    def get(self, key):
        return self.loop.run_until_complete(MyConnector.get(key))


def sync_call():
    client = SyncClient()
    print(client.get("sync_key"))


async def async_call():
    client = AsyncClient()
    print(await client.get("async_key"))


if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(async_call())
    sync_call()
"""

This is in case the underlying connector is asynchronous already. If its
synchronous and you want to support both modes, you have to rewrite the IO
interactions of MyConnector into a new AsyncMyConnector to support asyncio
and then use one or the other accordingly in the upper classes.

Am I doing it right or there is another better/alternative way?

Thanks for your time,

Manuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170608/654dc3d6/attachment.html>

From luciano at ramalho.org  Thu Jun  8 20:07:22 2017
From: luciano at ramalho.org (Luciano Ramalho)
Date: Fri, 09 Jun 2017 00:07:22 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
Message-ID: <CALxg4FW-Q3s4uj4=mijvxgKBnKHPOj24fPOeohDCCR_NebCT0A@mail.gmail.com>

Hello, Manuel.

The answer to your problem is to refactor the libraries in the "sans I/O"
style.

Take a look here:
http://sans-io.readthedocs.io/


On Thu, 8 Jun 2017 at 19:32 manuel miranda <manu.mirandad at gmail.com> wrote:

> Hello everyone,
>
> After using asyncio for a while, I'm struggling to find information about
> how to support both synchronous and asynchronous use cases for the same
> library.
>
> I.e. imagine you have a package for http requests and you want to give the
> user the choice to use a synchronous or an asynchronous interface. Right
> now the approach the community is following is creating separate libraries
> one for each version. This is far from ideal for several reasons, some I
> can think of:
>
> - Code duplication, most of the functionality is the same in both
> libraries, only difference is the sync/async behaviors
> - Some new async libraries lack functionality compared to their sync
> siblings. Others will introduce bugs that the sync version already solved
> long ago, etc.
> - Different interfaces for the user for the same exact functionality.
>
> In summary, in some cases it looks like reinventing the wheel. So now
> comes the question, is there any documentation, guide on what would be best
> practice supporting this kind of duality? I've been playing a bit with that
> on my own but I really don't know if I'm doing something stupid or not.
> Simple example:
>
> """
> import asyncio
>
>
> class MyConnector:
>
>     @classmethod
>     async def get(cls, key):
>         return key
>
>
> class AsyncClient:
>
>     async def get(self, key):
>         return await MyConnector.get(key)
>
>
> class SyncClient:
>
>     def __init__(self):
>         self.loop = asyncio.get_event_loop()
>
>     def get(self, key):
>         return self.loop.run_until_complete(MyConnector.get(key))
>
>
> def sync_call():
>     client = SyncClient()
>     print(client.get("sync_key"))
>
>
> async def async_call():
>     client = AsyncClient()
>     print(await client.get("async_key"))
>
>
> if __name__ == "__main__":
>     loop = asyncio.get_event_loop()
>     loop.run_until_complete(async_call())
>     sync_call()
> """
>
> This is in case the underlying connector is asynchronous already. If its
> synchronous and you want to support both modes, you have to rewrite the IO
> interactions of MyConnector into a new AsyncMyConnector to support asyncio
> and then use one or the other accordingly in the upper classes.
>
> Am I doing it right or there is another better/alternative way?
>
> Thanks for your time,
>
> Manuel
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-- 
Luciano Ramalho
|  Author of Fluent Python (O'Reilly, 2015)
|     http://shop.oreilly.com/product/0636920032519.do
|  Technical Principal at ThoughtWorks
|  Twitter: @ramalhoorg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/63a69d06/attachment.html>

From njs at pobox.com  Fri Jun  9 01:48:05 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Jun 2017 22:48:05 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
Message-ID: <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>

On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda <manu.mirandad at gmail.com> wrote:
> Hello everyone,
>
> After using asyncio for a while, I'm struggling to find information about
> how to support both synchronous and asynchronous use cases for the same
> library.
>
> I.e. imagine you have a package for http requests and you want to give the
> user the choice to use a synchronous or an asynchronous interface. Right now
> the approach the community is following is creating separate libraries one
> for each version. This is far from ideal for several reasons, some I can
> think of:
>
> - Code duplication, most of the functionality is the same in both libraries,
> only difference is the sync/async behaviors
> - Some new async libraries lack functionality compared to their sync
> siblings. Others will introduce bugs that the sync version already solved
> long ago, etc.
> - Different interfaces for the user for the same exact functionality.
>
> In summary, in some cases it looks like reinventing the wheel. So now comes
> the question, is there any documentation, guide on what would be best
> practice supporting this kind of duality?

I would say that this is something that we as a community are still
figuring out. I really like the Sans-IO approach, and it's a really
valuable piece of the solution, but it doesn't solve the whole problem
by itself - you still need to actually do I/O, and this means things
like error handling and timeouts that aren't obviously a natural fit
to the Sans-IO approach, and this means you may still have some tricky
code that can end up duplicated. (Or maybe the Sans-IO approach can be
extended to handle these things too?) There are active discussions
happening in projects like urllib3 [1] and packaging [2] about what
the best strategy to take is. And the options vary a lot depending on
whether you need to support python 2 etc.

If you figure out a good approach I think everyone would be interested
to hear it :-)

-n

[1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348

[2] Here's the same API implemented three different ways:
Using deferreds: https://github.com/pypa/packaging/pull/87
"traditional" sans-IO: https://github.com/pypa/packaging/pull/88
Using the "effect" library: https://github.com/dstufft/packaging/pull/1

-- 
Nathaniel J. Smith -- https://vorpus.org

From yarkot1 at gmail.com  Fri Jun  9 02:19:51 2017
From: yarkot1 at gmail.com (Yarko Tymciurak)
Date: Fri, 09 Jun 2017 06:19:51 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
Message-ID: <CAJ+Z=PJkzrQ+bYgqbX6mPePOREP1s=VT_wEVkC1rrNf6iELbgg@mail.gmail.com>

On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda <manu.mirandad at gmail.com>
> wrote:
> > Hello everyone,
> >
> > After using asyncio for a while, I'm struggling to find information about
> > how to support both synchronous and asynchronous use cases for the same
> > library.
> >
> > I.e. imagine you have a package for http requests and you want to give
> the
> > user the choice to use a synchronous or an asynchronous interface. Right
> now
> > the approach the community is following is creating separate libraries
> one
> > for each version. This is far from ideal for several reasons, some I can
> > think of:
> >
> > - Code duplication, most of the functionality is the same in both
> libraries,
> > only difference is the sync/async behaviors
> > - Some new async libraries lack functionality compared to their sync
> > siblings. Others will introduce bugs that the sync version already solved
> > long ago, etc.
> > - Different interfaces for the user for the same exact functionality.
> >
> > In summary, in some cases it looks like reinventing the wheel. So now
> comes
> > the question, is there any documentation, guide on what would be best
> > practice supporting this kind of duality?
>
> I would say that this is something that we as a community are still
> figuring out. I really like the Sans-IO approach, and it's a really
> valuable piece of the solution, but it doesn't solve the whole problem
> by itself - you still need to actually do I/O, and this means things
> like error handling and timeouts that aren't obviously a natural fit
> to the Sans-IO approach, and this means you may still have some tricky
> code that can end up duplicated. (Or maybe the Sans-IO approach can be
> extended to handle these things too?) There are active discussions
> happening in projects like urllib3 [1] and packaging [2] about what
> the best strategy to take is. And the options vary a lot depending on
> whether you need to support python 2 etc.
>
> If you figure out a good approach I think everyone would be interested
> to hear it :-)
>

Just to leave this breadcrumb here - I've said this before, but not thought
in depth about it a lot, but pretty sure that in something like Python4,
async needs to become "first class citizen," that is from the inside out,
right in the bowels of the repl loop.

If async is the default, and synchronous calls just a special case (e.g.
single-task async), then I'd expect two things (at least): developers would
have an easier time, make fewer mistakes in async programming (the language
would handle more), and libraries would be unified as async & sync would be
the same.

Maybe there's something that would make this not make sense, but I'd be
really surprised.  Larry's gil removal work intuitively seems an enabler
for this kind of (potential) work...

-y


> -n
>
> [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348
>
> [2] Here's the same API implemented three different ways:
> Using deferreds: https://github.com/pypa/packaging/pull/87
> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88
> Using the "effect" library: https://github.com/dstufft/packaging/pull/1
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/92d8db3e/attachment.html>

From alex.gronholm at nextday.fi  Fri Jun  9 04:05:21 2017
From: alex.gronholm at nextday.fi (=?UTF-8?Q?Alex_Gr=c3=b6nholm?=)
Date: Fri, 9 Jun 2017 11:05:21 +0300
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAJ+Z=PJkzrQ+bYgqbX6mPePOREP1s=VT_wEVkC1rrNf6iELbgg@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <CAJ+Z=PJkzrQ+bYgqbX6mPePOREP1s=VT_wEVkC1rrNf6iELbgg@mail.gmail.com>
Message-ID: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi>

Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19:
>
> On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith <njs at pobox.com 
> <mailto:njs at pobox.com>> wrote:
>
>     On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda
>     <manu.mirandad at gmail.com <mailto:manu.mirandad at gmail.com>> wrote:
>     > Hello everyone,
>     >
>     > After using asyncio for a while, I'm struggling to find
>     information about
>     > how to support both synchronous and asynchronous use cases for
>     the same
>     > library.
>     >
>     > I.e. imagine you have a package for http requests and you want
>     to give the
>     > user the choice to use a synchronous or an asynchronous
>     interface. Right now
>     > the approach the community is following is creating separate
>     libraries one
>     > for each version. This is far from ideal for several reasons,
>     some I can
>     > think of:
>     >
>     > - Code duplication, most of the functionality is the same in
>     both libraries,
>     > only difference is the sync/async behaviors
>     > - Some new async libraries lack functionality compared to their sync
>     > siblings. Others will introduce bugs that the sync version
>     already solved
>     > long ago, etc.
>     > - Different interfaces for the user for the same exact
>     functionality.
>     >
>     > In summary, in some cases it looks like reinventing the wheel.
>     So now comes
>     > the question, is there any documentation, guide on what would be
>     best
>     > practice supporting this kind of duality?
>
>     I would say that this is something that we as a community are still
>     figuring out. I really like the Sans-IO approach, and it's a really
>     valuable piece of the solution, but it doesn't solve the whole problem
>     by itself - you still need to actually do I/O, and this means things
>     like error handling and timeouts that aren't obviously a natural fit
>     to the Sans-IO approach, and this means you may still have some tricky
>     code that can end up duplicated. (Or maybe the Sans-IO approach can be
>     extended to handle these things too?) There are active discussions
>     happening in projects like urllib3 [1] and packaging [2] about what
>     the best strategy to take is. And the options vary a lot depending on
>     whether you need to support python 2 etc.
>
>     If you figure out a good approach I think everyone would be interested
>     to hear it :-)
>
>
> Just to leave this breadcrumb here - I've said this before, but not 
> thought in depth about it a lot, but pretty sure that in something 
> like Python4, async needs to become "first class citizen," that is 
> from the inside out, right in the bowels of the repl loop.
>
Python 4 will be nothing more than the next minor release after 3.9. 
Because Guido hates double digit minor versions :)
> If async is the default, and synchronous calls just a special case 
> (e.g. single-task async), then I'd expect two things (at least): 
> developers would have an easier time, make fewer mistakes in async 
> programming (the language would handle more), and libraries would be 
> unified as async & sync would be the same.
Are you suggesting the removal of the "await", "async with" and "async 
for" structures? Those were added deliberately so developers can spot 
the yield points in a coroutine function. Not having them would give us 
something like gevent where you can never tell when your task is going 
to be adjourned in favor of another.
>
> Maybe there's something that would make this not make sense, but I'd 
> be really surprised.  Larry's gil removal work intuitively seems an 
> enabler for this kind of (potential) work...
>
> -y
>
>
>
>     -n
>
>     [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348
>
>     [2] Here's the same API implemented three different ways:
>     Using deferreds: https://github.com/pypa/packaging/pull/87
>     "traditional" sans-IO: https://github.com/pypa/packaging/pull/88
>     Using the "effect" library:
>     https://github.com/dstufft/packaging/pull/1
>
>     --
>     Nathaniel J. Smith -- https://vorpus.org
>     _______________________________________________
>     Async-sig mailing list
>     Async-sig at python.org <mailto:Async-sig at python.org>
>     https://mail.python.org/mailman/listinfo/async-sig
>     Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/0e7942d4/attachment-0001.html>

From yarkot1 at gmail.com  Fri Jun  9 04:49:00 2017
From: yarkot1 at gmail.com (Yarko Tymciurak)
Date: Fri, 09 Jun 2017 08:49:00 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <CAJ+Z=PJkzrQ+bYgqbX6mPePOREP1s=VT_wEVkC1rrNf6iELbgg@mail.gmail.com>
 <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi>
Message-ID: <CAJ+Z=PKEOx56arKPBRy4n_HtPu_H2+HDNpbgTXY=hRVEfUtuSg@mail.gmail.com>

On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm <alex.gronholm at nextday.fi>
wrote:

> Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19:
>
>
> On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda <manu.mirandad at gmail.com>
>> wrote:
>> > Hello everyone,
>> >
>> > After using asyncio for a while, I'm struggling to find information
>> about
>> > how to support both synchronous and asynchronous use cases for the same
>> > library.
>> >
>> > I.e. imagine you have a package for http requests and you want to give
>> the
>> > user the choice to use a synchronous or an asynchronous interface.
>> Right now
>> > the approach the community is following is creating separate libraries
>> one
>> > for each version. This is far from ideal for several reasons, some I can
>> > think of:
>> >
>> > - Code duplication, most of the functionality is the same in both
>> libraries,
>> > only difference is the sync/async behaviors
>> > - Some new async libraries lack functionality compared to their sync
>> > siblings. Others will introduce bugs that the sync version already
>> solved
>> > long ago, etc.
>> > - Different interfaces for the user for the same exact functionality.
>> >
>> > In summary, in some cases it looks like reinventing the wheel. So now
>> comes
>> > the question, is there any documentation, guide on what would be best
>> > practice supporting this kind of duality?
>>
>> I would say that this is something that we as a community are still
>> figuring out. I really like the Sans-IO approach, and it's a really
>> valuable piece of the solution, but it doesn't solve the whole problem
>> by itself - you still need to actually do I/O, and this means things
>> like error handling and timeouts that aren't obviously a natural fit
>> to the Sans-IO approach, and this means you may still have some tricky
>> code that can end up duplicated. (Or maybe the Sans-IO approach can be
>> extended to handle these things too?) There are active discussions
>> happening in projects like urllib3 [1] and packaging [2] about what
>> the best strategy to take is. And the options vary a lot depending on
>> whether you need to support python 2 etc.
>>
>> If you figure out a good approach I think everyone would be interested
>> to hear it :-)
>>
>
> Just to leave this breadcrumb here - I've said this before, but not
> thought in depth about it a lot, but pretty sure that in something like
> Python4, async needs to become "first class citizen," that is from the
> inside out, right in the bowels of the repl loop.
>
> Python 4 will be nothing more than the next minor release after 3.9.
> Because Guido hates double digit minor versions :)
>
> If async is the default, and synchronous calls just a special case (e.g.
> single-task async), then I'd expect two things (at least): developers would
> have an easier time, make fewer mistakes in async programming (the language
> would handle more), and libraries would be unified as async & sync would be
> the same.
>
> Are you suggesting the removal of the "await", "async with" and "async
> for" structures? Those were added deliberately so developers can spot the
> yield points in a coroutine function. Not having them would give us
> something like gevent where you can never tell when your task is going to
> be adjourned in favor of another.
>

actually I was bot thinking of that...  but I was thinking of processing in
the language, rather than a library...

In any case, I don't have answers, only a vision which keeps coming up.  My
interest is not in providing "a solution", rather generating a reasoned
discussion...


>
> Maybe there's something that would make this not make sense, but I'd be
> really surprised.  Larry's gil removal work intuitively seems an enabler
> for this kind of (potential) work...
>
> -y
>
>
>
>> -n
>>
>> [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348
>>
>> [2] Here's the same API implemented three different ways:
>> Using deferreds: https://github.com/pypa/packaging/pull/87
>> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88
>> Using the "effect" library: https://github.com/dstufft/packaging/pull/1
>>
>> --
>> Nathaniel J. Smith -- https://vorpus.org
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>
>
> _______________________________________________
> Async-sig mailing listAsync-sig at python.orghttps://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/2e037e1a/attachment.html>

From alex.gronholm at nextday.fi  Fri Jun  9 04:57:46 2017
From: alex.gronholm at nextday.fi (=?UTF-8?Q?Alex_Gr=c3=b6nholm?=)
Date: Fri, 9 Jun 2017 11:57:46 +0300
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAJ+Z=PKEOx56arKPBRy4n_HtPu_H2+HDNpbgTXY=hRVEfUtuSg@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <CAJ+Z=PJkzrQ+bYgqbX6mPePOREP1s=VT_wEVkC1rrNf6iELbgg@mail.gmail.com>
 <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi>
 <CAJ+Z=PKEOx56arKPBRy4n_HtPu_H2+HDNpbgTXY=hRVEfUtuSg@mail.gmail.com>
Message-ID: <afe9b280-a719-dc1e-1f65-0661cda917e0@nextday.fi>

Yarko Tymciurak kirjoitti 09.06.2017 klo 11:49:
>
> On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm <alex.gronholm at nextday.fi 
> <mailto:alex.gronholm at nextday.fi>> wrote:
>
>     Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19:
>>
>>     On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith <njs at pobox.com
>>     <mailto:njs at pobox.com>> wrote:
>>
>>         On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda
>>         <manu.mirandad at gmail.com <mailto:manu.mirandad at gmail.com>> wrote:
>>         > Hello everyone,
>>         >
>>         > After using asyncio for a while, I'm struggling to find
>>         information about
>>         > how to support both synchronous and asynchronous use cases
>>         for the same
>>         > library.
>>         >
>>         > I.e. imagine you have a package for http requests and you
>>         want to give the
>>         > user the choice to use a synchronous or an asynchronous
>>         interface. Right now
>>         > the approach the community is following is creating
>>         separate libraries one
>>         > for each version. This is far from ideal for several
>>         reasons, some I can
>>         > think of:
>>         >
>>         > - Code duplication, most of the functionality is the same
>>         in both libraries,
>>         > only difference is the sync/async behaviors
>>         > - Some new async libraries lack functionality compared to
>>         their sync
>>         > siblings. Others will introduce bugs that the sync version
>>         already solved
>>         > long ago, etc.
>>         > - Different interfaces for the user for the same exact
>>         functionality.
>>         >
>>         > In summary, in some cases it looks like reinventing the
>>         wheel. So now comes
>>         > the question, is there any documentation, guide on what
>>         would be best
>>         > practice supporting this kind of duality?
>>
>>         I would say that this is something that we as a community are
>>         still
>>         figuring out. I really like the Sans-IO approach, and it's a
>>         really
>>         valuable piece of the solution, but it doesn't solve the
>>         whole problem
>>         by itself - you still need to actually do I/O, and this means
>>         things
>>         like error handling and timeouts that aren't obviously a
>>         natural fit
>>         to the Sans-IO approach, and this means you may still have
>>         some tricky
>>         code that can end up duplicated. (Or maybe the Sans-IO
>>         approach can be
>>         extended to handle these things too?) There are active
>>         discussions
>>         happening in projects like urllib3 [1] and packaging [2]
>>         about what
>>         the best strategy to take is. And the options vary a lot
>>         depending on
>>         whether you need to support python 2 etc.
>>
>>         If you figure out a good approach I think everyone would be
>>         interested
>>         to hear it :-)
>>
>>
>>     Just to leave this breadcrumb here - I've said this before, but
>>     not thought in depth about it a lot, but pretty sure that in
>>     something like Python4, async needs to become "first class
>>     citizen," that is from the inside out, right in the bowels of the
>>     repl loop.
>>
>     Python 4 will be nothing more than the next minor release after
>     3.9. Because Guido hates double digit minor versions :)
>
>>     If async is the default, and synchronous calls just a special
>>     case (e.g. single-task async), then I'd expect two things (at
>>     least): developers would have an easier time, make fewer mistakes
>>     in async programming (the language would handle more), and
>>     libraries would be unified as async & sync would be the same.
>     Are you suggesting the removal of the "await", "async with" and
>     "async for" structures? Those were added deliberately so
>     developers can spot the yield points in a coroutine function. Not
>     having them would give us something like gevent where you can
>     never tell when your task is going to be adjourned in favor of
>     another.
>
>
> actually I was bot thinking of that...  but I was thinking of 
> processing in the language, rather than a library...
>
> In any case, I don't have answers, only a vision which keeps coming 
> up.  My interest is not in providing "a solution", rather generating a 
> reasoned discussion...
Then explain what you mean by making async a first class citizen in 
Python. In my mind it already is, by courtesy of having the "async def", 
"await" et al added to the language syntax itself and the inclusion of 
the asyncio module in the standard library. The only other thing that 
could've been done is to tie the language syntax to a single event loop 
implementation but that was deliberately left out.
>
>
>
>>
>>     Maybe there's something that would make this not make sense, but
>>     I'd be really surprised. Larry's gil removal work intuitively
>>     seems an enabler for this kind of (potential) work...
>>
>>     -y
>>
>>
>>
>>         -n
>>
>>         [1]
>>         https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348
>>
>>         [2] Here's the same API implemented three different ways:
>>         Using deferreds: https://github.com/pypa/packaging/pull/87
>>         "traditional" sans-IO: https://github.com/pypa/packaging/pull/88
>>         Using the "effect" library:
>>         https://github.com/dstufft/packaging/pull/1
>>
>>         --
>>         Nathaniel J. Smith -- https://vorpus.org
>>         _______________________________________________
>>         Async-sig mailing list
>>         Async-sig at python.org <mailto:Async-sig at python.org>
>>         https://mail.python.org/mailman/listinfo/async-sig
>>         Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>>
>>
>>     _______________________________________________
>>     Async-sig mailing list
>>     Async-sig at python.org <mailto:Async-sig at python.org>
>>     https://mail.python.org/mailman/listinfo/async-sig
>>     Code of Conduct:https://www.python.org/psf/codeofconduct/
>
>     _______________________________________________
>     Async-sig mailing list
>     Async-sig at python.org <mailto:Async-sig at python.org>
>     https://mail.python.org/mailman/listinfo/async-sig
>     Code of Conduct: https://www.python.org/psf/codeofconduct/
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/1833acdf/attachment-0001.html>

From cory at lukasa.co.uk  Fri Jun  9 05:06:39 2017
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 9 Jun 2017 10:06:39 +0100
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
Message-ID: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>


> On 9 Jun 2017, at 06:48, Nathaniel Smith <njs at pobox.com> wrote:
> 
> I would say that this is something that we as a community are still
> figuring out. I really like the Sans-IO approach, and it's a really
> valuable piece of the solution, but it doesn't solve the whole problem
> by itself - you still need to actually do I/O, and this means things
> like error handling and timeouts that aren't obviously a natural fit
> to the Sans-IO approach, and this means you may still have some tricky
> code that can end up duplicated. (Or maybe the Sans-IO approach can be
> extended to handle these things too?) There are active discussions
> happening in projects like urllib3 [1] and packaging [2] about what
> the best strategy to take is. And the options vary a lot depending on
> whether you need to support python 2 etc.


Let me take a moment to elaborate on some of the thinking that has gone on for urllib3/Requests. We have an unusual set of constraints that are worth understanding, and so I?ll throw out all the ideas we had and why they were rejected (and indeed, why you may not want to reject them).

1. Implement the core library in asyncio, add a synchronous shim on top of it in terms of asyncio.run_until_complete().

This works great in many ways: you get a nice async-based library implementation, you correctly prioritise people using the async case over those using the synchronous one, and you can expect wide support and interop thanks to asyncio?s role as the common event loop implementation. However, you don?t support more novel async paradigms like those used by curio and trio.

More damningly for urllib3/Requests, this also limits your supported Python versions to 3.5 and later. There are also some efficiency concerns. Finally, unless you?re willing to only support 3.7 you end up needing to pass loop arguments around which is pretty gross.

2. Have an abstract low-level I/O interface and ?bleach? it (remove the keywords async/await) on Python 2.

This would require you write all your code in terms of a small number of abstract I/O operations with ?async? in front of their name, e.g. ?async def send?, ?async def recv?, and so-on. You can then implement these across multiple I/O backends, and also provide a synchronous one that still has ?async? in front of it and just doesn?t ever use the word ?await?. You can then provide a code transformation at install time on Python 2 that transforms that codebase, removing all the words ?async? and ?await? and leaving behind a synchronous-only codebase.

The advantages here are better support for novel async paradigms (e.g. curio and trio), the ability to write more native backends for non-asyncio I/O models (e.g. Twisted/Tornado), and having a single codebase that handles sync and async.

There are many myriad disadvantages. The first is the most obvious: the code your users run is not the same as the code you shipped. While the transformation is small and pretty easy to understand, that doesn?t remove its risks. It also makes debugging harder and more painful. On top of that, your Python 3 synchronous code looks pretty ugly because you have to write the word ?await? around it even though it is not in fact asynchronous (technically you *don?t* have to do that but I guarantee IDEs will get mad).

More subtly, this causes problems for backpressure and task management on event loops. It turns out defining your low-level I/O primitives is not trivial. In urllib3?s case, one of the things we?d need is either the equivalent of ?async def select()? or ?async def new_task?. In the first case, to write this would require a careful management of futures/deferreds and various bits of state in order to correctly suspect execution on event loops. In the second case, the synchronous version of this is called ?threading.Thread? and that has a number of issues. I?d say that if you?re going to use threads you may as well just always use threads, but more importantly it has substantially different semantics to all async task management which make it difficult to reason about and to ensure that the code is sensible.

This approach is also entirely untested, at any scale. It?s simply not clear that it works yet. All the tooling would need to be written.

3. Just use Twisted/Tornado.

This variation on number (1) turns out to get you surprisingly close to our actual goal. Twisted and Tornado support Python 2 and Python 3, when async/await are present they integrate fairly nicely with them, and they give you the added advantage of allowing your Python 2 users to do asynchronous code so long as they buy into the relevant async ecosystem. It also means that you can use the run_until_complete model for your Python 2 synchronous code.

However, these also have some downsides. Twisted, the library I know better, doesn?t yet integrate as cleanly with async/await as we?d like: that?s coming sometime this year, probably with the landing of 3.7. Additionally, Twisted has no equivalent of asyncio.run_until_complete(), which would mean that someone would have to add the relevant Twisted support (either restartable or instantiable reactors, neither of which Twisted has yet).

This also adds a potentially sizeable external dependency, which isn?t necessarily all that fun.

4. ??? Who knows.

Right now there is no clarity about what we?re going to do. It?s possible that the answer will end up being ?nothing at the moment? and that we?ll wait for the ecosystem to progress for a while before making the change. Either way, it?s clear that there is no easy answer to this problem.

Cory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/74eb1716/attachment.html>

From yarkot1 at gmail.com  Fri Jun  9 05:08:04 2017
From: yarkot1 at gmail.com (Yarko Tymciurak)
Date: Fri, 09 Jun 2017 09:08:04 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <afe9b280-a719-dc1e-1f65-0661cda917e0@nextday.fi>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <CAJ+Z=PJkzrQ+bYgqbX6mPePOREP1s=VT_wEVkC1rrNf6iELbgg@mail.gmail.com>
 <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi>
 <CAJ+Z=PKEOx56arKPBRy4n_HtPu_H2+HDNpbgTXY=hRVEfUtuSg@mail.gmail.com>
 <afe9b280-a719-dc1e-1f65-0661cda917e0@nextday.fi>
Message-ID: <CAJ+Z=PKCp+s2E4ZRF19huHdVB8_UMCLJegqy3bqvKzfvNTgqiw@mail.gmail.com>

On Fri, Jun 9, 2017 at 3:57 AM Alex Gr?nholm <alex.gronholm at nextday.fi>
wrote:

> Yarko Tymciurak kirjoitti 09.06.2017 klo 11:49:
>
>
> On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm <alex.gronholm at nextday.fi>
> wrote:
>
>> Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19:
>>
>>
>> On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith <njs at pobox.com> wrote:
>>
>>> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda <manu.mirandad at gmail.com>
>>> wrote:
>>> > Hello everyone,
>>> >
>>> > After using asyncio for a while, I'm struggling to find information
>>> about
>>> > how to support both synchronous and asynchronous use cases for the same
>>> > library.
>>> >
>>> > I.e. imagine you have a package for http requests and you want to give
>>> the
>>> > user the choice to use a synchronous or an asynchronous interface.
>>> Right now
>>> > the approach the community is following is creating separate libraries
>>> one
>>> > for each version. This is far from ideal for several reasons, some I
>>> can
>>> > think of:
>>> >
>>> > - Code duplication, most of the functionality is the same in both
>>> libraries,
>>> > only difference is the sync/async behaviors
>>> > - Some new async libraries lack functionality compared to their sync
>>> > siblings. Others will introduce bugs that the sync version already
>>> solved
>>> > long ago, etc.
>>> > - Different interfaces for the user for the same exact functionality.
>>> >
>>> > In summary, in some cases it looks like reinventing the wheel. So now
>>> comes
>>> > the question, is there any documentation, guide on what would be best
>>> > practice supporting this kind of duality?
>>>
>>> I would say that this is something that we as a community are still
>>> figuring out. I really like the Sans-IO approach, and it's a really
>>> valuable piece of the solution, but it doesn't solve the whole problem
>>> by itself - you still need to actually do I/O, and this means things
>>> like error handling and timeouts that aren't obviously a natural fit
>>> to the Sans-IO approach, and this means you may still have some tricky
>>> code that can end up duplicated. (Or maybe the Sans-IO approach can be
>>> extended to handle these things too?) There are active discussions
>>> happening in projects like urllib3 [1] and packaging [2] about what
>>> the best strategy to take is. And the options vary a lot depending on
>>> whether you need to support python 2 etc.
>>>
>>> If you figure out a good approach I think everyone would be interested
>>> to hear it :-)
>>>
>>
>> Just to leave this breadcrumb here - I've said this before, but not
>> thought in depth about it a lot, but pretty sure that in something like
>> Python4, async needs to become "first class citizen," that is from the
>> inside out, right in the bowels of the repl loop.
>>
>> Python 4 will be nothing more than the next minor release after 3.9.
>> Because Guido hates double digit minor versions :)
>>
>> If async is the default, and synchronous calls just a special case (e.g.
>> single-task async), then I'd expect two things (at least): developers would
>> have an easier time, make fewer mistakes in async programming (the language
>> would handle more), and libraries would be unified as async & sync would be
>> the same.
>>
>> Are you suggesting the removal of the "await", "async with" and "async
>> for" structures? Those were added deliberately so developers can spot the
>> yield points in a coroutine function. Not having them would give us
>> something like gevent where you can never tell when your task is going to
>> be adjourned in favor of another.
>>
>
> actually I was bot thinking of that...  but I was thinking of processing
> in the language, rather than a library...
>
> In any case, I don't have answers, only a vision which keeps coming up.
> My interest is not in providing "a solution", rather generating a reasoned
> discussion...
>
> Then explain what you mean by making async a first class citizen in
> Python. In my mind it already is, by courtesy of having the "async def",
> "await" et al added to the language syntax itself and the inclusion of the
> asyncio module in the standard library. The only other thing that could've
> been done is to tie the language syntax to a single event loop
> implementation but that was deliberately left out.
>
>  i'm sorry - I thought that was clear by saying it would be in the repl
loop itself and not in a library.

 and those it wouldn't require two versions of every library. That's what I
meant.

that is right now it's coming from the outside in, that is to say from
applications,  closer in, to an attempt at a common library.   i'm
suggesting it start from the inside of the language out so that all things
have that support and that it is not just a library thus any code can take
advantage of either single or multiple async tasks, goal being that there
only need be on version of libraries.   at least that's the discussion I'm
calling for.

 does that help?


>
>
>>
>> Maybe there's something that would make this not make sense, but I'd be
>> really surprised.  Larry's gil removal work intuitively seems an enabler
>> for this kind of (potential) work...
>>
>> -y
>>
>>
>>
>>> -n
>>>
>>> [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348
>>>
>>> [2] Here's the same API implemented three different ways:
>>> Using deferreds: https://github.com/pypa/packaging/pull/87
>>> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88
>>> Using the "effect" library: https://github.com/dstufft/packaging/pull/1
>>>
>>> --
>>> Nathaniel J. Smith -- https://vorpus.org
>>> _______________________________________________
>>> Async-sig mailing list
>>> Async-sig at python.org
>>> https://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>
>>
>>
>> _______________________________________________
>> Async-sig mailing listAsync-sig at python.orghttps://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>>
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/7d892849/attachment-0001.html>

From guido at python.org  Fri Jun  9 11:33:31 2017
From: guido at python.org (Guido van Rossum)
Date: Fri, 9 Jun 2017 08:33:31 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAJ+Z=PKCp+s2E4ZRF19huHdVB8_UMCLJegqy3bqvKzfvNTgqiw@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <CAJ+Z=PJkzrQ+bYgqbX6mPePOREP1s=VT_wEVkC1rrNf6iELbgg@mail.gmail.com>
 <43a11605-c6a5-c9c2-f8d0-610be4850c63@nextday.fi>
 <CAJ+Z=PKEOx56arKPBRy4n_HtPu_H2+HDNpbgTXY=hRVEfUtuSg@mail.gmail.com>
 <afe9b280-a719-dc1e-1f65-0661cda917e0@nextday.fi>
 <CAJ+Z=PKCp+s2E4ZRF19huHdVB8_UMCLJegqy3bqvKzfvNTgqiw@mail.gmail.com>
Message-ID: <CAP7+vJJ1V5t44QFMfEiBDi=u4n4_Hn8r0DiB4nf+fmmS4i3AXQ@mail.gmail.com>

Yarko, I think your vision is too far out. Maybe something like that could
become a reality in Python 5 -- it would require all extensions to become
aware of the async stuff (adding it to Python doesn't automatically add it
to C!). Also the GIL has nothing to do with this, async tasks all run in
the same thread, and if there was no GIL it would not be any different
(else two cooperating tasks could be run on different threads and you'd be
back on pre-emptive scheduling and the ensuing race conditions).  (Note
that I refer to Python 4 as Python after the Gilectomy -- it needs to be a
new major version since the C API changes dramatically as C extensions will
no longer have the protection of the GIL.)

--Guido

On Fri, Jun 9, 2017 at 2:08 AM, Yarko Tymciurak <yarkot1 at gmail.com> wrote:

>
> On Fri, Jun 9, 2017 at 3:57 AM Alex Gr?nholm <alex.gronholm at nextday.fi>
> wrote:
>
>> Yarko Tymciurak kirjoitti 09.06.2017 klo 11:49:
>>
>>
>> On Fri, Jun 9, 2017 at 3:05 AM Alex Gr?nholm <alex.gronholm at nextday.fi>
>> wrote:
>>
>>> Yarko Tymciurak kirjoitti 09.06.2017 klo 09:19:
>>>
>>>
>>> On Fri, Jun 9, 2017 at 12:48 AM Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>>> On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda <manu.mirandad at gmail.com>
>>>> wrote:
>>>> > Hello everyone,
>>>> >
>>>> > After using asyncio for a while, I'm struggling to find information
>>>> about
>>>> > how to support both synchronous and asynchronous use cases for the
>>>> same
>>>> > library.
>>>> >
>>>> > I.e. imagine you have a package for http requests and you want to
>>>> give the
>>>> > user the choice to use a synchronous or an asynchronous interface.
>>>> Right now
>>>> > the approach the community is following is creating separate
>>>> libraries one
>>>> > for each version. This is far from ideal for several reasons, some I
>>>> can
>>>> > think of:
>>>> >
>>>> > - Code duplication, most of the functionality is the same in both
>>>> libraries,
>>>> > only difference is the sync/async behaviors
>>>> > - Some new async libraries lack functionality compared to their sync
>>>> > siblings. Others will introduce bugs that the sync version already
>>>> solved
>>>> > long ago, etc.
>>>> > - Different interfaces for the user for the same exact functionality.
>>>> >
>>>> > In summary, in some cases it looks like reinventing the wheel. So now
>>>> comes
>>>> > the question, is there any documentation, guide on what would be best
>>>> > practice supporting this kind of duality?
>>>>
>>>> I would say that this is something that we as a community are still
>>>> figuring out. I really like the Sans-IO approach, and it's a really
>>>> valuable piece of the solution, but it doesn't solve the whole problem
>>>> by itself - you still need to actually do I/O, and this means things
>>>> like error handling and timeouts that aren't obviously a natural fit
>>>> to the Sans-IO approach, and this means you may still have some tricky
>>>> code that can end up duplicated. (Or maybe the Sans-IO approach can be
>>>> extended to handle these things too?) There are active discussions
>>>> happening in projects like urllib3 [1] and packaging [2] about what
>>>> the best strategy to take is. And the options vary a lot depending on
>>>> whether you need to support python 2 etc.
>>>>
>>>> If you figure out a good approach I think everyone would be interested
>>>> to hear it :-)
>>>>
>>>
>>> Just to leave this breadcrumb here - I've said this before, but not
>>> thought in depth about it a lot, but pretty sure that in something like
>>> Python4, async needs to become "first class citizen," that is from the
>>> inside out, right in the bowels of the repl loop.
>>>
>>> Python 4 will be nothing more than the next minor release after 3.9.
>>> Because Guido hates double digit minor versions :)
>>>
>>> If async is the default, and synchronous calls just a special case (e.g.
>>> single-task async), then I'd expect two things (at least): developers would
>>> have an easier time, make fewer mistakes in async programming (the language
>>> would handle more), and libraries would be unified as async & sync would be
>>> the same.
>>>
>>> Are you suggesting the removal of the "await", "async with" and "async
>>> for" structures? Those were added deliberately so developers can spot the
>>> yield points in a coroutine function. Not having them would give us
>>> something like gevent where you can never tell when your task is going to
>>> be adjourned in favor of another.
>>>
>>
>> actually I was bot thinking of that...  but I was thinking of processing
>> in the language, rather than a library...
>>
>> In any case, I don't have answers, only a vision which keeps coming up.
>> My interest is not in providing "a solution", rather generating a reasoned
>> discussion...
>>
>> Then explain what you mean by making async a first class citizen in
>> Python. In my mind it already is, by courtesy of having the "async def",
>> "await" et al added to the language syntax itself and the inclusion of the
>> asyncio module in the standard library. The only other thing that could've
>> been done is to tie the language syntax to a single event loop
>> implementation but that was deliberately left out.
>>
>>  i'm sorry - I thought that was clear by saying it would be in the repl
> loop itself and not in a library.
>
>  and those it wouldn't require two versions of every library. That's what
> I meant.
>
> that is right now it's coming from the outside in, that is to say from
> applications,  closer in, to an attempt at a common library.   i'm
> suggesting it start from the inside of the language out so that all things
> have that support and that it is not just a library thus any code can take
> advantage of either single or multiple async tasks, goal being that there
> only need be on version of libraries.   at least that's the discussion I'm
> calling for.
>
>  does that help?
>
>
>>
>>
>>>
>>> Maybe there's something that would make this not make sense, but I'd be
>>> really surprised.  Larry's gil removal work intuitively seems an enabler
>>> for this kind of (potential) work...
>>>
>>> -y
>>>
>>>
>>>
>>>> -n
>>>>
>>>> [1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348
>>>>
>>>> [2] Here's the same API implemented three different ways:
>>>> Using deferreds: https://github.com/pypa/packaging/pull/87
>>>> "traditional" sans-IO: https://github.com/pypa/packaging/pull/88
>>>> Using the "effect" library: https://github.com/dstufft/packaging/pull/1
>>>>
>>>> --
>>>> Nathaniel J. Smith -- https://vorpus.org
>>>> _______________________________________________
>>>> Async-sig mailing list
>>>> Async-sig at python.org
>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>
>>>
>>>
>>> _______________________________________________
>>> Async-sig mailing listAsync-sig at python.orghttps://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>
>>>
>>> _______________________________________________
>>> Async-sig mailing list
>>> Async-sig at python.org
>>> https://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>
>>
>>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/bdb379c0/attachment-0001.html>

From guido at python.org  Fri Jun  9 11:40:16 2017
From: guido at python.org (Guido van Rossum)
Date: Fri, 9 Jun 2017 08:40:16 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
Message-ID: <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>

Cory, I really like your approach #1. You can make it work all the way back
to Python 3.3 by using @coroutine and yield from. That's not pretty, but
for libraries the goal shouldn't primarily be prettiness of the
implementation -- prettiness of the API is much more important, and that's
preserved by asyncio's compatibility (code you write that's compatible with
Python 3.3 and the latest asyncio from PyPI should still run on Python 3.7
and provide a modern async/await-based API for applications written for
3.7).

Also, I don't think the situation with explicitly passing loop= is so
terrible as you seem to think. If you rely on the default event loop, you
rely on there *being* a default event loop, but there will always be one
unless an app goes out of its way to create an event loop and then make it
not the default loop. Only the asyncio tests do that. There are a few
things you can't do unless you pass an event loop (such as scheduling
callbacks before the event loop is started) but other than that it's really
not such a big deal as people seem to think it is. (You mostly see the
pattern because asyncio itself uses that pattern, because it needs to be
robust for the extreme use case where someone *does* hide the active event
loop. But there will never be two active event loops.)

--Guido

On Fri, Jun 9, 2017 at 2:06 AM, Cory Benfield <cory at lukasa.co.uk> wrote:

>
> On 9 Jun 2017, at 06:48, Nathaniel Smith <njs at pobox.com> wrote:
>
> I would say that this is something that we as a community are still
> figuring out. I really like the Sans-IO approach, and it's a really
> valuable piece of the solution, but it doesn't solve the whole problem
> by itself - you still need to actually do I/O, and this means things
> like error handling and timeouts that aren't obviously a natural fit
> to the Sans-IO approach, and this means you may still have some tricky
> code that can end up duplicated. (Or maybe the Sans-IO approach can be
> extended to handle these things too?) There are active discussions
> happening in projects like urllib3 [1] and packaging [2] about what
> the best strategy to take is. And the options vary a lot depending on
> whether you need to support python 2 etc.
>
>
> Let me take a moment to elaborate on some of the thinking that has gone on
> for urllib3/Requests. We have an unusual set of constraints that are worth
> understanding, and so I?ll throw out all the ideas we had and why they were
> rejected (and indeed, why you may not want to reject them).
>
> 1. Implement the core library in asyncio, add a synchronous shim on top of
> it in terms of asyncio.run_until_complete().
>
> This works great in many ways: you get a nice async-based library
> implementation, you correctly prioritise people using the async case over
> those using the synchronous one, and you can expect wide support and
> interop thanks to asyncio?s role as the common event loop implementation.
> However, you don?t support more novel async paradigms like those used by
> curio and trio.
>
> More damningly for urllib3/Requests, this also limits your supported
> Python versions to 3.5 and later. There are also some efficiency concerns.
> Finally, unless you?re willing to only support 3.7 you end up needing to
> pass loop arguments around which is pretty gross.
>
> 2. Have an abstract low-level I/O interface and ?bleach? it (remove the
> keywords async/await) on Python 2.
>
> This would require you write all your code in terms of a small number of
> abstract I/O operations with ?async? in front of their name, e.g. ?async
> def send?, ?async def recv?, and so-on. You can then implement these across
> multiple I/O backends, and also provide a synchronous one that still has
> ?async? in front of it and just doesn?t ever use the word ?await?. You can
> then provide a code transformation at install time on Python 2 that
> transforms that codebase, removing all the words ?async? and ?await? and
> leaving behind a synchronous-only codebase.
>
> The advantages here are better support for novel async paradigms (e.g.
> curio and trio), the ability to write more native backends for non-asyncio
> I/O models (e.g. Twisted/Tornado), and having a single codebase that
> handles sync and async.
>
> There are many myriad disadvantages. The first is the most obvious: the
> code your users run is not the same as the code you shipped. While the
> transformation is small and pretty easy to understand, that doesn?t remove
> its risks. It also makes debugging harder and more painful. On top of that,
> your Python 3 synchronous code looks pretty ugly because you have to write
> the word ?await? around it even though it is not in fact asynchronous
> (technically you *don?t* have to do that but I guarantee IDEs will get mad).
>
> More subtly, this causes problems for backpressure and task management on
> event loops. It turns out defining your low-level I/O primitives is not
> trivial. In urllib3?s case, one of the things we?d need is either the
> equivalent of ?async def select()? or ?async def new_task?. In the first
> case, to write this would require a careful management of futures/deferreds
> and various bits of state in order to correctly suspect execution on event
> loops. In the second case, the synchronous version of this is called
> ?threading.Thread? and that has a number of issues. I?d say that if you?re
> going to use threads you may as well just always use threads, but more
> importantly it has substantially different semantics to all async task
> management which make it difficult to reason about and to ensure that the
> code is sensible.
>
> This approach is also entirely untested, at any scale. It?s simply not
> clear that it works yet. All the tooling would need to be written.
>
> 3. Just use Twisted/Tornado.
>
> This variation on number (1) turns out to get you surprisingly close to
> our actual goal. Twisted and Tornado support Python 2 and Python 3, when
> async/await are present they integrate fairly nicely with them, and they
> give you the added advantage of allowing your Python 2 users to do
> asynchronous code so long as they buy into the relevant async ecosystem. It
> also means that you can use the run_until_complete model for your Python 2
> synchronous code.
>
> However, these also have some downsides. Twisted, the library I know
> better, doesn?t yet integrate as cleanly with async/await as we?d like:
> that?s coming sometime this year, probably with the landing of 3.7.
> Additionally, Twisted has no equivalent of asyncio.run_until_complete(),
> which would mean that someone would have to add the relevant Twisted
> support (either restartable or instantiable reactors, neither of which
> Twisted has yet).
>
> This also adds a potentially sizeable external dependency, which isn?t
> necessarily all that fun.
>
> 4. ??? Who knows.
>
> Right now there is no clarity about what we?re going to do. It?s possible
> that the answer will end up being ?nothing at the moment? and that we?ll
> wait for the ecosystem to progress for a while before making the change.
> Either way, it?s clear that there is no easy answer to this problem.
>
> Cory
>
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/dea727fb/attachment.html>

From cory at lukasa.co.uk  Fri Jun  9 11:51:16 2017
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 9 Jun 2017 16:51:16 +0100
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
Message-ID: <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>


> On 9 Jun 2017, at 16:40, Guido van Rossum <guido at python.org> wrote:
> 
> Also, I don't think the situation with explicitly passing loop= is so terrible as you seem to think. If you rely on the default event loop, you rely on there *being* a default event loop, but there will always be one unless an app goes out of its way to create an event loop and then make it not the default loop. Only the asyncio tests do that. There are a few things you can't do unless you pass an event loop (such as scheduling callbacks before the event loop is started) but other than that it's really not such a big deal as people seem to think it is. (You mostly see the pattern because asyncio itself uses that pattern, because it needs to be robust for the extreme use case where someone *does* hide the active event loop. But there will never be two active event loops.)

My concern with multiple loops boils down to the fact that urllib3 supports being used in a multithreaded context where each thread can independently make forward progress on one request. To establish that with a synchronous codebase you either need one event loop per thread or you need to spawn a background thread on startup that owns the only event loop in the process.

Generally speaking I?ve not had positive results with libraries spawning their own threads in Python. In my experience this has tended to lead to programs that deadlock mysteriously or that fail to terminate in the face of a Ctrl+C. So I tend to prefer to have users spawn their own threads, which would make me want a ?one-event-loop-per-thread? model: hence, needing a loop parameter to pass around prior to 3.6.

I admit that my concerns here regarding libraries spawning their own threads may be overblown: after my series of negative experiences I basically never went back to that model, and it may be that the problems were more user-error than anything else. However, I feel comfortable saying that libraries spawning their own Python threads is definitely subtle and hard to get right, at the very least.

Cory

From ben at bendarnell.com  Fri Jun  9 12:07:51 2017
From: ben at bendarnell.com (Ben Darnell)
Date: Fri, 09 Jun 2017 16:07:51 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
Message-ID: <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>

On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk> wrote:

>
>
> My concern with multiple loops boils down to the fact that urllib3
> supports being used in a multithreaded context where each thread can
> independently make forward progress on one request. To establish that with
> a synchronous codebase you either need one event loop per thread or you
> need to spawn a background thread on startup that owns the only event loop
> in the process.
>

Yeah, one event loop per thread is probably the way to go for integration
with synchronous codebases. A dedicated event loop thread may perform
better but libraries that spawn threads are problematic.


>
> Generally speaking I?ve not had positive results with libraries spawning
> their own threads in Python. In my experience this has tended to lead to
> programs that deadlock mysteriously or that fail to terminate in the face
> of a Ctrl+C. So I tend to prefer to have users spawn their own threads,
> which would make me want a ?one-event-loop-per-thread? model: hence,
> needing a loop parameter to pass around prior to 3.6.
>

You can avoid the loop parameter on older versions of asyncio (at least as
long as the default event loop policy is used) by manually setting your
event loop as current before calling run_until_complete (and resetting it
afterwards).

Tornado's run_sync() method is equivalent to asyncio's
run_until_complete(), and Tornado supports multiple IOLoops in this way. We
use this to expose a synchronous version of our AsyncHTTPClient:
https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54

-Ben


>
> I admit that my concerns here regarding libraries spawning their own
> threads may be overblown: after my series of negative experiences I
> basically never went back to that model, and it may be that the problems
> were more user-error than anything else. However, I feel comfortable saying
> that libraries spawning their own Python threads is definitely subtle and
> hard to get right, at the very least.
>
> Cory
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/97f6b50c/attachment.html>

From guido at python.org  Fri Jun  9 12:28:57 2017
From: guido at python.org (Guido van Rossum)
Date: Fri, 9 Jun 2017 09:28:57 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
Message-ID: <CAP7+vJKBsW_vQXp7tWcw_AwdeApBGEs-=f0SrJ0+Lg9G9mMa7A@mail.gmail.com>

On Fri, Jun 9, 2017 at 8:51 AM, Cory Benfield <cory at lukasa.co.uk> wrote:

>
> On 9 Jun 2017, at 16:40, Guido van Rossum <guido at python.org> wrote:
>
> Also, I don't think the situation with explicitly passing loop= is so
> terrible as you seem to think. If you rely on the default event loop, you
> rely on there *being* a default event loop, but there will always be one
> unless an app goes out of its way to create an event loop and then make it
> not the default loop. Only the asyncio tests do that. There are a few
> things you can't do unless you pass an event loop (such as scheduling
> callbacks before the event loop is started) but other than that it's really
> not such a big deal as people seem to think it is. (You mostly see the
> pattern because asyncio itself uses that pattern, because it needs to be
> robust for the extreme use case where someone *does* hide the active event
> loop. But there will never be two active event loops.)
>
>
> My concern with multiple loops boils down to the fact that urllib3
> supports being used in a multithreaded context where each thread can
> independently make forward progress on one request. To establish that with
> a synchronous codebase you either need one event loop per thread or you
> need to spawn a background thread on startup that owns the only event loop
> in the process.
>
> Generally speaking I?ve not had positive results with libraries spawning
> their own threads in Python. In my experience this has tended to lead to
> programs that deadlock mysteriously or that fail to terminate in the face
> of a Ctrl+C. So I tend to prefer to have users spawn their own threads,
> which would make me want a ?one-event-loop-per-thread? model: hence,
> needing a loop parameter to pass around prior to 3.6.
>
> I admit that my concerns here regarding libraries spawning their own
> threads may be overblown: after my series of negative experiences I
> basically never went back to that model, and it may be that the problems
> were more user-error than anything else. However, I feel comfortable saying
> that libraries spawning their own Python threads is definitely subtle and
> hard to get right, at the very least.


At least one of us is still confused. The one-event-loop-per-thread model
is supported in asyncio without passing the loop around explicitly. The
get_event_loop() implementation stores all its state in thread-locals
instance, so it returns the thread's event loop. (Because this is an
"advanced" model, you have to explicitly create the event loop with
new_event_loop() and make it the default loop for the thread with
set_event_loop().)

All in all, I'm a bit curious why you would need to use asyncio at all when
you've got a thread per request anyway.

I agree there are problems with threads that are hidden from an app. Hence
asyncio allows you to set the executor where it runs things you pass to
run_in_executor() (including some of its own, esp. getaddrinfo()).

One note about the one-event-loop-per-thread model: threads should be very
cautious touching each other's event loops. This should only be done
usingcall_soon_threadsafe()!

--Guido

-- 
--Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/e4ce21f8/attachment.html>

From cory at lukasa.co.uk  Fri Jun  9 12:55:35 2017
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 9 Jun 2017 17:55:35 +0100
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJKBsW_vQXp7tWcw_AwdeApBGEs-=f0SrJ0+Lg9G9mMa7A@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAP7+vJKBsW_vQXp7tWcw_AwdeApBGEs-=f0SrJ0+Lg9G9mMa7A@mail.gmail.com>
Message-ID: <C6075620-404B-4972-8961-997BE134A2A1@lukasa.co.uk>


> On 9 Jun 2017, at 17:28, Guido van Rossum <guido at python.org> wrote:
> 
> At least one of us is still confused. The one-event-loop-per-thread model is supported in asyncio without passing the loop around explicitly. The get_event_loop() implementation stores all its state in thread-locals instance, so it returns the thread's event loop. (Because this is an "advanced" model, you have to explicitly create the event loop with new_event_loop() and make it the default loop for the thread with set_event_loop().)

Aha, ok, so the confused one is me. I did not know this. =) That definitely works a lot better. It admittedly works less well if someone is doing their own custom event loop stuff, but that?s probably an acceptable limitation up until the time that Python 2 goes quietly into the night.

> All in all, I'm a bit curious why you would need to use asyncio at all when you've got a thread per request anyway.

Yeah, so this is a bit of a diversion from the original topic of this thread but I think it?s an idea worth discussing in this space. I want to reframe the question a bit if you don?t mind, so shout if you think I?m not responding to quite what you were asking. In my understanding, the question you?re implicitly asking is this:

"If you have a thread-safe library today (that is, one that allows users to do threaded I/O with appropriate resource pooling and management), why move to a model built on asyncio??

There are many answers to this question that differ for different libraries with different uses, but for HTTP libraries like urllib3 here are our reasons.

The first is that it turns out that even for HTTP/1.1 you need to write something that amounts to a partial event loop to properly handle the protocol. Good HTTP clients need to watch for responses while they?re uploading body data because if a response arrives during that process body upload should be terminated immediately. This is also required for sensibly handling things like Expect: 100-continue, as well as spotting other intermediate responses and connection teardowns sensibly and without throwing exceptions.

Today urllib3 does not do this, and it has caused us pain, so our v2 branch includes a backport of the Python 3 selectors module and a hand-written partially-complete event loop that only handles the specific cases we need. This is an extra thing for us to debug and maintain, and ultimately it?d be easier to just delegate the whole thing to event loops written by others who promise to maintain them and make them efficient.

The second answer is that I believe good asyncio support in libraries is a vital part of the future of this language, and ?good? asyncio support IMO does as little as possible to block the main event loop. Running all of the complex protocol parsing and state manipulation of the Requests stack on a background thread is not cheap, and involves a lot of GIL swapping around. We have found several bug reports complaining about using Requests with largish-numbers of threads, indicating that our big stack of Python code really does cause contention on the GIL if used heavily. In general, having to defer to a thread to run *Python* code in asyncio is IMO a nasty anti-pattern that should be avoided where possible. It is much less bad to defer to a thread to then block on a syscall (e.g. to get an ?async? getaddrinfo), but doing so to run a big big stack of Python code is vastly less pleasant for the main event loop.

For this reason, we?d ideally treat asyncio as the first-class citizen and retrofit on the threaded support, rather than the other way around. This goes doubly so when you consider the other reasons for wanting to use asyncio.

The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is a *highly* concurrent protocol. Connections send a lot of control frames back and forth that are invisible to the user working at the semantic HTTP level but that nonetheless need relatively low-latency turnaround (e.g. PING frames). It turns out that in the traditional synchronous HTTP model urllib3 only gets access to the socket to do work when the user calls into our code. If the user goes a ?long? time without calling into urllib3, we take a long time to process any data off the connection. In the best case this causes latency spikes as we process all the data that queued up in the socket. In the worst case, this causes us to lose connections we should have been able to keep because we failed to respond to a PING frame in a timely manner.

My experience is that purely synchronous libraries handling HTTP/2 simply cannot provide a positive user experience. HTTP/2 flat-out *requires* either an event loop or a dedicated background thread, and in practice in your dedicated background thread you?d also just end up writing an event loop (see answer 1 again). For this reason, it is basically mandatory for HTTP/2 support in Python to either use an event loop or to spawn out a dedicated C thread that does not hold the GIL to do the I/O (as this thread will be regularly woken up to handle I/O events).

Hopefully this (admittedly horrifyingly long) response helps illuminate why we?re interested in asyncio support. It should be noted that if we find ourselves unable to get it in the short term we may simply resort to offering an ?async? API that involves us doing the rough equivalent of running in a thread-pool executor, but I won?t be thrilled about it. ;)

Cory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/a4bda004/attachment-0001.html>

From guido at python.org  Fri Jun  9 14:23:53 2017
From: guido at python.org (Guido van Rossum)
Date: Fri, 9 Jun 2017 11:23:53 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <C6075620-404B-4972-8961-997BE134A2A1@lukasa.co.uk>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAP7+vJKBsW_vQXp7tWcw_AwdeApBGEs-=f0SrJ0+Lg9G9mMa7A@mail.gmail.com>
 <C6075620-404B-4972-8961-997BE134A2A1@lukasa.co.uk>
Message-ID: <CAP7+vJLPqT4=bN-KmZgFPF+PNmdDDoaAazWXODcoq9gw--BRow@mail.gmail.com>

Great write-up! I actually find the async nature of HTTP (both versions) a
compelling reason to switch to asyncio. For HTTP/1.1 this sounds mostly
like it would make the implementation easier; for HTTP/2 it sounds like it
would just be better for the user-side as well (if the user just wants one
resource they can safely continue to use the synchronous HTTP/1.1 version
of the API.)

On Fri, Jun 9, 2017 at 9:55 AM, Cory Benfield <cory at lukasa.co.uk> wrote:

>
> On 9 Jun 2017, at 17:28, Guido van Rossum <guido at python.org> wrote:
>
> At least one of us is still confused. The one-event-loop-per-thread model
> is supported in asyncio without passing the loop around explicitly. The
> get_event_loop() implementation stores all its state in thread-locals
> instance, so it returns the thread's event loop. (Because this is an
> "advanced" model, you have to explicitly create the event loop with
> new_event_loop() and make it the default loop for the thread with
> set_event_loop().)
>
>
> Aha, ok, so the confused one is me. I did not know this. =) That
> definitely works a lot better. It admittedly works less well if someone is
> doing their own custom event loop stuff, but that?s probably an acceptable
> limitation up until the time that Python 2 goes quietly into the night.
>
> All in all, I'm a bit curious why you would need to use asyncio at all
> when you've got a thread per request anyway.
>
>
> Yeah, so this is a bit of a diversion from the original topic of this
> thread but I think it?s an idea worth discussing in this space. I want to
> reframe the question a bit if you don?t mind, so shout if you think I?m not
> responding to quite what you were asking. In my understanding, the question
> you?re implicitly asking is this:
>
> "If you have a thread-safe library today (that is, one that allows users
> to do threaded I/O with appropriate resource pooling and management), why
> move to a model built on asyncio??
>
> There are many answers to this question that differ for different
> libraries with different uses, but for HTTP libraries like urllib3 here are
> our reasons.
>
> The first is that it turns out that even for HTTP/1.1 you need to write
> something that amounts to a partial event loop to properly handle the
> protocol. Good HTTP clients need to watch for responses while they?re
> uploading body data because if a response arrives during that process body
> upload should be terminated immediately. This is also required for sensibly
> handling things like Expect: 100-continue, as well as spotting other
> intermediate responses and connection teardowns sensibly and without
> throwing exceptions.
>
> Today urllib3 does not do this, and it has caused us pain, so our v2
> branch includes a backport of the Python 3 selectors module and a
> hand-written partially-complete event loop that only handles the specific
> cases we need. This is an extra thing for us to debug and maintain, and
> ultimately it?d be easier to just delegate the whole thing to event loops
> written by others who promise to maintain them and make them efficient.
>
> The second answer is that I believe good asyncio support in libraries is a
> vital part of the future of this language, and ?good? asyncio support IMO
> does as little as possible to block the main event loop. Running all of the
> complex protocol parsing and state manipulation of the Requests stack on a
> background thread is not cheap, and involves a lot of GIL swapping around.
> We have found several bug reports complaining about using Requests with
> largish-numbers of threads, indicating that our big stack of Python code
> really does cause contention on the GIL if used heavily. In general, having
> to defer to a thread to run *Python* code in asyncio is IMO a nasty
> anti-pattern that should be avoided where possible. It is much less bad to
> defer to a thread to then block on a syscall (e.g. to get an ?async?
> getaddrinfo), but doing so to run a big big stack of Python code is vastly
> less pleasant for the main event loop.
>
> For this reason, we?d ideally treat asyncio as the first-class citizen and
> retrofit on the threaded support, rather than the other way around. This
> goes doubly so when you consider the other reasons for wanting to use
> asyncio.
>
> The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is a
> *highly* concurrent protocol. Connections send a lot of control frames back
> and forth that are invisible to the user working at the semantic HTTP level
> but that nonetheless need relatively low-latency turnaround (e.g. PING
> frames). It turns out that in the traditional synchronous HTTP model
> urllib3 only gets access to the socket to do work when the user calls into
> our code. If the user goes a ?long? time without calling into urllib3, we
> take a long time to process any data off the connection. In the best case
> this causes latency spikes as we process all the data that queued up in the
> socket. In the worst case, this causes us to lose connections we should
> have been able to keep because we failed to respond to a PING frame in a
> timely manner.
>
> My experience is that purely synchronous libraries handling HTTP/2 simply
> cannot provide a positive user experience. HTTP/2 flat-out *requires*
> either an event loop or a dedicated background thread, and in practice in
> your dedicated background thread you?d also just end up writing an event
> loop (see answer 1 again). For this reason, it is basically mandatory for
> HTTP/2 support in Python to either use an event loop or to spawn out a
> dedicated C thread that does not hold the GIL to do the I/O (as this thread
> will be regularly woken up to handle I/O events).
>
> Hopefully this (admittedly horrifyingly long) response helps illuminate
> why we?re interested in asyncio support. It should be noted that if we find
> ourselves unable to get it in the short term we may simply resort to
> offering an ?async? API that involves us doing the rough equivalent of
> running in a thread-pool executor, but I won?t be thrilled about it. ;)
>
> Cory
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/49426c23/attachment.html>

From yarkot1 at gmail.com  Fri Jun  9 15:52:36 2017
From: yarkot1 at gmail.com (Yarko Tymciurak)
Date: Fri, 9 Jun 2017 14:52:36 -0500
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJLPqT4=bN-KmZgFPF+PNmdDDoaAazWXODcoq9gw--BRow@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAP7+vJKBsW_vQXp7tWcw_AwdeApBGEs-=f0SrJ0+Lg9G9mMa7A@mail.gmail.com>
 <C6075620-404B-4972-8961-997BE134A2A1@lukasa.co.uk>
 <CAP7+vJLPqT4=bN-KmZgFPF+PNmdDDoaAazWXODcoq9gw--BRow@mail.gmail.com>
Message-ID: <CAJ+Z=PJgohj0OzJ_-=xEi=sp6zymf64ZT3PNT1Dmny8GCLRPOA@mail.gmail.com>

...so I really am enjoying the conversation.

Guido - re: "vision too far out":  yes, for people trying to struggle w/
async support in their libraries, now... but that is also part of my
motivation.   Python 5?  Sure...  (I may have to watch it come to use from
the grave, but hopefully not... ;-) ).  Anyway, from back-porting and
tactical "implement now" concerns, to plans for next release, to plans for
next version of python, to brainstorming much less concrete future versions
- all are an interesting continuum.

Re:  GIL... sure, sort of, and sort of not.  I was thinking "as long as
major changes are going on...  think about additional structural
changes..."   More to the point:  as I see it, people have a hard time
thinking about async in the cooperative-multitasking (CMT) sense, and thus
disappointments happen around blocking (missed, or unexpects, e.g. hardware
failures).   Cory (in his reply - and, yeah: nice writeup!) hints to what I
generally structurally like:

"...we?d ideally treat asyncio as the first-class citizen and retrofit on
the threaded support, rather than the other way around"

Structurally,  async is light-weight overhead compared to threads, which
are lightweight compared to processes, and so a sort of natural app flow
seems from lightest-weight, on out.  To me, this seems practical for making
life easier for developers, because you can imagine "promoting" an async
task caught unexpectedly blocking, to a thread, while still having the
lightest-weight loop have control over it (promotion out, as well as
cancellation while promoted).

As for multiple task loops, or loops off in a thread, I haven't thought
about it too much, but this seems like nothing new nor unreasonable.  I'm
thinking of the base-stations we talk over in our mobile connections, which
are multiple diskless servers, and hot-promote to "master" server status on
hardware failure (or live capacity upgrade, i.e. inserting processors).
This pattern seems both reasonable and useful in this context, i.e. the
concept of a master loop (which implies communication/control channels - a
complication).  With some thought, some reasonable ground rules and
simplifications, and I would expect much can be done.

Appreciate the discussions!

- Yarko

On Fri, Jun 9, 2017 at 1:23 PM, Guido van Rossum <guido at python.org> wrote:

> Great write-up! I actually find the async nature of HTTP (both versions) a
> compelling reason to switch to asyncio. For HTTP/1.1 this sounds mostly
> like it would make the implementation easier; for HTTP/2 it sounds like it
> would just be better for the user-side as well (if the user just wants one
> resource they can safely continue to use the synchronous HTTP/1.1 version
> of the API.)
>
> On Fri, Jun 9, 2017 at 9:55 AM, Cory Benfield <cory at lukasa.co.uk> wrote:
>
>>
>> On 9 Jun 2017, at 17:28, Guido van Rossum <guido at python.org> wrote:
>>
>> At least one of us is still confused. The one-event-loop-per-thread model
>> is supported in asyncio without passing the loop around explicitly. The
>> get_event_loop() implementation stores all its state in thread-locals
>> instance, so it returns the thread's event loop. (Because this is an
>> "advanced" model, you have to explicitly create the event loop with
>> new_event_loop() and make it the default loop for the thread with
>> set_event_loop().)
>>
>>
>> Aha, ok, so the confused one is me. I did not know this. =) That
>> definitely works a lot better. It admittedly works less well if someone is
>> doing their own custom event loop stuff, but that?s probably an acceptable
>> limitation up until the time that Python 2 goes quietly into the night.
>>
>> All in all, I'm a bit curious why you would need to use asyncio at all
>> when you've got a thread per request anyway.
>>
>>
>> Yeah, so this is a bit of a diversion from the original topic of this
>> thread but I think it?s an idea worth discussing in this space. I want to
>> reframe the question a bit if you don?t mind, so shout if you think I?m not
>> responding to quite what you were asking. In my understanding, the question
>> you?re implicitly asking is this:
>>
>> "If you have a thread-safe library today (that is, one that allows users
>> to do threaded I/O with appropriate resource pooling and management), why
>> move to a model built on asyncio??
>>
>> There are many answers to this question that differ for different
>> libraries with different uses, but for HTTP libraries like urllib3 here are
>> our reasons.
>>
>> The first is that it turns out that even for HTTP/1.1 you need to write
>> something that amounts to a partial event loop to properly handle the
>> protocol. Good HTTP clients need to watch for responses while they?re
>> uploading body data because if a response arrives during that process body
>> upload should be terminated immediately. This is also required for sensibly
>> handling things like Expect: 100-continue, as well as spotting other
>> intermediate responses and connection teardowns sensibly and without
>> throwing exceptions.
>>
>> Today urllib3 does not do this, and it has caused us pain, so our v2
>> branch includes a backport of the Python 3 selectors module and a
>> hand-written partially-complete event loop that only handles the specific
>> cases we need. This is an extra thing for us to debug and maintain, and
>> ultimately it?d be easier to just delegate the whole thing to event loops
>> written by others who promise to maintain them and make them efficient.
>>
>> The second answer is that I believe good asyncio support in libraries is
>> a vital part of the future of this language, and ?good? asyncio support IMO
>> does as little as possible to block the main event loop. Running all of the
>> complex protocol parsing and state manipulation of the Requests stack on a
>> background thread is not cheap, and involves a lot of GIL swapping around.
>> We have found several bug reports complaining about using Requests with
>> largish-numbers of threads, indicating that our big stack of Python code
>> really does cause contention on the GIL if used heavily. In general, having
>> to defer to a thread to run *Python* code in asyncio is IMO a nasty
>> anti-pattern that should be avoided where possible. It is much less bad to
>> defer to a thread to then block on a syscall (e.g. to get an ?async?
>> getaddrinfo), but doing so to run a big big stack of Python code is vastly
>> less pleasant for the main event loop.
>>
>> For this reason, we?d ideally treat asyncio as the first-class citizen
>> and retrofit on the threaded support, rather than the other way around.
>> This goes doubly so when you consider the other reasons for wanting to use
>> asyncio.
>>
>> The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is
>> a *highly* concurrent protocol. Connections send a lot of control frames
>> back and forth that are invisible to the user working at the semantic HTTP
>> level but that nonetheless need relatively low-latency turnaround (e.g.
>> PING frames). It turns out that in the traditional synchronous HTTP model
>> urllib3 only gets access to the socket to do work when the user calls into
>> our code. If the user goes a ?long? time without calling into urllib3, we
>> take a long time to process any data off the connection. In the best case
>> this causes latency spikes as we process all the data that queued up in the
>> socket. In the worst case, this causes us to lose connections we should
>> have been able to keep because we failed to respond to a PING frame in a
>> timely manner.
>>
>> My experience is that purely synchronous libraries handling HTTP/2 simply
>> cannot provide a positive user experience. HTTP/2 flat-out *requires*
>> either an event loop or a dedicated background thread, and in practice in
>> your dedicated background thread you?d also just end up writing an event
>> loop (see answer 1 again). For this reason, it is basically mandatory for
>> HTTP/2 support in Python to either use an event loop or to spawn out a
>> dedicated C thread that does not hold the GIL to do the I/O (as this thread
>> will be regularly woken up to handle I/O events).
>>
>> Hopefully this (admittedly horrifyingly long) response helps illuminate
>> why we?re interested in asyncio support. It should be noted that if we find
>> ourselves unable to get it in the short term we may simply resort to
>> offering an ?async? API that involves us doing the rough equivalent of
>> running in a thread-pool executor, but I won?t be thrilled about it. ;)
>>
>> Cory
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170609/4ef26b1a/attachment-0001.html>

From pfreixes at gmail.com  Mon Jun 12 07:39:48 2017
From: pfreixes at gmail.com (Pau Freixes)
Date: Mon, 12 Jun 2017 13:39:48 +0200
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
Message-ID: <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>

Sorry a bit of topic, but I would like to figure out why older python
versions, prior this commit [1], the get_event_loop is not considered
deterministic

does anybody know the reason behind this change?


[1] https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e

On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com> wrote:
> On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk> wrote:
>>
>>
>>
>> My concern with multiple loops boils down to the fact that urllib3
>> supports being used in a multithreaded context where each thread can
>> independently make forward progress on one request. To establish that with a
>> synchronous codebase you either need one event loop per thread or you need
>> to spawn a background thread on startup that owns the only event loop in the
>> process.
>
>
> Yeah, one event loop per thread is probably the way to go for integration
> with synchronous codebases. A dedicated event loop thread may perform better
> but libraries that spawn threads are problematic.
>
>>
>>
>> Generally speaking I?ve not had positive results with libraries spawning
>> their own threads in Python. In my experience this has tended to lead to
>> programs that deadlock mysteriously or that fail to terminate in the face of
>> a Ctrl+C. So I tend to prefer to have users spawn their own threads, which
>> would make me want a ?one-event-loop-per-thread? model: hence, needing a
>> loop parameter to pass around prior to 3.6.
>
>
> You can avoid the loop parameter on older versions of asyncio (at least as
> long as the default event loop policy is used) by manually setting your
> event loop as current before calling run_until_complete (and resetting it
> afterwards).
>
> Tornado's run_sync() method is equivalent to asyncio's run_until_complete(),
> and Tornado supports multiple IOLoops in this way. We use this to expose a
> synchronous version of our AsyncHTTPClient:
> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>
> -Ben
>
>>
>>
>> I admit that my concerns here regarding libraries spawning their own
>> threads may be overblown: after my series of negative experiences I
>> basically never went back to that model, and it may be that the problems
>> were more user-error than anything else. However, I feel comfortable saying
>> that libraries spawning their own Python threads is definitely subtle and
>> hard to get right, at the very least.
>>
>> Cory
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>


-- 
--pau

From guido at python.org  Mon Jun 12 11:36:01 2017
From: guido at python.org (Guido van Rossum)
Date: Mon, 12 Jun 2017 08:36:01 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
Message-ID: <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>

In theory it's possible to create two event loops (using new_event_loop()),
then set one as the default event loop (using set_event_loop()), then run
the other one (using run_forever() or run_until_complete()). To tasks
running in the latter event loop, get_event_loop() would nevertheless
return the former.

On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com> wrote:

> Sorry a bit of topic, but I would like to figure out why older python
> versions, prior this commit [1], the get_event_loop is not considered
> deterministic
>
> does anybody know the reason behind this change?
>
>
> [1] https://github.com/python/cpython/commit/
> 600a349781bfa0a8239e1cb95fac29c7c4a3302e
>
> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com> wrote:
> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk> wrote:
> >>
> >>
> >>
> >> My concern with multiple loops boils down to the fact that urllib3
> >> supports being used in a multithreaded context where each thread can
> >> independently make forward progress on one request. To establish that
> with a
> >> synchronous codebase you either need one event loop per thread or you
> need
> >> to spawn a background thread on startup that owns the only event loop
> in the
> >> process.
> >
> >
> > Yeah, one event loop per thread is probably the way to go for integration
> > with synchronous codebases. A dedicated event loop thread may perform
> better
> > but libraries that spawn threads are problematic.
> >
> >>
> >>
> >> Generally speaking I?ve not had positive results with libraries spawning
> >> their own threads in Python. In my experience this has tended to lead to
> >> programs that deadlock mysteriously or that fail to terminate in the
> face of
> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads,
> which
> >> would make me want a ?one-event-loop-per-thread? model: hence, needing a
> >> loop parameter to pass around prior to 3.6.
> >
> >
> > You can avoid the loop parameter on older versions of asyncio (at least
> as
> > long as the default event loop policy is used) by manually setting your
> > event loop as current before calling run_until_complete (and resetting it
> > afterwards).
> >
> > Tornado's run_sync() method is equivalent to asyncio's
> run_until_complete(),
> > and Tornado supports multiple IOLoops in this way. We use this to expose
> a
> > synchronous version of our AsyncHTTPClient:
> > https://github.com/tornadoweb/tornado/blob/
> 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
> >
> > -Ben
> >
> >>
> >>
> >> I admit that my concerns here regarding libraries spawning their own
> >> threads may be overblown: after my series of negative experiences I
> >> basically never went back to that model, and it may be that the problems
> >> were more user-error than anything else. However, I feel comfortable
> saying
> >> that libraries spawning their own Python threads is definitely subtle
> and
> >> hard to get right, at the very least.
> >>
> >> Cory
> >> _______________________________________________
> >> Async-sig mailing list
> >> Async-sig at python.org
> >> https://mail.python.org/mailman/listinfo/async-sig
> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
> >
> >
> > _______________________________________________
> > Async-sig mailing list
> > Async-sig at python.org
> > https://mail.python.org/mailman/listinfo/async-sig
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
> >
>
>
>
> --
> --pau
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/a0cc9906/attachment.html>

From pfreixes at gmail.com  Mon Jun 12 11:49:41 2017
From: pfreixes at gmail.com (Pau Freixes)
Date: Mon, 12 Jun 2017 17:49:41 +0200
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
Message-ID: <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>

And what about the rationale of having multiple loop instances in the same
thread switching btw them. Im still trying to find out what patterns need
this... Do you have an example?

Btw thanks for the first explanation

El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:

> In theory it's possible to create two event loops (using
> new_event_loop()), then set one as the default event loop (using
> set_event_loop()), then run the other one (using run_forever() or
> run_until_complete()). To tasks running in the latter event loop,
> get_event_loop() would nevertheless return the former.
>
> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>
>> Sorry a bit of topic, but I would like to figure out why older python
>> versions, prior this commit [1], the get_event_loop is not considered
>> deterministic
>>
>> does anybody know the reason behind this change?
>>
>>
>> [1] https://github.com/python/cpython/commit/600a349781bfa0a8239
>> e1cb95fac29c7c4a3302e
>>
>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com> wrote:
>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>> wrote:
>> >>
>> >>
>> >>
>> >> My concern with multiple loops boils down to the fact that urllib3
>> >> supports being used in a multithreaded context where each thread can
>> >> independently make forward progress on one request. To establish that
>> with a
>> >> synchronous codebase you either need one event loop per thread or you
>> need
>> >> to spawn a background thread on startup that owns the only event loop
>> in the
>> >> process.
>> >
>> >
>> > Yeah, one event loop per thread is probably the way to go for
>> integration
>> > with synchronous codebases. A dedicated event loop thread may perform
>> better
>> > but libraries that spawn threads are problematic.
>> >
>> >>
>> >>
>> >> Generally speaking I?ve not had positive results with libraries
>> spawning
>> >> their own threads in Python. In my experience this has tended to lead
>> to
>> >> programs that deadlock mysteriously or that fail to terminate in the
>> face of
>> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads,
>> which
>> >> would make me want a ?one-event-loop-per-thread? model: hence, needing
>> a
>> >> loop parameter to pass around prior to 3.6.
>> >
>> >
>> > You can avoid the loop parameter on older versions of asyncio (at least
>> as
>> > long as the default event loop policy is used) by manually setting your
>> > event loop as current before calling run_until_complete (and resetting
>> it
>> > afterwards).
>> >
>> > Tornado's run_sync() method is equivalent to asyncio's
>> run_until_complete(),
>> > and Tornado supports multiple IOLoops in this way. We use this to
>> expose a
>> > synchronous version of our AsyncHTTPClient:
>> > https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83
>> f951758c96775a43e80475b/tornado/httpclient.py#L54
>> >
>> > -Ben
>> >
>> >>
>> >>
>> >> I admit that my concerns here regarding libraries spawning their own
>> >> threads may be overblown: after my series of negative experiences I
>> >> basically never went back to that model, and it may be that the
>> problems
>> >> were more user-error than anything else. However, I feel comfortable
>> saying
>> >> that libraries spawning their own Python threads is definitely subtle
>> and
>> >> hard to get right, at the very least.
>> >>
>> >> Cory
>> >> _______________________________________________
>> >> Async-sig mailing list
>> >> Async-sig at python.org
>> >> https://mail.python.org/mailman/listinfo/async-sig
>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>> >
>> >
>> > _______________________________________________
>> > Async-sig mailing list
>> > Async-sig at python.org
>> > https://mail.python.org/mailman/listinfo/async-sig
>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>> >
>>
>>
>>
>> --
>> --pau
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/e8c419c2/attachment-0001.html>

From guido at python.org  Mon Jun 12 11:58:14 2017
From: guido at python.org (Guido van Rossum)
Date: Mon, 12 Jun 2017 08:58:14 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
Message-ID: <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>

Multiple loops in the same thread is purely theoretical -- the API allows
it but there's no use case. It might be necessary if a platform has a
UI-only event loop that cannot be extended to do I/O -- the only solution
to do background I/O might be to alternate between two loops. (Though in
that case I would still prefer a thread for the background I/O.)

On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com> wrote:

> And what about the rationale of having multiple loop instances in the same
> thread switching btw them. Im still trying to find out what patterns need
> this... Do you have an example?
>
> Btw thanks for the first explanation
>
> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>
>> In theory it's possible to create two event loops (using
>> new_event_loop()), then set one as the default event loop (using
>> set_event_loop()), then run the other one (using run_forever() or
>> run_until_complete()). To tasks running in the latter event loop,
>> get_event_loop() would nevertheless return the former.
>>
>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>>
>>> Sorry a bit of topic, but I would like to figure out why older python
>>> versions, prior this commit [1], the get_event_loop is not considered
>>> deterministic
>>>
>>> does anybody know the reason behind this change?
>>>
>>>
>>> [1] https://github.com/python/cpython/commit/600a349781bfa0a8239
>>> e1cb95fac29c7c4a3302e
>>>
>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com> wrote:
>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>> wrote:
>>> >>
>>> >>
>>> >>
>>> >> My concern with multiple loops boils down to the fact that urllib3
>>> >> supports being used in a multithreaded context where each thread can
>>> >> independently make forward progress on one request. To establish that
>>> with a
>>> >> synchronous codebase you either need one event loop per thread or you
>>> need
>>> >> to spawn a background thread on startup that owns the only event loop
>>> in the
>>> >> process.
>>> >
>>> >
>>> > Yeah, one event loop per thread is probably the way to go for
>>> integration
>>> > with synchronous codebases. A dedicated event loop thread may perform
>>> better
>>> > but libraries that spawn threads are problematic.
>>> >
>>> >>
>>> >>
>>> >> Generally speaking I?ve not had positive results with libraries
>>> spawning
>>> >> their own threads in Python. In my experience this has tended to lead
>>> to
>>> >> programs that deadlock mysteriously or that fail to terminate in the
>>> face of
>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads,
>>> which
>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>> needing a
>>> >> loop parameter to pass around prior to 3.6.
>>> >
>>> >
>>> > You can avoid the loop parameter on older versions of asyncio (at
>>> least as
>>> > long as the default event loop policy is used) by manually setting your
>>> > event loop as current before calling run_until_complete (and resetting
>>> it
>>> > afterwards).
>>> >
>>> > Tornado's run_sync() method is equivalent to asyncio's
>>> run_until_complete(),
>>> > and Tornado supports multiple IOLoops in this way. We use this to
>>> expose a
>>> > synchronous version of our AsyncHTTPClient:
>>> > https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83
>>> f951758c96775a43e80475b/tornado/httpclient.py#L54
>>> >
>>> > -Ben
>>> >
>>> >>
>>> >>
>>> >> I admit that my concerns here regarding libraries spawning their own
>>> >> threads may be overblown: after my series of negative experiences I
>>> >> basically never went back to that model, and it may be that the
>>> problems
>>> >> were more user-error than anything else. However, I feel comfortable
>>> saying
>>> >> that libraries spawning their own Python threads is definitely subtle
>>> and
>>> >> hard to get right, at the very least.
>>> >>
>>> >> Cory
>>> >> _______________________________________________
>>> >> Async-sig mailing list
>>> >> Async-sig at python.org
>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>> >
>>> >
>>> > _______________________________________________
>>> > Async-sig mailing list
>>> > Async-sig at python.org
>>> > https://mail.python.org/mailman/listinfo/async-sig
>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>> >
>>>
>>>
>>>
>>> --
>>> --pau
>>> _______________________________________________
>>> Async-sig mailing list
>>> Async-sig at python.org
>>> https://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/0eba4e48/attachment.html>

From andrew.svetlov at gmail.com  Mon Jun 12 12:25:23 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 12 Jun 2017 16:25:23 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
 <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
Message-ID: <CAL3CFcUB18CrAwsx6amv+xhgbfMAL-h5qm_c8xBXcbMe1mDVpg@mail.gmail.com>

Unit tests at least. Running every test in own loop is crucial fro tests
isolation.

On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum <guido at python.org> wrote:

> Multiple loops in the same thread is purely theoretical -- the API allows
> it but there's no use case. It might be necessary if a platform has a
> UI-only event loop that cannot be extended to do I/O -- the only solution
> to do background I/O might be to alternate between two loops. (Though in
> that case I would still prefer a thread for the background I/O.)
>
> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>
>> And what about the rationale of having multiple loop instances in the
>> same thread switching btw them. Im still trying to find out what patterns
>> need this... Do you have an example?
>>
>> Btw thanks for the first explanation
>>
>> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>>
>>> In theory it's possible to create two event loops (using
>>> new_event_loop()), then set one as the default event loop (using
>>> set_event_loop()), then run the other one (using run_forever() or
>>> run_until_complete()). To tasks running in the latter event loop,
>>> get_event_loop() would nevertheless return the former.
>>>
>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>>>
>>>> Sorry a bit of topic, but I would like to figure out why older python
>>>> versions, prior this commit [1], the get_event_loop is not considered
>>>> deterministic
>>>>
>>>> does anybody know the reason behind this change?
>>>>
>>>>
>>>> [1]
>>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e
>>>>
>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com> wrote:
>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>>> wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> My concern with multiple loops boils down to the fact that urllib3
>>>> >> supports being used in a multithreaded context where each thread can
>>>> >> independently make forward progress on one request. To establish
>>>> that with a
>>>> >> synchronous codebase you either need one event loop per thread or
>>>> you need
>>>> >> to spawn a background thread on startup that owns the only event
>>>> loop in the
>>>> >> process.
>>>> >
>>>> >
>>>> > Yeah, one event loop per thread is probably the way to go for
>>>> integration
>>>> > with synchronous codebases. A dedicated event loop thread may perform
>>>> better
>>>> > but libraries that spawn threads are problematic.
>>>> >
>>>> >>
>>>> >>
>>>> >> Generally speaking I?ve not had positive results with libraries
>>>> spawning
>>>> >> their own threads in Python. In my experience this has tended to
>>>> lead to
>>>> >> programs that deadlock mysteriously or that fail to terminate in the
>>>> face of
>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads,
>>>> which
>>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>>> needing a
>>>> >> loop parameter to pass around prior to 3.6.
>>>> >
>>>> >
>>>> > You can avoid the loop parameter on older versions of asyncio (at
>>>> least as
>>>> > long as the default event loop policy is used) by manually setting
>>>> your
>>>> > event loop as current before calling run_until_complete (and
>>>> resetting it
>>>> > afterwards).
>>>> >
>>>> > Tornado's run_sync() method is equivalent to asyncio's
>>>> run_until_complete(),
>>>> > and Tornado supports multiple IOLoops in this way. We use this to
>>>> expose a
>>>> > synchronous version of our AsyncHTTPClient:
>>>> >
>>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>>>> >
>>>> > -Ben
>>>> >
>>>> >>
>>>> >>
>>>> >> I admit that my concerns here regarding libraries spawning their own
>>>> >> threads may be overblown: after my series of negative experiences I
>>>> >> basically never went back to that model, and it may be that the
>>>> problems
>>>> >> were more user-error than anything else. However, I feel comfortable
>>>> saying
>>>> >> that libraries spawning their own Python threads is definitely
>>>> subtle and
>>>> >> hard to get right, at the very least.
>>>> >>
>>>> >> Cory
>>>> >> _______________________________________________
>>>> >> Async-sig mailing list
>>>> >> Async-sig at python.org
>>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Async-sig mailing list
>>>> > Async-sig at python.org
>>>> > https://mail.python.org/mailman/listinfo/async-sig
>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> --pau
>>>> _______________________________________________
>>>> Async-sig mailing list
>>>> Async-sig at python.org
>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>
>>>
>>>
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>>
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/f560658e/attachment-0001.html>

From guido at python.org  Mon Jun 12 12:37:12 2017
From: guido at python.org (Guido van Rossum)
Date: Mon, 12 Jun 2017 09:37:12 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAL3CFcUB18CrAwsx6amv+xhgbfMAL-h5qm_c8xBXcbMe1mDVpg@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
 <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
 <CAL3CFcUB18CrAwsx6amv+xhgbfMAL-h5qm_c8xBXcbMe1mDVpg@mail.gmail.com>
Message-ID: <CAP7+vJ+5v3EDiKqxzEu_DxiDUSymfJqOcES2Jwg=ZgB6MH9iMg@mail.gmail.com>

Yes, but not co-existing, I hope!

On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov <andrew.svetlov at gmail.com>
wrote:

> Unit tests at least. Running every test in own loop is crucial fro tests
> isolation.
>
> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum <guido at python.org> wrote:
>
>> Multiple loops in the same thread is purely theoretical -- the API allows
>> it but there's no use case. It might be necessary if a platform has a
>> UI-only event loop that cannot be extended to do I/O -- the only solution
>> to do background I/O might be to alternate between two loops. (Though in
>> that case I would still prefer a thread for the background I/O.)
>>
>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>>
>>> And what about the rationale of having multiple loop instances in the
>>> same thread switching btw them. Im still trying to find out what patterns
>>> need this... Do you have an example?
>>>
>>> Btw thanks for the first explanation
>>>
>>> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>>>
>>>> In theory it's possible to create two event loops (using
>>>> new_event_loop()), then set one as the default event loop (using
>>>> set_event_loop()), then run the other one (using run_forever() or
>>>> run_until_complete()). To tasks running in the latter event loop,
>>>> get_event_loop() would nevertheless return the former.
>>>>
>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com>
>>>> wrote:
>>>>
>>>>> Sorry a bit of topic, but I would like to figure out why older python
>>>>> versions, prior this commit [1], the get_event_loop is not considered
>>>>> deterministic
>>>>>
>>>>> does anybody know the reason behind this change?
>>>>>
>>>>>
>>>>> [1] https://github.com/python/cpython/commit/
>>>>> 600a349781bfa0a8239e1cb95fac29c7c4a3302e
>>>>>
>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com>
>>>>> wrote:
>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>>>> wrote:
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> My concern with multiple loops boils down to the fact that urllib3
>>>>> >> supports being used in a multithreaded context where each thread can
>>>>> >> independently make forward progress on one request. To establish
>>>>> that with a
>>>>> >> synchronous codebase you either need one event loop per thread or
>>>>> you need
>>>>> >> to spawn a background thread on startup that owns the only event
>>>>> loop in the
>>>>> >> process.
>>>>> >
>>>>> >
>>>>> > Yeah, one event loop per thread is probably the way to go for
>>>>> integration
>>>>> > with synchronous codebases. A dedicated event loop thread may
>>>>> perform better
>>>>> > but libraries that spawn threads are problematic.
>>>>> >
>>>>> >>
>>>>> >>
>>>>> >> Generally speaking I?ve not had positive results with libraries
>>>>> spawning
>>>>> >> their own threads in Python. In my experience this has tended to
>>>>> lead to
>>>>> >> programs that deadlock mysteriously or that fail to terminate in
>>>>> the face of
>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own
>>>>> threads, which
>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>>>> needing a
>>>>> >> loop parameter to pass around prior to 3.6.
>>>>> >
>>>>> >
>>>>> > You can avoid the loop parameter on older versions of asyncio (at
>>>>> least as
>>>>> > long as the default event loop policy is used) by manually setting
>>>>> your
>>>>> > event loop as current before calling run_until_complete (and
>>>>> resetting it
>>>>> > afterwards).
>>>>> >
>>>>> > Tornado's run_sync() method is equivalent to asyncio's
>>>>> run_until_complete(),
>>>>> > and Tornado supports multiple IOLoops in this way. We use this to
>>>>> expose a
>>>>> > synchronous version of our AsyncHTTPClient:
>>>>> > https://github.com/tornadoweb/tornado/blob/
>>>>> 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>>>>> >
>>>>> > -Ben
>>>>> >
>>>>> >>
>>>>> >>
>>>>> >> I admit that my concerns here regarding libraries spawning their own
>>>>> >> threads may be overblown: after my series of negative experiences I
>>>>> >> basically never went back to that model, and it may be that the
>>>>> problems
>>>>> >> were more user-error than anything else. However, I feel
>>>>> comfortable saying
>>>>> >> that libraries spawning their own Python threads is definitely
>>>>> subtle and
>>>>> >> hard to get right, at the very least.
>>>>> >>
>>>>> >> Cory
>>>>> >> _______________________________________________
>>>>> >> Async-sig mailing list
>>>>> >> Async-sig at python.org
>>>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Async-sig mailing list
>>>>> > Async-sig at python.org
>>>>> > https://mail.python.org/mailman/listinfo/async-sig
>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> --pau
>>>>> _______________________________________________
>>>>> Async-sig mailing list
>>>>> Async-sig at python.org
>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --Guido van Rossum (python.org/~guido)
>>>>
>>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
> --
> Thanks,
> Andrew Svetlov
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/2cebca82/attachment.html>

From andrew.svetlov at gmail.com  Mon Jun 12 12:57:29 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 12 Jun 2017 16:57:29 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJ+5v3EDiKqxzEu_DxiDUSymfJqOcES2Jwg=ZgB6MH9iMg@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
 <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
 <CAL3CFcUB18CrAwsx6amv+xhgbfMAL-h5qm_c8xBXcbMe1mDVpg@mail.gmail.com>
 <CAP7+vJ+5v3EDiKqxzEu_DxiDUSymfJqOcES2Jwg=ZgB6MH9iMg@mail.gmail.com>
Message-ID: <CAL3CFcV_bBW1uzXXN5URbQA-S8gQe51gTW=T+yYF15sccTeDWA@mail.gmail.com>

Yes, but with one exception: default event loop created on module import
stage might co-exist with a loop created for test.
It leads to mystic hangs, you know.
Please recall code like:
    class A:
         mongodb = motor.motor_asyncio.AsyncIOMotorClient()

On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum <guido at python.org> wrote:

> Yes, but not co-existing, I hope!
>
> On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov <andrew.svetlov at gmail.com>
> wrote:
>
>> Unit tests at least. Running every test in own loop is crucial fro tests
>> isolation.
>>
>> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum <guido at python.org>
>> wrote:
>>
>>> Multiple loops in the same thread is purely theoretical -- the API
>>> allows it but there's no use case. It might be necessary if a platform has
>>> a UI-only event loop that cannot be extended to do I/O -- the only solution
>>> to do background I/O might be to alternate between two loops. (Though in
>>> that case I would still prefer a thread for the background I/O.)
>>>
>>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>>>
>>>> And what about the rationale of having multiple loop instances in the
>>>> same thread switching btw them. Im still trying to find out what patterns
>>>> need this... Do you have an example?
>>>>
>>>> Btw thanks for the first explanation
>>>>
>>>> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>>>>
>>>>> In theory it's possible to create two event loops (using
>>>>> new_event_loop()), then set one as the default event loop (using
>>>>> set_event_loop()), then run the other one (using run_forever() or
>>>>> run_until_complete()). To tasks running in the latter event loop,
>>>>> get_event_loop() would nevertheless return the former.
>>>>>
>>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sorry a bit of topic, but I would like to figure out why older python
>>>>>> versions, prior this commit [1], the get_event_loop is not considered
>>>>>> deterministic
>>>>>>
>>>>>> does anybody know the reason behind this change?
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e
>>>>>>
>>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com>
>>>>>> wrote:
>>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>>>>> wrote:
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> My concern with multiple loops boils down to the fact that urllib3
>>>>>> >> supports being used in a multithreaded context where each thread
>>>>>> can
>>>>>> >> independently make forward progress on one request. To establish
>>>>>> that with a
>>>>>> >> synchronous codebase you either need one event loop per thread or
>>>>>> you need
>>>>>> >> to spawn a background thread on startup that owns the only event
>>>>>> loop in the
>>>>>> >> process.
>>>>>> >
>>>>>> >
>>>>>> > Yeah, one event loop per thread is probably the way to go for
>>>>>> integration
>>>>>> > with synchronous codebases. A dedicated event loop thread may
>>>>>> perform better
>>>>>> > but libraries that spawn threads are problematic.
>>>>>> >
>>>>>> >>
>>>>>> >>
>>>>>> >> Generally speaking I?ve not had positive results with libraries
>>>>>> spawning
>>>>>> >> their own threads in Python. In my experience this has tended to
>>>>>> lead to
>>>>>> >> programs that deadlock mysteriously or that fail to terminate in
>>>>>> the face of
>>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own
>>>>>> threads, which
>>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>>>>> needing a
>>>>>> >> loop parameter to pass around prior to 3.6.
>>>>>> >
>>>>>> >
>>>>>> > You can avoid the loop parameter on older versions of asyncio (at
>>>>>> least as
>>>>>> > long as the default event loop policy is used) by manually setting
>>>>>> your
>>>>>> > event loop as current before calling run_until_complete (and
>>>>>> resetting it
>>>>>> > afterwards).
>>>>>> >
>>>>>> > Tornado's run_sync() method is equivalent to asyncio's
>>>>>> run_until_complete(),
>>>>>> > and Tornado supports multiple IOLoops in this way. We use this to
>>>>>> expose a
>>>>>> > synchronous version of our AsyncHTTPClient:
>>>>>> >
>>>>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>>>>>> >
>>>>>> > -Ben
>>>>>> >
>>>>>> >>
>>>>>> >>
>>>>>> >> I admit that my concerns here regarding libraries spawning their
>>>>>> own
>>>>>> >> threads may be overblown: after my series of negative experiences I
>>>>>> >> basically never went back to that model, and it may be that the
>>>>>> problems
>>>>>> >> were more user-error than anything else. However, I feel
>>>>>> comfortable saying
>>>>>> >> that libraries spawning their own Python threads is definitely
>>>>>> subtle and
>>>>>> >> hard to get right, at the very least.
>>>>>> >>
>>>>>> >> Cory
>>>>>> >> _______________________________________________
>>>>>> >> Async-sig mailing list
>>>>>> >> Async-sig at python.org
>>>>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > Async-sig mailing list
>>>>>> > Async-sig at python.org
>>>>>> > https://mail.python.org/mailman/listinfo/async-sig
>>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --pau
>>>>>> _______________________________________________
>>>>>> Async-sig mailing list
>>>>>> Async-sig at python.org
>>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> --Guido van Rossum (python.org/~guido)
>>>>>
>>>>
>>>
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>> _______________________________________________
>>> Async-sig mailing list
>>> Async-sig at python.org
>>> https://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>
>> --
>> Thanks,
>> Andrew Svetlov
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/f3ac6689/attachment-0001.html>

From ben at bendarnell.com  Mon Jun 12 12:14:08 2017
From: ben at bendarnell.com (Ben Darnell)
Date: Mon, 12 Jun 2017 16:14:08 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
 <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
Message-ID: <CAFkYKJ7WLNd5aYzUBQAYMB2c9vW6QyQx6hAsMxEzvb+fQrSzKw@mail.gmail.com>

In Tornado this comes up sometimes in initialization scenarios:

    def main():
        # Since main is synchronous, we need a synchronous HTTP client
        with tornado.httpclient.HTTPClient() as client:
            # HTTPClient creates its own event loop and runs it behind the
scenes.
            # This is not the same as the event loop under which main() is
running.
            resp = client.fetch(url)

    if __name__ == '__main__':
        IOLoop.current().add_callback(main)
        IOLoop.current().start()

This is never an ideal scenario (it would be better to make main() a
coroutine and use an async HTTP client), but it does sometimes come up as
the most expedient option.

This scenario is also why methods like EventLoop.is_running() tend to be
misguided - the question of "can I use this event loop" is not directly
related to "is this event loop running".

-Ben

On Mon, Jun 12, 2017 at 11:58 AM Guido van Rossum <guido at python.org> wrote:

> Multiple loops in the same thread is purely theoretical -- the API allows
> it but there's no use case. It might be necessary if a platform has a
> UI-only event loop that cannot be extended to do I/O -- the only solution
> to do background I/O might be to alternate between two loops. (Though in
> that case I would still prefer a thread for the background I/O.)
>
> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>
>> And what about the rationale of having multiple loop instances in the
>> same thread switching btw them. Im still trying to find out what patterns
>> need this... Do you have an example?
>>
>> Btw thanks for the first explanation
>>
>> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>>
>>> In theory it's possible to create two event loops (using
>>> new_event_loop()), then set one as the default event loop (using
>>> set_event_loop()), then run the other one (using run_forever() or
>>> run_until_complete()). To tasks running in the latter event loop,
>>> get_event_loop() would nevertheless return the former.
>>>
>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com> wrote:
>>>
>>>> Sorry a bit of topic, but I would like to figure out why older python
>>>> versions, prior this commit [1], the get_event_loop is not considered
>>>> deterministic
>>>>
>>>> does anybody know the reason behind this change?
>>>>
>>>>
>>>> [1]
>>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e
>>>>
>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com> wrote:
>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>>> wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> My concern with multiple loops boils down to the fact that urllib3
>>>> >> supports being used in a multithreaded context where each thread can
>>>> >> independently make forward progress on one request. To establish
>>>> that with a
>>>> >> synchronous codebase you either need one event loop per thread or
>>>> you need
>>>> >> to spawn a background thread on startup that owns the only event
>>>> loop in the
>>>> >> process.
>>>> >
>>>> >
>>>> > Yeah, one event loop per thread is probably the way to go for
>>>> integration
>>>> > with synchronous codebases. A dedicated event loop thread may perform
>>>> better
>>>> > but libraries that spawn threads are problematic.
>>>> >
>>>> >>
>>>> >>
>>>> >> Generally speaking I?ve not had positive results with libraries
>>>> spawning
>>>> >> their own threads in Python. In my experience this has tended to
>>>> lead to
>>>> >> programs that deadlock mysteriously or that fail to terminate in the
>>>> face of
>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own threads,
>>>> which
>>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>>> needing a
>>>> >> loop parameter to pass around prior to 3.6.
>>>> >
>>>> >
>>>> > You can avoid the loop parameter on older versions of asyncio (at
>>>> least as
>>>> > long as the default event loop policy is used) by manually setting
>>>> your
>>>> > event loop as current before calling run_until_complete (and
>>>> resetting it
>>>> > afterwards).
>>>> >
>>>> > Tornado's run_sync() method is equivalent to asyncio's
>>>> run_until_complete(),
>>>> > and Tornado supports multiple IOLoops in this way. We use this to
>>>> expose a
>>>> > synchronous version of our AsyncHTTPClient:
>>>> >
>>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>>>> >
>>>> > -Ben
>>>> >
>>>> >>
>>>> >>
>>>> >> I admit that my concerns here regarding libraries spawning their own
>>>> >> threads may be overblown: after my series of negative experiences I
>>>> >> basically never went back to that model, and it may be that the
>>>> problems
>>>> >> were more user-error than anything else. However, I feel comfortable
>>>> saying
>>>> >> that libraries spawning their own Python threads is definitely
>>>> subtle and
>>>> >> hard to get right, at the very least.
>>>> >>
>>>> >> Cory
>>>> >> _______________________________________________
>>>> >> Async-sig mailing list
>>>> >> Async-sig at python.org
>>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Async-sig mailing list
>>>> > Async-sig at python.org
>>>> > https://mail.python.org/mailman/listinfo/async-sig
>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> --pau
>>>> _______________________________________________
>>>> Async-sig mailing list
>>>> Async-sig at python.org
>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>
>>>
>>>
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>>
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/e0a9154d/attachment.html>

From guido at python.org  Mon Jun 12 14:50:26 2017
From: guido at python.org (Guido van Rossum)
Date: Mon, 12 Jun 2017 11:50:26 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAL3CFcV_bBW1uzXXN5URbQA-S8gQe51gTW=T+yYF15sccTeDWA@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
 <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
 <CAL3CFcUB18CrAwsx6amv+xhgbfMAL-h5qm_c8xBXcbMe1mDVpg@mail.gmail.com>
 <CAP7+vJ+5v3EDiKqxzEu_DxiDUSymfJqOcES2Jwg=ZgB6MH9iMg@mail.gmail.com>
 <CAL3CFcV_bBW1uzXXN5URbQA-S8gQe51gTW=T+yYF15sccTeDWA@mail.gmail.com>
Message-ID: <CAP7+vJKCKC+gkZw51M41kZHM1qTx412TB6_bW=KGPMubCBzVuQ@mail.gmail.com>

Honestly I think we're in agreement. There's never a use for one loop
running while another is the default. There are some rare use cases for
multiple loops running but before the mentioned commit it was up to the app
to ensure to switch the default loop when running a loop. The commit took
the ability to screw up there out of the user's hand.

On Mon, Jun 12, 2017 at 9:57 AM, Andrew Svetlov <andrew.svetlov at gmail.com>
wrote:

> Yes, but with one exception: default event loop created on module import
> stage might co-exist with a loop created for test.
> It leads to mystic hangs, you know.
> Please recall code like:
>     class A:
>          mongodb = motor.motor_asyncio.AsyncIOMotorClient()
>
> On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum <guido at python.org> wrote:
>
>> Yes, but not co-existing, I hope!
>>
>> On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov <andrew.svetlov at gmail.com
>> > wrote:
>>
>>> Unit tests at least. Running every test in own loop is crucial fro tests
>>> isolation.
>>>
>>> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum <guido at python.org>
>>> wrote:
>>>
>>>> Multiple loops in the same thread is purely theoretical -- the API
>>>> allows it but there's no use case. It might be necessary if a platform has
>>>> a UI-only event loop that cannot be extended to do I/O -- the only solution
>>>> to do background I/O might be to alternate between two loops. (Though in
>>>> that case I would still prefer a thread for the background I/O.)
>>>>
>>>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com>
>>>> wrote:
>>>>
>>>>> And what about the rationale of having multiple loop instances in the
>>>>> same thread switching btw them. Im still trying to find out what patterns
>>>>> need this... Do you have an example?
>>>>>
>>>>> Btw thanks for the first explanation
>>>>>
>>>>> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>>>>>
>>>>>> In theory it's possible to create two event loops (using
>>>>>> new_event_loop()), then set one as the default event loop (using
>>>>>> set_event_loop()), then run the other one (using run_forever() or
>>>>>> run_until_complete()). To tasks running in the latter event loop,
>>>>>> get_event_loop() would nevertheless return the former.
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Sorry a bit of topic, but I would like to figure out why older python
>>>>>>> versions, prior this commit [1], the get_event_loop is not considered
>>>>>>> deterministic
>>>>>>>
>>>>>>> does anybody know the reason behind this change?
>>>>>>>
>>>>>>>
>>>>>>> [1] https://github.com/python/cpython/commit/
>>>>>>> 600a349781bfa0a8239e1cb95fac29c7c4a3302e
>>>>>>>
>>>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com>
>>>>>>> wrote:
>>>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> My concern with multiple loops boils down to the fact that urllib3
>>>>>>> >> supports being used in a multithreaded context where each thread
>>>>>>> can
>>>>>>> >> independently make forward progress on one request. To establish
>>>>>>> that with a
>>>>>>> >> synchronous codebase you either need one event loop per thread or
>>>>>>> you need
>>>>>>> >> to spawn a background thread on startup that owns the only event
>>>>>>> loop in the
>>>>>>> >> process.
>>>>>>> >
>>>>>>> >
>>>>>>> > Yeah, one event loop per thread is probably the way to go for
>>>>>>> integration
>>>>>>> > with synchronous codebases. A dedicated event loop thread may
>>>>>>> perform better
>>>>>>> > but libraries that spawn threads are problematic.
>>>>>>> >
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Generally speaking I?ve not had positive results with libraries
>>>>>>> spawning
>>>>>>> >> their own threads in Python. In my experience this has tended to
>>>>>>> lead to
>>>>>>> >> programs that deadlock mysteriously or that fail to terminate in
>>>>>>> the face of
>>>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own
>>>>>>> threads, which
>>>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>>>>>> needing a
>>>>>>> >> loop parameter to pass around prior to 3.6.
>>>>>>> >
>>>>>>> >
>>>>>>> > You can avoid the loop parameter on older versions of asyncio (at
>>>>>>> least as
>>>>>>> > long as the default event loop policy is used) by manually setting
>>>>>>> your
>>>>>>> > event loop as current before calling run_until_complete (and
>>>>>>> resetting it
>>>>>>> > afterwards).
>>>>>>> >
>>>>>>> > Tornado's run_sync() method is equivalent to asyncio's
>>>>>>> run_until_complete(),
>>>>>>> > and Tornado supports multiple IOLoops in this way. We use this to
>>>>>>> expose a
>>>>>>> > synchronous version of our AsyncHTTPClient:
>>>>>>> > https://github.com/tornadoweb/tornado/blob/
>>>>>>> 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>>>>>>> >
>>>>>>> > -Ben
>>>>>>> >
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> I admit that my concerns here regarding libraries spawning their
>>>>>>> own
>>>>>>> >> threads may be overblown: after my series of negative experiences
>>>>>>> I
>>>>>>> >> basically never went back to that model, and it may be that the
>>>>>>> problems
>>>>>>> >> were more user-error than anything else. However, I feel
>>>>>>> comfortable saying
>>>>>>> >> that libraries spawning their own Python threads is definitely
>>>>>>> subtle and
>>>>>>> >> hard to get right, at the very least.
>>>>>>> >>
>>>>>>> >> Cory
>>>>>>> >> _______________________________________________
>>>>>>> >> Async-sig mailing list
>>>>>>> >> Async-sig at python.org
>>>>>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>> >
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > Async-sig mailing list
>>>>>>> > Async-sig at python.org
>>>>>>> > https://mail.python.org/mailman/listinfo/async-sig
>>>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> --pau
>>>>>>> _______________________________________________
>>>>>>> Async-sig mailing list
>>>>>>> Async-sig at python.org
>>>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --Guido van Rossum (python.org/~guido)
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> --Guido van Rossum (python.org/~guido)
>>>> _______________________________________________
>>>> Async-sig mailing list
>>>> Async-sig at python.org
>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>
>>> --
>>> Thanks,
>>> Andrew Svetlov
>>>
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
> --
> Thanks,
> Andrew Svetlov
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/30425c8a/attachment-0001.html>

From andrew.svetlov at gmail.com  Mon Jun 12 15:05:22 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 12 Jun 2017 19:05:22 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAP7+vJKCKC+gkZw51M41kZHM1qTx412TB6_bW=KGPMubCBzVuQ@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
 <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
 <CAL3CFcUB18CrAwsx6amv+xhgbfMAL-h5qm_c8xBXcbMe1mDVpg@mail.gmail.com>
 <CAP7+vJ+5v3EDiKqxzEu_DxiDUSymfJqOcES2Jwg=ZgB6MH9iMg@mail.gmail.com>
 <CAL3CFcV_bBW1uzXXN5URbQA-S8gQe51gTW=T+yYF15sccTeDWA@mail.gmail.com>
 <CAP7+vJKCKC+gkZw51M41kZHM1qTx412TB6_bW=KGPMubCBzVuQ@mail.gmail.com>
Message-ID: <CAL3CFcV8Vi31Qd4juLkBaxvn90-C+HmyK+Rdf6DiqhbVvkAWWQ@mail.gmail.com>

Agree in general but current asyncio still may shoot your leg.
The solution (at least for my unittest example) might be in adding top
level functions for running asyncio code (asyncio.run() and
asyncio.run_forever() as Yury Selivanov proposed in
https://github.com/python/asyncio/pull/465)
After this we could raise a warning in `asyncio.get_event_loop()` if the
loop was not set explicitly by `asyncio.set_event_loop()`.

On Mon, Jun 12, 2017 at 9:50 PM Guido van Rossum <guido at python.org> wrote:

> Honestly I think we're in agreement. There's never a use for one loop
> running while another is the default. There are some rare use cases for
> multiple loops running but before the mentioned commit it was up to the app
> to ensure to switch the default loop when running a loop. The commit took
> the ability to screw up there out of the user's hand.
>
> On Mon, Jun 12, 2017 at 9:57 AM, Andrew Svetlov <andrew.svetlov at gmail.com>
> wrote:
>
>> Yes, but with one exception: default event loop created on module import
>> stage might co-exist with a loop created for test.
>> It leads to mystic hangs, you know.
>> Please recall code like:
>>     class A:
>>          mongodb = motor.motor_asyncio.AsyncIOMotorClient()
>>
>> On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum <guido at python.org>
>> wrote:
>>
>>> Yes, but not co-existing, I hope!
>>>
>>> On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov <
>>> andrew.svetlov at gmail.com> wrote:
>>>
>>>> Unit tests at least. Running every test in own loop is crucial fro
>>>> tests isolation.
>>>>
>>>> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum <guido at python.org>
>>>> wrote:
>>>>
>>>>> Multiple loops in the same thread is purely theoretical -- the API
>>>>> allows it but there's no use case. It might be necessary if a platform has
>>>>> a UI-only event loop that cannot be extended to do I/O -- the only solution
>>>>> to do background I/O might be to alternate between two loops. (Though in
>>>>> that case I would still prefer a thread for the background I/O.)
>>>>>
>>>>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> And what about the rationale of having multiple loop instances in the
>>>>>> same thread switching btw them. Im still trying to find out what patterns
>>>>>> need this... Do you have an example?
>>>>>>
>>>>>> Btw thanks for the first explanation
>>>>>>
>>>>>> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>>>>>>
>>>>>>> In theory it's possible to create two event loops (using
>>>>>>> new_event_loop()), then set one as the default event loop (using
>>>>>>> set_event_loop()), then run the other one (using run_forever() or
>>>>>>> run_until_complete()). To tasks running in the latter event loop,
>>>>>>> get_event_loop() would nevertheless return the former.
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Sorry a bit of topic, but I would like to figure out why older
>>>>>>>> python
>>>>>>>> versions, prior this commit [1], the get_event_loop is not
>>>>>>>> considered
>>>>>>>> deterministic
>>>>>>>>
>>>>>>>> does anybody know the reason behind this change?
>>>>>>>>
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://github.com/python/cpython/commit/600a349781bfa0a8239e1cb95fac29c7c4a3302e
>>>>>>>>
>>>>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com>
>>>>>>>> wrote:
>>>>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>>>>>>> wrote:
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> My concern with multiple loops boils down to the fact that
>>>>>>>> urllib3
>>>>>>>> >> supports being used in a multithreaded context where each thread
>>>>>>>> can
>>>>>>>> >> independently make forward progress on one request. To establish
>>>>>>>> that with a
>>>>>>>> >> synchronous codebase you either need one event loop per thread
>>>>>>>> or you need
>>>>>>>> >> to spawn a background thread on startup that owns the only event
>>>>>>>> loop in the
>>>>>>>> >> process.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Yeah, one event loop per thread is probably the way to go for
>>>>>>>> integration
>>>>>>>> > with synchronous codebases. A dedicated event loop thread may
>>>>>>>> perform better
>>>>>>>> > but libraries that spawn threads are problematic.
>>>>>>>> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> Generally speaking I?ve not had positive results with libraries
>>>>>>>> spawning
>>>>>>>> >> their own threads in Python. In my experience this has tended to
>>>>>>>> lead to
>>>>>>>> >> programs that deadlock mysteriously or that fail to terminate in
>>>>>>>> the face of
>>>>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own
>>>>>>>> threads, which
>>>>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>>>>>>> needing a
>>>>>>>> >> loop parameter to pass around prior to 3.6.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > You can avoid the loop parameter on older versions of asyncio (at
>>>>>>>> least as
>>>>>>>> > long as the default event loop policy is used) by manually
>>>>>>>> setting your
>>>>>>>> > event loop as current before calling run_until_complete (and
>>>>>>>> resetting it
>>>>>>>> > afterwards).
>>>>>>>> >
>>>>>>>> > Tornado's run_sync() method is equivalent to asyncio's
>>>>>>>> run_until_complete(),
>>>>>>>> > and Tornado supports multiple IOLoops in this way. We use this to
>>>>>>>> expose a
>>>>>>>> > synchronous version of our AsyncHTTPClient:
>>>>>>>> >
>>>>>>>> https://github.com/tornadoweb/tornado/blob/62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>>>>>>>> >
>>>>>>>> > -Ben
>>>>>>>> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> I admit that my concerns here regarding libraries spawning their
>>>>>>>> own
>>>>>>>> >> threads may be overblown: after my series of negative
>>>>>>>> experiences I
>>>>>>>> >> basically never went back to that model, and it may be that the
>>>>>>>> problems
>>>>>>>> >> were more user-error than anything else. However, I feel
>>>>>>>> comfortable saying
>>>>>>>> >> that libraries spawning their own Python threads is definitely
>>>>>>>> subtle and
>>>>>>>> >> hard to get right, at the very least.
>>>>>>>> >>
>>>>>>>> >> Cory
>>>>>>>> >> _______________________________________________
>>>>>>>> >> Async-sig mailing list
>>>>>>>> >> Async-sig at python.org
>>>>>>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>>>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > _______________________________________________
>>>>>>>> > Async-sig mailing list
>>>>>>>> > Async-sig at python.org
>>>>>>>> > https://mail.python.org/mailman/listinfo/async-sig
>>>>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --pau
>>>>>>>> _______________________________________________
>>>>>>>> Async-sig mailing list
>>>>>>>> Async-sig at python.org
>>>>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> --Guido van Rossum (python.org/~guido)
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> --Guido van Rossum (python.org/~guido)
>>>>> _______________________________________________
>>>>> Async-sig mailing list
>>>>> Async-sig at python.org
>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>
>>>> --
>>>> Thanks,
>>>> Andrew Svetlov
>>>>
>>>
>>>
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>>
>> --
>> Thanks,
>> Andrew Svetlov
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/775dbb69/attachment.html>

From guido at python.org  Mon Jun 12 15:09:27 2017
From: guido at python.org (Guido van Rossum)
Date: Mon, 12 Jun 2017 12:09:27 -0700
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAL3CFcV8Vi31Qd4juLkBaxvn90-C+HmyK+Rdf6DiqhbVvkAWWQ@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAFkYKJ7hOhU1RHmgGfj46nVsuhE4oG0f7RHxAMYADf3JeMH43g@mail.gmail.com>
 <CA+ULCcEFGSJzt_A=sJ8e1PZx4-3U=if=FrchErFRdeiOaUOMXQ@mail.gmail.com>
 <CAP7+vJJ5+SBkdpui=x+AQwqhe2ddi7pPqkL-tK2DeNjGEwn_1A@mail.gmail.com>
 <CA+ULCcFv6Ms85Jt8_2aM53opW9sDJGaTzpOBjKEV6No4oVyGmw@mail.gmail.com>
 <CAP7+vJJCB1OJ-+A=K4U3wynOB26OcCoHPYSyUQZNyg1t8JquhA@mail.gmail.com>
 <CAL3CFcUB18CrAwsx6amv+xhgbfMAL-h5qm_c8xBXcbMe1mDVpg@mail.gmail.com>
 <CAP7+vJ+5v3EDiKqxzEu_DxiDUSymfJqOcES2Jwg=ZgB6MH9iMg@mail.gmail.com>
 <CAL3CFcV_bBW1uzXXN5URbQA-S8gQe51gTW=T+yYF15sccTeDWA@mail.gmail.com>
 <CAP7+vJKCKC+gkZw51M41kZHM1qTx412TB6_bW=KGPMubCBzVuQ@mail.gmail.com>
 <CAL3CFcV8Vi31Qd4juLkBaxvn90-C+HmyK+Rdf6DiqhbVvkAWWQ@mail.gmail.com>
Message-ID: <CAP7+vJ+euXAoewiKNVqO_Qs1gje6fbw41tCieYmKOKptMTNfUw@mail.gmail.com>

I think we're getting way beyond the rationale Pau Freixes requested...

On Mon, Jun 12, 2017 at 12:05 PM, Andrew Svetlov <andrew.svetlov at gmail.com>
wrote:

> Agree in general but current asyncio still may shoot your leg.
> The solution (at least for my unittest example) might be in adding top
> level functions for running asyncio code (asyncio.run() and
> asyncio.run_forever() as Yury Selivanov proposed in
> https://github.com/python/asyncio/pull/465)
> After this we could raise a warning in `asyncio.get_event_loop()` if the
> loop was not set explicitly by `asyncio.set_event_loop()`.
>
> On Mon, Jun 12, 2017 at 9:50 PM Guido van Rossum <guido at python.org> wrote:
>
>> Honestly I think we're in agreement. There's never a use for one loop
>> running while another is the default. There are some rare use cases for
>> multiple loops running but before the mentioned commit it was up to the app
>> to ensure to switch the default loop when running a loop. The commit took
>> the ability to screw up there out of the user's hand.
>>
>> On Mon, Jun 12, 2017 at 9:57 AM, Andrew Svetlov <andrew.svetlov at gmail.com
>> > wrote:
>>
>>> Yes, but with one exception: default event loop created on module import
>>> stage might co-exist with a loop created for test.
>>> It leads to mystic hangs, you know.
>>> Please recall code like:
>>>     class A:
>>>          mongodb = motor.motor_asyncio.AsyncIOMotorClient()
>>>
>>> On Mon, Jun 12, 2017 at 7:37 PM Guido van Rossum <guido at python.org>
>>> wrote:
>>>
>>>> Yes, but not co-existing, I hope!
>>>>
>>>> On Mon, Jun 12, 2017 at 9:25 AM, Andrew Svetlov <
>>>> andrew.svetlov at gmail.com> wrote:
>>>>
>>>>> Unit tests at least. Running every test in own loop is crucial fro
>>>>> tests isolation.
>>>>>
>>>>> On Mon, Jun 12, 2017 at 7:04 PM Guido van Rossum <guido at python.org>
>>>>> wrote:
>>>>>
>>>>>> Multiple loops in the same thread is purely theoretical -- the API
>>>>>> allows it but there's no use case. It might be necessary if a platform has
>>>>>> a UI-only event loop that cannot be extended to do I/O -- the only solution
>>>>>> to do background I/O might be to alternate between two loops. (Though in
>>>>>> that case I would still prefer a thread for the background I/O.)
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 8:49 AM, Pau Freixes <pfreixes at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> And what about the rationale of having multiple loop instances in
>>>>>>> the same thread switching btw them. Im still trying to find out what
>>>>>>> patterns need this... Do you have an example?
>>>>>>>
>>>>>>> Btw thanks for the first explanation
>>>>>>>
>>>>>>> El 12/06/2017 17:36, "Guido van Rossum" <guido at python.org> escribi?:
>>>>>>>
>>>>>>>> In theory it's possible to create two event loops (using
>>>>>>>> new_event_loop()), then set one as the default event loop (using
>>>>>>>> set_event_loop()), then run the other one (using run_forever() or
>>>>>>>> run_until_complete()). To tasks running in the latter event loop,
>>>>>>>> get_event_loop() would nevertheless return the former.
>>>>>>>>
>>>>>>>> On Mon, Jun 12, 2017 at 4:39 AM, Pau Freixes <pfreixes at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Sorry a bit of topic, but I would like to figure out why older
>>>>>>>>> python
>>>>>>>>> versions, prior this commit [1], the get_event_loop is not
>>>>>>>>> considered
>>>>>>>>> deterministic
>>>>>>>>>
>>>>>>>>> does anybody know the reason behind this change?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] https://github.com/python/cpython/commit/
>>>>>>>>> 600a349781bfa0a8239e1cb95fac29c7c4a3302e
>>>>>>>>>
>>>>>>>>> On Fri, Jun 9, 2017 at 6:07 PM, Ben Darnell <ben at bendarnell.com>
>>>>>>>>> wrote:
>>>>>>>>> > On Fri, Jun 9, 2017 at 11:51 AM Cory Benfield <cory at lukasa.co.uk>
>>>>>>>>> wrote:
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> My concern with multiple loops boils down to the fact that
>>>>>>>>> urllib3
>>>>>>>>> >> supports being used in a multithreaded context where each
>>>>>>>>> thread can
>>>>>>>>> >> independently make forward progress on one request. To
>>>>>>>>> establish that with a
>>>>>>>>> >> synchronous codebase you either need one event loop per thread
>>>>>>>>> or you need
>>>>>>>>> >> to spawn a background thread on startup that owns the only
>>>>>>>>> event loop in the
>>>>>>>>> >> process.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Yeah, one event loop per thread is probably the way to go for
>>>>>>>>> integration
>>>>>>>>> > with synchronous codebases. A dedicated event loop thread may
>>>>>>>>> perform better
>>>>>>>>> > but libraries that spawn threads are problematic.
>>>>>>>>> >
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> Generally speaking I?ve not had positive results with libraries
>>>>>>>>> spawning
>>>>>>>>> >> their own threads in Python. In my experience this has tended
>>>>>>>>> to lead to
>>>>>>>>> >> programs that deadlock mysteriously or that fail to terminate
>>>>>>>>> in the face of
>>>>>>>>> >> a Ctrl+C. So I tend to prefer to have users spawn their own
>>>>>>>>> threads, which
>>>>>>>>> >> would make me want a ?one-event-loop-per-thread? model: hence,
>>>>>>>>> needing a
>>>>>>>>> >> loop parameter to pass around prior to 3.6.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > You can avoid the loop parameter on older versions of asyncio
>>>>>>>>> (at least as
>>>>>>>>> > long as the default event loop policy is used) by manually
>>>>>>>>> setting your
>>>>>>>>> > event loop as current before calling run_until_complete (and
>>>>>>>>> resetting it
>>>>>>>>> > afterwards).
>>>>>>>>> >
>>>>>>>>> > Tornado's run_sync() method is equivalent to asyncio's
>>>>>>>>> run_until_complete(),
>>>>>>>>> > and Tornado supports multiple IOLoops in this way. We use this
>>>>>>>>> to expose a
>>>>>>>>> > synchronous version of our AsyncHTTPClient:
>>>>>>>>> > https://github.com/tornadoweb/tornado/blob/
>>>>>>>>> 62e47215ce12aee83f951758c96775a43e80475b/tornado/httpclient.py#L54
>>>>>>>>> >
>>>>>>>>> > -Ben
>>>>>>>>> >
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> I admit that my concerns here regarding libraries spawning
>>>>>>>>> their own
>>>>>>>>> >> threads may be overblown: after my series of negative
>>>>>>>>> experiences I
>>>>>>>>> >> basically never went back to that model, and it may be that the
>>>>>>>>> problems
>>>>>>>>> >> were more user-error than anything else. However, I feel
>>>>>>>>> comfortable saying
>>>>>>>>> >> that libraries spawning their own Python threads is definitely
>>>>>>>>> subtle and
>>>>>>>>> >> hard to get right, at the very least.
>>>>>>>>> >>
>>>>>>>>> >> Cory
>>>>>>>>> >> _______________________________________________
>>>>>>>>> >> Async-sig mailing list
>>>>>>>>> >> Async-sig at python.org
>>>>>>>>> >> https://mail.python.org/mailman/listinfo/async-sig
>>>>>>>>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > _______________________________________________
>>>>>>>>> > Async-sig mailing list
>>>>>>>>> > Async-sig at python.org
>>>>>>>>> > https://mail.python.org/mailman/listinfo/async-sig
>>>>>>>>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> --pau
>>>>>>>>> _______________________________________________
>>>>>>>>> Async-sig mailing list
>>>>>>>>> Async-sig at python.org
>>>>>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --Guido van Rossum (python.org/~guido)
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --Guido van Rossum (python.org/~guido)
>>>>>> _______________________________________________
>>>>>> Async-sig mailing list
>>>>>> Async-sig at python.org
>>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Andrew Svetlov
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --Guido van Rossum (python.org/~guido)
>>>>
>>> --
>>> Thanks,
>>> Andrew Svetlov
>>>
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
> --
> Thanks,
> Andrew Svetlov
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/eb9df428/attachment-0001.html>

From manu.mirandad at gmail.com  Mon Jun 12 17:20:17 2017
From: manu.mirandad at gmail.com (manuel miranda)
Date: Mon, 12 Jun 2017 21:20:17 +0000
Subject: [Async-sig] async/sync library reusage
In-Reply-To: <CAJ+Z=PJgohj0OzJ_-=xEi=sp6zymf64ZT3PNT1Dmny8GCLRPOA@mail.gmail.com>
References: <CAHyLuuHYr0pkEiP_LDigL=80Apidwih6dq0KgqA2fHHL5ZxesQ@mail.gmail.com>
 <CAPJVwBnbwe7eGv2DYnTa4SqXY1DxQxAjWssTupSCyy0DPSt1Yg@mail.gmail.com>
 <6D0F482B-C4F7-497E-BAF4-7AF67A663EAC@lukasa.co.uk>
 <CAP7+vJ+uYA7SxoDsz6Qsra4zkZSPqHhGcx2uEKaUUq_GYd3few@mail.gmail.com>
 <A2DC3769-FD4A-4678-BAA9-BF7064A1A6D4@lukasa.co.uk>
 <CAP7+vJKBsW_vQXp7tWcw_AwdeApBGEs-=f0SrJ0+Lg9G9mMa7A@mail.gmail.com>
 <C6075620-404B-4972-8961-997BE134A2A1@lukasa.co.uk>
 <CAP7+vJLPqT4=bN-KmZgFPF+PNmdDDoaAazWXODcoq9gw--BRow@mail.gmail.com>
 <CAJ+Z=PJgohj0OzJ_-=xEi=sp6zymf64ZT3PNT1Dmny8GCLRPOA@mail.gmail.com>
Message-ID: <CAHyLuuGC1g+-coG4v6i83tr7eu2T9uWet0ZjnLh+gMWWdKBZyQ@mail.gmail.com>

So, I've been playing a bit with the information I saw in this thread
(thank you all for the responses) and I got something super simple working:
https://gist.github.com/argaen/056a43b083a29f76ac6e2fa97b3e08d1

What I like about this (and that's what I was aiming for) is that the user
uses the same class/interface no matter if its inside asyncio world or not.
So both `await fn()` and `fn()` work producing the expected results.

Now some cons (that in the case of my library are acceptable):

- This aims only for asyncio compatibility, other async frameworks like
trio, curio, etc. wouldn't work
- No python2 compatibility (although Nathaniel's idea of bleaching could
still be applied)
- I guess it adds some overhead to both sync and async versions, I will do
some benchmarking when I have time (actually this one will be the one
deciding whether I do the integration or not)

Pros:

- User is agnostic to the async/sync implementation. If you are in asyncio
world, just use `async fn()` and if not `fn()`. Both will work
- There is compatibility between classes using this approach
- No duplication of code

I haven't thought yet about async context managers, iterations and so but I
guess there is a way to fix that too (or not, I have no idea).

One fun part of all this is if its possible (meaning easily) to reuse also
the tests to test both the sync and the async version... :rolling_eyes:


On Fri, Jun 9, 2017 at 9:52 PM Yarko Tymciurak <yarkot1 at gmail.com> wrote:

> ...so I really am enjoying the conversation.
>
> Guido - re: "vision too far out":  yes, for people trying to struggle w/
> async support in their libraries, now... but that is also part of my
> motivation.   Python 5?  Sure...  (I may have to watch it come to use from
> the grave, but hopefully not... ;-) ).  Anyway, from back-porting and
> tactical "implement now" concerns, to plans for next release, to plans for
> next version of python, to brainstorming much less concrete future versions
> - all are an interesting continuum.
>
> Re:  GIL... sure, sort of, and sort of not.  I was thinking "as long as
> major changes are going on...  think about additional structural
> changes..."   More to the point:  as I see it, people have a hard time
> thinking about async in the cooperative-multitasking (CMT) sense, and thus
> disappointments happen around blocking (missed, or unexpects, e.g. hardware
> failures).   Cory (in his reply - and, yeah: nice writeup!) hints to what I
> generally structurally like:
>
> "...we?d ideally treat asyncio as the first-class citizen and retrofit on
> the threaded support, rather than the other way around"
>
> Structurally,  async is light-weight overhead compared to threads, which
> are lightweight compared to processes, and so a sort of natural app flow
> seems from lightest-weight, on out.  To me, this seems practical for making
> life easier for developers, because you can imagine "promoting" an async
> task caught unexpectedly blocking, to a thread, while still having the
> lightest-weight loop have control over it (promotion out, as well as
> cancellation while promoted).
>
> As for multiple task loops, or loops off in a thread, I haven't thought
> about it too much, but this seems like nothing new nor unreasonable.  I'm
> thinking of the base-stations we talk over in our mobile connections, which
> are multiple diskless servers, and hot-promote to "master" server status on
> hardware failure (or live capacity upgrade, i.e. inserting processors).
> This pattern seems both reasonable and useful in this context, i.e. the
> concept of a master loop (which implies communication/control channels - a
> complication).  With some thought, some reasonable ground rules and
> simplifications, and I would expect much can be done.
>
> Appreciate the discussions!
>
> - Yarko
> On Fri, Jun 9, 2017 at 1:23 PM, Guido van Rossum <guido at python.org> wrote:
>
>> Great write-up! I actually find the async nature of HTTP (both versions)
>> a compelling reason to switch to asyncio. For HTTP/1.1 this sounds mostly
>> like it would make the implementation easier; for HTTP/2 it sounds like it
>> would just be better for the user-side as well (if the user just wants one
>> resource they can safely continue to use the synchronous HTTP/1.1 version
>> of the API.)
>>
>> On Fri, Jun 9, 2017 at 9:55 AM, Cory Benfield <cory at lukasa.co.uk> wrote:
>>
>>>
>>> On 9 Jun 2017, at 17:28, Guido van Rossum <guido at python.org> wrote:
>>>
>>> At least one of us is still confused. The one-event-loop-per-thread
>>> model is supported in asyncio without passing the loop around explicitly.
>>> The get_event_loop() implementation stores all its state in thread-locals
>>> instance, so it returns the thread's event loop. (Because this is an
>>> "advanced" model, you have to explicitly create the event loop with
>>> new_event_loop() and make it the default loop for the thread with
>>> set_event_loop().)
>>>
>>>
>>> Aha, ok, so the confused one is me. I did not know this. =) That
>>> definitely works a lot better. It admittedly works less well if someone is
>>> doing their own custom event loop stuff, but that?s probably an acceptable
>>> limitation up until the time that Python 2 goes quietly into the night.
>>>
>>> All in all, I'm a bit curious why you would need to use asyncio at all
>>> when you've got a thread per request anyway.
>>>
>>>
>>> Yeah, so this is a bit of a diversion from the original topic of this
>>> thread but I think it?s an idea worth discussing in this space. I want to
>>> reframe the question a bit if you don?t mind, so shout if you think I?m not
>>> responding to quite what you were asking. In my understanding, the question
>>> you?re implicitly asking is this:
>>>
>>> "If you have a thread-safe library today (that is, one that allows users
>>> to do threaded I/O with appropriate resource pooling and management), why
>>> move to a model built on asyncio??
>>>
>>> There are many answers to this question that differ for different
>>> libraries with different uses, but for HTTP libraries like urllib3 here are
>>> our reasons.
>>>
>>> The first is that it turns out that even for HTTP/1.1 you need to write
>>> something that amounts to a partial event loop to properly handle the
>>> protocol. Good HTTP clients need to watch for responses while they?re
>>> uploading body data because if a response arrives during that process body
>>> upload should be terminated immediately. This is also required for sensibly
>>> handling things like Expect: 100-continue, as well as spotting other
>>> intermediate responses and connection teardowns sensibly and without
>>> throwing exceptions.
>>>
>>> Today urllib3 does not do this, and it has caused us pain, so our v2
>>> branch includes a backport of the Python 3 selectors module and a
>>> hand-written partially-complete event loop that only handles the specific
>>> cases we need. This is an extra thing for us to debug and maintain, and
>>> ultimately it?d be easier to just delegate the whole thing to event loops
>>> written by others who promise to maintain them and make them efficient.
>>>
>>> The second answer is that I believe good asyncio support in libraries is
>>> a vital part of the future of this language, and ?good? asyncio support IMO
>>> does as little as possible to block the main event loop. Running all of the
>>> complex protocol parsing and state manipulation of the Requests stack on a
>>> background thread is not cheap, and involves a lot of GIL swapping around.
>>> We have found several bug reports complaining about using Requests with
>>> largish-numbers of threads, indicating that our big stack of Python code
>>> really does cause contention on the GIL if used heavily. In general, having
>>> to defer to a thread to run *Python* code in asyncio is IMO a nasty
>>> anti-pattern that should be avoided where possible. It is much less bad to
>>> defer to a thread to then block on a syscall (e.g. to get an ?async?
>>> getaddrinfo), but doing so to run a big big stack of Python code is vastly
>>> less pleasant for the main event loop.
>>>
>>> For this reason, we?d ideally treat asyncio as the first-class citizen
>>> and retrofit on the threaded support, rather than the other way around.
>>> This goes doubly so when you consider the other reasons for wanting to use
>>> asyncio.
>>>
>>> The third answer is that HTTP/2 makes all of this much harder. HTTP/2 is
>>> a *highly* concurrent protocol. Connections send a lot of control frames
>>> back and forth that are invisible to the user working at the semantic HTTP
>>> level but that nonetheless need relatively low-latency turnaround (e.g.
>>> PING frames). It turns out that in the traditional synchronous HTTP model
>>> urllib3 only gets access to the socket to do work when the user calls into
>>> our code. If the user goes a ?long? time without calling into urllib3, we
>>> take a long time to process any data off the connection. In the best case
>>> this causes latency spikes as we process all the data that queued up in the
>>> socket. In the worst case, this causes us to lose connections we should
>>> have been able to keep because we failed to respond to a PING frame in a
>>> timely manner.
>>>
>>> My experience is that purely synchronous libraries handling HTTP/2
>>> simply cannot provide a positive user experience. HTTP/2 flat-out
>>> *requires* either an event loop or a dedicated background thread, and in
>>> practice in your dedicated background thread you?d also just end up writing
>>> an event loop (see answer 1 again). For this reason, it is basically
>>> mandatory for HTTP/2 support in Python to either use an event loop or to
>>> spawn out a dedicated C thread that does not hold the GIL to do the I/O (as
>>> this thread will be regularly woken up to handle I/O events).
>>>
>>> Hopefully this (admittedly horrifyingly long) response helps illuminate
>>> why we?re interested in asyncio support. It should be noted that if we find
>>> ourselves unable to get it in the short term we may simply resort to
>>> offering an ?async? API that involves us doing the rough equivalent of
>>> running in a thread-pool executor, but I won?t be thrilled about it. ;)
>>>
>>> Cory
>>>
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170612/163dda98/attachment-0001.html>

From pfreixes at gmail.com  Thu Jun 15 17:40:04 2017
From: pfreixes at gmail.com (Pau Freixes)
Date: Thu, 15 Jun 2017 23:40:04 +0200
Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor
Message-ID: <CA+ULCcHdNSrhgybp8qdvLsXGuuRSTkTtBTwkU0r1hsF8cPpPxw@mail.gmail.com>

Hi guys, recently I've been trying to implement a POC of the default
event loop implemented by asyncio but using a fair scheduling reactor.
At the moment is just
a POC [1], something to test the rationale and pending to be evolved
in someone more mature, but before of that I would prefer to share my
thoughts and get all of the comments from you.

The current implementation is based on a FIFO queue that is filled
with all of the callbacks that have to be executed, these callbacks
can stand for:

1) Run tasks, either to be started or resumed
2) Run future callbacks.
3) Run scheduled callbacks.
4) Run file descriptors callbacks.

Worth mentioning that most of them internally are chained in somehow,
perhaps a future callback can wake up a resumed task. Also, have in
mind that the API published by asyncio to schedule callbacks can be
used by asyncio itself or by the user, callsoon for example.

The usual flow the reactor is the following one:

1) Check the file descriptors with events, and stack the handlers into
the reactor.
2) Pop all outdated scheduled callbacks and push them into the reactor.
3) Iterate for N first elements at the queue, where N stands for the number of
the handles stacked at that moment. Future handles stacked during that
iteration won't be handled, they must wait until next whole iteration
4) Go to the point 1.

As you can observe here, the IO is only made once per loop and should
wait until all handles that are in a specific moment are executed.

This implements in somehow a natural backpressure, the read and also
the accept the new connections will rely on the buffers run by the
operating system.

That implementation can be seen as simple, but it stands on a solid
strategy and follows KISS design that helps to scare the bugs.

Why fair scheduling?

Not all code that is written in the same module, in terms of loop
sharing, has the same requirements. Some part might need N and other
parts M. When this implementation cant be decoupled, and it means that
the cost of placing them into a separated pieces inside of your
architecture are too expensive, in that scenario the developer cant
express this difference to make the underlying implementation aware of
that.

For example, an API with a regular endpoint accessed by the user and
another one with the health-check of the system, which has completely
different requirements in terms of IO. Not only due to the nature of
the resources accessed, also because of the frequency of use.
Meanwhile, the healthcheck is accessed to a known a frequency at X
seconds, the other endpoint has a variable frequency of use.

Do you believe that asyncio will be able to preserve the health-check
frequency at any moment? Absolutely not.

Therefore, the idea of implementing a fair scheduling reactor is based
on the needed of address these kind of situations, giving to the
developer an interface to isolate different resources.

Basic principles

The basic principles of the implementation are:

- The cost of the scheduling has to be the same of the current
implementation, no overhead
- The design has to follow the current one, having the implicit
backpressure that was commented.

I will focus in the second principle, taking into account that the
first one is a matter of implementation.

To achieve the same behavior, the new implementation only split the
resources - handles, schedules, file descriptors - in isolated
partitions to then implement for each partition the same algorithm
than the current one. The developer can create a new partition using a
new function called `spawn`, this function takes as an argument a
coroutine, the task wrapped to that coroutine and all of the resources
created inside this coroutine will belong to that partition. For
example:

>>> async def background_task():
>>>     task = [ fetch() for i in range(1000)]
>>>     return (await asyncio.gather(*t))
>>>
>>> async def foo():
>>>     return (await asyncio.spawn(background_task()))


All resources created inside the scope of the `background_tasks` are
isolated to one partition. The 1000 sockets will schedule callbacks
that will be stacked in the same queue.

The partition is by default identified with the hash of the task that
warps the `background_task`, but the user can pass an alternative
value.

>>> async def foo():
>>>     return (await asyncio.spawn(healtheck(), partition='healthcheck'))

Internally the implementation has a default ROOT partition that is
used for all of these resources that are not executed inside of the
scope of a spawn function. As you can guess, if you don't use the
spawn method the reactor will run exactly as the current
implementation. Having all the resources in the same queue.


Round robin between partitions.

The differents partitions that exist at some moment share the CPU
resource using a round robin strategy. It gives the same chance to all
partitions to run the same amount of handles, but with one
particularity. Each time that a partition runs out of handles, the
loop is restarted again to handle the file descriptors and the delayed
calls but only for that specific partition that runs out of handles.

The side effect is clear, have the same backpressure mechanism. But,
per partition.


The black hole of the current implementation.

There is always a but, at least I've found a situation where this
strategy can perform in the same way as the current one, without
applying any fair scheduling. Although the code uses the spawn method.

Have a look at the following snippet:

>>> async def healtheck(request):
>>>     await request.resp()
>>>
>>> async def view(request):
>>>     return (await asyncio.spawn(healthcheck(request)))


The task that wraps the healtcheck coroutine that is being isolated in
a partition, won't be scheduled until the data from the file
descriptor that is read by a callback that is in fact executed inside
of the ROOT partition. Therefore, in the worst case scenario, the fair
scheduling will become a simple FIFO scheduling.

IMHO there is not an easy way to solve that issue, or at least without
changing the current picture. And try to solve it, might end up having
a messy implementation and a buggy code.

Although, I believe that this still worth it, having in mind the
benefits that it will bring us for all of those cases where the user
needs to isolate resources.

Thoughts, comments, and others will be welcomed.

[1] https://github.com/pfreixes/qloop

-- 
--pau

From njs at pobox.com  Thu Jun 15 18:13:43 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 15 Jun 2017 15:13:43 -0700
Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor
In-Reply-To: <CA+ULCcHdNSrhgybp8qdvLsXGuuRSTkTtBTwkU0r1hsF8cPpPxw@mail.gmail.com>
References: <CA+ULCcHdNSrhgybp8qdvLsXGuuRSTkTtBTwkU0r1hsF8cPpPxw@mail.gmail.com>
Message-ID: <CAPJVwBn8E60gN6sorDzxU6TC42c=Y2XLqmb=n8bNXTw4EYkb5g@mail.gmail.com>

A few quick thoughts:

You might find these notes interesting:
https://github.com/python-trio/trio/issues/32

It sounds like a more precise description of your scheduler would be
"hierarchical FIFO"? i.e., there's a top-level scheduler that selects
between "child" schedulers in round-robin/FIFO fashion, and each child
scheduler is round-robin/FIFO within its schedulable entities? [1]
Generally I would think of a "fair" scheduler as one that notices when
e.g. one task is blocking the event loop for a long time when it runs,
and penalizing it by not letting it run as often.

For your motivating example of a health-check: it sounds like another
way to express your goal would be by attaching a static "priority
level" to the health-check task(s), such that they get to run first
whenever they're ready. Have you considered that as an alternative
approach?

But also... isn't part of the point of a healthcheck that it *should*
get slow if the system is overloaded?

-n

[1] http://intronetworks.cs.luc.edu/current/html/queuing.html#hierarchical-queuing

On Thu, Jun 15, 2017 at 2:40 PM, Pau Freixes <pfreixes at gmail.com> wrote:
> Hi guys, recently I've been trying to implement a POC of the default
> event loop implemented by asyncio but using a fair scheduling reactor.
> At the moment is just
> a POC [1], something to test the rationale and pending to be evolved
> in someone more mature, but before of that I would prefer to share my
> thoughts and get all of the comments from you.
>
> The current implementation is based on a FIFO queue that is filled
> with all of the callbacks that have to be executed, these callbacks
> can stand for:
>
> 1) Run tasks, either to be started or resumed
> 2) Run future callbacks.
> 3) Run scheduled callbacks.
> 4) Run file descriptors callbacks.
>
> Worth mentioning that most of them internally are chained in somehow,
> perhaps a future callback can wake up a resumed task. Also, have in
> mind that the API published by asyncio to schedule callbacks can be
> used by asyncio itself or by the user, callsoon for example.
>
> The usual flow the reactor is the following one:
>
> 1) Check the file descriptors with events, and stack the handlers into
> the reactor.
> 2) Pop all outdated scheduled callbacks and push them into the reactor.
> 3) Iterate for N first elements at the queue, where N stands for the number of
> the handles stacked at that moment. Future handles stacked during that
> iteration won't be handled, they must wait until next whole iteration
> 4) Go to the point 1.
>
> As you can observe here, the IO is only made once per loop and should
> wait until all handles that are in a specific moment are executed.
>
> This implements in somehow a natural backpressure, the read and also
> the accept the new connections will rely on the buffers run by the
> operating system.
>
> That implementation can be seen as simple, but it stands on a solid
> strategy and follows KISS design that helps to scare the bugs.
>
> Why fair scheduling?
>
> Not all code that is written in the same module, in terms of loop
> sharing, has the same requirements. Some part might need N and other
> parts M. When this implementation cant be decoupled, and it means that
> the cost of placing them into a separated pieces inside of your
> architecture are too expensive, in that scenario the developer cant
> express this difference to make the underlying implementation aware of
> that.
>
> For example, an API with a regular endpoint accessed by the user and
> another one with the health-check of the system, which has completely
> different requirements in terms of IO. Not only due to the nature of
> the resources accessed, also because of the frequency of use.
> Meanwhile, the healthcheck is accessed to a known a frequency at X
> seconds, the other endpoint has a variable frequency of use.
>
> Do you believe that asyncio will be able to preserve the health-check
> frequency at any moment? Absolutely not.
>
> Therefore, the idea of implementing a fair scheduling reactor is based
> on the needed of address these kind of situations, giving to the
> developer an interface to isolate different resources.
>
> Basic principles
>
> The basic principles of the implementation are:
>
> - The cost of the scheduling has to be the same of the current
> implementation, no overhead
> - The design has to follow the current one, having the implicit
> backpressure that was commented.
>
> I will focus in the second principle, taking into account that the
> first one is a matter of implementation.
>
> To achieve the same behavior, the new implementation only split the
> resources - handles, schedules, file descriptors - in isolated
> partitions to then implement for each partition the same algorithm
> than the current one. The developer can create a new partition using a
> new function called `spawn`, this function takes as an argument a
> coroutine, the task wrapped to that coroutine and all of the resources
> created inside this coroutine will belong to that partition. For
> example:
>
>>>> async def background_task():
>>>>     task = [ fetch() for i in range(1000)]
>>>>     return (await asyncio.gather(*t))
>>>>
>>>> async def foo():
>>>>     return (await asyncio.spawn(background_task()))
>
>
> All resources created inside the scope of the `background_tasks` are
> isolated to one partition. The 1000 sockets will schedule callbacks
> that will be stacked in the same queue.
>
> The partition is by default identified with the hash of the task that
> warps the `background_task`, but the user can pass an alternative
> value.
>
>>>> async def foo():
>>>>     return (await asyncio.spawn(healtheck(), partition='healthcheck'))
>
> Internally the implementation has a default ROOT partition that is
> used for all of these resources that are not executed inside of the
> scope of a spawn function. As you can guess, if you don't use the
> spawn method the reactor will run exactly as the current
> implementation. Having all the resources in the same queue.
>
>
> Round robin between partitions.
>
> The differents partitions that exist at some moment share the CPU
> resource using a round robin strategy. It gives the same chance to all
> partitions to run the same amount of handles, but with one
> particularity. Each time that a partition runs out of handles, the
> loop is restarted again to handle the file descriptors and the delayed
> calls but only for that specific partition that runs out of handles.
>
> The side effect is clear, have the same backpressure mechanism. But,
> per partition.
>
>
> The black hole of the current implementation.
>
> There is always a but, at least I've found a situation where this
> strategy can perform in the same way as the current one, without
> applying any fair scheduling. Although the code uses the spawn method.
>
> Have a look at the following snippet:
>
>>>> async def healtheck(request):
>>>>     await request.resp()
>>>>
>>>> async def view(request):
>>>>     return (await asyncio.spawn(healthcheck(request)))
>
>
> The task that wraps the healtcheck coroutine that is being isolated in
> a partition, won't be scheduled until the data from the file
> descriptor that is read by a callback that is in fact executed inside
> of the ROOT partition. Therefore, in the worst case scenario, the fair
> scheduling will become a simple FIFO scheduling.
>
> IMHO there is not an easy way to solve that issue, or at least without
> changing the current picture. And try to solve it, might end up having
> a messy implementation and a buggy code.
>
> Although, I believe that this still worth it, having in mind the
> benefits that it will bring us for all of those cases where the user
> needs to isolate resources.
>
> Thoughts, comments, and others will be welcomed.
>
> [1] https://github.com/pfreixes/qloop
>
> --
> --pau
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/


-- 
Nathaniel J. Smith -- https://vorpus.org

From dimaqq at gmail.com  Fri Jun 16 07:47:03 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Fri, 16 Jun 2017 13:47:03 +0200
Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor
In-Reply-To: <CA+ULCcHdNSrhgybp8qdvLsXGuuRSTkTtBTwkU0r1hsF8cPpPxw@mail.gmail.com>
References: <CA+ULCcHdNSrhgybp8qdvLsXGuuRSTkTtBTwkU0r1hsF8cPpPxw@mail.gmail.com>
Message-ID: <CAGGBzX+F5yjdhqG9suuA00k+M9Gc=MLyWokV5jvVygRMjeFfqg@mail.gmail.com>

Just a couple of thoughts:

1. A good place to start is Linux BFS (simple, correct, good
performance). Typical test case is "make -j4", perhaps there's a way
to simulate something similar?

2. There's a space to research async/await scheduling (in academic
sense), older research may have been focused on .net, and current
research probably on browsers and node, in any case javascript.

d.

On 15 June 2017 at 23:40, Pau Freixes <pfreixes at gmail.com> wrote:
> Hi guys, recently I've been trying to implement a POC of the default
> event loop implemented by asyncio but using a fair scheduling reactor.
> At the moment is just
> a POC [1], something to test the rationale and pending to be evolved
> in someone more mature, but before of that I would prefer to share my
> thoughts and get all of the comments from you.
>
> The current implementation is based on a FIFO queue that is filled
> with all of the callbacks that have to be executed, these callbacks
> can stand for:
>
> 1) Run tasks, either to be started or resumed
> 2) Run future callbacks.
> 3) Run scheduled callbacks.
> 4) Run file descriptors callbacks.
>
> Worth mentioning that most of them internally are chained in somehow,
> perhaps a future callback can wake up a resumed task. Also, have in
> mind that the API published by asyncio to schedule callbacks can be
> used by asyncio itself or by the user, callsoon for example.
>
> The usual flow the reactor is the following one:
>
> 1) Check the file descriptors with events, and stack the handlers into
> the reactor.
> 2) Pop all outdated scheduled callbacks and push them into the reactor.
> 3) Iterate for N first elements at the queue, where N stands for the number of
> the handles stacked at that moment. Future handles stacked during that
> iteration won't be handled, they must wait until next whole iteration
> 4) Go to the point 1.
>
> As you can observe here, the IO is only made once per loop and should
> wait until all handles that are in a specific moment are executed.
>
> This implements in somehow a natural backpressure, the read and also
> the accept the new connections will rely on the buffers run by the
> operating system.
>
> That implementation can be seen as simple, but it stands on a solid
> strategy and follows KISS design that helps to scare the bugs.
>
> Why fair scheduling?
>
> Not all code that is written in the same module, in terms of loop
> sharing, has the same requirements. Some part might need N and other
> parts M. When this implementation cant be decoupled, and it means that
> the cost of placing them into a separated pieces inside of your
> architecture are too expensive, in that scenario the developer cant
> express this difference to make the underlying implementation aware of
> that.
>
> For example, an API with a regular endpoint accessed by the user and
> another one with the health-check of the system, which has completely
> different requirements in terms of IO. Not only due to the nature of
> the resources accessed, also because of the frequency of use.
> Meanwhile, the healthcheck is accessed to a known a frequency at X
> seconds, the other endpoint has a variable frequency of use.
>
> Do you believe that asyncio will be able to preserve the health-check
> frequency at any moment? Absolutely not.
>
> Therefore, the idea of implementing a fair scheduling reactor is based
> on the needed of address these kind of situations, giving to the
> developer an interface to isolate different resources.
>
> Basic principles
>
> The basic principles of the implementation are:
>
> - The cost of the scheduling has to be the same of the current
> implementation, no overhead
> - The design has to follow the current one, having the implicit
> backpressure that was commented.
>
> I will focus in the second principle, taking into account that the
> first one is a matter of implementation.
>
> To achieve the same behavior, the new implementation only split the
> resources - handles, schedules, file descriptors - in isolated
> partitions to then implement for each partition the same algorithm
> than the current one. The developer can create a new partition using a
> new function called `spawn`, this function takes as an argument a
> coroutine, the task wrapped to that coroutine and all of the resources
> created inside this coroutine will belong to that partition. For
> example:
>
>>>> async def background_task():
>>>>     task = [ fetch() for i in range(1000)]
>>>>     return (await asyncio.gather(*t))
>>>>
>>>> async def foo():
>>>>     return (await asyncio.spawn(background_task()))
>
>
> All resources created inside the scope of the `background_tasks` are
> isolated to one partition. The 1000 sockets will schedule callbacks
> that will be stacked in the same queue.
>
> The partition is by default identified with the hash of the task that
> warps the `background_task`, but the user can pass an alternative
> value.
>
>>>> async def foo():
>>>>     return (await asyncio.spawn(healtheck(), partition='healthcheck'))
>
> Internally the implementation has a default ROOT partition that is
> used for all of these resources that are not executed inside of the
> scope of a spawn function. As you can guess, if you don't use the
> spawn method the reactor will run exactly as the current
> implementation. Having all the resources in the same queue.
>
>
> Round robin between partitions.
>
> The differents partitions that exist at some moment share the CPU
> resource using a round robin strategy. It gives the same chance to all
> partitions to run the same amount of handles, but with one
> particularity. Each time that a partition runs out of handles, the
> loop is restarted again to handle the file descriptors and the delayed
> calls but only for that specific partition that runs out of handles.
>
> The side effect is clear, have the same backpressure mechanism. But,
> per partition.
>
>
> The black hole of the current implementation.
>
> There is always a but, at least I've found a situation where this
> strategy can perform in the same way as the current one, without
> applying any fair scheduling. Although the code uses the spawn method.
>
> Have a look at the following snippet:
>
>>>> async def healtheck(request):
>>>>     await request.resp()
>>>>
>>>> async def view(request):
>>>>     return (await asyncio.spawn(healthcheck(request)))
>
>
> The task that wraps the healtcheck coroutine that is being isolated in
> a partition, won't be scheduled until the data from the file
> descriptor that is read by a callback that is in fact executed inside
> of the ROOT partition. Therefore, in the worst case scenario, the fair
> scheduling will become a simple FIFO scheduling.
>
> IMHO there is not an easy way to solve that issue, or at least without
> changing the current picture. And try to solve it, might end up having
> a messy implementation and a buggy code.
>
> Although, I believe that this still worth it, having in mind the
> benefits that it will bring us for all of those cases where the user
> needs to isolate resources.
>
> Thoughts, comments, and others will be welcomed.
>
> [1] https://github.com/pfreixes/qloop
>
> --
> --pau
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/

From pfreixes at gmail.com  Sat Jun 17 05:50:25 2017
From: pfreixes at gmail.com (Pau Freixes)
Date: Sat, 17 Jun 2017 11:50:25 +0200
Subject: [Async-sig] Asyncio, implementing a new fair scheduling reactor
In-Reply-To: <CAPJVwBn8E60gN6sorDzxU6TC42c=Y2XLqmb=n8bNXTw4EYkb5g@mail.gmail.com>
References: <CA+ULCcHdNSrhgybp8qdvLsXGuuRSTkTtBTwkU0r1hsF8cPpPxw@mail.gmail.com>
 <CAPJVwBn8E60gN6sorDzxU6TC42c=Y2XLqmb=n8bNXTw4EYkb5g@mail.gmail.com>
Message-ID: <CA+ULCcHt9fxydDBY1JUu20yMEJzejehCo6svx91ZZNK2vXfPpA@mail.gmail.com>

Hi Nathaniel ...

Let me share my thoughts regarding your response

> It sounds like a more precise description of your scheduler would be
> "hierarchical FIFO"? i.e., there's a top-level scheduler that selects
> between "child" schedulers in round-robin/FIFO fashion, and each child
> scheduler is round-robin/FIFO within its schedulable entities? [1]
> Generally I would think of a "fair" scheduler as one that notices when
> e.g. one task is blocking the event loop for a long time when it runs,
> and penalizing it by not letting it run as often.

Agree with the controversial meaning of fair scheduling, I would
rename my title as Fair Queuing [1].

Regarding your notes, pretty interesting, and your thoughts about how
the current lack could be tackled in the near future. But, I would say
that these notes are more about flow control at the network level,
where mine was more oriented on business logic. I'm more inclined to
think that both works at different level and different granularity,
and are not excluding.

The draft that I presented is keen on allow the user - in a way of
pattern - to split and isolate resources from top to bottom. Yes, I
was thinking about different strategies that might give more
granularity and fine control to the user such as weighted partitions,
or even have just two queues  LOW and HIGH. But these last ones didn't
appeal to me for the following reasons:

- Give to the user the chance to mix partitions and weights might end
up having some code difficult to understand.
- The LOW and HIGH priority was too much restrictive. Give control to
the user to configure what is LOW and what is HIGH might end up also
having an overall performance worst than it was expected.

The idea behind the Fair queuing is: Give enough control to the user
but without taking the chance to screw up the whole thing.


> But also... isn't part of the point of a healthcheck that it *should*
> get slow if the system is overloaded?

Not really, the response of a health-check is dichotomic: Yes or No.
The problem on sharing resources between the health-check and the user
flow is when the last one impacts on the first one having, as a
result, false negatives.

The dynamic allocation of resources to scale up horizontally to suit
more traffic and reduce the pressure to the main flow is run by other
actors, that of course can rely on metrics that are sent out by the
user flow.

[1] https://en.wikipedia.org/wiki/Fair_queuing

-- 
--pau

From mehaase at gmail.com  Wed Jun 21 13:50:57 2017
From: mehaase at gmail.com (Mark E. Haase)
Date: Wed, 21 Jun 2017 13:50:57 -0400
Subject: [Async-sig] Cancelling SSL connection
Message-ID: <CALb0Rk7V=E-B4zK1Uqsx2FcN7tLUfyDs6hEFFLkgH40Wqzii9g@mail.gmail.com>

(I'm not sure if this is a newbie question or a bug report or something in
between. I apologize in advance if its off-topic. Let me know if I should
post this somewhere else.)

If a task is cancelled while SSL is being negotiated, then an SSLError is
raised, but there's no way (as far as I can tell) for the caller to catch
it. (The example below is pretty contrived, but in an application I'm
working on, the user can cancel downloads at any time.) Here's an example:

    import asyncio, random, ssl

    async def download(host):
        ssl_context = ssl.create_default_context()
        reader, writer = await asyncio.open_connection(host, 443,
ssl=ssl_context)
        request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n'
        writer.write(request.encode('ascii'))
        lines = list()
        while True:
            newdata = await reader.readline()
            if newdata == b'\r\n':
                break
            else:
                lines.append(newdata.decode('utf8').rstrip('\r\n'))
        return lines[0]

    async def main():
        while True:
            task = asyncio.Task(download('www.python.org'))
            await asyncio.sleep(random.uniform(0.0, 0.5))
            task.cancel()
            try:
                response = await task
                print(response)
            except asyncio.CancelledError:
                print('request cancelled!')
            except ssl.SSLError:
                print('caught SSL error')
            await asyncio.sleep(1)

    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    loop.close()

Running this script yields the following output:

    HTTP/1.1 200 OK
    request cancelled!
    HTTP/1.1 200 OK
    HTTP/1.1 200 OK
    <asyncio.sslproto.SSLProtocol object at 0x7fe7c00e5a20>: SSL handshake
failed
    Traceback (most recent call last):
      File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in
_create_connection_transport
        yield from waiter
      File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup
        future.result()
    concurrent.futures._base.CancelledError

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in
_on_handshake_complete
        raise handshake_exc
      File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in
_process_write_backlog
        ssldata = self._sslpipe.shutdown(self._finalize)
      File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in shutdown
        ssldata, appdata = self.feed_ssldata(b'')
      File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in
feed_ssldata
        self._sslobj.unwrap()
      File "/usr/lib/python3.6/ssl.py", line 692, in unwrap
        return self._sslobj.shutdown()
    ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299)

Is this a bug that I should file, or is there some reason that it's
intended to work this way? I can work around it with asyncio.shield(), but
I think I would prefer for the asyncio/sslproto.py to catch the SSLError
and ignore it. Maybe I'm being short sighted.

Thanks,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170621/18f74e36/attachment.html>

From dimaqq at gmail.com  Wed Jun 21 15:49:27 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Wed, 21 Jun 2017 21:49:27 +0200
Subject: [Async-sig] Cancelling SSL connection
In-Reply-To: <CALb0Rk7V=E-B4zK1Uqsx2FcN7tLUfyDs6hEFFLkgH40Wqzii9g@mail.gmail.com>
References: <CALb0Rk7V=E-B4zK1Uqsx2FcN7tLUfyDs6hEFFLkgH40Wqzii9g@mail.gmail.com>
Message-ID: <CAGGBzXKtqWhFTAcHq4vqntuEMDo4DeTR75_J4vgSJqNtyNmqfQ@mail.gmail.com>

Looks like a bug in the `ssl` module, not `asyncio`.

Refer to https://github.com/openssl/openssl/issues/710
IMO `ssl` module should be prepared for this.

I'd say post a bug to cpython and see what core devs have to say about it :)
Please note exact versions of python and openssl ofc.

my 2c: openssl has been a moving target every so often, it's quite
possible that this change in the API escaped the devs.

On 21 June 2017 at 19:50, Mark E. Haase <mehaase at gmail.com> wrote:
> (I'm not sure if this is a newbie question or a bug report or something in
> between. I apologize in advance if its off-topic. Let me know if I should
> post this somewhere else.)
>
> If a task is cancelled while SSL is being negotiated, then an SSLError is
> raised, but there's no way (as far as I can tell) for the caller to catch
> it. (The example below is pretty contrived, but in an application I'm
> working on, the user can cancel downloads at any time.) Here's an example:
>
>     import asyncio, random, ssl
>
>     async def download(host):
>         ssl_context = ssl.create_default_context()
>         reader, writer = await asyncio.open_connection(host, 443,
> ssl=ssl_context)
>         request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n'
>         writer.write(request.encode('ascii'))
>         lines = list()
>         while True:
>             newdata = await reader.readline()
>             if newdata == b'\r\n':
>                 break
>             else:
>                 lines.append(newdata.decode('utf8').rstrip('\r\n'))
>         return lines[0]
>
>     async def main():
>         while True:
>             task = asyncio.Task(download('www.python.org'))
>             await asyncio.sleep(random.uniform(0.0, 0.5))
>             task.cancel()
>             try:
>                 response = await task
>                 print(response)
>             except asyncio.CancelledError:
>                 print('request cancelled!')
>             except ssl.SSLError:
>                 print('caught SSL error')
>             await asyncio.sleep(1)
>
>     loop = asyncio.get_event_loop()
>     loop.run_until_complete(main())
>     loop.close()
>
> Running this script yields the following output:
>
>     HTTP/1.1 200 OK
>     request cancelled!
>     HTTP/1.1 200 OK
>     HTTP/1.1 200 OK
>     <asyncio.sslproto.SSLProtocol object at 0x7fe7c00e5a20>: SSL handshake
> failed
>     Traceback (most recent call last):
>       File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in
> _create_connection_transport
>         yield from waiter
>       File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup
>         future.result()
>     concurrent.futures._base.CancelledError
>
>     During handling of the above exception, another exception occurred:
>
>     Traceback (most recent call last):
>       File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in
> _on_handshake_complete
>         raise handshake_exc
>       File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in
> _process_write_backlog
>         ssldata = self._sslpipe.shutdown(self._finalize)
>       File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in shutdown
>         ssldata, appdata = self.feed_ssldata(b'')
>       File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in
> feed_ssldata
>         self._sslobj.unwrap()
>       File "/usr/lib/python3.6/ssl.py", line 692, in unwrap
>         return self._sslobj.shutdown()
>     ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299)
>
> Is this a bug that I should file, or is there some reason that it's intended
> to work this way? I can work around it with asyncio.shield(), but I think I
> would prefer for the asyncio/sslproto.py to catch the SSLError and ignore
> it. Maybe I'm being short sighted.
>
> Thanks,
> Mark
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>

From njs at pobox.com  Wed Jun 21 18:47:04 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 21 Jun 2017 15:47:04 -0700
Subject: [Async-sig] Cancelling SSL connection
In-Reply-To: <CAGGBzXKtqWhFTAcHq4vqntuEMDo4DeTR75_J4vgSJqNtyNmqfQ@mail.gmail.com>
References: <CALb0Rk7V=E-B4zK1Uqsx2FcN7tLUfyDs6hEFFLkgH40Wqzii9g@mail.gmail.com>
 <CAGGBzXKtqWhFTAcHq4vqntuEMDo4DeTR75_J4vgSJqNtyNmqfQ@mail.gmail.com>
Message-ID: <CAPJVwBmrHWBAXrxZ5emxXjzYgQ7icTaJkHHGFmPsbg8BEE6wqA@mail.gmail.com>

SSLObject.unwrap has the contract that if it finishes successfully, then
the SSL connection has been cleanly shut down and both sides remain in
sync, and can continue to use the socket in unencrypted mode. When asyncio
calls unwrap before the handshake has completed, then this contract is
impossible to fulfill, and raising an error is the right thing to do. So
imo the ssl module is correct here, and this is a (minor) bug in asyncio.

On Jun 21, 2017 12:49 PM, "Dima Tisnek" <dimaqq at gmail.com> wrote:

> Looks like a bug in the `ssl` module, not `asyncio`.
>
> Refer to https://github.com/openssl/openssl/issues/710
> IMO `ssl` module should be prepared for this.
>
> I'd say post a bug to cpython and see what core devs have to say about it
> :)
> Please note exact versions of python and openssl ofc.
>
> my 2c: openssl has been a moving target every so often, it's quite
> possible that this change in the API escaped the devs.
>
> On 21 June 2017 at 19:50, Mark E. Haase <mehaase at gmail.com> wrote:
> > (I'm not sure if this is a newbie question or a bug report or something
> in
> > between. I apologize in advance if its off-topic. Let me know if I should
> > post this somewhere else.)
> >
> > If a task is cancelled while SSL is being negotiated, then an SSLError is
> > raised, but there's no way (as far as I can tell) for the caller to catch
> > it. (The example below is pretty contrived, but in an application I'm
> > working on, the user can cancel downloads at any time.) Here's an
> example:
> >
> >     import asyncio, random, ssl
> >
> >     async def download(host):
> >         ssl_context = ssl.create_default_context()
> >         reader, writer = await asyncio.open_connection(host, 443,
> > ssl=ssl_context)
> >         request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n'
> >         writer.write(request.encode('ascii'))
> >         lines = list()
> >         while True:
> >             newdata = await reader.readline()
> >             if newdata == b'\r\n':
> >                 break
> >             else:
> >                 lines.append(newdata.decode('utf8').rstrip('\r\n'))
> >         return lines[0]
> >
> >     async def main():
> >         while True:
> >             task = asyncio.Task(download('www.python.org'))
> >             await asyncio.sleep(random.uniform(0.0, 0.5))
> >             task.cancel()
> >             try:
> >                 response = await task
> >                 print(response)
> >             except asyncio.CancelledError:
> >                 print('request cancelled!')
> >             except ssl.SSLError:
> >                 print('caught SSL error')
> >             await asyncio.sleep(1)
> >
> >     loop = asyncio.get_event_loop()
> >     loop.run_until_complete(main())
> >     loop.close()
> >
> > Running this script yields the following output:
> >
> >     HTTP/1.1 200 OK
> >     request cancelled!
> >     HTTP/1.1 200 OK
> >     HTTP/1.1 200 OK
> >     <asyncio.sslproto.SSLProtocol object at 0x7fe7c00e5a20>: SSL
> handshake
> > failed
> >     Traceback (most recent call last):
> >       File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in
> > _create_connection_transport
> >         yield from waiter
> >       File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup
> >         future.result()
> >     concurrent.futures._base.CancelledError
> >
> >     During handling of the above exception, another exception occurred:
> >
> >     Traceback (most recent call last):
> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in
> > _on_handshake_complete
> >         raise handshake_exc
> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in
> > _process_write_backlog
> >         ssldata = self._sslpipe.shutdown(self._finalize)
> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in
> shutdown
> >         ssldata, appdata = self.feed_ssldata(b'')
> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in
> > feed_ssldata
> >         self._sslobj.unwrap()
> >       File "/usr/lib/python3.6/ssl.py", line 692, in unwrap
> >         return self._sslobj.shutdown()
> >     ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299)
> >
> > Is this a bug that I should file, or is there some reason that it's
> intended
> > to work this way? I can work around it with asyncio.shield(), but I
> think I
> > would prefer for the asyncio/sslproto.py to catch the SSLError and ignore
> > it. Maybe I'm being short sighted.
> >
> > Thanks,
> > Mark
> >
> > _______________________________________________
> > Async-sig mailing list
> > Async-sig at python.org
> > https://mail.python.org/mailman/listinfo/async-sig
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
> >
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170621/6c4a2bb8/attachment-0001.html>

From mehaase at gmail.com  Fri Jun 23 10:11:10 2017
From: mehaase at gmail.com (Mark E. Haase)
Date: Fri, 23 Jun 2017 10:11:10 -0400
Subject: [Async-sig] Cancelling SSL connection
In-Reply-To: <CAPJVwBmrHWBAXrxZ5emxXjzYgQ7icTaJkHHGFmPsbg8BEE6wqA@mail.gmail.com>
References: <CALb0Rk7V=E-B4zK1Uqsx2FcN7tLUfyDs6hEFFLkgH40Wqzii9g@mail.gmail.com>
 <CAGGBzXKtqWhFTAcHq4vqntuEMDo4DeTR75_J4vgSJqNtyNmqfQ@mail.gmail.com>
 <CAPJVwBmrHWBAXrxZ5emxXjzYgQ7icTaJkHHGFmPsbg8BEE6wqA@mail.gmail.com>
Message-ID: <CALb0Rk6g8c7+3wokHPX21FMsJYEKK-u+YkYacb2B2UsoB4f27Q@mail.gmail.com>

Thanks Dima & Nathaniel. I opened an asyncio bug. (
http://bugs.python.org/issue30740)

Cheers,
Mark

On Wed, Jun 21, 2017 at 6:47 PM, Nathaniel Smith <njs at pobox.com> wrote:

> SSLObject.unwrap has the contract that if it finishes successfully, then
> the SSL connection has been cleanly shut down and both sides remain in
> sync, and can continue to use the socket in unencrypted mode. When asyncio
> calls unwrap before the handshake has completed, then this contract is
> impossible to fulfill, and raising an error is the right thing to do. So
> imo the ssl module is correct here, and this is a (minor) bug in asyncio.
>
> On Jun 21, 2017 12:49 PM, "Dima Tisnek" <dimaqq at gmail.com> wrote:
>
>> Looks like a bug in the `ssl` module, not `asyncio`.
>>
>> Refer to https://github.com/openssl/openssl/issues/710
>> IMO `ssl` module should be prepared for this.
>>
>> I'd say post a bug to cpython and see what core devs have to say about it
>> :)
>> Please note exact versions of python and openssl ofc.
>>
>> my 2c: openssl has been a moving target every so often, it's quite
>> possible that this change in the API escaped the devs.
>>
>> On 21 June 2017 at 19:50, Mark E. Haase <mehaase at gmail.com> wrote:
>> > (I'm not sure if this is a newbie question or a bug report or something
>> in
>> > between. I apologize in advance if its off-topic. Let me know if I
>> should
>> > post this somewhere else.)
>> >
>> > If a task is cancelled while SSL is being negotiated, then an SSLError
>> is
>> > raised, but there's no way (as far as I can tell) for the caller to
>> catch
>> > it. (The example below is pretty contrived, but in an application I'm
>> > working on, the user can cancel downloads at any time.) Here's an
>> example:
>> >
>> >     import asyncio, random, ssl
>> >
>> >     async def download(host):
>> >         ssl_context = ssl.create_default_context()
>> >         reader, writer = await asyncio.open_connection(host, 443,
>> > ssl=ssl_context)
>> >         request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n'
>> >         writer.write(request.encode('ascii'))
>> >         lines = list()
>> >         while True:
>> >             newdata = await reader.readline()
>> >             if newdata == b'\r\n':
>> >                 break
>> >             else:
>> >                 lines.append(newdata.decode('utf8').rstrip('\r\n'))
>> >         return lines[0]
>> >
>> >     async def main():
>> >         while True:
>> >             task = asyncio.Task(download('www.python.org'))
>> >             await asyncio.sleep(random.uniform(0.0, 0.5))
>> >             task.cancel()
>> >             try:
>> >                 response = await task
>> >                 print(response)
>> >             except asyncio.CancelledError:
>> >                 print('request cancelled!')
>> >             except ssl.SSLError:
>> >                 print('caught SSL error')
>> >             await asyncio.sleep(1)
>> >
>> >     loop = asyncio.get_event_loop()
>> >     loop.run_until_complete(main())
>> >     loop.close()
>> >
>> > Running this script yields the following output:
>> >
>> >     HTTP/1.1 200 OK
>> >     request cancelled!
>> >     HTTP/1.1 200 OK
>> >     HTTP/1.1 200 OK
>> >     <asyncio.sslproto.SSLProtocol object at 0x7fe7c00e5a20>: SSL
>> handshake
>> > failed
>> >     Traceback (most recent call last):
>> >       File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in
>> > _create_connection_transport
>> >         yield from waiter
>> >       File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup
>> >         future.result()
>> >     concurrent.futures._base.CancelledError
>> >
>> >     During handling of the above exception, another exception occurred:
>> >
>> >     Traceback (most recent call last):
>> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in
>> > _on_handshake_complete
>> >         raise handshake_exc
>> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in
>> > _process_write_backlog
>> >         ssldata = self._sslpipe.shutdown(self._finalize)
>> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in
>> shutdown
>> >         ssldata, appdata = self.feed_ssldata(b'')
>> >       File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in
>> > feed_ssldata
>> >         self._sslobj.unwrap()
>> >       File "/usr/lib/python3.6/ssl.py", line 692, in unwrap
>> >         return self._sslobj.shutdown()
>> >     ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299)
>> >
>> > Is this a bug that I should file, or is there some reason that it's
>> intended
>> > to work this way? I can work around it with asyncio.shield(), but I
>> think I
>> > would prefer for the asyncio/sslproto.py to catch the SSLError and
>> ignore
>> > it. Maybe I'm being short sighted.
>> >
>> > Thanks,
>> > Mark
>> >
>> > _______________________________________________
>> > Async-sig mailing list
>> > Async-sig at python.org
>> > https://mail.python.org/mailman/listinfo/async-sig
>> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>> >
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170623/4dd19704/attachment.html>

From chris.jerdonek at gmail.com  Sun Jun 25 17:13:12 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sun, 25 Jun 2017 14:13:12 -0700
Subject: [Async-sig] "read-write" synchronization
Message-ID: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>

I'm relatively new to async programming in Python and am thinking
through possibilities for doing "read-write" synchronization.

I'm using asyncio, and the synchronization primitives that asyncio
exposes are relatively simple [1]. Have options for async read-write
synchronization already been discussed in any detail?

I'm interested in designs where "readers" don't need to acquire a lock
-- only writers. It seems like one way to deal with the main race
condition I see that comes up would be to use loop.time(). Does that
ring a bell, or might there be a much simpler way?

Thanks,
--Chris


[1] https://docs.python.org/3/library/asyncio-sync.html

From andrew.svetlov at gmail.com  Sun Jun 25 17:16:33 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sun, 25 Jun 2017 21:16:33 +0000
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
Message-ID: <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>

There is https://github.com/aio-libs/aiorwlock

On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek <chris.jerdonek at gmail.com>
wrote:

> I'm relatively new to async programming in Python and am thinking
> through possibilities for doing "read-write" synchronization.
>
> I'm using asyncio, and the synchronization primitives that asyncio
> exposes are relatively simple [1]. Have options for async read-write
> synchronization already been discussed in any detail?
>
> I'm interested in designs where "readers" don't need to acquire a lock
> -- only writers. It seems like one way to deal with the main race
> condition I see that comes up would be to use loop.time(). Does that
> ring a bell, or might there be a much simpler way?
>
> Thanks,
> --Chris
>
>
> [1] https://docs.python.org/3/library/asyncio-sync.html
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170625/f37af036/attachment.html>

From chris.jerdonek at gmail.com  Sun Jun 25 17:24:50 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sun, 25 Jun 2017 14:24:50 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
Message-ID: <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>

Thank you. I had seen that, but it seems heavier weight than needed.
And it also requires locking on reading.

--Chris

On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov
<andrew.svetlov at gmail.com> wrote:
> There is https://github.com/aio-libs/aiorwlock
>
> On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek <chris.jerdonek at gmail.com>
> wrote:
>>
>> I'm relatively new to async programming in Python and am thinking
>> through possibilities for doing "read-write" synchronization.
>>
>> I'm using asyncio, and the synchronization primitives that asyncio
>> exposes are relatively simple [1]. Have options for async read-write
>> synchronization already been discussed in any detail?
>>
>> I'm interested in designs where "readers" don't need to acquire a lock
>> -- only writers. It seems like one way to deal with the main race
>> condition I see that comes up would be to use loop.time(). Does that
>> ring a bell, or might there be a much simpler way?
>>
>> Thanks,
>> --Chris
>>
>>
>> [1] https://docs.python.org/3/library/asyncio-sync.html
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
> --
> Thanks,
> Andrew Svetlov

From chris.jerdonek at gmail.com  Sun Jun 25 17:54:44 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sun, 25 Jun 2017 14:54:44 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAP7+vJLqu+aSU8XST28RwNLYqda=9_Uux=p3qqtPZUma4iSLpQ@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
 <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>
 <CAP7+vJLqu+aSU8XST28RwNLYqda=9_Uux=p3qqtPZUma4iSLpQ@mail.gmail.com>
Message-ID: <CAOTb1wcst+CZfHiBFoeCBObbuLC_ec7E1VWu7Jjgke3RYx11wA@mail.gmail.com>

The read-write operations I'm protecting will have coroutines inside
that need to be awaited on, so I don't think I'll be able to take
advantage to that extreme.

But I think I might be able to use your point to simplify the logic a
little. (To rephrase, you're reminding me that context switches can't
happen at arbitrary lines of code. I only need to be prepared for the
cases where there's an await / yield from.)

--Chris


On Sun, Jun 25, 2017 at 2:30 PM, Guido van Rossum <gvanrossum at gmail.com> wrote:
> The secret is that as long as you don't yield no other task will run so you
> don't need locks at all.
>
> On Jun 25, 2017 2:24 PM, "Chris Jerdonek" <chris.jerdonek at gmail.com> wrote:
>>
>> Thank you. I had seen that, but it seems heavier weight than needed.
>> And it also requires locking on reading.
>>
>> --Chris
>>
>> On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov
>> <andrew.svetlov at gmail.com> wrote:
>> > There is https://github.com/aio-libs/aiorwlock
>> >
>> > On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek
>> > <chris.jerdonek at gmail.com>
>> > wrote:
>> >>
>> >> I'm relatively new to async programming in Python and am thinking
>> >> through possibilities for doing "read-write" synchronization.
>> >>
>> >> I'm using asyncio, and the synchronization primitives that asyncio
>> >> exposes are relatively simple [1]. Have options for async read-write
>> >> synchronization already been discussed in any detail?
>> >>
>> >> I'm interested in designs where "readers" don't need to acquire a lock
>> >> -- only writers. It seems like one way to deal with the main race
>> >> condition I see that comes up would be to use loop.time(). Does that
>> >> ring a bell, or might there be a much simpler way?
>> >>
>> >> Thanks,
>> >> --Chris
>> >>
>> >>
>> >> [1] https://docs.python.org/3/library/asyncio-sync.html
>> >> _______________________________________________
>> >> Async-sig mailing list
>> >> Async-sig at python.org
>> >> https://mail.python.org/mailman/listinfo/async-sig
>> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
>> >
>> > --
>> > Thanks,
>> > Andrew Svetlov
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/

From gvanrossum at gmail.com  Sun Jun 25 17:30:39 2017
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 25 Jun 2017 14:30:39 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
 <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>
Message-ID: <CAP7+vJLqu+aSU8XST28RwNLYqda=9_Uux=p3qqtPZUma4iSLpQ@mail.gmail.com>

The secret is that as long as you don't yield no other task will run so you
don't need locks at all.

On Jun 25, 2017 2:24 PM, "Chris Jerdonek" <chris.jerdonek at gmail.com> wrote:

> Thank you. I had seen that, but it seems heavier weight than needed.
> And it also requires locking on reading.
>
> --Chris
>
> On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov
> <andrew.svetlov at gmail.com> wrote:
> > There is https://github.com/aio-libs/aiorwlock
> >
> > On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek <
> chris.jerdonek at gmail.com>
> > wrote:
> >>
> >> I'm relatively new to async programming in Python and am thinking
> >> through possibilities for doing "read-write" synchronization.
> >>
> >> I'm using asyncio, and the synchronization primitives that asyncio
> >> exposes are relatively simple [1]. Have options for async read-write
> >> synchronization already been discussed in any detail?
> >>
> >> I'm interested in designs where "readers" don't need to acquire a lock
> >> -- only writers. It seems like one way to deal with the main race
> >> condition I see that comes up would be to use loop.time(). Does that
> >> ring a bell, or might there be a much simpler way?
> >>
> >> Thanks,
> >> --Chris
> >>
> >>
> >> [1] https://docs.python.org/3/library/asyncio-sync.html
> >> _______________________________________________
> >> Async-sig mailing list
> >> Async-sig at python.org
> >> https://mail.python.org/mailman/listinfo/async-sig
> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
> >
> > --
> > Thanks,
> > Andrew Svetlov
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170625/25d15e29/attachment-0001.html>

From njs at pobox.com  Sun Jun 25 18:09:16 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 25 Jun 2017 15:09:16 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
Message-ID: <CAPJVwBmRChvVphAVzPypL2XdbnVZSSoTDMOW5-DhXdus+n2Oxw@mail.gmail.com>

On Sun, Jun 25, 2017 at 2:13 PM, Chris Jerdonek
<chris.jerdonek at gmail.com> wrote:
> I'm relatively new to async programming in Python and am thinking
> through possibilities for doing "read-write" synchronization.
>
> I'm using asyncio, and the synchronization primitives that asyncio
> exposes are relatively simple [1]. Have options for async read-write
> synchronization already been discussed in any detail?

As a general comment: I used to think rwlocks were a simple extension
to regular locks, but it turns out there's actually this huge increase
in design complexity. Do you want your lock to be read-biased,
write-biased, task-fair, phase-fair? Can you acquire a write lock if
you already hold one (i.e., are write locks reentrant)? What about
acquiring a read lock if you already hold the write lock? Can you
atomically upgrade/downgrade a lock? This makes it much harder to come
up with a one-size-fits-all design suitable for adding to something
like the python stdlib.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From chris.jerdonek at gmail.com  Sun Jun 25 18:27:38 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sun, 25 Jun 2017 15:27:38 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAPJVwBmRChvVphAVzPypL2XdbnVZSSoTDMOW5-DhXdus+n2Oxw@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAPJVwBmRChvVphAVzPypL2XdbnVZSSoTDMOW5-DhXdus+n2Oxw@mail.gmail.com>
Message-ID: <CAOTb1wco+dm_aaNPkVEG8kmpkCpMq4M8rT-DsaTyJN3o_o-dSw@mail.gmail.com>

On Sun, Jun 25, 2017 at 3:09 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Sun, Jun 25, 2017 at 2:13 PM, Chris Jerdonek
> <chris.jerdonek at gmail.com> wrote:
>> I'm using asyncio, and the synchronization primitives that asyncio
>> exposes are relatively simple [1]. Have options for async read-write
>> synchronization already been discussed in any detail?
>
> As a general comment: I used to think rwlocks were a simple extension
> to regular locks, but it turns out there's actually this huge increase
> in design complexity. Do you want your lock to be read-biased,
> write-biased, task-fair, phase-fair? Can you acquire a write lock if
> you already hold one (i.e., are write locks reentrant)? What about
> acquiring a read lock if you already hold the write lock? Can you
> atomically upgrade/downgrade a lock? This makes it much harder to come
> up with a one-size-fits-all design suitable for adding to something
> like the python stdlib.

I agree. And my point about asyncio's primitives wasn't a criticism or
request that more be added. I was asking more if there has been any
discussion of general approaches and patterns that take advantage of
the event loop's single thread, etc.

Maybe what I'll do is briefly write up the approach I have in mind,
and people can let me know if I'm on the right track. :)

--Chris

From yarkot1 at gmail.com  Sun Jun 25 18:38:41 2017
From: yarkot1 at gmail.com (Yarko Tymciurak)
Date: Sun, 25 Jun 2017 22:38:41 +0000
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wcst+CZfHiBFoeCBObbuLC_ec7E1VWu7Jjgke3RYx11wA@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
 <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>
 <CAP7+vJLqu+aSU8XST28RwNLYqda=9_Uux=p3qqtPZUma4iSLpQ@mail.gmail.com>
 <CAOTb1wcst+CZfHiBFoeCBObbuLC_ec7E1VWu7Jjgke3RYx11wA@mail.gmail.com>
Message-ID: <CAJ+Z=PJ=f1Kux5CZSMZPfWTMX11p8cO1z0XyLHguYvV+qnzaJA@mail.gmail.com>

On Sun, Jun 25, 2017 at 4:54 PM Chris Jerdonek <chris.jerdonek at gmail.com>
wrote:

> The read-write operations I'm protecting will have coroutines inside
> that need to be awaited on, so I don't think I'll be able to take
> advantage to that extreme.
>
> But I think I might be able to use your point to simplify the logic a
> little. (To rephrase, you're reminding me that context switches can't
> happen at arbitrary lines of code. I only need to be prepared for the
> cases where there's an await / yield from.)


The "secret" Guido refers to we should pull out front and center,
explicitly at all times - asynchronous programming is nothing more than
cooperative multitasking.

Patterns suited for preemptive multi-tasking (executive-based, interrupt
based, etc.) are suspect, potentially misplaced when they show up in a
cooperative multitasking context.

To be a well-behaved (capable of effective cooperation) task in such a
system, you should guard against getting embroiled in potentially blocking
I/O tasks whose latency you are not able to control (within facilities
available in a cooperative multitasking context).  The raises a couple of
questions: to be well-behaved, simple control flow is desireable (i.e. not
nested layers of yields, except perhaps for a pipeline case); and
"read/write" control from memory space w/in the process (since external I/O
is generally not for async) begs the question: what for?  Eliminate
globals, encapsulate and limit access as needed theough usual programming
methods.

I'm sure someone will find an edgecase to challenge my above rule-of-thumb,
but as you're new to this, I think this is a pretty good place to start.
Ask yourself if what your trying to do w/ async is suited for async.

Cheers,
Yarko

>
>
> --Chris
>
>
> On Sun, Jun 25, 2017 at 2:30 PM, Guido van Rossum <gvanrossum at gmail.com>
> wrote:
> > The secret is that as long as you don't yield no other task will run so
> you
> > don't need locks at all.
> >
> > On Jun 25, 2017 2:24 PM, "Chris Jerdonek" <chris.jerdonek at gmail.com>
> wrote:
> >>
> >> Thank you. I had seen that, but it seems heavier weight than needed.
> >> And it also requires locking on reading.
> >>
> >> --Chris
> >>
> >> On Sun, Jun 25, 2017 at 2:16 PM, Andrew Svetlov
> >> <andrew.svetlov at gmail.com> wrote:
> >> > There is https://github.com/aio-libs/aiorwlock
> >> >
> >> > On Mon, Jun 26, 2017 at 12:13 AM Chris Jerdonek
> >> > <chris.jerdonek at gmail.com>
> >> > wrote:
> >> >>
> >> >> I'm relatively new to async programming in Python and am thinking
> >> >> through possibilities for doing "read-write" synchronization.
> >> >>
> >> >> I'm using asyncio, and the synchronization primitives that asyncio
> >> >> exposes are relatively simple [1]. Have options for async read-write
> >> >> synchronization already been discussed in any detail?
> >> >>
> >> >> I'm interested in designs where "readers" don't need to acquire a
> lock
> >> >> -- only writers. It seems like one way to deal with the main race
> >> >> condition I see that comes up would be to use loop.time(). Does that
> >> >> ring a bell, or might there be a much simpler way?
> >> >>
> >> >> Thanks,
> >> >> --Chris
> >> >>
> >> >>
> >> >> [1] https://docs.python.org/3/library/asyncio-sync.html
> >> >> _______________________________________________
> >> >> Async-sig mailing list
> >> >> Async-sig at python.org
> >> >> https://mail.python.org/mailman/listinfo/async-sig
> >> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
> >> >
> >> > --
> >> > Thanks,
> >> > Andrew Svetlov
> >> _______________________________________________
> >> Async-sig mailing list
> >> Async-sig at python.org
> >> https://mail.python.org/mailman/listinfo/async-sig
> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170625/7cf7e88c/attachment.html>

From gvanrossum at gmail.com  Sun Jun 25 23:33:57 2017
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 25 Jun 2017 20:33:57 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAJ+Z=PJ=f1Kux5CZSMZPfWTMX11p8cO1z0XyLHguYvV+qnzaJA@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
 <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>
 <CAP7+vJLqu+aSU8XST28RwNLYqda=9_Uux=p3qqtPZUma4iSLpQ@mail.gmail.com>
 <CAOTb1wcst+CZfHiBFoeCBObbuLC_ec7E1VWu7Jjgke3RYx11wA@mail.gmail.com>
 <CAJ+Z=PJ=f1Kux5CZSMZPfWTMX11p8cO1z0XyLHguYvV+qnzaJA@mail.gmail.com>
Message-ID: <CAP7+vJLFkGL8D=QHS3xD8zerdCS=WOvOhpUjK=2CiUahMywyZg@mail.gmail.com>

On Sun, Jun 25, 2017 at 3:38 PM, Yarko Tymciurak <yarkot1 at gmail.com> wrote:

> To be a well-behaved (capable of effective cooperation) task in such a
> system, you should guard against getting embroiled in potentially blocking
> I/O tasks whose latency you are not able to control (within facilities
> available in a cooperative multitasking context).  The raises a couple of
> questions: to be well-behaved, simple control flow is desireable (i.e. not
> nested layers of yields, except perhaps for a pipeline case); and
> "read/write" control from memory space w/in the process (since external I/O
> is generally not for async) begs the question: what for?  Eliminate
> globals, encapsulate and limit access as needed through usual programming
> methods.
>

Before anyone takes this paragraph too seriously, there seem to be a bunch
of misunderstandings underlying this paragraph.

- *All* blocking I/O is wrong in an async task, regardless of whether you
can control its latency. (The only safe way to do I/O is using a primitive
that works with `await`.)

- There's nothing wrong with `yield` itself. (You shouldn't do I/O in a
generator used in an async task -- but that's just due to the general ban
on I/O.)

- Using async tasks don't make globals more risky than regular code (in
fact they are safer here than in traditional multi-threaded code).

- What on earth is "read/write" control from memory space w/in the process?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170625/e4bd4aec/attachment-0001.html>

From chris.jerdonek at gmail.com  Sun Jun 25 23:34:50 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sun, 25 Jun 2017 20:34:50 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wco+dm_aaNPkVEG8kmpkCpMq4M8rT-DsaTyJN3o_o-dSw@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAPJVwBmRChvVphAVzPypL2XdbnVZSSoTDMOW5-DhXdus+n2Oxw@mail.gmail.com>
 <CAOTb1wco+dm_aaNPkVEG8kmpkCpMq4M8rT-DsaTyJN3o_o-dSw@mail.gmail.com>
Message-ID: <CAOTb1wec5_TQUzPza5BGSVsC3OhbThe4Hwog6XW97JDCebgy2w@mail.gmail.com>

So here's one approach I'm thinking about for implementing
readers-writer synchronization. Does this seem reasonable as a
starting point, or am I missing something much simpler?

I know there are various things you can prioritize for (readers vs.
writers, etc), but I'm less concerned about those for now.

The global state is--

* reader_count: an integer count of the active (reading) readers
* writer_lock: an asyncio Lock object
* no_readers_event: an asyncio Event object signaling no active readers
* no_writer_event: an asyncio Event object signaling no active writer

Untested pseudo-code for a writer--

    async with writer_lock:
        no_writer_event.clear()
        # Wait for the readers to finish.
        await no_readers_event.wait()
        # Do the write.
        await write()

    # Awaken waiting readers.
    no_writer_event.set()

Untested pseudo-code for a reader--

    while True:
        await no_writer_event.wait()
        # Check the writer_lock again in case a new writer has
        # started writing.
        if not writer_lock.locked():
            # Then we can do the read.
            break

    reader_count += 1
    if reader_count == 1:
        no_readers_event.clear()
    # Do the read.
    await read()
    reader_count -= 1
    if reader_count == 0:
        # Awaken any waiting writer.
        no_readers_event.set()

One thing I'm not clear about is when the writer_lock is released and
the no_writer_event set, are there any guarantees about what coroutine
will be awakened first -- a writer waiting on the lock or the readers
waiting on the no_writer_event?

Similarly, is there a way to avoid having to have readers check the
writer_lock again when a reader waiting on no_writer_event is
awakened?

--Chris


On Sun, Jun 25, 2017 at 3:27 PM, Chris Jerdonek
<chris.jerdonek at gmail.com> wrote:
> On Sun, Jun 25, 2017 at 3:09 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Sun, Jun 25, 2017 at 2:13 PM, Chris Jerdonek
>> <chris.jerdonek at gmail.com> wrote:
>>> I'm using asyncio, and the synchronization primitives that asyncio
>>> exposes are relatively simple [1]. Have options for async read-write
>>> synchronization already been discussed in any detail?
>>
>> As a general comment: I used to think rwlocks were a simple extension
>> to regular locks, but it turns out there's actually this huge increase
>> in design complexity. Do you want your lock to be read-biased,
>> write-biased, task-fair, phase-fair? Can you acquire a write lock if
>> you already hold one (i.e., are write locks reentrant)? What about
>> acquiring a read lock if you already hold the write lock? Can you
>> atomically upgrade/downgrade a lock? This makes it much harder to come
>> up with a one-size-fits-all design suitable for adding to something
>> like the python stdlib.
>
> I agree. And my point about asyncio's primitives wasn't a criticism or
> request that more be added. I was asking more if there has been any
> discussion of general approaches and patterns that take advantage of
> the event loop's single thread, etc.
>
> Maybe what I'll do is briefly write up the approach I have in mind,
> and people can let me know if I'm on the right track. :)
>
> --Chris

From yarkot1 at gmail.com  Mon Jun 26 00:01:32 2017
From: yarkot1 at gmail.com (Yarko Tymciurak)
Date: Sun, 25 Jun 2017 23:01:32 -0500
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAJ+Z=P+bMpE3ZiYL1b2GqjA5ZBKtd08jPb5-=9H81xe+WBP+oA@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
 <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>
 <CAP7+vJLqu+aSU8XST28RwNLYqda=9_Uux=p3qqtPZUma4iSLpQ@mail.gmail.com>
 <CAOTb1wcst+CZfHiBFoeCBObbuLC_ec7E1VWu7Jjgke3RYx11wA@mail.gmail.com>
 <CAJ+Z=PJ=f1Kux5CZSMZPfWTMX11p8cO1z0XyLHguYvV+qnzaJA@mail.gmail.com>
 <CAP7+vJLFkGL8D=QHS3xD8zerdCS=WOvOhpUjK=2CiUahMywyZg@mail.gmail.com>
 <CAJ+Z=P+bMpE3ZiYL1b2GqjA5ZBKtd08jPb5-=9H81xe+WBP+oA@mail.gmail.com>
Message-ID: <CAJ+Z=PKs7UPDPFxm4q-=vg85pBYwi4c9aWGiJHEC_Brhm3izfw@mail.gmail.com>

On Sun, Jun 25, 2017 at 10:46 PM, Yarko Tymciurak <yarkot1 at gmail.com> wrote:

>
>
> On Sun, Jun 25, 2017 at 10:33 PM, Guido van Rossum <gvanrossum at gmail.com>
> wrote:
>
>> On Sun, Jun 25, 2017 at 3:38 PM, Yarko Tymciurak <yarkot1 at gmail.com>
>> wrote:
>>
>>> To be a well-behaved (capable of effective cooperation) task in such a
>>> system, you should guard against getting embroiled in potentially blocking
>>> I/O tasks whose latency you are not able to control (within facilities
>>> available in a cooperative multitasking context).  The raises a couple of
>>> questions: to be well-behaved, simple control flow is desireable (i.e. not
>>> nested layers of yields, except perhaps for a pipeline case); and
>>> "read/write" control from memory space w/in the process (since external I/O
>>> is generally not for async) begs the question: what for?  Eliminate
>>> globals, encapsulate and limit access as needed through usual programming
>>> methods.
>>>
>>
>> Before anyone takes this paragraph too seriously, there seem to be a
>> bunch of misunderstandings underlying this paragraph.
>>
>
> yes - thanks for the clarifications...  I'm speaking from the perspective
> of an ECE, and thinking in the small-scale (embedded) of things like when
> in general is cooperative multitasking (very light-weight) more performant
> than pre-emptive... so from that space:
>
>>
>> - *All* blocking I/O is wrong in an async task, regardless of whether you
>> can control its latency. (The only safe way to do I/O is using a primitive
>> that works with `await`.)
>>
>
yes, and from ECE perspective the only I/O is "local" device (e.g. RAM,
which itself has rather deterministic setup and write times...), etc.

my more general point (sorry - should have made it explicit) is that if you
call a library routine, you may not expect it's calling external I/O, so
that requires either care (or defensively guarding against it, e.g. with
timers ... another story).   This in particular is an error which I saw in
OpenStack swift project - they depended on fast local storage device I/O.
Except when devices started failing.   Then they mistakenly assumed this
was python's fault - missing the programming error of doing async (gevent -
but same issue) I/O (which might be ok, within limits, but was not guarded
against - was done in an unreliable way).

So - whether intentionally doing such "risky" but seemingly reliable and
"ok" I/O and failing to put in place guards, as must be in cooperative
multitasking, or if you just get surprised that some library you thought
was inoccuous is somewhere doing some surprise I/O (logging?
anything...).... in cooperative multi-tasking, you can get away with some
things, but it is _all_ your responsibility to guard against problems.

That was my point here.


>> - There's nothing wrong with `yield` itself. (You shouldn't do I/O in a
>> generator used in an async task -- but that's just due to the general ban
>> on I/O.)
>>
>
Yes;  as above.  But I'm calling local variables (strictly speaking) I/O
too.   And you might consider REDIS as "to RAM, so how different is that?"
--- well, it's through another process, and ... up to a preemptive
scheduler, and all sorts of things.  So, sure - you _can_ do it, if you put
in guards.  But don't.  Or at least, have very specific good reasons, and
understand the coding cost of trying to do so.  In other words - don't.


>
>> - Using async tasks don't make globals more risky than regular code (in
>> fact they are safer here than in traditional multi-threaded code).
>>
>> - What on earth is "read/write" control from memory space w/in the
>> process?
>>
>
Sorry - these last two were a bit of a joke on my part.   The silly:  only
valid I/O is to variables.  But you don't need that, because you have
normal variable scoping/encapsulation rules.   So (I suppose my joke
continued), the only reason to have "read/write controls left is against
(!) global variables.  Answer - don't;  and you don'' need R/W controls,
because you have normal encapsulation controls of variables from the
language.  So - in cooperative multitasking, my argument goes, there can be
(!) no reasonable motivation for R/W controls.

-- Yarko


>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170625/80ca2cb8/attachment.html>

From yarkot1 at gmail.com  Sun Jun 25 23:46:02 2017
From: yarkot1 at gmail.com (Yarko Tymciurak)
Date: Sun, 25 Jun 2017 22:46:02 -0500
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAP7+vJLFkGL8D=QHS3xD8zerdCS=WOvOhpUjK=2CiUahMywyZg@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAL3CFcXfUshrAYTX7R6A5x6oDMsajzD6GyG=POFssCDeBQPeww@mail.gmail.com>
 <CAOTb1wc5AzzbwYDzWRX_oTWYFinkzNfU0+6wKUv1H49CKA6F7Q@mail.gmail.com>
 <CAP7+vJLqu+aSU8XST28RwNLYqda=9_Uux=p3qqtPZUma4iSLpQ@mail.gmail.com>
 <CAOTb1wcst+CZfHiBFoeCBObbuLC_ec7E1VWu7Jjgke3RYx11wA@mail.gmail.com>
 <CAJ+Z=PJ=f1Kux5CZSMZPfWTMX11p8cO1z0XyLHguYvV+qnzaJA@mail.gmail.com>
 <CAP7+vJLFkGL8D=QHS3xD8zerdCS=WOvOhpUjK=2CiUahMywyZg@mail.gmail.com>
Message-ID: <CAJ+Z=P+bMpE3ZiYL1b2GqjA5ZBKtd08jPb5-=9H81xe+WBP+oA@mail.gmail.com>

On Sun, Jun 25, 2017 at 10:33 PM, Guido van Rossum <gvanrossum at gmail.com>
wrote:

> On Sun, Jun 25, 2017 at 3:38 PM, Yarko Tymciurak <yarkot1 at gmail.com>
> wrote:
>
>> To be a well-behaved (capable of effective cooperation) task in such a
>> system, you should guard against getting embroiled in potentially blocking
>> I/O tasks whose latency you are not able to control (within facilities
>> available in a cooperative multitasking context).  The raises a couple of
>> questions: to be well-behaved, simple control flow is desireable (i.e. not
>> nested layers of yields, except perhaps for a pipeline case); and
>> "read/write" control from memory space w/in the process (since external I/O
>> is generally not for async) begs the question: what for?  Eliminate
>> globals, encapsulate and limit access as needed through usual programming
>> methods.
>>
>
> Before anyone takes this paragraph too seriously, there seem to be a bunch
> of misunderstandings underlying this paragraph.
>

yes - thanks for the clarifications...  I'm speaking from the perspective
of an ECE, and thinking in the small-scale (embedded) of things like when
in general is cooperative multitasking (very light-weight) more performant
than pre-emptive... so from that space:

>
> - *All* blocking I/O is wrong in an async task, regardless of whether you
> can control its latency. (The only safe way to do I/O is using a primitive
> that works with `await`.)
>
> - There's nothing wrong with `yield` itself. (You shouldn't do I/O in a
> generator used in an async task -- but that's just due to the general ban
> on I/O.)
>
> - Using async tasks don't make globals more risky than regular code (in
> fact they are safer here than in traditional multi-threaded code).
>
> - What on earth is "read/write" control from memory space w/in the process?
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170625/a1442a84/attachment-0001.html>

From dimaqq at gmail.com  Mon Jun 26 04:43:41 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Mon, 26 Jun 2017 10:43:41 +0200
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
Message-ID: <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>

Chris, this led to an interesting discussion, which then went pretty
far from the original concern.

Perhaps you can share your use-case, both as pseudo-code and a link to
real code.

I'm specifically interested to see why/where you'd like to use a
read-write async lock, to evaluate if this is something common or
specific, and if, perhaps, some other paradigm (like queue, worker
pool, ...) may be more useful in general case.

I'm also curious if a full set of async sync primitives may one day
lead to async monitors. Granted, simple use of async monitor is really
a future/promise, but perhaps there are complex use cases in the
UI/react domain with its promise/stream dichotomy.

Cheers,
d.

On 25 June 2017 at 23:13, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
> I'm relatively new to async programming in Python and am thinking
> through possibilities for doing "read-write" synchronization.
>
> I'm using asyncio, and the synchronization primitives that asyncio
> exposes are relatively simple [1]. Have options for async read-write
> synchronization already been discussed in any detail?
>
> I'm interested in designs where "readers" don't need to acquire a lock
> -- only writers. It seems like one way to deal with the main race
> condition I see that comes up would be to use loop.time(). Does that
> ring a bell, or might there be a much simpler way?
>
> Thanks,
> --Chris
>
>
> [1] https://docs.python.org/3/library/asyncio-sync.html
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/

From chris.jerdonek at gmail.com  Mon Jun 26 05:28:26 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Mon, 26 Jun 2017 02:28:26 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
Message-ID: <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>

On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
> Perhaps you can share your use-case, both as pseudo-code and a link to
> real code.
>
> I'm specifically interested to see why/where you'd like to use a
> read-write async lock, to evaluate if this is something common or
> specific, and if, perhaps, some other paradigm (like queue, worker
> pool, ...) may be more useful in general case.
>
> I'm also curious if a full set of async sync primitives may one day
> lead to async monitors. Granted, simple use of async monitor is really
> a future/promise, but perhaps there are complex use cases in the
> UI/react domain with its promise/stream dichotomy.

Thank you, Dima. In my last email I shared pseudo-code for an approach
to read-write synchronization that is independent of use case. [1]

For the use case, my original purpose in mind was to synchronize many
small file operations on disk like creating and removing directories
that possibly share intermediate segments. The real code isn't public.
But these would be operations like os.makedirs() and os.removedirs()
that would be wrapped by loop.run_in_executor() to be non-blocking.
The directory removal using os.removedirs() is the operation I thought
should require exclusive access, so as not to interfere with directory
creations in progress.

Perhaps a simpler, dirtier approach would be not to synchronize at all
and simply retry directory creations that fail until they succeed.
That could be enough to handle rare cases where simultaneous creation
and removal causes an error. You could view this an EAFP approach.

Either way, I think the process of thinking through patterns for
read-write synchronization is helpful for getting a better general
feel and understanding of async.

--Chris


>
> Cheers,
> d.
>
> On 25 June 2017 at 23:13, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>> I'm relatively new to async programming in Python and am thinking
>> through possibilities for doing "read-write" synchronization.
>>
>> I'm using asyncio, and the synchronization primitives that asyncio
>> exposes are relatively simple [1]. Have options for async read-write
>> synchronization already been discussed in any detail?
>>
>> I'm interested in designs where "readers" don't need to acquire a lock
>> -- only writers. It seems like one way to deal with the main race
>> condition I see that comes up would be to use loop.time(). Does that
>> ring a bell, or might there be a much simpler way?
>>
>> Thanks,
>> --Chris
>>
>>
>> [1] https://docs.python.org/3/library/asyncio-sync.html
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/

From dimaqq at gmail.com  Mon Jun 26 12:25:49 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Mon, 26 Jun 2017 18:25:49 +0200
Subject: [Async-sig] async generator confusion or bug?
Message-ID: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>

Hi group,

I'm trying to cross-use an sync generator across several async functions.
Is it allowed or a completely bad idea? (if so, why?)

Here's MRE:

import asyncio


async def generator():
    while True:
        x = yield
        print("received", x)
        await asyncio.sleep(0.1)


async def user(name, g):
    print("sending", name)
    await g.asend(name)


async def helper():
    g = generator()
    await g.asend(None)

    await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])


if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(helper())


And the output it produces when ran (py3.6.1):

sending user-1
received user-1
sending user-2
sending user-0
received None
received None


Where are those None's coming from in the end?
Where did "user-0" and "user-1" data go?

Is this a bug, or am I hopelessly confused?
Thanks!

From dimaqq at gmail.com  Mon Jun 26 13:02:17 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Mon, 26 Jun 2017 19:02:17 +0200
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
Message-ID: <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>

A little epiphany on my part:

In threaded world, a lock (etc.) can be used for 2 distinct purposes:
*1 synchronise [access to resource in the] library implementation, and
*2 synchronise users of a library

It's easy since taken lock has an owner (thread).
Both library and user stack frames belong to either this thread or some other.


In the async world, users are opaque to library implementation
(technically own async threads).
Therefore only use case #1 is valid.
Moreover, it occurs to me that lock/unlock pair must be confined to
same async function.
Going beyond that restriction is bug-prone like crazy (even for me).


Chris, coming back to your use-case.
Do you want to synchronise side-effect creation/deletion for the
sanity of side-effects only?
Or do you imply that callers' actions are synchronised too?
In other words, do your callers use those directories out of band?


P.S./O.T. when it comes to directories, you probably want hierarchical
locks rather than RW.


On 26 June 2017 at 11:28, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
> On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
>> Perhaps you can share your use-case, both as pseudo-code and a link to
>> real code.
>>
>> I'm specifically interested to see why/where you'd like to use a
>> read-write async lock, to evaluate if this is something common or
>> specific, and if, perhaps, some other paradigm (like queue, worker
>> pool, ...) may be more useful in general case.
>>
>> I'm also curious if a full set of async sync primitives may one day
>> lead to async monitors. Granted, simple use of async monitor is really
>> a future/promise, but perhaps there are complex use cases in the
>> UI/react domain with its promise/stream dichotomy.
>
> Thank you, Dima. In my last email I shared pseudo-code for an approach
> to read-write synchronization that is independent of use case. [1]
>
> For the use case, my original purpose in mind was to synchronize many
> small file operations on disk like creating and removing directories
> that possibly share intermediate segments. The real code isn't public.
> But these would be operations like os.makedirs() and os.removedirs()
> that would be wrapped by loop.run_in_executor() to be non-blocking.
> The directory removal using os.removedirs() is the operation I thought
> should require exclusive access, so as not to interfere with directory
> creations in progress.
>
> Perhaps a simpler, dirtier approach would be not to synchronize at all
> and simply retry directory creations that fail until they succeed.
> That could be enough to handle rare cases where simultaneous creation
> and removal causes an error. You could view this an EAFP approach.
>
> Either way, I think the process of thinking through patterns for
> read-write synchronization is helpful for getting a better general
> feel and understanding of async.
>
> --Chris
>
>
>>
>> Cheers,
>> d.
>>
>> On 25 June 2017 at 23:13, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>>> I'm relatively new to async programming in Python and am thinking
>>> through possibilities for doing "read-write" synchronization.
>>>
>>> I'm using asyncio, and the synchronization primitives that asyncio
>>> exposes are relatively simple [1]. Have options for async read-write
>>> synchronization already been discussed in any detail?
>>>
>>> I'm interested in designs where "readers" don't need to acquire a lock
>>> -- only writers. It seems like one way to deal with the main race
>>> condition I see that comes up would be to use loop.time(). Does that
>>> ring a bell, or might there be a much simpler way?
>>>
>>> Thanks,
>>> --Chris
>>>
>>>
>>> [1] https://docs.python.org/3/library/asyncio-sync.html
>>> _______________________________________________
>>> Async-sig mailing list
>>> Async-sig at python.org
>>> https://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/

From yselivanov at gmail.com  Mon Jun 26 13:48:37 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Mon, 26 Jun 2017 13:48:37 -0400
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
Message-ID: <D4673B0D-C186-47C9-8C2A-14345CA06717@gmail.com>

Hi Dima,

> On Jun 26, 2017, at 12:25 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> 
> Hi group,
> 
> I'm trying to cross-use an sync generator across several async functions.
> Is it allowed or a completely bad idea? (if so, why?)

It is allowed, but leads to complex code.

> 
> Here's MRE:
> 
> import asyncio
> 
> 
> async def generator():
>    while True:
>        x = yield
>        print("received", x)
>        await asyncio.sleep(0.1)
> 
> 
> async def user(name, g):
>    print("sending", name)
>    await g.asend(name)
> 
> 
> async def helper():
>    g = generator()
>    await g.asend(None)
> 
>    await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])
> 
> 
> if __name__ == "__main__":
>    asyncio.get_event_loop().run_until_complete(helper())
> 
> 
> And the output it produces when ran (py3.6.1):
> 
> sending user-1
> received user-1
> sending user-2
> sending user-0
> received None
> received None
> 
> 
> Where are those None's coming from in the end?
> Where did "user-0" and "user-1" data go?


Interesting.  If I replace "gather" with three consecutive awaits of "asend", everything works as expected.  So there's some weird interaction of asend/gather, or maybe you did find a bug.  Need more time to investigate.

Would you mind to open an issue on bugs.python?

Thanks,
Yury


From andrew.svetlov at gmail.com  Mon Jun 26 13:53:25 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 26 Jun 2017 17:53:25 +0000
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <D4673B0D-C186-47C9-8C2A-14345CA06717@gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
 <D4673B0D-C186-47C9-8C2A-14345CA06717@gmail.com>
Message-ID: <CAL3CFcUZ7CQ0ytzeYom6a4aiGXfNauOfFQyv0DWqM-icrc1C+A@mail.gmail.com>

IIRC gather collects coroutines in arbitrary order, maybe it's the source
of misunderstanding?

On Mon, Jun 26, 2017 at 8:48 PM Yury Selivanov <yselivanov at gmail.com> wrote:

> Hi Dima,
>
> > On Jun 26, 2017, at 12:25 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> >
> > Hi group,
> >
> > I'm trying to cross-use an sync generator across several async functions.
> > Is it allowed or a completely bad idea? (if so, why?)
>
> It is allowed, but leads to complex code.
>
> >
> > Here's MRE:
> >
> > import asyncio
> >
> >
> > async def generator():
> >    while True:
> >        x = yield
> >        print("received", x)
> >        await asyncio.sleep(0.1)
> >
> >
> > async def user(name, g):
> >    print("sending", name)
> >    await g.asend(name)
> >
> >
> > async def helper():
> >    g = generator()
> >    await g.asend(None)
> >
> >    await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])
> >
> >
> > if __name__ == "__main__":
> >    asyncio.get_event_loop().run_until_complete(helper())
> >
> >
> > And the output it produces when ran (py3.6.1):
> >
> > sending user-1
> > received user-1
> > sending user-2
> > sending user-0
> > received None
> > received None
> >
> >
> > Where are those None's coming from in the end?
> > Where did "user-0" and "user-1" data go?
>
>
> Interesting.  If I replace "gather" with three consecutive awaits of
> "asend", everything works as expected.  So there's some weird interaction
> of asend/gather, or maybe you did find a bug.  Need more time to
> investigate.
>
> Would you mind to open an issue on bugs.python?
>
> Thanks,
> Yury
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170626/86140061/attachment-0001.html>

From yselivanov at gmail.com  Mon Jun 26 13:55:03 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Mon, 26 Jun 2017 13:55:03 -0400
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <CAL3CFcUZ7CQ0ytzeYom6a4aiGXfNauOfFQyv0DWqM-icrc1C+A@mail.gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
 <D4673B0D-C186-47C9-8C2A-14345CA06717@gmail.com>
 <CAL3CFcUZ7CQ0ytzeYom6a4aiGXfNauOfFQyv0DWqM-icrc1C+A@mail.gmail.com>
Message-ID: <27F15750-EA96-4A24-8B21-83A9D6B71F7A@gmail.com>


> On Jun 26, 2017, at 1:53 PM, Andrew Svetlov <andrew.svetlov at gmail.com> wrote:
> 
> IIRC gather collects coroutines in arbitrary order, maybe it's the source of misunderstanding?

Yes, but that does not explain "receiving None" messages. Let's move this discussion to the bug tracker.

Yury


From chris.jerdonek at gmail.com  Mon Jun 26 14:21:47 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Mon, 26 Jun 2017 11:21:47 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
 <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
Message-ID: <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>

On Mon, Jun 26, 2017 at 10:02 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
> Chris, coming back to your use-case.
> Do you want to synchronise side-effect creation/deletion for the
> sanity of side-effects only?
> Or do you imply that callers' actions are synchronised too?
> In other words, do your callers use those directories out of band?

If I understand your question, the former. The callers aren't / need
not be synchronized, and they aren't aware of the underlying
synchronization happening inside the higher-level create() and
delete() functions they would be using. (These are the two
higher-level functions described in my pseudocode.)

The synchronization is needed inside these create() and delete()
functions since the low-level directory operations occur in different
threads (because they are wrapped by run_in_executor()).

--Chris

>
>
> P.S./O.T. when it comes to directories, you probably want hierarchical
> locks rather than RW.
>
>
> On 26 June 2017 at 11:28, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>> On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
>>> Perhaps you can share your use-case, both as pseudo-code and a link to
>>> real code.
>>>
>>> I'm specifically interested to see why/where you'd like to use a
>>> read-write async lock, to evaluate if this is something common or
>>> specific, and if, perhaps, some other paradigm (like queue, worker
>>> pool, ...) may be more useful in general case.
>>>
>>> I'm also curious if a full set of async sync primitives may one day
>>> lead to async monitors. Granted, simple use of async monitor is really
>>> a future/promise, but perhaps there are complex use cases in the
>>> UI/react domain with its promise/stream dichotomy.
>>
>> Thank you, Dima. In my last email I shared pseudo-code for an approach
>> to read-write synchronization that is independent of use case. [1]
>>
>> For the use case, my original purpose in mind was to synchronize many
>> small file operations on disk like creating and removing directories
>> that possibly share intermediate segments. The real code isn't public.
>> But these would be operations like os.makedirs() and os.removedirs()
>> that would be wrapped by loop.run_in_executor() to be non-blocking.
>> The directory removal using os.removedirs() is the operation I thought
>> should require exclusive access, so as not to interfere with directory
>> creations in progress.
>>
>> Perhaps a simpler, dirtier approach would be not to synchronize at all
>> and simply retry directory creations that fail until they succeed.
>> That could be enough to handle rare cases where simultaneous creation
>> and removal causes an error. You could view this an EAFP approach.
>>
>> Either way, I think the process of thinking through patterns for
>> read-write synchronization is helpful for getting a better general
>> feel and understanding of async.
>>
>> --Chris
>>
>>
>>>
>>> Cheers,
>>> d.
>>>
>>> On 25 June 2017 at 23:13, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>>>> I'm relatively new to async programming in Python and am thinking
>>>> through possibilities for doing "read-write" synchronization.
>>>>
>>>> I'm using asyncio, and the synchronization primitives that asyncio
>>>> exposes are relatively simple [1]. Have options for async read-write
>>>> synchronization already been discussed in any detail?
>>>>
>>>> I'm interested in designs where "readers" don't need to acquire a lock
>>>> -- only writers. It seems like one way to deal with the main race
>>>> condition I see that comes up would be to use loop.time(). Does that
>>>> ring a bell, or might there be a much simpler way?
>>>>
>>>> Thanks,
>>>> --Chris
>>>>
>>>>
>>>> [1] https://docs.python.org/3/library/asyncio-sync.html
>>>> _______________________________________________
>>>> Async-sig mailing list
>>>> Async-sig at python.org
>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/

From dimaqq at gmail.com  Mon Jun 26 14:56:39 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Mon, 26 Jun 2017 20:56:39 +0200
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <27F15750-EA96-4A24-8B21-83A9D6B71F7A@gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
 <D4673B0D-C186-47C9-8C2A-14345CA06717@gmail.com>
 <CAL3CFcUZ7CQ0ytzeYom6a4aiGXfNauOfFQyv0DWqM-icrc1C+A@mail.gmail.com>
 <27F15750-EA96-4A24-8B21-83A9D6B71F7A@gmail.com>
Message-ID: <CAGGBzXLPxO7XGJyiUd821ZoOmJurf6CMAhM1B6N6dFF6TXs3VA@mail.gmail.com>

Thanks Yuri for quick reply.
http://bugs.python.org/issue30773 created :)


On 26 June 2017 at 19:55, Yury Selivanov <yselivanov at gmail.com> wrote:
>
>> On Jun 26, 2017, at 1:53 PM, Andrew Svetlov <andrew.svetlov at gmail.com> wrote:
>>
>> IIRC gather collects coroutines in arbitrary order, maybe it's the source of misunderstanding?
>
> Yes, but that does not explain "receiving None" messages. Let's move this discussion to the bug tracker.
>
> Yury
>

From dimaqq at gmail.com  Mon Jun 26 15:37:19 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Mon, 26 Jun 2017 21:37:19 +0200
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
 <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
 <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>
Message-ID: <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>

Chris, here's a simple RWLock implementation and analysis:

```
import asyncio


class RWLock:
    def __init__(self):
        self.cond = asyncio.Condition()
        self.readers = 0
        self.writer = False

    async def lock(self, write=False):
        async with self.cond:
            # write requested: there cannot be readers or writers
            # read requested: there can be other readers but not writers
            while self.readers and write or self.writer:
                self.cond.wait()
            if write: self.writer = True
            else: self.readers += 1
            # self.cond.notifyAll() would be good taste
            # however no waiters can be unblocked by this state change

    async def unlock(self, write=False):
        async with self.cond:
            if write: self.writer = False
            else: self.readers -= 1
            self.cond.notifyAll()  # notify (one) could be used `if not write:`
```

Note that `.unlock` cannot validate that it's called by same coroutine
as `.lock` was.
That's because there's no concept for "current_thread" for coroutines
-- there can be many waiting on each other in the stack.

Obv., this code could be nicer:
* separate context managers for read and write cases
* .unlock can be automatic (if self.writer: unlock_for_write()) at the
cost of opening doors wide open to bugs
* policy can be introduced if `.lock` identified itself (by an
object(), since there's no thread id) in shared state
* notifyAll() makes real life use O(N^2) for N being number of
simultaneous write lock requests

Feel free to use it :)


On 26 June 2017 at 20:21, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
> On Mon, Jun 26, 2017 at 10:02 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
>> Chris, coming back to your use-case.
>> Do you want to synchronise side-effect creation/deletion for the
>> sanity of side-effects only?
>> Or do you imply that callers' actions are synchronised too?
>> In other words, do your callers use those directories out of band?
>
> If I understand your question, the former. The callers aren't / need
> not be synchronized, and they aren't aware of the underlying
> synchronization happening inside the higher-level create() and
> delete() functions they would be using. (These are the two
> higher-level functions described in my pseudocode.)
>
> The synchronization is needed inside these create() and delete()
> functions since the low-level directory operations occur in different
> threads (because they are wrapped by run_in_executor()).
>
> --Chris
>
>>
>>
>> P.S./O.T. when it comes to directories, you probably want hierarchical
>> locks rather than RW.
>>
>>
>> On 26 June 2017 at 11:28, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>>> On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
>>>> Perhaps you can share your use-case, both as pseudo-code and a link to
>>>> real code.
>>>>
>>>> I'm specifically interested to see why/where you'd like to use a
>>>> read-write async lock, to evaluate if this is something common or
>>>> specific, and if, perhaps, some other paradigm (like queue, worker
>>>> pool, ...) may be more useful in general case.
>>>>
>>>> I'm also curious if a full set of async sync primitives may one day
>>>> lead to async monitors. Granted, simple use of async monitor is really
>>>> a future/promise, but perhaps there are complex use cases in the
>>>> UI/react domain with its promise/stream dichotomy.
>>>
>>> Thank you, Dima. In my last email I shared pseudo-code for an approach
>>> to read-write synchronization that is independent of use case. [1]
>>>
>>> For the use case, my original purpose in mind was to synchronize many
>>> small file operations on disk like creating and removing directories
>>> that possibly share intermediate segments. The real code isn't public.
>>> But these would be operations like os.makedirs() and os.removedirs()
>>> that would be wrapped by loop.run_in_executor() to be non-blocking.
>>> The directory removal using os.removedirs() is the operation I thought
>>> should require exclusive access, so as not to interfere with directory
>>> creations in progress.
>>>
>>> Perhaps a simpler, dirtier approach would be not to synchronize at all
>>> and simply retry directory creations that fail until they succeed.
>>> That could be enough to handle rare cases where simultaneous creation
>>> and removal causes an error. You could view this an EAFP approach.
>>>
>>> Either way, I think the process of thinking through patterns for
>>> read-write synchronization is helpful for getting a better general
>>> feel and understanding of async.
>>>
>>> --Chris
>>>
>>>
>>>>
>>>> Cheers,
>>>> d.
>>>>
>>>> On 25 June 2017 at 23:13, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>>>>> I'm relatively new to async programming in Python and am thinking
>>>>> through possibilities for doing "read-write" synchronization.
>>>>>
>>>>> I'm using asyncio, and the synchronization primitives that asyncio
>>>>> exposes are relatively simple [1]. Have options for async read-write
>>>>> synchronization already been discussed in any detail?
>>>>>
>>>>> I'm interested in designs where "readers" don't need to acquire a lock
>>>>> -- only writers. It seems like one way to deal with the main race
>>>>> condition I see that comes up would be to use loop.time(). Does that
>>>>> ring a bell, or might there be a much simpler way?
>>>>>
>>>>> Thanks,
>>>>> --Chris
>>>>>
>>>>>
>>>>> [1] https://docs.python.org/3/library/asyncio-sync.html
>>>>> _______________________________________________
>>>>> Async-sig mailing list
>>>>> Async-sig at python.org
>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/

From dimaqq at gmail.com  Mon Jun 26 15:38:40 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Mon, 26 Jun 2017 21:38:40 +0200
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
 <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
 <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>
 <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
Message-ID: <CAGGBzXKbg0V0+5ehV72HJT7GGmR9FHTx+GGwpnV1SXOe3DjiDQ@mail.gmail.com>

- self.cond.wait()
+ await self.cond.wait()

I've no tests for this :P

On 26 June 2017 at 21:37, Dima Tisnek <dimaqq at gmail.com> wrote:
> Chris, here's a simple RWLock implementation and analysis:
>
> ```
> import asyncio
>
>
> class RWLock:
>     def __init__(self):
>         self.cond = asyncio.Condition()
>         self.readers = 0
>         self.writer = False
>
>     async def lock(self, write=False):
>         async with self.cond:
>             # write requested: there cannot be readers or writers
>             # read requested: there can be other readers but not writers
>             while self.readers and write or self.writer:
>                 self.cond.wait()
>             if write: self.writer = True
>             else: self.readers += 1
>             # self.cond.notifyAll() would be good taste
>             # however no waiters can be unblocked by this state change
>
>     async def unlock(self, write=False):
>         async with self.cond:
>             if write: self.writer = False
>             else: self.readers -= 1
>             self.cond.notifyAll()  # notify (one) could be used `if not write:`
> ```
>
> Note that `.unlock` cannot validate that it's called by same coroutine
> as `.lock` was.
> That's because there's no concept for "current_thread" for coroutines
> -- there can be many waiting on each other in the stack.
>
> Obv., this code could be nicer:
> * separate context managers for read and write cases
> * .unlock can be automatic (if self.writer: unlock_for_write()) at the
> cost of opening doors wide open to bugs
> * policy can be introduced if `.lock` identified itself (by an
> object(), since there's no thread id) in shared state
> * notifyAll() makes real life use O(N^2) for N being number of
> simultaneous write lock requests
>
> Feel free to use it :)
>
>
>
> On 26 June 2017 at 20:21, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>> On Mon, Jun 26, 2017 at 10:02 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
>>> Chris, coming back to your use-case.
>>> Do you want to synchronise side-effect creation/deletion for the
>>> sanity of side-effects only?
>>> Or do you imply that callers' actions are synchronised too?
>>> In other words, do your callers use those directories out of band?
>>
>> If I understand your question, the former. The callers aren't / need
>> not be synchronized, and they aren't aware of the underlying
>> synchronization happening inside the higher-level create() and
>> delete() functions they would be using. (These are the two
>> higher-level functions described in my pseudocode.)
>>
>> The synchronization is needed inside these create() and delete()
>> functions since the low-level directory operations occur in different
>> threads (because they are wrapped by run_in_executor()).
>>
>> --Chris
>>
>>>
>>>
>>> P.S./O.T. when it comes to directories, you probably want hierarchical
>>> locks rather than RW.
>>>
>>>
>>> On 26 June 2017 at 11:28, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>>>> On Mon, Jun 26, 2017 at 1:43 AM, Dima Tisnek <dimaqq at gmail.com> wrote:
>>>>> Perhaps you can share your use-case, both as pseudo-code and a link to
>>>>> real code.
>>>>>
>>>>> I'm specifically interested to see why/where you'd like to use a
>>>>> read-write async lock, to evaluate if this is something common or
>>>>> specific, and if, perhaps, some other paradigm (like queue, worker
>>>>> pool, ...) may be more useful in general case.
>>>>>
>>>>> I'm also curious if a full set of async sync primitives may one day
>>>>> lead to async monitors. Granted, simple use of async monitor is really
>>>>> a future/promise, but perhaps there are complex use cases in the
>>>>> UI/react domain with its promise/stream dichotomy.
>>>>
>>>> Thank you, Dima. In my last email I shared pseudo-code for an approach
>>>> to read-write synchronization that is independent of use case. [1]
>>>>
>>>> For the use case, my original purpose in mind was to synchronize many
>>>> small file operations on disk like creating and removing directories
>>>> that possibly share intermediate segments. The real code isn't public.
>>>> But these would be operations like os.makedirs() and os.removedirs()
>>>> that would be wrapped by loop.run_in_executor() to be non-blocking.
>>>> The directory removal using os.removedirs() is the operation I thought
>>>> should require exclusive access, so as not to interfere with directory
>>>> creations in progress.
>>>>
>>>> Perhaps a simpler, dirtier approach would be not to synchronize at all
>>>> and simply retry directory creations that fail until they succeed.
>>>> That could be enough to handle rare cases where simultaneous creation
>>>> and removal causes an error. You could view this an EAFP approach.
>>>>
>>>> Either way, I think the process of thinking through patterns for
>>>> read-write synchronization is helpful for getting a better general
>>>> feel and understanding of async.
>>>>
>>>> --Chris
>>>>
>>>>
>>>>>
>>>>> Cheers,
>>>>> d.
>>>>>
>>>>> On 25 June 2017 at 23:13, Chris Jerdonek <chris.jerdonek at gmail.com> wrote:
>>>>>> I'm relatively new to async programming in Python and am thinking
>>>>>> through possibilities for doing "read-write" synchronization.
>>>>>>
>>>>>> I'm using asyncio, and the synchronization primitives that asyncio
>>>>>> exposes are relatively simple [1]. Have options for async read-write
>>>>>> synchronization already been discussed in any detail?
>>>>>>
>>>>>> I'm interested in designs where "readers" don't need to acquire a lock
>>>>>> -- only writers. It seems like one way to deal with the main race
>>>>>> condition I see that comes up would be to use loop.time(). Does that
>>>>>> ring a bell, or might there be a much simpler way?
>>>>>>
>>>>>> Thanks,
>>>>>> --Chris
>>>>>>
>>>>>>
>>>>>> [1] https://docs.python.org/3/library/asyncio-sync.html
>>>>>> _______________________________________________
>>>>>> Async-sig mailing list
>>>>>> Async-sig at python.org
>>>>>> https://mail.python.org/mailman/listinfo/async-sig
>>>>>> Code of Conduct: https://www.python.org/psf/codeofconduct/

From njs at pobox.com  Mon Jun 26 18:22:47 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 26 Jun 2017 15:22:47 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
 <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
 <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>
 <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
Message-ID: <CAPJVwBk88g_SqTgxzskg=g3__Vgc3iMKxwqGyn7iwk9nWYG+Fg@mail.gmail.com>

On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> Note that `.unlock` cannot validate that it's called by same coroutine
> as `.lock` was.
> That's because there's no concept for "current_thread" for coroutines
> -- there can be many waiting on each other in the stack.

This is also a surprisingly complex design question. Your async RWLock
actually matches how Python's threading.Lock works: you're explicitly
allowed to acquire it in one thread and then release it from another.
People sometimes find this surprising, and it prevents some kinds of
error-checking. For example, this code *probably* deadlocks:

    lock = threading.Lock()
    lock.acquire()
    # probably deadlocks
    lock.acquire()

but the interpreter can't detect this and raise an error, because in
theory some other thread might come along and call lock.release(). On
the other hand, it is sometimes useful to be able to acquire a lock in
one thread and then "hand it off" to e.g. a child thread. (Reentrant
locks, OTOH, do have an implicit concept of ownership -- they kind of
have to, if you think about it -- so even if you don't need reentrancy
they can be useful because they'll raise a noisy error if you
accidentally try to release a lock from the wrong thread.)

In trio we do have a current_task() concept, and the basic trio.Lock
[1] does track ownership, and I even have a Semaphore-equivalent that
tracks ownership as well [2]. The motivation here is that I want to
provide nice debugging tools to detect things like deadlocks, which is
only possible when your primitives have some kind of ownership
tracking. So far this just means that we detect and error on these
kinds of simple cases:

    lock = trio.Lock()
    await lock.acquire()
    # raises an error
   await lock.acquire()

But I have ambitions to do more [3] :-).

However, this raises some tricky design questions around how and
whether to support the "passing ownership" cases. Of course you can
always fall back on something like a raw Semaphore, but it turns out
that trio.run_in_worker_thread (our equivalent of asyncio's
run_in_executor) actually wants to do something like pass ownership
from the calling task into the spawned thread. So far I've handled
this by adding acquire_on_behalf_of/release_on_behalf_of methods to
the primitive that run_in_worker_thread uses, but this isn't really
fully baked yet.

-n

[1] https://trio.readthedocs.io/en/latest/reference-core.html#trio.Lock
[2] https://trio.readthedocs.io/en/latest/reference-core.html#trio.CapacityLimiter
[3] https://github.com/python-trio/trio/issues/182

-- 
Nathaniel J. Smith -- https://vorpus.org

From chris.jerdonek at gmail.com  Mon Jun 26 21:41:16 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Mon, 26 Jun 2017 18:41:16 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
 <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
 <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>
 <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
Message-ID: <CAOTb1wdx84=QZuC3XvRi1Mqb4g7FF3KMeT1WNEPA6J0QEDv9iw@mail.gmail.com>

On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> Chris, here's a simple RWLock implementation and analysis:
> ...
> Obv., this code could be nicer:
> * separate context managers for read and write cases
> * .unlock can be automatic (if self.writer: unlock_for_write()) at the
> cost of opening doors wide open to bugs
> * policy can be introduced if `.lock` identified itself (by an
> object(), since there's no thread id) in shared state
> * notifyAll() makes real life use O(N^2) for N being number of
> simultaneous write lock requests
>
> Feel free to use it :)

Thanks, Dima. However, as I said in my earlier posts, I'm actually
more interested in exploring approaches to synchronizing readers and
writers in async code that don't require locking on reads. (This is
also why I've always been saying RW "synchronization" instead of RW
"locking.")

I'm interested in this because I think the single-threadedness of the
event loop might be what makes this simplification possible over the
traditional multi-threaded approach (along the lines Guido was
mentioning). It also makes the "fast path" faster. Lastly, the API for
the callers is just to call read() or write(), so there is no need for
a general RWLock construct or to work through RWLock semantics of the
sort Nathaniel mentioned.

I coded up a working version of the pseudo-code I included in an
earlier email so people can see how it works. I included it at the
bottom of this email and also in this gist:
https://gist.github.com/cjerdonek/858e1467f768ee045849ea81ddb47901

--Chris


import asyncio
import random


NO_READERS_EVENT = asyncio.Event()
NO_WRITERS_EVENT = asyncio.Event()
WRITE_LOCK = asyncio.Lock()


class State:
    reader_count = 0
    mock_file_data = 'initial'


async def read_file():
    data = State.mock_file_data
    print(f'read: {data}')


async def write_file(data):
    print(f'writing: {data}')
    State.mock_file_data = data
    await asyncio.sleep(0.5)


async def write(data):
    async with WRITE_LOCK:
        NO_WRITERS_EVENT.clear()
        # Wait for the readers to finish.
        await NO_READERS_EVENT.wait()
        # Do the file write.
        await write_file(data)

    # Awaken waiting readers.
    NO_WRITERS_EVENT.set()


async def read():
    while True:
        await NO_WRITERS_EVENT.wait()
        # Check the writer_lock again in case a new writer has
        # started writing.
        if WRITE_LOCK.locked():
            print(f'cannot read: still writing: {State.mock_file_data!r}')
        else:
            # Otherwise, we can do the read.
            break

    State.reader_count += 1
    if State.reader_count == 1:
        NO_READERS_EVENT.clear()
    # Do the file read.
    await read_file()
    State.reader_count -= 1
    if State.reader_count == 0:
        # Awaken any waiting writer.
        NO_READERS_EVENT.set()


async def delayed(coro):
    await asyncio.sleep(random.random())
    await coro


async def test_synchronization():
    NO_READERS_EVENT.set()
    NO_WRITERS_EVENT.set()

    coros = [
        read(),
        read(),
        read(),
        read(),
        read(),
        read(),
        write('apple'),
        write('banana'),
    ]
    # Add a delay before each coroutine for variety.
    coros = [delayed(coro) for coro in coros]
    await asyncio.gather(*coros)


if __name__ == '__main__':
    asyncio.get_event_loop().run_until_complete(test_synchronization())

# Sample output:
#
# read: initial
# read: initial
# read: initial
# read: initial
# writing: banana
# writing: apple
# cannot read: still writing: 'apple'
# cannot read: still writing: 'apple'
# read: apple
# read: apple

From yselivanov at gmail.com  Mon Jun 26 21:54:50 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Mon, 26 Jun 2017 21:54:50 -0400
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
Message-ID: <CA09DF5B-7A2A-43AB-A660-2EF2C2C752F3@gmail.com>

(Posting here, rather than to the issue, because I think this actually needs more exposure).

I looked at the code (genobject.c) and I think I know what's going on here.  Normally, when you work with an asynchronous generator (AG) you interact with it through "asend" or "athrow" *coroutines*.

Each AG has its own private state, and when you await on "asend" coroutine you are changing that state.  The state changes on each "asend.send" or "asend.throw" call.  The normal relation between AGs and asends is 1 to 1.

  AG - asend

However, in your example you change that to 1 to many:

     asend
    /
  AG - asend
    \
     asend

Both 'ensure_future' and 'gather' will wrap each asend coroutine into an 'asyncio.Task'. And each Task will call "asend.send(None)" right in its '__init__', which changes the underlying *shared* AG instance completely out of order.

I don't see how this can be fixed (or that it even needs to be fixed), so I propose to simply raise an exception if an AG has more than one asends changing it state *at the same time*.

Thoughts?

Yury

> On Jun 26, 2017, at 12:25 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> 
> Hi group,
> 
> I'm trying to cross-use an sync generator across several async functions.
> Is it allowed or a completely bad idea? (if so, why?)
> 
> Here's MRE:
> 
> import asyncio
> 
> 
> async def generator():
>    while True:
>        x = yield
>        print("received", x)
>        await asyncio.sleep(0.1)
> 
> 
> async def user(name, g):
>    print("sending", name)
>    await g.asend(name)
> 
> 
> async def helper():
>    g = generator()
>    await g.asend(None)
> 
>    await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])
> 
> 
> if __name__ == "__main__":
>    asyncio.get_event_loop().run_until_complete(helper())
> 
> 
> And the output it produces when ran (py3.6.1):
> 
> sending user-1
> received user-1
> sending user-2
> sending user-0
> received None
> received None
> 
> 
> Where are those None's coming from in the end?
> Where did "user-0" and "user-1" data go?
> 
> Is this a bug, or am I hopelessly confused?
> Thanks!
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/


From guido at python.org  Mon Jun 26 22:46:53 2017
From: guido at python.org (Guido van Rossum)
Date: Mon, 26 Jun 2017 19:46:53 -0700
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <CA09DF5B-7A2A-43AB-A660-2EF2C2C752F3@gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
 <CA09DF5B-7A2A-43AB-A660-2EF2C2C752F3@gmail.com>
Message-ID: <CAP7+vJJmViks4WxtU7VYER16uXEMZDoDzs3jOL1f-+-jTi0J+g@mail.gmail.com>

It does look complicated. The crux of the problem seems to be that helper()
is essentially

async def helper():
  g = generator()
  await g.asend(None)
  await asyncio.gather(user("user-0", g), user("user-1", g), user("user-2",
g))

which means that it is attempting to wait for three calls to user()
concurrently. This feels to me similar to three threads attempting to call
next() or send() on the same generator in parallel, which AFAIR is
explicitly guarded against somewhere. So detecting disallowing a similar
situation for async generators makes sense (and can be considered a bugfix).

On Mon, Jun 26, 2017 at 6:54 PM, Yury Selivanov <yselivanov at gmail.com>
wrote:

> (Posting here, rather than to the issue, because I think this actually
> needs more exposure).
>
> I looked at the code (genobject.c) and I think I know what's going on
> here.  Normally, when you work with an asynchronous generator (AG) you
> interact with it through "asend" or "athrow" *coroutines*.
>
> Each AG has its own private state, and when you await on "asend" coroutine
> you are changing that state.  The state changes on each "asend.send" or
> "asend.throw" call.  The normal relation between AGs and asends is 1 to 1.
>
>   AG - asend
>
> However, in your example you change that to 1 to many:
>
>      asend
>     /
>   AG - asend
>     \
>      asend
>
> Both 'ensure_future' and 'gather' will wrap each asend coroutine into an
> 'asyncio.Task'. And each Task will call "asend.send(None)" right in its
> '__init__', which changes the underlying *shared* AG instance completely
> out of order.
>
> I don't see how this can be fixed (or that it even needs to be fixed), so
> I propose to simply raise an exception if an AG has more than one asends
> changing it state *at the same time*.
>
> Thoughts?
>
> Yury
>
> > On Jun 26, 2017, at 12:25 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> >
> > Hi group,
> >
> > I'm trying to cross-use an sync generator across several async functions.
> > Is it allowed or a completely bad idea? (if so, why?)
> >
> > Here's MRE:
> >
> > import asyncio
> >
> >
> > async def generator():
> >    while True:
> >        x = yield
> >        print("received", x)
> >        await asyncio.sleep(0.1)
> >
> >
> > async def user(name, g):
> >    print("sending", name)
> >    await g.asend(name)
> >
> >
> > async def helper():
> >    g = generator()
> >    await g.asend(None)
> >
> >    await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])
> >
> >
> > if __name__ == "__main__":
> >    asyncio.get_event_loop().run_until_complete(helper())
> >
> >
> > And the output it produces when ran (py3.6.1):
> >
> > sending user-1
> > received user-1
> > sending user-2
> > sending user-0
> > received None
> > received None
> >
> >
> > Where are those None's coming from in the end?
> > Where did "user-0" and "user-1" data go?
> >
> > Is this a bug, or am I hopelessly confused?
> > Thanks!
> > _______________________________________________
> > Async-sig mailing list
> > Async-sig at python.org
> > https://mail.python.org/mailman/listinfo/async-sig
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170626/b6e763af/attachment.html>

From njs at pobox.com  Mon Jun 26 23:19:18 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 26 Jun 2017 20:19:18 -0700
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <CA09DF5B-7A2A-43AB-A660-2EF2C2C752F3@gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
 <CA09DF5B-7A2A-43AB-A660-2EF2C2C752F3@gmail.com>
Message-ID: <CAPJVwB=Rn_=xCg3VwhhL_uDEuqGHrS5ud4Odm=Fixu+5uVvCeQ@mail.gmail.com>

I actually thought that async generators already guarded against this using
their ag_running attribute. If I try running Dima's example with
async_generator, I get:

sending user-1
received user-1
sending user-2
sending user-0
Traceback (most recent call last):
[...]
ValueError: async generator already executing

The relevant code is here:
https://github.com/njsmith/async_generator/blob/e303e077c9dcb5880c0ce9930d560b282f8288ec/async_generator/impl.py#L273-L279

But I added this in the first place because I thought it was needed for
compatibility with native async generators :-)

-n

On Jun 26, 2017 6:54 PM, "Yury Selivanov" <yselivanov at gmail.com> wrote:

> (Posting here, rather than to the issue, because I think this actually
> needs more exposure).
>
> I looked at the code (genobject.c) and I think I know what's going on
> here.  Normally, when you work with an asynchronous generator (AG) you
> interact with it through "asend" or "athrow" *coroutines*.
>
> Each AG has its own private state, and when you await on "asend" coroutine
> you are changing that state.  The state changes on each "asend.send" or
> "asend.throw" call.  The normal relation between AGs and asends is 1 to 1.
>
>   AG - asend
>
> However, in your example you change that to 1 to many:
>
>      asend
>     /
>   AG - asend
>     \
>      asend
>
> Both 'ensure_future' and 'gather' will wrap each asend coroutine into an
> 'asyncio.Task'. And each Task will call "asend.send(None)" right in its
> '__init__', which changes the underlying *shared* AG instance completely
> out of order.
>
> I don't see how this can be fixed (or that it even needs to be fixed), so
> I propose to simply raise an exception if an AG has more than one asends
> changing it state *at the same time*.
>
> Thoughts?
>
> Yury
>
> > On Jun 26, 2017, at 12:25 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> >
> > Hi group,
> >
> > I'm trying to cross-use an sync generator across several async functions.
> > Is it allowed or a completely bad idea? (if so, why?)
> >
> > Here's MRE:
> >
> > import asyncio
> >
> >
> > async def generator():
> >    while True:
> >        x = yield
> >        print("received", x)
> >        await asyncio.sleep(0.1)
> >
> >
> > async def user(name, g):
> >    print("sending", name)
> >    await g.asend(name)
> >
> >
> > async def helper():
> >    g = generator()
> >    await g.asend(None)
> >
> >    await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])
> >
> >
> > if __name__ == "__main__":
> >    asyncio.get_event_loop().run_until_complete(helper())
> >
> >
> > And the output it produces when ran (py3.6.1):
> >
> > sending user-1
> > received user-1
> > sending user-2
> > sending user-0
> > received None
> > received None
> >
> >
> > Where are those None's coming from in the end?
> > Where did "user-0" and "user-1" data go?
> >
> > Is this a bug, or am I hopelessly confused?
> > Thanks!
> > _______________________________________________
> > Async-sig mailing list
> > Async-sig at python.org
> > https://mail.python.org/mailman/listinfo/async-sig
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170626/9aeb22d7/attachment-0001.html>

From yselivanov at gmail.com  Tue Jun 27 00:13:05 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Tue, 27 Jun 2017 00:13:05 -0400
Subject: [Async-sig] async generator confusion or bug?
In-Reply-To: <CAPJVwB=Rn_=xCg3VwhhL_uDEuqGHrS5ud4Odm=Fixu+5uVvCeQ@mail.gmail.com>
References: <CAGGBzX+dfPsPCfkXURk4uwEbDC3FR1ODv7YJfEP1LqwMSsMhcg@mail.gmail.com>
 <CA09DF5B-7A2A-43AB-A660-2EF2C2C752F3@gmail.com>
 <CAPJVwB=Rn_=xCg3VwhhL_uDEuqGHrS5ud4Odm=Fixu+5uVvCeQ@mail.gmail.com>
Message-ID: <5DD12A9F-18FF-4F15-A373-79FB3341D04F@gmail.com>

Thanks Guido and Nathaniel. I'll work on a fix.

Yury

> On Jun 26, 2017, at 11:19 PM, Nathaniel Smith <njs at pobox.com> wrote:
> 
> I actually thought that async generators already guarded against this using their ag_running attribute. If I try running Dima's example with async_generator, I get:
> 
> sending user-1
> received user-1
> sending user-2
> sending user-0
> Traceback (most recent call last):
> [...]
> ValueError: async generator already executing
> 
> The relevant code is here:
> https://github.com/njsmith/async_generator/blob/e303e077c9dcb5880c0ce9930d560b282f8288ec/async_generator/impl.py#L273-L279
> 
> But I added this in the first place because I thought it was needed for compatibility with native async generators :-)
> 
> -n
> 
> On Jun 26, 2017 6:54 PM, "Yury Selivanov" <yselivanov at gmail.com> wrote:
> (Posting here, rather than to the issue, because I think this actually needs more exposure).
> 
> I looked at the code (genobject.c) and I think I know what's going on here.  Normally, when you work with an asynchronous generator (AG) you interact with it through "asend" or "athrow" *coroutines*.
> 
> Each AG has its own private state, and when you await on "asend" coroutine you are changing that state.  The state changes on each "asend.send" or "asend.throw" call.  The normal relation between AGs and asends is 1 to 1.
> 
>   AG - asend
> 
> However, in your example you change that to 1 to many:
> 
>      asend
>     /
>   AG - asend
>     \
>      asend
> 
> Both 'ensure_future' and 'gather' will wrap each asend coroutine into an 'asyncio.Task'. And each Task will call "asend.send(None)" right in its '__init__', which changes the underlying *shared* AG instance completely out of order.
> 
> I don't see how this can be fixed (or that it even needs to be fixed), so I propose to simply raise an exception if an AG has more than one asends changing it state *at the same time*.
> 
> Thoughts?
> 
> Yury
> 
> > On Jun 26, 2017, at 12:25 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
> >
> > Hi group,
> >
> > I'm trying to cross-use an sync generator across several async functions.
> > Is it allowed or a completely bad idea? (if so, why?)
> >
> > Here's MRE:
> >
> > import asyncio
> >
> >
> > async def generator():
> >    while True:
> >        x = yield
> >        print("received", x)
> >        await asyncio.sleep(0.1)
> >
> >
> > async def user(name, g):
> >    print("sending", name)
> >    await g.asend(name)
> >
> >
> > async def helper():
> >    g = generator()
> >    await g.asend(None)
> >
> >    await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])
> >
> >
> > if __name__ == "__main__":
> >    asyncio.get_event_loop().run_until_complete(helper())
> >
> >
> > And the output it produces when ran (py3.6.1):
> >
> > sending user-1
> > received user-1
> > sending user-2
> > sending user-0
> > received None
> > received None
> >
> >
> > Where are those None's coming from in the end?
> > Where did "user-0" and "user-1" data go?
> >
> > Is this a bug, or am I hopelessly confused?
> > Thanks!
> > _______________________________________________
> > Async-sig mailing list
> > Async-sig at python.org
> > https://mail.python.org/mailman/listinfo/async-sig
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
> 


From chris.jerdonek at gmail.com  Tue Jun 27 03:29:10 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Tue, 27 Jun 2017 00:29:10 -0700
Subject: [Async-sig] question re: asyncio.Condition lock acquisition order
Message-ID: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>

I have a couple questions about asyncio's synchronization primitives.

Say a coroutine acquires an asyncio Condition's underlying lock, calls
notify() (or notify_all()), and then releases the lock. In terms of
which coroutines will acquire the lock next, is any preference given
between (1) coroutines waiting to acquire the underlying lock, and (2)
coroutines waiting on the Condition object itself? The documentation
doesn't seem to say anything about this.

Also, more generally (and I'm sure this question gets asked a lot),
does asyncio provide any guarantees about the order in which awaiting
coroutines are awakened? For example, for synchronization primitives,
does each primitive maintain a FIFO queue of who will be awakened
next, or are there no guarantees about the order?

Thanks a lot,
--Chris

From andrew.svetlov at gmail.com  Tue Jun 27 03:48:58 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Tue, 27 Jun 2017 07:48:58 +0000
Subject: [Async-sig] question re: asyncio.Condition lock acquisition
 order
In-Reply-To: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>
References: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>
Message-ID: <CAL3CFcWQG0RZPELux95ZRVKd9z4Tch0F-BZzADB4WMxLW49g6A@mail.gmail.com>

AFAIK No any guarantee

On Tue, Jun 27, 2017 at 10:29 AM Chris Jerdonek <chris.jerdonek at gmail.com>
wrote:

> I have a couple questions about asyncio's synchronization primitives.
>
> Say a coroutine acquires an asyncio Condition's underlying lock, calls
> notify() (or notify_all()), and then releases the lock. In terms of
> which coroutines will acquire the lock next, is any preference given
> between (1) coroutines waiting to acquire the underlying lock, and (2)
> coroutines waiting on the Condition object itself? The documentation
> doesn't seem to say anything about this.
>
> Also, more generally (and I'm sure this question gets asked a lot),
> does asyncio provide any guarantees about the order in which awaiting
> coroutines are awakened? For example, for synchronization primitives,
> does each primitive maintain a FIFO queue of who will be awakened
> next, or are there no guarantees about the order?
>
> Thanks a lot,
> --Chris
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170627/650c204c/attachment.html>

From njs at pobox.com  Tue Jun 27 06:29:23 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 27 Jun 2017 03:29:23 -0700
Subject: [Async-sig] question re: asyncio.Condition lock acquisition
 order
In-Reply-To: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>
References: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>
Message-ID: <CAPJVwB=3OXFDOjp6NjgDpZz1yUQgf2Z-4qcKz+C2M8WTz_wtGA@mail.gmail.com>

On Tue, Jun 27, 2017 at 12:29 AM, Chris Jerdonek
<chris.jerdonek at gmail.com> wrote:
> I have a couple questions about asyncio's synchronization primitives.
>
> Say a coroutine acquires an asyncio Condition's underlying lock, calls
> notify() (or notify_all()), and then releases the lock. In terms of
> which coroutines will acquire the lock next, is any preference given
> between (1) coroutines waiting to acquire the underlying lock, and (2)
> coroutines waiting on the Condition object itself? The documentation
> doesn't seem to say anything about this.
>
> Also, more generally (and I'm sure this question gets asked a lot),
> does asyncio provide any guarantees about the order in which awaiting
> coroutines are awakened? For example, for synchronization primitives,
> does each primitive maintain a FIFO queue of who will be awakened
> next, or are there no guarantees about the order?

In fact asyncio.Lock's implementation is careful to maintain strict
FIFO fairness, i.e. whoever calls acquire() first is guaranteed to get
the lock first. Whether this is something you feel you can depend on
I'll leave to your conscience :-). Though the docs do say "only one
coroutine proceeds when a release() call resets the state to unlocked;
first coroutine which is blocked in acquire() is being processed",
which I think might be intended to say that they're FIFO-fair?

asyncio.Condition internally maintains a FIFO list so that notify(1)
is guaranteed to wake up the task that called wait() first. But if you
notify multiple tasks at once, then I don't think there's any
guarantee that they'll get the lock in FIFO order -- basically
notify{,_all} just wakes them up, and then the next time they run they
try to call lock.acquire(), so it depends on the underlying scheduler
to decide who gets to run first.

There's also an edge condition where if a task blocked in wait() gets
cancelled, then... well, it's complicated. If notify has not been
called yet, then it wakes up, reacquires the lock, and then raises
CancelledError. If it's already been notified and is waiting to
acquire the lock, then I think it goes to the back of the line of
tasks waiting for the lock, but otherwise swallows the CancelledError.
And then returns None, which is not a documented return value.

In case it's interesting for comparison -- hopefully these comments
aren't getting annoying -- trio does provide documented fairness
guarantees for all its synchronization primitives:

  https://trio.readthedocs.io/en/latest/reference-core.html#fairness

There's some question about whether this is a great idea or what the
best definition of "fairness" is, so it also provides
trio.StrictFIFOLock for cases where FIFO fairness is actually a
requirement for correctness and you want to document this in the code:

  https://trio.readthedocs.io/en/latest/reference-core.html#trio.StrictFIFOLock

And trio.Condition.notify moves tasks from the Condition wait queue
directly to the Lock wait queue while preserving FIFO order. (The
trade-off is that this means that trio.Condition can only be used with
trio.Lock exactly, while asyncio.Condition works with any object that
provides the asyncio.Lock interface.) Also, it has a similar edge case
around cancellation, because cancellation and condition variables are
very tricky :-). Though I guess trio's version arguably a little less
quirky because it acts the same regardless of whether it's in the
wait-for-notify or wait-for-lock phase, it will only ever drop to the
back of the line once, and cancellation in trio is level-triggered
rather than edge-triggered so discarding the notification isn't a big
deal.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From chris.jerdonek at gmail.com  Tue Jun 27 07:15:28 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Tue, 27 Jun 2017 04:15:28 -0700
Subject: [Async-sig] question re: asyncio.Condition lock acquisition
 order
In-Reply-To: <CAPJVwB=3OXFDOjp6NjgDpZz1yUQgf2Z-4qcKz+C2M8WTz_wtGA@mail.gmail.com>
References: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>
 <CAPJVwB=3OXFDOjp6NjgDpZz1yUQgf2Z-4qcKz+C2M8WTz_wtGA@mail.gmail.com>
Message-ID: <CAOTb1weUtAO5t7xZx2M5Svzn-QnqcF5BwatDW_r0Lt5f3AFYGA@mail.gmail.com>

On Tue, Jun 27, 2017 at 3:29 AM, Nathaniel Smith <njs at pobox.com> wrote:
> In fact asyncio.Lock's implementation is careful to maintain strict
> FIFO fairness, i.e. whoever calls acquire() first is guaranteed to get
> the lock first. Whether this is something you feel you can depend on
> I'll leave to your conscience :-). Though the docs do say "only one
> coroutine proceeds when a release() call resets the state to unlocked;
> first coroutine which is blocked in acquire() is being processed",
> which I think might be intended to say that they're FIFO-fair?
> ...

Thanks. All that is really interesting, especially the issue you
linked to in the Trio docs re: fairness:
https://github.com/python-trio/trio/issues/54

Thinking through the requirements I want for my RW synchronization use
case in more detail, I think I want the completion of any "write" to
be followed by exhausting all "reads." I'm not sure if that qualifies
as barging. Hopefully this will be implementable easily enough with
the available primitives, given what you say.

Can anything similar be said not about synchronization primitives, but
about awakening coroutines in general? Do event loops maintain strict
FIFO queues when it comes to deciding which awaiting coroutine to
awaken? (I hope that question makes sense!)

--Chris

From njs at pobox.com  Tue Jun 27 18:48:40 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 27 Jun 2017 15:48:40 -0700
Subject: [Async-sig] question re: asyncio.Condition lock acquisition
 order
In-Reply-To: <CAOTb1weUtAO5t7xZx2M5Svzn-QnqcF5BwatDW_r0Lt5f3AFYGA@mail.gmail.com>
References: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>
 <CAPJVwB=3OXFDOjp6NjgDpZz1yUQgf2Z-4qcKz+C2M8WTz_wtGA@mail.gmail.com>
 <CAOTb1weUtAO5t7xZx2M5Svzn-QnqcF5BwatDW_r0Lt5f3AFYGA@mail.gmail.com>
Message-ID: <CAPJVwBkBkoed+hz8vxc6dce-sYH_db+PgORyqQh8M_Xfez28wg@mail.gmail.com>

On Tue, Jun 27, 2017 at 4:15 AM, Chris Jerdonek
<chris.jerdonek at gmail.com> wrote:
> On Tue, Jun 27, 2017 at 3:29 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> In fact asyncio.Lock's implementation is careful to maintain strict
>> FIFO fairness, i.e. whoever calls acquire() first is guaranteed to get
>> the lock first. Whether this is something you feel you can depend on
>> I'll leave to your conscience :-). Though the docs do say "only one
>> coroutine proceeds when a release() call resets the state to unlocked;
>> first coroutine which is blocked in acquire() is being processed",
>> which I think might be intended to say that they're FIFO-fair?
>> ...
>
> Thanks. All that is really interesting, especially the issue you
> linked to in the Trio docs re: fairness:
> https://github.com/python-trio/trio/issues/54
>
> Thinking through the requirements I want for my RW synchronization use
> case in more detail, I think I want the completion of any "write" to
> be followed by exhausting all "reads." I'm not sure if that qualifies
> as barging. Hopefully this will be implementable easily enough with
> the available primitives, given what you say.

I've only seen the term "barging" used in discussions of regular
locks, though I'm not an expert, just someone with eclectic reading
habits. But RWLocks have some extra subtleties that "barging" vs
"non-barging" don't really capture. Say you have the following
sequence:

task w0 acquires for write
task r1 attempts to acquire for read (and blocks)
task r2 attempts to acquire for read (and blocks)
task w1 attempts to acquire for write (and blocks)
task r3 attempts to acquire for read (and blocks)
task w0 releases the write lock
task r4 attempts to acquire for read

What happens? If r1+r2+r3+r4 are able to take the lock, then you're
"read-biased" (which is essentially the same as barging for readers,
but it can be extra dangerous for RWLocks, because if you have a heavy
read load it's very easy for readers to starve writers). If tasks
r1+r2 wake up, but r3+r4 have to wait, then you're "task-fair" (the
equivalent of FIFO fairness for RWLocks). If r1+r2+r3 wake up, but r4
has to wait, then you're "phase fair".

There are some notes here that are poorly organized but perhaps retain
some small interest:
https://github.com/python-trio/trio/blob/master/trio/_core/_parking_lot.py

If I ever implement one of these it'll probably be phase-fair, because
(a) it has some nice theoretical properties, and (b) it happens to be
particularly easy to implement using my existing wait-queue primitive,
and task-fair isn't :-).

> Can anything similar be said not about synchronization primitives, but
> about awakening coroutines in general? Do event loops maintain strict
> FIFO queues when it comes to deciding which awaiting coroutine to
> awaken? (I hope that question makes sense!)

Something like that. There's some complication because there are two
ways that a task can become runnable: directly by another piece of
code in the system (e.g., releasing a lock), or via some I/O (e.g.,
bytes arriving on a socket). If you really wanted to ensure that tasks
ran exactly in the order that they became runnable, then you need to
check for I/O constantly, but this is inefficient. So usually what
cooperative scheduling systems guarantee is a kind of "batched FIFO":
they do a poll for I/O (a which point they may discover some new
runnable tasks), and then take a snapshot of all the runnable tasks,
and then run all of the tasks in their snapshot once before
considering any new tasks. So this isn't quite strict FIFO, but it's
fair-like-FIFO (the discrepancy between when each task should run
under strict FIFO, and when it actually runs, is bounded by the number
of active tasks; there's no possibility of a runnable task being left
unscheduled for an arbitrary amount of time).

Curio used to allow woken-by-code tasks to starve out woken-by-I/O
tasks, and you might be interested in the discussion in the PR that
changed that: https://github.com/dabeaz/curio/pull/127

In trio I actually randomize the order within each batch because I
don't want people to accidentally encode assumptions about the
scheduler (e.g. in their test suites). This is because I have hopes of
eventually doing something fancier :-):
https://github.com/python-trio/trio/issues/32 ("If you liked issue
#54, you'll love #32!"). Many systems are not this paranoid though,
and actually are strict-FIFO for tasks that are woken-by-code - but
this is definitely one of those features where depending on it is
dubious. In asyncio for example the event loop is pluggable and the
scheduling policy is a feature of the event loop, so even if the
implementation in the stdlib is strict FIFO you don't know about
third-party ones.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Tue Jun 27 18:52:48 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 27 Jun 2017 15:52:48 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAOTb1wdx84=QZuC3XvRi1Mqb4g7FF3KMeT1WNEPA6J0QEDv9iw@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
 <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
 <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>
 <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
 <CAOTb1wdx84=QZuC3XvRi1Mqb4g7FF3KMeT1WNEPA6J0QEDv9iw@mail.gmail.com>
Message-ID: <CAPJVwBkuBph5DBV1skCxDG=49Cdy+cSCROznHnDZink3tv=_Ew@mail.gmail.com>

On Mon, Jun 26, 2017 at 6:41 PM, Chris Jerdonek
<chris.jerdonek at gmail.com> wrote:
> On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek <dimaqq at gmail.com> wrote:
>> Chris, here's a simple RWLock implementation and analysis:
>> ...
>> Obv., this code could be nicer:
>> * separate context managers for read and write cases
>> * .unlock can be automatic (if self.writer: unlock_for_write()) at the
>> cost of opening doors wide open to bugs
>> * policy can be introduced if `.lock` identified itself (by an
>> object(), since there's no thread id) in shared state
>> * notifyAll() makes real life use O(N^2) for N being number of
>> simultaneous write lock requests
>>
>> Feel free to use it :)
>
> Thanks, Dima. However, as I said in my earlier posts, I'm actually
> more interested in exploring approaches to synchronizing readers and
> writers in async code that don't require locking on reads. (This is
> also why I've always been saying RW "synchronization" instead of RW
> "locking.")
>
> I'm interested in this because I think the single-threadedness of the
> event loop might be what makes this simplification possible over the
> traditional multi-threaded approach (along the lines Guido was
> mentioning). It also makes the "fast path" faster. Lastly, the API for
> the callers is just to call read() or write(), so there is no need for
> a general RWLock construct or to work through RWLock semantics of the
> sort Nathaniel mentioned.
>
> I coded up a working version of the pseudo-code I included in an
> earlier email so people can see how it works. I included it at the
> bottom of this email and also in this gist:
> https://gist.github.com/cjerdonek/858e1467f768ee045849ea81ddb47901

FWIW, to me this just looks like an implementation of an async RWLock?
It's common for async synchronization primitives to be simpler
internally than threading primitives because the async ones don't need
to worry about being pre-empted at arbitrary points, but from the
caller's point of view you still have basically a blocking acquire()
method, and then you do your stuff (potentially blocking while you're
at it), and then you call a non-blocking release(), just like every
other async lock.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From chris.jerdonek at gmail.com  Tue Jun 27 19:39:02 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Tue, 27 Jun 2017 16:39:02 -0700
Subject: [Async-sig] "read-write" synchronization
In-Reply-To: <CAPJVwBkuBph5DBV1skCxDG=49Cdy+cSCROznHnDZink3tv=_Ew@mail.gmail.com>
References: <CAOTb1wf+Jr_LzZAb9JxbLb9Wpt2j1Kz9cmRPtEMqpjcm4vY8Ew@mail.gmail.com>
 <CAGGBzXJwA6g3DYa4cb17Cr51yLdT_V=b9oqOJAq0gPbRY+hkFA@mail.gmail.com>
 <CAOTb1wcO0nHyPXFavXq7uhK-2BXB0QpXEsew5kOTvynv8MQUPQ@mail.gmail.com>
 <CAGGBzXL4TPYKgHA-XFxPTx1OnoL3Je9HPVpv5G+oBbOkDksTXg@mail.gmail.com>
 <CAOTb1wfC0-dHMFCX4mrj=ykfJ2L8LoNbr10WNy2g5T050=PHhw@mail.gmail.com>
 <CAGGBzXKQWVkzZpmVAjA4_WQBMj15x3sgmfsa+=LDwr_Eeu05Og@mail.gmail.com>
 <CAOTb1wdx84=QZuC3XvRi1Mqb4g7FF3KMeT1WNEPA6J0QEDv9iw@mail.gmail.com>
 <CAPJVwBkuBph5DBV1skCxDG=49Cdy+cSCROznHnDZink3tv=_Ew@mail.gmail.com>
Message-ID: <CAOTb1wc4+AF_GgtPx7wyOAZfwGwqBHSZQjCkk7VpYVUtKbC29Q@mail.gmail.com>

On Tue, Jun 27, 2017 at 3:52 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Mon, Jun 26, 2017 at 6:41 PM, Chris Jerdonek
> <chris.jerdonek at gmail.com> wrote:
>> I coded up a working version of the pseudo-code I included in an
>> earlier email so people can see how it works. I included it at the
>> bottom of this email and also in this gist:
>> https://gist.github.com/cjerdonek/858e1467f768ee045849ea81ddb47901
>
> FWIW, to me this just looks like an implementation of an async RWLock?
> It's common for async synchronization primitives to be simpler
> internally than threading primitives because the async ones don't need
> to worry about being pre-empted at arbitrary points, but from the
> caller's point of view you still have basically a blocking acquire()
> method, and then you do your stuff (potentially blocking while you're
> at it), and then you call a non-blocking release(), just like every
> other async lock.

Yes and no I think. Internally, the implementation does just amount to
applying an async RWLock. But the difference I was getting at is that
the use case doesn't require exposing the RWLock in the API (e.g.
underlying acquire() and release() methods). This means you can avoid
having to think about some of the tricky design questions you started
discussing in an earlier email of yours:

> This is also a surprisingly complex design question. Your async RWLock
> actually matches how Python's threading.Lock works: you're explicitly
> allowed to acquire it in one thread and then release it from another.
> People sometimes find this surprising, and it prevents some kinds of
> error-checking. For example, this code *probably* deadlocks:
> ...

So my point was just that if the API is narrowed to exposing only
"read" and "write" operations (to support the easier task of
synchronizing reads and writes) and the RWLock kept private, you can
avoid having to think through and support full-blown RWLock design and
use cases, like with issues around passing ownership, etc. The API
restricts how the RWLock is ever used, so it needn't be a complete
RWLock implementation.

--Chris

From chris.jerdonek at gmail.com  Wed Jun 28 19:32:46 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Wed, 28 Jun 2017 16:32:46 -0700
Subject: [Async-sig] question re: asyncio.Condition lock acquisition
 order
In-Reply-To: <CAPJVwBkBkoed+hz8vxc6dce-sYH_db+PgORyqQh8M_Xfez28wg@mail.gmail.com>
References: <CAOTb1wdJTe0iTuUy0-6e-urs8hqq4Coh6z7dnY5E_RAtgNkYTQ@mail.gmail.com>
 <CAPJVwB=3OXFDOjp6NjgDpZz1yUQgf2Z-4qcKz+C2M8WTz_wtGA@mail.gmail.com>
 <CAOTb1weUtAO5t7xZx2M5Svzn-QnqcF5BwatDW_r0Lt5f3AFYGA@mail.gmail.com>
 <CAPJVwBkBkoed+hz8vxc6dce-sYH_db+PgORyqQh8M_Xfez28wg@mail.gmail.com>
Message-ID: <CAOTb1wfD=O3sj2y6L479W0BMv=SXXeUKJ+Ji7p=A7pX-W3Nvsw@mail.gmail.com>

On Tue, Jun 27, 2017 at 3:48 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Tue, Jun 27, 2017 at 4:15 AM, Chris Jerdonek
>> Thinking through the requirements I want for my RW synchronization use
>> case in more detail, I think I want the completion of any "write" to
>> be followed by exhausting all "reads." I'm not sure if that qualifies
>> as barging. Hopefully this will be implementable easily enough with
>> the available primitives, given what you say.
>
> I've only seen the term "barging" used in discussions of regular
> locks, though I'm not an expert, just someone with eclectic reading
> habits. But RWLocks have some extra subtleties that "barging" vs
> "non-barging" don't really capture. Say you have the following
> sequence:
>
> task w0 acquires for write
> task r1 attempts to acquire for read (and blocks)
> task r2 attempts to acquire for read (and blocks)
> task w1 attempts to acquire for write (and blocks)
> task r3 attempts to acquire for read (and blocks)
> task w0 releases the write lock
> task r4 attempts to acquire for read
>
> What happens? If r1+r2+r3+r4 are able to take the lock, then you're
> "read-biased" (which is essentially the same as barging for readers,
> but it can be extra dangerous for RWLocks, because if you have a heavy
> read load it's very easy for readers to starve writers).

All really interesting and informative again. Thank you, Nathaniel.
Regarding the above, in my case the "writes" will be a background
cleanup task that can happen as time is available. So it will be okay
if it is starved.

--Chris

From dimaqq at gmail.com  Fri Jun 30 06:11:46 2017
From: dimaqq at gmail.com (Dima Tisnek)
Date: Fri, 30 Jun 2017 12:11:46 +0200
Subject: [Async-sig] async documentation methods
Message-ID: <CAGGBzXJPd2=CEsJ8Q==C0RZcXgPXs8Dwz84byD2oGK6bWVGx7g@mail.gmail.com>

Hi all,

I'm working to improve async docs, and I wonder if/how async methods
ought to be marked in the documentation, for example
library/async-sync.rst:

""" ... It [lock] has two basic methods, `acquire()` and `release()`. ... """

In fact, these methods are not symmetric, the earlier is asynchronous
and the latter synchronous:

Definitions are `async def acquire()` and `def release()`.
Likewise user is expected to call `await .acquire()` and `.release()`.

This is user-facing documentation, IMO it should be clearer.
Although there are examples for this specific case, I'm concerned with
general documentation best practice.

Should this example read, e.g.:
* two methods, `async acquire()` and `release()`
or perhaps
* two methods, used `await x.acquire()` and `x.release()`
or something else?

If there's a good example already Python docs or in some 3rd party
docs, please tell.

Likewise, should there be marks on iterators? async generators? things
that ought to be used as context managers?

Cheers,
d.

From andrew.svetlov at gmail.com  Fri Jun 30 06:28:45 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Fri, 30 Jun 2017 10:28:45 +0000
Subject: [Async-sig] async documentation methods
In-Reply-To: <CAGGBzXJPd2=CEsJ8Q==C0RZcXgPXs8Dwz84byD2oGK6bWVGx7g@mail.gmail.com>
References: <CAGGBzXJPd2=CEsJ8Q==C0RZcXgPXs8Dwz84byD2oGK6bWVGx7g@mail.gmail.com>
Message-ID: <CAL3CFcWhU5k2gH-X3+_cRT9deeNHXinu3MWaUNtUuFfpAFaG7Q@mail.gmail.com>

I like "two methods, `async acquire()` and `release()`"

Regarding to extra markups -- I created sphinxcontrib-asyncio [1] library
for it. Hmm, README is pretty empty but we do use the library for
documenting aio-libs and aiohttp [2] itself

We use ".. comethod:: connect(request)" for method and "cofunction" for top
level functions.

Additional markup for methods that could be used as async context managers:

   .. comethod:: delete(url, **kwargs)
      :async-with:
      :coroutine:

and `:async-for:` for async iterators.


1. https://github.com/aio-libs/sphinxcontrib-asyncio
2. https://github.com/aio-libs/aiohttp

On Fri, Jun 30, 2017 at 1:11 PM Dima Tisnek <dimaqq at gmail.com> wrote:

> Hi all,
>
> I'm working to improve async docs, and I wonder if/how async methods
> ought to be marked in the documentation, for example
> library/async-sync.rst:
>
> """ ... It [lock] has two basic methods, `acquire()` and `release()`. ...
> """
>
> In fact, these methods are not symmetric, the earlier is asynchronous
> and the latter synchronous:
>
> Definitions are `async def acquire()` and `def release()`.
> Likewise user is expected to call `await .acquire()` and `.release()`.
>
> This is user-facing documentation, IMO it should be clearer.
> Although there are examples for this specific case, I'm concerned with
> general documentation best practice.
>
> Should this example read, e.g.:
> * two methods, `async acquire()` and `release()`
> or perhaps
> * two methods, used `await x.acquire()` and `x.release()`
> or something else?
>
> If there's a good example already Python docs or in some 3rd party
> docs, please tell.
>
> Likewise, should there be marks on iterators? async generators? things
> that ought to be used as context managers?
>
> Cheers,
> d.
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170630/cf97ae74/attachment.html>

From brett at python.org  Fri Jun 30 11:31:52 2017
From: brett at python.org (Brett Cannon)
Date: Fri, 30 Jun 2017 15:31:52 +0000
Subject: [Async-sig] async documentation methods
In-Reply-To: <CAL3CFcWhU5k2gH-X3+_cRT9deeNHXinu3MWaUNtUuFfpAFaG7Q@mail.gmail.com>
References: <CAGGBzXJPd2=CEsJ8Q==C0RZcXgPXs8Dwz84byD2oGK6bWVGx7g@mail.gmail.com>
 <CAL3CFcWhU5k2gH-X3+_cRT9deeNHXinu3MWaUNtUuFfpAFaG7Q@mail.gmail.com>
Message-ID: <CAP1=2W6bHWu8PqZt3nWRvHrYHPhhHxjzicjeVBk75evADZfn1w@mail.gmail.com>

Curio uses `.. asyncfunction:: acquire` and it renders as `await acquire()`
at least in the function definition.

On Fri, 30 Jun 2017 at 03:36 Andrew Svetlov <andrew.svetlov at gmail.com>
wrote:

> I like "two methods, `async acquire()` and `release()`"
>
> Regarding to extra markups -- I created sphinxcontrib-asyncio [1] library
> for it. Hmm, README is pretty empty but we do use the library for
> documenting aio-libs and aiohttp [2] itself
>
> We use ".. comethod:: connect(request)" for method and "cofunction" for
> top level functions.
>
> Additional markup for methods that could be used as async context managers:
>
>    .. comethod:: delete(url, **kwargs)
>       :async-with:
>       :coroutine:
>
> and `:async-for:` for async iterators.
>
>
> 1. https://github.com/aio-libs/sphinxcontrib-asyncio
> 2. https://github.com/aio-libs/aiohttp
>
> On Fri, Jun 30, 2017 at 1:11 PM Dima Tisnek <dimaqq at gmail.com> wrote:
>
>> Hi all,
>>
>> I'm working to improve async docs, and I wonder if/how async methods
>> ought to be marked in the documentation, for example
>> library/async-sync.rst:
>>
>> """ ... It [lock] has two basic methods, `acquire()` and `release()`. ...
>> """
>>
>> In fact, these methods are not symmetric, the earlier is asynchronous
>> and the latter synchronous:
>>
>> Definitions are `async def acquire()` and `def release()`.
>> Likewise user is expected to call `await .acquire()` and `.release()`.
>>
>> This is user-facing documentation, IMO it should be clearer.
>> Although there are examples for this specific case, I'm concerned with
>> general documentation best practice.
>>
>> Should this example read, e.g.:
>> * two methods, `async acquire()` and `release()`
>> or perhaps
>> * two methods, used `await x.acquire()` and `x.release()`
>> or something else?
>>
>> If there's a good example already Python docs or in some 3rd party
>> docs, please tell.
>>
>> Likewise, should there be marks on iterators? async generators? things
>> that ought to be used as context managers?
>>
>> Cheers,
>> d.
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
> --
> Thanks,
> Andrew Svetlov
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170630/69642be0/attachment.html>

From yselivanov at gmail.com  Fri Jun 30 14:41:33 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Fri, 30 Jun 2017 14:41:33 -0400
Subject: [Async-sig] async documentation methods
In-Reply-To: <CAGGBzXJPd2=CEsJ8Q==C0RZcXgPXs8Dwz84byD2oGK6bWVGx7g@mail.gmail.com>
References: <CAGGBzXJPd2=CEsJ8Q==C0RZcXgPXs8Dwz84byD2oGK6bWVGx7g@mail.gmail.com>
Message-ID: <6b4085a6-6783-4ec8-b40d-2d3328dec926@Spark>

Hi Dima,

Have you seen?https://github.com/asyncio-docs? ?I'm trying to get some work going there to improve asyncio docs in 3.7. Will start committing more of my time there soon.

Thanks,
Yury

On Jun 30, 2017, 6:11 AM -0400, Dima Tisnek <dimaqq at gmail.com>, wrote:
> Hi all,
>
> I'm working to improve async docs, and I wonder if/how async methods
> ought to be marked in the documentation, for example
> library/async-sync.rst:
>
> """ ... It [lock] has two basic methods, `acquire()` and `release()`. ... """
>
> In fact, these methods are not symmetric, the earlier is asynchronous
> and the latter synchronous:
>
> Definitions are `async def acquire()` and `def release()`.
> Likewise user is expected to call `await .acquire()` and `.release()`.
>
> This is user-facing documentation, IMO it should be clearer.
> Although there are examples for this specific case, I'm concerned with
> general documentation best practice.
>
> Should this example read, e.g.:
> * two methods, `async acquire()` and `release()`
> or perhaps
> * two methods, used `await x.acquire()` and `x.release()`
> or something else?
>
> If there's a good example already Python docs or in some 3rd party
> docs, please tell.
>
> Likewise, should there be marks on iterators? async generators? things
> that ought to be used as context managers?
>
> Cheers,
> d.
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20170630/cb59202b/attachment.html>