From asmodehn at gmail.com  Sat Dec  2 12:50:40 2017
From: asmodehn at gmail.com (Asmodehn Shade)
Date: Sun, 3 Dec 2017 02:50:40 +0900
Subject: [Async-sig] async await as basic paradigm - relation to delimited
 continuations ?
Message-ID: <CANH8SB91rxZZ+oQ1adZjbPjwi+VOyqu_KN=vZ6S68Ze4S3dDNg@mail.gmail.com>

Hi everyone,

I have been working on distributed programming for a while now, however I
was away from python land, and didnt get into the twisted/tornado/asyncio
train until few years ago.

While writing asyncio code and exploring different abstract computing
paradigm (my recent hobby seems to be exploring new ways to do computing),
I recently discover curio.

I read a few github issues, and although I do enjoy experiments, I am also
aware of the need for mathematical fundations in order for a project to be
able to compose between projects and groups of people and ultimately scale
and persist. As well as finding its own roots to support an ecosystem.

I was wondering if there was someone on this mailing list versed enough in
maths and computer science who could see some kind of association/relation
between async/await and delimited continuations... and hopefully fomalize
it somehow ?

As a side effect, this might give some improvement ideas to curio ...

The way I see it so far, very rough and naive :
- the event loop is "doing the work", that is a sequence of delimited
continuations.
- async/await denotes how the work to be done can be composed (wrapping
functions that already could compose by themselves to add additional
runtime semantics),
- tasks are like delimited continuations, and they will be ordered at
runtime to execute in a specific sequence.
Or am I completely misleaded ?

If we scale this up and "distribute" it (actor model style), one actor can
match one event-loop, and there are probably many association to draw
there, that could indicate for curio (or libraries using it) if it is
sensible or not to completely identify the event_loop to a thread...

As a broad overview :

1) imperative concepts can be implemented from functional/math concepts
(haskell - do notation, relation with category theory)
2) functional concept are currently implemented with imperative languages (
scheme interpreters, haskell compiler, etc. )
3) looking as async programming and functional/math concepts these days, I
am wondering how these relate, and how far down goes the rabbit hole...

Thanks a lot for sharing your views on this :-).
--
AlexV
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20171203/fbb67048/attachment.html>

From chris.jerdonek at gmail.com  Sun Dec  3 00:59:57 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sat, 2 Dec 2017 21:59:57 -0800
Subject: [Async-sig] subclassing CancelledError
Message-ID: <CAOTb1wf7deqBaOM=b7W_irX60G0xL=e70hfOYDuH14P6zFp0NA@mail.gmail.com>

Hi, I want to ask how people feel about the following.

If you raise a subclass of CancelledError from within a task and then
call task.result(), CancelledError is raised rather than the subclass.

Here is some code to illustrate:

    class MyCancelledError(CancelledError): pass

    async def raise_my_cancel():
        raise MyCancelledError()

    task = asyncio.ensure_future(raise_my_cancel())
    try:
        await task
    except Exception:
        pass
    assert task.cancelled()
    # Raises CancelledError and not MyCancelledError.
    task.result()

Does this seem right to people? Is there a justification for this?

If it would help for the discussion, I could provide a use case.

Thanks a lot,
--Chris

From andrew.svetlov at gmail.com  Sun Dec  3 05:11:49 2017
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sun, 03 Dec 2017 10:11:49 +0000
Subject: [Async-sig] subclassing CancelledError
In-Reply-To: <CAOTb1wf7deqBaOM=b7W_irX60G0xL=e70hfOYDuH14P6zFp0NA@mail.gmail.com>
References: <CAOTb1wf7deqBaOM=b7W_irX60G0xL=e70hfOYDuH14P6zFp0NA@mail.gmail.com>
Message-ID: <CAL3CFcXL1=9E7v1CTQJdhhZ=bA91pTOfBC4MKNrYmrx5Yk1x+A@mail.gmail.com>

IIRC at very early stages Guido van Rossum decided to *freeze*
`CancelledError`: user code should not derive from the exception. Like you
never derive from StopIteration.

On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek <chris.jerdonek at gmail.com>
wrote:

> Hi, I want to ask how people feel about the following.
>
> If you raise a subclass of CancelledError from within a task and then
> call task.result(), CancelledError is raised rather than the subclass.
>
> Here is some code to illustrate:
>
>     class MyCancelledError(CancelledError): pass
>
>     async def raise_my_cancel():
>         raise MyCancelledError()
>
>     task = asyncio.ensure_future(raise_my_cancel())
>     try:
>         await task
>     except Exception:
>         pass
>     assert task.cancelled()
>     # Raises CancelledError and not MyCancelledError.
>     task.result()
>
> Does this seem right to people? Is there a justification for this?
>
> If it would help for the discussion, I could provide a use case.
>
> Thanks a lot,
> --Chris
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
-- 
Thanks,
Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20171203/e9d345e3/attachment.html>

From guido at python.org  Sun Dec  3 12:04:30 2017
From: guido at python.org (Guido van Rossum)
Date: Sun, 3 Dec 2017 09:04:30 -0800
Subject: [Async-sig] subclassing CancelledError
In-Reply-To: <CAL3CFcXL1=9E7v1CTQJdhhZ=bA91pTOfBC4MKNrYmrx5Yk1x+A@mail.gmail.com>
References: <CAOTb1wf7deqBaOM=b7W_irX60G0xL=e70hfOYDuH14P6zFp0NA@mail.gmail.com>
 <CAL3CFcXL1=9E7v1CTQJdhhZ=bA91pTOfBC4MKNrYmrx5Yk1x+A@mail.gmail.com>
Message-ID: <CAP7+vJKthKaz03G6QotLJ1nTcx5Szn2e681u+NsAvFuqz5N-pA@mail.gmail.com>

Sounds like an implementation issue. The Future class has a boolean flag
indicating whether it's been cancelled, and everything then raises
CancelledError when that flag is set. I suppose we could replace that flag
with an instance of CancelledError, and when we *catch* CancelledError we
set it to that. But it seems messy and I'm not sure that you should attempt
this. IMO you should define an exception that does *not* derive from
CancelledError and raise that -- it will properly be transmitted. What's
your reason for not doing that? IOW why are you trying to pump more than a
bit through the narrow "cancelled" channel?

On Sun, Dec 3, 2017 at 2:11 AM, Andrew Svetlov <andrew.svetlov at gmail.com>
wrote:

> IIRC at very early stages Guido van Rossum decided to *freeze*
> `CancelledError`: user code should not derive from the exception. Like you
> never derive from StopIteration.
>
> On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek <chris.jerdonek at gmail.com>
> wrote:
>
>> Hi, I want to ask how people feel about the following.
>>
>> If you raise a subclass of CancelledError from within a task and then
>> call task.result(), CancelledError is raised rather than the subclass.
>>
>> Here is some code to illustrate:
>>
>>     class MyCancelledError(CancelledError): pass
>>
>>     async def raise_my_cancel():
>>         raise MyCancelledError()
>>
>>     task = asyncio.ensure_future(raise_my_cancel())
>>     try:
>>         await task
>>     except Exception:
>>         pass
>>     assert task.cancelled()
>>     # Raises CancelledError and not MyCancelledError.
>>     task.result()
>>
>> Does this seem right to people? Is there a justification for this?
>>
>> If it would help for the discussion, I could provide a use case.
>>
>> Thanks a lot,
>> --Chris
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
> --
> Thanks,
> Andrew Svetlov
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20171203/157464e3/attachment.html>

From chris.jerdonek at gmail.com  Sun Dec  3 16:53:12 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sun, 3 Dec 2017 13:53:12 -0800
Subject: [Async-sig] subclassing CancelledError
In-Reply-To: <CAP7+vJKthKaz03G6QotLJ1nTcx5Szn2e681u+NsAvFuqz5N-pA@mail.gmail.com>
References: <CAOTb1wf7deqBaOM=b7W_irX60G0xL=e70hfOYDuH14P6zFp0NA@mail.gmail.com>
 <CAL3CFcXL1=9E7v1CTQJdhhZ=bA91pTOfBC4MKNrYmrx5Yk1x+A@mail.gmail.com>
 <CAP7+vJKthKaz03G6QotLJ1nTcx5Szn2e681u+NsAvFuqz5N-pA@mail.gmail.com>
Message-ID: <CAOTb1wc1GMCUDTmPrnbbHm7Lh3oPP-M=DTsp4LQ9tA0Jke-BLg@mail.gmail.com>

On Sun, Dec 3, 2017 at 9:04 AM, Guido van Rossum <guido at python.org> wrote:
> Sounds like an implementation issue. The Future class has a boolean flag
> indicating whether it's been cancelled, and everything then raises
> CancelledError when that flag is set. I suppose we could replace that flag
> with an instance of CancelledError, and when we *catch* CancelledError we
> set it to that. But it seems messy and I'm not sure that you should attempt
> this. IMO you should define an exception that does *not* derive from
> CancelledError and raise that -- it will properly be transmitted. What's
> your reason for not doing that? IOW why are you trying to pump more than a
> bit through the narrow "cancelled" channel?

My use case is mostly for diagnostic / logging purposes. I want to log
all exceptions bubbling out of a task, and I want to do so from within
the coroutine itself rather than when I call task.result(). I also
want to be able to detect if something wasn't logged when I call
task.result() (to avoid double logging and errors passing silently,
etc), and I'd also prefer that task.cancelled(), etc. still return the
correct values.

My first idea was to define LoggedError(Exception) and
LoggedCancelledError(CancelledError) subclasses, and raise a new
exception from within the task after logging. If LoggedCancelledError
doesn't derive from CancelledError, then task.cancelled() won't
reflect that the task was cancelled.

Could it simplify things on the asyncio side if the CancelledError and
non-CancelledError code paths shared more logic (e.g. by both setting
an exception instead of only the generic case setting an exception)?

--Chris


>
> On Sun, Dec 3, 2017 at 2:11 AM, Andrew Svetlov <andrew.svetlov at gmail.com>
> wrote:
>>
>> IIRC at very early stages Guido van Rossum decided to *freeze*
>> `CancelledError`: user code should not derive from the exception. Like you
>> never derive from StopIteration.
>>
>> On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek <chris.jerdonek at gmail.com>
>> wrote:
>>>
>>> Hi, I want to ask how people feel about the following.
>>>
>>> If you raise a subclass of CancelledError from within a task and then
>>> call task.result(), CancelledError is raised rather than the subclass.
>>>
>>> Here is some code to illustrate:
>>>
>>>     class MyCancelledError(CancelledError): pass
>>>
>>>     async def raise_my_cancel():
>>>         raise MyCancelledError()
>>>
>>>     task = asyncio.ensure_future(raise_my_cancel())
>>>     try:
>>>         await task
>>>     except Exception:
>>>         pass
>>>     assert task.cancelled()
>>>     # Raises CancelledError and not MyCancelledError.
>>>     task.result()
>>>
>>> Does this seem right to people? Is there a justification for this?
>>>
>>> If it would help for the discussion, I could provide a use case.
>>>
>>> Thanks a lot,
>>> --Chris
>>> _______________________________________________
>>> Async-sig mailing list
>>> Async-sig at python.org
>>> https://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>> --
>> Thanks,
>> Andrew Svetlov
>>
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)

From guido at python.org  Sun Dec  3 20:35:06 2017
From: guido at python.org (Guido van Rossum)
Date: Sun, 3 Dec 2017 17:35:06 -0800
Subject: [Async-sig] subclassing CancelledError
In-Reply-To: <CAOTb1wc1GMCUDTmPrnbbHm7Lh3oPP-M=DTsp4LQ9tA0Jke-BLg@mail.gmail.com>
References: <CAOTb1wf7deqBaOM=b7W_irX60G0xL=e70hfOYDuH14P6zFp0NA@mail.gmail.com>
 <CAL3CFcXL1=9E7v1CTQJdhhZ=bA91pTOfBC4MKNrYmrx5Yk1x+A@mail.gmail.com>
 <CAP7+vJKthKaz03G6QotLJ1nTcx5Szn2e681u+NsAvFuqz5N-pA@mail.gmail.com>
 <CAOTb1wc1GMCUDTmPrnbbHm7Lh3oPP-M=DTsp4LQ9tA0Jke-BLg@mail.gmail.com>
Message-ID: <CAP7+vJ+DHWohRuuRXDjPH5na26nu1wP_tXCPeD16ju6eirAr9Q@mail.gmail.com>

I think it's too late to change the cancellation logic. Cancellation is
tricky.

On Sun, Dec 3, 2017 at 1:53 PM, Chris Jerdonek <chris.jerdonek at gmail.com>
wrote:

> On Sun, Dec 3, 2017 at 9:04 AM, Guido van Rossum <guido at python.org> wrote:
> > Sounds like an implementation issue. The Future class has a boolean flag
> > indicating whether it's been cancelled, and everything then raises
> > CancelledError when that flag is set. I suppose we could replace that
> flag
> > with an instance of CancelledError, and when we *catch* CancelledError we
> > set it to that. But it seems messy and I'm not sure that you should
> attempt
> > this. IMO you should define an exception that does *not* derive from
> > CancelledError and raise that -- it will properly be transmitted. What's
> > your reason for not doing that? IOW why are you trying to pump more than
> a
> > bit through the narrow "cancelled" channel?
>
> My use case is mostly for diagnostic / logging purposes. I want to log
> all exceptions bubbling out of a task, and I want to do so from within
> the coroutine itself rather than when I call task.result(). I also
> want to be able to detect if something wasn't logged when I call
> task.result() (to avoid double logging and errors passing silently,
> etc), and I'd also prefer that task.cancelled(), etc. still return the
> correct values.
>
> My first idea was to define LoggedError(Exception) and
> LoggedCancelledError(CancelledError) subclasses, and raise a new
> exception from within the task after logging. If LoggedCancelledError
> doesn't derive from CancelledError, then task.cancelled() won't
> reflect that the task was cancelled.
>
> Could it simplify things on the asyncio side if the CancelledError and
> non-CancelledError code paths shared more logic (e.g. by both setting
> an exception instead of only the generic case setting an exception)?
>
> --Chris
>
>
> >
> > On Sun, Dec 3, 2017 at 2:11 AM, Andrew Svetlov <andrew.svetlov at gmail.com
> >
> > wrote:
> >>
> >> IIRC at very early stages Guido van Rossum decided to *freeze*
> >> `CancelledError`: user code should not derive from the exception. Like
> you
> >> never derive from StopIteration.
> >>
> >> On Sun, Dec 3, 2017 at 8:00 AM Chris Jerdonek <chris.jerdonek at gmail.com
> >
> >> wrote:
> >>>
> >>> Hi, I want to ask how people feel about the following.
> >>>
> >>> If you raise a subclass of CancelledError from within a task and then
> >>> call task.result(), CancelledError is raised rather than the subclass.
> >>>
> >>> Here is some code to illustrate:
> >>>
> >>>     class MyCancelledError(CancelledError): pass
> >>>
> >>>     async def raise_my_cancel():
> >>>         raise MyCancelledError()
> >>>
> >>>     task = asyncio.ensure_future(raise_my_cancel())
> >>>     try:
> >>>         await task
> >>>     except Exception:
> >>>         pass
> >>>     assert task.cancelled()
> >>>     # Raises CancelledError and not MyCancelledError.
> >>>     task.result()
> >>>
> >>> Does this seem right to people? Is there a justification for this?
> >>>
> >>> If it would help for the discussion, I could provide a use case.
> >>>
> >>> Thanks a lot,
> >>> --Chris
> >>> _______________________________________________
> >>> Async-sig mailing list
> >>> Async-sig at python.org
> >>> https://mail.python.org/mailman/listinfo/async-sig
> >>> Code of Conduct: https://www.python.org/psf/codeofconduct/
> >>
> >> --
> >> Thanks,
> >> Andrew Svetlov
> >>
> >> _______________________________________________
> >> Async-sig mailing list
> >> Async-sig at python.org
> >> https://mail.python.org/mailman/listinfo/async-sig
> >> Code of Conduct: https://www.python.org/psf/codeofconduct/
> >>
> >
> >
> >
> > --
> > --Guido van Rossum (python.org/~guido)
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20171203/3bcebed5/attachment-0001.html>

From njs at pobox.com  Thu Dec  7 03:28:07 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 7 Dec 2017 00:28:07 -0800
Subject: [Async-sig] ANN: Trio v0.2.0 released
Message-ID: <CAPJVwBmgM=T0atkqgt9+V7-B72nC2DQbr8om7SQtxmNeUke6Dw@mail.gmail.com>

Hi all,

I'm proud to announce the release of Trio v0.2.0. Trio is a new async
concurrency library for Python that's obsessed with usability and
correctness -- we want to make it easy to get things right. This is
the second public release, and it contains major new features and
bugfixes from 14 contributors.

You can read the full release notes here:

    https://trio.readthedocs.io/en/latest/history.html#trio-0-2-0-2017-12-06

Some things I'm particularly excited about are:

- Comprehensive support for async file I/O

- The new 'nursery.start' method for clean startup of complex task trees

- The new high-level networking API -- this is roughly the same level
of abstraction as twisted/asyncio's protocols/transports. Includes
luxuries like happy eyeballs for most robust client connections, and
server helpers that integrate with nursery.start.

- Complete support for using SSL/TLS encryption over arbitrary
transports. You can even do SSL-over-SSL, which is useful for HTTPS
proxies and AFAIK not supported by any other Python library.

- Task-local storage.

- Our new contributing guide:
https://trio.readthedocs.io/en/latest/contributing.html

To get started with Trio, the best place to start is our tutorial:

    https://trio.readthedocs.io/en/latest/tutorial.html

It doesn't assume any prior familiarity with concurrency or async/await.

Share and enjoy,
-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From yselivanov at gmail.com  Thu Dec  7 21:03:46 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Thu, 7 Dec 2017 21:03:46 -0500
Subject: [Async-sig] APIs for high-bandwidth large I/O?
In-Reply-To: <20171018200441.505e8145@fsol>
References: <20171018200441.505e8145@fsol>
Message-ID: <1199ac88-719c-41cb-b852-e347a566e94c@Spark>

Hi Antoine,

Thanks for posting this, and sorry for the delayed reply!

I've known about a possibility to optimize asyncio Protocols for a while. ?I noticed that `Protocol.data_received()` requires making one extra copy of the received data when I was working on the initial version of uvloop. ?Back then my main priority was to make uvloop fully compatible with asyncio, so I wasn't really thinking about improving asyncio design.


Let me explain the current flaw of `Protocol.data_received()` so that other people on the list can catch up with the discussion:

1. Currently, when a Transport is reading data, it uses `sock.recv()` call, which returns a `bytes` object, which is then pushed to `Protocol.data_received()`. ?Every time `sock.recv()` is called, a new bytes object is allocated.

2. Typically, protocols need to accumulate bytes objects they receive until they have enough buffered data to be parsed. ?Usually a `deque` is used for that, less optimized code just concatenates all bytes objects into one.

3. When enough data is gathered and a protocol message can be parsed out of it, usually there's a need to concatenate a few buffers from the `deque` or get a slice of the concatenated buffer. ?At this point, we've copied the received data two times.


I propose to add another Protocol base class to asyncio: BufferedProtocol. ?It won't have the 'data_received()' method, instead it will have 'get_buffer()' and 'buffer_updated(nbytes)' methods:

? ? class asyncio.BufferedProtocol:

? ? ? ? def get_buffer(self) -> memoryview:
? ? ? ? ? ? pass

? ? ? ? def buffer_updated(self, nbytes: int):
? ? ? ? ? ? pass

When the protocol's transport is ready to receive data, it will call `protocol.get_buffer()`. ?The latter must return an object that implements the buffer protocol. ?The transport will request a writable buffer over the returned object and receive data *into* that buffer.

When the `sock.recv_into(buffer)` call is done, `protocol.buffer_updated(nbytes)` method will be called. ?The number of bytes received into the buffer will be passed as a first argument.


I've implemented the proposed design in uvloop (branch 'get_buffer', [1]) and adjusted your benchmark [2] to use it. ?Here are benchmark results from my machine (macOS):

vanilla asyncio: 120-135 Mb/s
uvloop: 320-330 Mb/s
uvloop/get_buffer: 600-650 Mb/s.

The benchmark is quite unstable, but it's clear that Protocol.get_buffer() allows to implement framing way more efficiently.


I'm also working on porting asyncpg library to use get_buffer(), as it has a fairly good benchmark suite. ?So far I'm seeing 5-15% speed boost on all benchmarks. ?What's more important is that get_buffer() makes asyncpg buffer implementation simpler!


I'm quite happy with these results and Ipropose to implement the get_buffer() API (or its equivalent) in Python 3.7. ?I've opened an issue [3] to discuss the implementation details.


[1]?https://github.com/MagicStack/uvloop/tree/get_buffer
[2]?https://gist.github.com/1st1/1c606e5b83ef0e9c41faf21564d75ad7


Thanks,
Yury

On Oct 18, 2017, 2:31 PM -0400, Antoine Pitrou <solipsis at pitrou.net>, wrote:
>
> Hi,
>
> I am currently looking into ways to optimize large data transfers for a
> distributed computing framework
> (https://github.com/dask/distributed/). We are using Tornado but the
> question is more general, as it turns out that certain kinds of API are
> an impediment to such optimizations.
>
> To put things short, there are a couple benchmarks discussed here:
> https://github.com/tornadoweb/tornado/issues/2147#issuecomment-337187960
>
> - for Tornado, this benchmark:
> https://gist.github.com/pitrou/0f772867008d861c4aa2d2d7b846bbf0
> - for asyncio, this benchmark:
> https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc
>
> Both implement a trivial form of framing using the "preferred" APIs of
> each framework (IOStream for Tornado, Protocol for asyncio), and then
> benchmark it over 100 MB frames using a simple echo client/server.
>
> The results (on Python 3.6) are interesting:
> - vanilla asyncio achieves 350 MB/s
> - vanilla Tornado achieves 400 MB/s
> - asyncio + uvloop achieves 600 MB/s
> - an optimized Tornado IOStream with a more sophisticated buffering
> logic (https://github.com/tornadoweb/tornado/pull/2166)
> achieves 700 MB/s
>
> The latter result is especially interesting. uvloop uses hand-crafted
> Cython code + the C libuv library, still, a pure Python version of
> Tornado does better thanks to an improved buffering logic in the
> streaming layer.
>
> Even the Tornado result is not ideal. When profiling, we see that
> 50% of the runtime is actual IO calls (socket.send and socket.recv),
> but the rest is still overhead. Especially, buffering on the read side
> still has costly memory copies (b''.join calls take 22% of the time!).
>
> For a framed layer, you shouldn't need so many copies. Once you've
> read the frame length, you can allocate the frame upfront and read into
> it. It is at odds, however, with the API exposed by asyncio's Protocol:
> data_received() gives you a new bytes object as soon as data arrives.
> It's already too late: a spurious memory copy will have to occur.
>
> Tornado's IOStream is less constrained, but it supports too many read
> schemes (including several types of callbacks). So I crafted a limited
> version of IOStream (*) that supports little functionality, but is able
> to use socket.recv_into() when asked for a given number of bytes. When
> benchmarked, this version achieves 950 MB/s. This is still without C
> code!
>
> (*) see
> https://github.com/tornadoweb/tornado/compare/master...pitrou:stream_readinto?expand=1
>
> When profiling that limited version of IOStream, we see that 68% of the
> runtime is actual IO calls (socket.send and socket.recv_into).
> Still, 21% of the total runtime is spent allocating a 100 MB buffer for
> each frame! That's 70% of the non-IO overhead! Whether or not there
> are smart ways to reuse that writable buffer depends on how the
> application intends to use data: does it throw it away before the next
> read or not? It doesn't sound easily doable in the general case.
>
>
> So I'm wondering which kind of APIs async libraries could expose to
> make those use cases faster. I know curio and trio have socket objects
> which would probably fit the bill. I don't know if there are
> higher-level concepts that may be as adequate for achieving the highest
> performance.
>
> Also, since asyncio is the de facto standard now, I wonder if asyncio
> might grow such a new API. That may be troublesome: asyncio already
> has Protocols and Streams, and people often complain about its
> extensive API surface that's difficult for beginners :-)
>
>
> Addendum: asyncio streams
> -------------------------
>
> I didn't think asyncio streams would be a good solution, but I still
> wrote a benchmark variant for them out of curiosity, and it turns out I
> was right. The results:
> - vanilla asyncio streams achieve 300 MB/s
> - asyncio + uvloop streams achieve 550 MB/s
>
> The benchmark script is at
> https://gist.github.com/pitrou/202221ca9c9c74c0b48373ac89e15fd7
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20171207/37203259/attachment.html>

From yselivanov at gmail.com  Thu Dec  7 21:05:29 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Thu, 7 Dec 2017 21:05:29 -0500
Subject: [Async-sig] APIs for high-bandwidth large I/O?
In-Reply-To: <1199ac88-719c-41cb-b852-e347a566e94c@Spark>
References: <20171018200441.505e8145@fsol>
 <1199ac88-719c-41cb-b852-e347a566e94c@Spark>
Message-ID: <0fa00196-0592-4a5a-b27b-dc9b8dfd50a4@Spark>

On Dec 7, 2017, 9:03 PM -0500, Yury Selivanov <yselivanov at gmail.com>, wrote:
[..]

> I'm quite happy with these results and Ipropose to implement the get_buffer() API (or its equivalent) in Python 3.7. ?I've opened an issue [3] to discuss the implementation details.

Issue:?https://bugs.python.org/issue32251

Yury
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20171207/62c0a715/attachment-0001.html>

From yselivanov at gmail.com  Thu Dec  7 21:07:45 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Thu, 7 Dec 2017 21:07:45 -0500
Subject: [Async-sig] APIs for high-bandwidth large I/O?
In-Reply-To: <1199ac88-719c-41cb-b852-e347a566e94c@Spark>
References: <20171018200441.505e8145@fsol>
 <1199ac88-719c-41cb-b852-e347a566e94c@Spark>
Message-ID: <fc61b061-288a-4db2-bc4e-d28431cc2e81@Spark>

On Dec 7, 2017, 9:03 PM -0500, Yury Selivanov <yselivanov at gmail.com>, wrote:

>
> I'm quite happy with these results and Ipropose to implement the get_buffer() API (or its equivalent) in Python 3.7. ?I've opened an issue [3] to discuss the implementation details.
>

I apologize for the noise, my email client is acting weird. ?Here's the url of the CPython issue I created to discuss the proposed design:?https://bugs.python.org/issue32251.


Yury
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20171207/1885a4a5/attachment.html>

From chris.jerdonek at gmail.com  Mon Dec 25 00:55:47 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Sun, 24 Dec 2017 21:55:47 -0800
Subject: [Async-sig] task.result() and exception traceback display
Message-ID: <CAOTb1wfdSEfyMvzSkaRg5Rd3gSqrkxVHanf1USqxi88S7G4y6A@mail.gmail.com>

Hi,

I noticed that if a task in asyncio raises an exception, then the
displayed traceback can be "polluted" by intermediate calls to
task.result().  Also, the calls to task.result() can appear out of
order relative to each other and to other lines.

Here is an example:

    import asyncio

    async def raise_error():
        raise ValueError()

    async def main():
        task = asyncio.ensure_future(raise_error())

        try:
            await task  # call 1
        except Exception:
            pass

        try:
            task.result()  # call 2
        except Exception:
            pass

        task.result()  # call 3

    asyncio.get_event_loop().run_until_complete(main())

The above outputs--

    Traceback (most recent call last):
      File "test.py", line 24, in <module>
        asyncio.get_event_loop().run_until_complete(main())
      File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
          line 467, in run_until_complete
        return future.result()
      File "test.py", line 21, in main
        task.result()  # call 3
      File "test.py", line 17, in main
        task.result()  # call 2
      File "test.py", line 12, in main
        await task  # call 1
      File "test.py", line 5, in raise_error
        raise ValueError()
    ValueError

Notice that the "call 2" line appears in the traceback, even though it
doesn't come into play in the exception.  Also, the lines don't obey
the "most recent call last" rule.  If this rule were followed, it
should be something more like--

    Traceback (most recent call last):
      File "test.py", line 24, in <module>
        asyncio.get_event_loop().run_until_complete(main())
      File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
          line 467, in run_until_complete
        return future.result()
      File "test.py", line 12, in main
        await task  # call 1
      File "test.py", line 5, in raise_error
        raise ValueError()
      File "test.py", line 17, in main
        task.result()  # call 2
      File "test.py", line 21, in main
        task.result()  # call 3
    ValueError

If people agree there's an issue along these lines, I can file an
issue in the tracker. I didn't seem to find one when searching for
open issues with search terms like "asyncio traceback".

Thanks,
--Chris

From njs at pobox.com  Mon Dec 25 03:48:15 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 25 Dec 2017 00:48:15 -0800
Subject: [Async-sig] task.result() and exception traceback display
In-Reply-To: <CAOTb1wfdSEfyMvzSkaRg5Rd3gSqrkxVHanf1USqxi88S7G4y6A@mail.gmail.com>
References: <CAOTb1wfdSEfyMvzSkaRg5Rd3gSqrkxVHanf1USqxi88S7G4y6A@mail.gmail.com>
Message-ID: <CAPJVwBnEP+gY_4B0tiwH=oCgS3U6zpv5O+XWdyinMtg-JQJAcw@mail.gmail.com>

I haven't thought about this enough to have an opinion about whether
this is correct or how it could be improved, but I can explain why
you're seeing what you're seeing :-).

The traceback is really a trace of where the exception went after it
was raised, with new lines added to the top as it bubbles out. So the
bottom line is the 'raise' statement, because that's where it was
created, and then it bubbled onto the 'call 1' line and was caught.
Then it was raised again and bubbled onto the 'call 2' line. Etc. So
you should think of it not as a snapshot of your stack when it was
created, but as a travelogue.

-n

On Sun, Dec 24, 2017 at 9:55 PM, Chris Jerdonek
<chris.jerdonek at gmail.com> wrote:
> Hi,
>
> I noticed that if a task in asyncio raises an exception, then the
> displayed traceback can be "polluted" by intermediate calls to
> task.result().  Also, the calls to task.result() can appear out of
> order relative to each other and to other lines.
>
> Here is an example:
>
>     import asyncio
>
>     async def raise_error():
>         raise ValueError()
>
>     async def main():
>         task = asyncio.ensure_future(raise_error())
>
>         try:
>             await task  # call 1
>         except Exception:
>             pass
>
>         try:
>             task.result()  # call 2
>         except Exception:
>             pass
>
>         task.result()  # call 3
>
>     asyncio.get_event_loop().run_until_complete(main())
>
> The above outputs--
>
>     Traceback (most recent call last):
>       File "test.py", line 24, in <module>
>         asyncio.get_event_loop().run_until_complete(main())
>       File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
>           line 467, in run_until_complete
>         return future.result()
>       File "test.py", line 21, in main
>         task.result()  # call 3
>       File "test.py", line 17, in main
>         task.result()  # call 2
>       File "test.py", line 12, in main
>         await task  # call 1
>       File "test.py", line 5, in raise_error
>         raise ValueError()
>     ValueError
>
> Notice that the "call 2" line appears in the traceback, even though it
> doesn't come into play in the exception.  Also, the lines don't obey
> the "most recent call last" rule.  If this rule were followed, it
> should be something more like--
>
>     Traceback (most recent call last):
>       File "test.py", line 24, in <module>
>         asyncio.get_event_loop().run_until_complete(main())
>       File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
>           line 467, in run_until_complete
>         return future.result()
>       File "test.py", line 12, in main
>         await task  # call 1
>       File "test.py", line 5, in raise_error
>         raise ValueError()
>       File "test.py", line 17, in main
>         task.result()  # call 2
>       File "test.py", line 21, in main
>         task.result()  # call 3
>     ValueError
>
> If people agree there's an issue along these lines, I can file an
> issue in the tracker. I didn't seem to find one when searching for
> open issues with search terms like "asyncio traceback".
>
> Thanks,
> --Chris
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/


-- 
Nathaniel J. Smith -- https://vorpus.org

From chris.jerdonek at gmail.com  Mon Dec 25 04:46:32 2017
From: chris.jerdonek at gmail.com (Chris Jerdonek)
Date: Mon, 25 Dec 2017 01:46:32 -0800
Subject: [Async-sig] task.result() and exception traceback display
In-Reply-To: <CAPJVwBnEP+gY_4B0tiwH=oCgS3U6zpv5O+XWdyinMtg-JQJAcw@mail.gmail.com>
References: <CAOTb1wfdSEfyMvzSkaRg5Rd3gSqrkxVHanf1USqxi88S7G4y6A@mail.gmail.com>
 <CAPJVwBnEP+gY_4B0tiwH=oCgS3U6zpv5O+XWdyinMtg-JQJAcw@mail.gmail.com>
Message-ID: <CAOTb1weY5wCdF+qvwqGtqTUA1mV9HYbLVqba-run5FVxpO+yig@mail.gmail.com>

On Mon, Dec 25, 2017 at 12:48 AM, Nathaniel Smith <njs at pobox.com> wrote:
> I haven't thought about this enough to have an opinion about whether
> this is correct or how it could be improved, but I can explain why
> you're seeing what you're seeing :-).
>
> The traceback is really a trace of where the exception went after it
> was raised, with new lines added to the top as it bubbles out. So the
> bottom line is the 'raise' statement, because that's where it was
> created, and then it bubbled onto the 'call 1' line and was caught.
> Then it was raised again and bubbled onto the 'call 2' line. Etc. So
> you should think of it not as a snapshot of your stack when it was
> created, but as a travelogue.

Thanks, Nathaniel.  That's a really good explanation.

Also, here's a way to see the same behavior without async:

    def main():
        exc = ValueError()

        try:
            raise exc  # call 1
        except Exception:
            pass

        try:
            raise exc  # call 2
        except Exception:
            pass

        raise exc  # call 3

    main()

With this, the traceback looks like--

    Traceback (most recent call last):
      File "test.py", line 16, in <module>
        main()
      File "test.py", line 14, in main
        raise exc  # call 3
      File "test.py", line 10, in main
        raise exc  # call 2
      File "test.py", line 5, in main
        raise exc  # call 1
    ValueError

Since you can see that the later calls are getting added on the top,
it's almost as if it should read: most recent calls **first**. :)

(I wonder if there's a 4-word phrase that does accurately describe
what's happening.)

--Chris


>
> -n
>
> On Sun, Dec 24, 2017 at 9:55 PM, Chris Jerdonek
> <chris.jerdonek at gmail.com> wrote:
>> Hi,
>>
>> I noticed that if a task in asyncio raises an exception, then the
>> displayed traceback can be "polluted" by intermediate calls to
>> task.result().  Also, the calls to task.result() can appear out of
>> order relative to each other and to other lines.
>>
>> Here is an example:
>>
>>     import asyncio
>>
>>     async def raise_error():
>>         raise ValueError()
>>
>>     async def main():
>>         task = asyncio.ensure_future(raise_error())
>>
>>         try:
>>             await task  # call 1
>>         except Exception:
>>             pass
>>
>>         try:
>>             task.result()  # call 2
>>         except Exception:
>>             pass
>>
>>         task.result()  # call 3
>>
>>     asyncio.get_event_loop().run_until_complete(main())
>>
>> The above outputs--
>>
>>     Traceback (most recent call last):
>>       File "test.py", line 24, in <module>
>>         asyncio.get_event_loop().run_until_complete(main())
>>       File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
>>           line 467, in run_until_complete
>>         return future.result()
>>       File "test.py", line 21, in main
>>         task.result()  # call 3
>>       File "test.py", line 17, in main
>>         task.result()  # call 2
>>       File "test.py", line 12, in main
>>         await task  # call 1
>>       File "test.py", line 5, in raise_error
>>         raise ValueError()
>>     ValueError
>>
>> Notice that the "call 2" line appears in the traceback, even though it
>> doesn't come into play in the exception.  Also, the lines don't obey
>> the "most recent call last" rule.  If this rule were followed, it
>> should be something more like--
>>
>>     Traceback (most recent call last):
>>       File "test.py", line 24, in <module>
>>         asyncio.get_event_loop().run_until_complete(main())
>>       File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
>>           line 467, in run_until_complete
>>         return future.result()
>>       File "test.py", line 12, in main
>>         await task  # call 1
>>       File "test.py", line 5, in raise_error
>>         raise ValueError()
>>       File "test.py", line 17, in main
>>         task.result()  # call 2
>>       File "test.py", line 21, in main
>>         task.result()  # call 3
>>     ValueError
>>
>> If people agree there's an issue along these lines, I can file an
>> issue in the tracker. I didn't seem to find one when searching for
>> open issues with search terms like "asyncio traceback".
>>
>> Thanks,
>> --Chris
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
>
> --
> Nathaniel J. Smith -- https://vorpus.org

From pfreixes at gmail.com  Sun Dec 31 12:32:21 2017
From: pfreixes at gmail.com (Pau Freixes)
Date: Sun, 31 Dec 2017 18:32:21 +0100
Subject: [Async-sig] Asyncio loop instrumentation
Message-ID: <CA+ULCcHMUkSm5qXKhsYOcFCWmjXeBcX_zbQUHyhwHXZD57Oz3g@mail.gmail.com>

Hi, foks

First of all, I hope that you have had a good 2017 and I wish for your
the best for 2018.

This email is the continuation of a plan B of the first proposal [1]
to articulate a way to measure
the load of the Asyncio loop. The main objections with the first
implementation were focused on
the technical debt that the implementation imposed, taking into
account that the feature was
definitely out of the main scope of the Asyncio loop goal.

Nathaniel proposed a plan B based on implement some kind of
instrumentalization that will allow
developers to implement features such as the load one. I put off the
plan for a while having, wrongly, feeling that an implementation of
the loop wired with the proper events will impact with the loop
performance. Far away from the reality, the suggested implementation
in terms of performance penalty is almost negligible, at least for
what I considered the happy path which means that there are no
instruments listening for these events.

These new implementation of the load method - remember that it returns
a load factor between 0.0 and 1.0 that inform you about how bussy is
your loop - based on an instrument can be checked with the following
snippet:

async def coro(loop, idx):
        await asyncio.sleep(idx % 10)
        if load() > 0.9:
                return False
        start = loop.time()
        while loop.time() - start < 0.02:
                pass
        return True

async def run(loop, n):
        tasks = [coro(loop, i) for i in range(n)]
        results = await asyncio.gather(*tasks)
        abandoned = len([r for r in results if not r])
        print("Load reached for {} coros/seq: {}, abandoned
{}/{}".format(n/10, load(), abandoned))

async def main(loop):
        await run(loop, 100)

loop = asyncio.get_event_loop()
loop.add_instrument(LoadInstrument)
loop.run_until_complete(main(loop))

The `LoadInstrument` [2] meets the contract of the LoopInstrument[3]
that allow it to listen the proper
loop signals that will be used to calculate the load of the loop.

For this proposal [4], POC, I've preferred make a reduced list of events:

* `loop_start` : Executed when the loop starts for the first time.
* `tick_start` : Executed when a new loop tick is started.
* `io_start` : Executed when a new IO process starts.
* `io_end` : Executed when the IO process ends.
* `tick_end` : Executed when the loop tick ends.
* `loop_stop` : Executed when the loop stops.

The idea of giving just this short list of events try to avoid over
complicate third loops implementations, implementing the minimum set
of events that a typical reactor has to implement.

I would like to gather your feedback for this new approximation, and
if you believe that it might be interesting which are the next steps
that must be done.

Cheers,

[1] https://mail.python.org/pipermail/async-sig/2017-August/000382.html
[2] https://github.com/pfreixes/asyncio_load_instrument/blob/master/asyncio_load_instrument/instrument.py#L8
[3] https://github.com/pfreixes/cpython/blob/asyncio_loop_instrumentation/Lib/asyncio/loop_instruments.py#L9
[4] https://github.com/pfreixes/cpython/commit/adc3ba46979394997c40aa89178b4724442b28eb


-- 
--pau

From solipsis at pitrou.net  Sun Dec 31 14:02:47 2017
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 31 Dec 2017 20:02:47 +0100
Subject: [Async-sig] Asyncio loop instrumentation
References: <CA+ULCcHMUkSm5qXKhsYOcFCWmjXeBcX_zbQUHyhwHXZD57Oz3g@mail.gmail.com>
Message-ID: <20171231200247.755b5c42@fsol>

On Sun, 31 Dec 2017 18:32:21 +0100
Pau Freixes <pfreixes at gmail.com> wrote:
> 
> These new implementation of the load method - remember that it returns
> a load factor between 0.0 and 1.0 that inform you about how bussy is
> your loop -

What does it mean exactly? Is it the ratio of CPU time over wall clock
time?

Depending on your needs, the `psutil` library (*) and/or the new
`time.thread_time` function (**) may also help.

(*) https://psutil.readthedocs.io/en/latest/
(**) https://docs.python.org/3.7/library/time.html#time.thread_time

> For this proposal [4], POC, I've preferred make a reduced list of events:
> 
> * `loop_start` : Executed when the loop starts for the first time.
> * `tick_start` : Executed when a new loop tick is started.
> * `io_start` : Executed when a new IO process starts.
> * `io_end` : Executed when the IO process ends.
> * `tick_end` : Executed when the loop tick ends.
> * `loop_stop` : Executed when the loop stops.

What do you call a "IO process" in this context?

Regards

Antoine.


From yselivanov at gmail.com  Sun Dec 31 14:12:35 2017
From: yselivanov at gmail.com (Yury Selivanov)
Date: Sun, 31 Dec 2017 22:12:35 +0300
Subject: [Async-sig] Asyncio loop instrumentation
In-Reply-To: <20171231200247.755b5c42@fsol>
References: <CA+ULCcHMUkSm5qXKhsYOcFCWmjXeBcX_zbQUHyhwHXZD57Oz3g@mail.gmail.com>
 <20171231200247.755b5c42@fsol>
Message-ID: <2B052CB7-FFCB-494C-97BA-DA8859B49598@gmail.com>

When PEP 567 is accepted, I plan to implement advanced instrumentation in uvloop, to monitor basically all io/callback/loop events. I'm still -1 to do this in asyncio at least in 3.7, because i'd like us to have some time to experiment with such instrumentation in real production code (preferably at scale)

Yury

Sent from my iPhone

> On Dec 31, 2017, at 10:02 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
> On Sun, 31 Dec 2017 18:32:21 +0100
> Pau Freixes <pfreixes at gmail.com> wrote:
>> 
>> These new implementation of the load method - remember that it returns
>> a load factor between 0.0 and 1.0 that inform you about how bussy is
>> your loop -
> 
> What does it mean exactly? Is it the ratio of CPU time over wall clock
> time?
> 
> Depending on your needs, the `psutil` library (*) and/or the new
> `time.thread_time` function (**) may also help.
> 
> (*) https://psutil.readthedocs.io/en/latest/
> (**) https://docs.python.org/3.7/library/time.html#time.thread_time
> 
>> For this proposal [4], POC, I've preferred make a reduced list of events:
>> 
>> * `loop_start` : Executed when the loop starts for the first time.
>> * `tick_start` : Executed when a new loop tick is started.
>> * `io_start` : Executed when a new IO process starts.
>> * `io_end` : Executed when the IO process ends.
>> * `tick_end` : Executed when the loop tick ends.
>> * `loop_stop` : Executed when the loop stops.
> 
> What do you call a "IO process" in this context?
> 
> Regards
> 
> Antoine.
> 
> 
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/