[Web-SIG] Server-side async API implementation sketches
Alex Grönholm
alex.gronholm at nextday.fi
Sun Jan 9 19:09:28 CET 2011
09.01.2011 19:03, P.J. Eby kirjoitti:
> At 06:06 AM 1/9/2011 +0200, Alex Grönholm wrote:
>> A new feature here is that the application itself yields a (status,
>> headers) tuple and then chunks of the body (or futures).
>
> Hm. I'm not sure if I like that. The typical app developer really
> shouldn't be yielding multiple body strings in the first place. I
> much prefer that the canonical example of a WSGI app just return a
> list with a single bytestring -- preferably in a single statement for
> the entire return operation, whether it's a yield or a return.
Uh, so don't yield multiple body strings then? How is that so difficult?
>
>
> IOW, I want it to look like the normal way to do thing is to just
> return the whole request at once, and use the additional difficulty of
> creating a second iterator to discourage people writing iterated
> bodies when they should just write everything to a BytesIO and be done
> with it.
I fail to understand why a second iterator is necessary when we can get
away with just one.
>
>
> Also, it makes middleware simpler: the last line can just yield the
> result of calling the app, or a modified version, i.e.:
>
> yield app(environ)
>
> or:
>
> s, h, b = app(environ)
> # ... modify or replace s, h, b
> yield s, h, b
Asynchronous applications may not be ready to send the status line as
the first thing coming out of the generator. Consider an app that
receives a file. The first thing coming out of the app is a future. The
app needs to receive the entire file until it can determine what status
line to send. Maybe there was an I/O error writing the file, so it needs
to send a 500 response instead of 200. This is not possible with a body
iterator, and if we are already iterating the application generator, I
really don't understand why the body needs to be an iterator as well.
>
>
> In your approach, the above samples have to be rewritten as:
>
> return app(environ)
>
> or:
>
> result = app(environ)
> s, h = yield result
> # ... modify or replace s, h
> yield s, h
>
> for data in result:
> # modify b as we go
> yield result
>
> Only that last bit doesn't actually work, because you have to be able
> to send future results back *into* the result. Try actually making
> some code that runs on this protocol and yields to futures during the
> body iteration.
Did you miss the gist posted by myself (and improved by Alice)?
>
> Really, this modified protocol can't work with a full async API the
> way my coroutine-based version does, AND the middleware is much more
> complicated. In my version, your do-nothing middleware looks like this:
>
>
> class NullMiddleware(object):
> def __init__(self, app):
> self.app = app
>
> def __call__(environ):
> # ACTION: pre-application environ mangling
>
> s, h, body = yield self.app(environ)
>
> # modify or replace s, h, body here
>
> yield s, h, body
>
>
> If you want to actually process the body in some way, it looks like:
>
> class NullMiddleware(object):
>
> def __init__(self, app):
> self.app = app
>
> def __call__(environ):
> # ACTION: pre-application environ mangling
>
> s, h, body = yield self.app(environ)
>
> # modify or replace s, h, body here
>
> yield s, h, self.process(body)
>
> def process(self, body_iter):
> while True:
> chunk = yield body_iter
> if chunk is None:
> break
> # process/modify chunk here
> yield chunk
>
> And that's still a lot simpler than your sketch.
>
> Personally, I would write both of the above as:
>
> def null_middleware(app):
>
> def wrapped(environ):
> # ACTION: pre-application environ mangling
> s, h, body = yield app(environ)
>
> # modify or replace s, h, body here
> yield s, h, process(body)
>
> def process(body_iter):
> while True:
> chunk = yield body_iter
> if chunk is None:
> break
> # process/modify chunk here
> yield chunk
>
> return wrapped
>
> But that's just personal taste. Even as a class, it's much easier to
> write. The above middleware pattern works with the sketches I gave on
> the PEAK wiki, and I've now updated the wiki to include an example app
> and middleware for clarity.
>
> Really, the only hole in this approach is dealing with applications
> that block. The elephant in the room here is that while it's easy to
> write these example applications so they don't block, in practice
> people read files and do database queries and whatnot in their
> requests, and those APIs are generally synchronous. So, unless they
> somehow fold their entire application into a future, it doesn't work.
>
>
>> I liked the idea of having a separate async_read() method in
>> wsgi.input, which would set the underlying socket in nonblocking mode
>> and return a future. The event loop would watch the socket and read
>> data into a buffer and trigger the callback when the given amount of
>> data has been read. Conversely, .read() would set the socket in
>> blocking mode. What kinds of problems would this cause?
>
> That you could never *call* the .read() method outside of a future, or
> else you would block the server, thereby obliterating the point of
> having the async API in the first place.
>
Outside of the application/middleware you mean? I hope there isn't any
more confusion left about what a future is. The fact is that you cannot
use synchronous API calls directly from an async app no matter what.
Some workaround is always necessary.
More information about the Web-SIG
mailing list