Fun with IO

Maxime S maxischmeii at gmail.com
Tue Jan 21 11:17:57 EST 2020


Hi,

Le ven. 17 janv. 2020 à 20:11, Frank Millman <frank at chagford.com> a écrit :


> It works perfectly. However, some pdf's can be large, and there could be
> concurrent requests, so I wanted to minimise the memory footprint. So I
> tried passing the client_writer directly to the handler -
>
>      await pdf_handler(client_writer)
>      client_writer.write(b'\r\n')
>
> It works! ReportLab accepts client_writer as a file-like object, and
> writes to it directly. I cannot use chunking, so I just let it do its
> thing.
>
> Can anyone see any problem with this?
>
>
If the socket is slower than the PDF generation (which is probably always
the case, unless you have a very fast network), it will still have to be
buffered in memory (in this case in the writer buffer). Since writer.write
is non-blocking but is not a coroutine, it has to buffer. There is an
interesting blog post about that here that I recommend reading:
https://lucumr.pocoo.org/2020/1/1/async-pressure/

Unfortunately, there is no way to avoid buffering the entire pdf in memory
without modifying reportlab to make it async-aware.

This version is still better than the one with BytesIO though because in
that version the pdf was buffered twice, once in BytesIO and once in the
writer, although you can fix that by using await writer.drain() after each
write and then the two versions are essentially equivalent.

Regards,

Maxime.


More information about the Python-list mailing list