python - handling HTTP requests asynchronously

Tue May 10 19:54:47 EDT 2016

On Sat, May 7, 2016 at 4:46 PM <lordluke80 at gmail.com> wrote:

> Il giorno sabato 7 maggio 2016 21:04:47 UTC+2, Michael Selik ha scritto:
> > On Fri, May 6, 2016 at 3:01 AM <lordluke80 at gmail.com> wrote:
> >
> > > The PDF is generated through an external API. Since currently is
> generated
> > > on demand, this is handled synchronously via an HTTP request/response.
> >
> >
> > Are you sending the request or are you receiving the request?
> > If you are sending, you can just use threads as you are only doing IO.
> > If you are receiving the requests and generating PDFs, you may want to
> use
> > subprocesses if the PDF-generation is compute-intensive.
> >
> >
> > > 3) multiprocessing module, with e.g. 10 as workers limit.
> > >
> >
> > multiprocessing.Pool is an easy way to use subprocesses
> > multiprocessing.pool.ThreadPool is an easy way to use threads. It's not
> > well documented, but has the exact same interface as Pool.
> >
> > the goal is to use the API concurrently (e.g. send 10 or 100 http
> requests
> > > simultaneously, depending on how many concurrent requests the API can
> > > handle).
> > >
> >
> > Sounds like you want to use threads. How does the API let you know you're
> > hitting it too frequently? Perhaps you want to code an exponential
> backoff
> > and retry wrapper for your API requests.
>
> Thanks for the reply.
> Currently the django view that does the job does three http request:
> - the first does a POST and send the payload used during the PDF
> rendering, the response contains a url to check the pdf generation progress;
> - the second loop over that url, checking the progress of pdf generation.
> Once the response contains the keyword 'status': 'complete', then it give
> also a url for the file retrieval;
> - the third one is a GET to retrieve the file, the reponse contains the
> binary content of the file, then this content is read and wrapped as
> attachment of a django http response, and then returned to the user.
>
> the goal is to reuse the existing code as much as possible, possibly doing
> concurrently, and saving the file instead on a folder.
>

I'm not a Django expert. Does the Django View require all that stuff to
happen before Django can send an HTTP Response back to the user? If so, I
suggest you respond to the user immediately: "now generating a bunch of
PDFs..." and take care of the rest in the background. Perhaps just write to
a file, database, or message queue the info for the PDFs to generate. Have
a separate program periodically read the file, database, or message queue
to do that work and then maybe email the user when it's completed.