HTTP server benchmarking/load testing in Python

Thomas Passin list1 at tompassin.net
Wed Jan 25 13:21:23 EST 2023


On 1/25/2023 10:53 AM, Dino wrote:
> 
> Hello, I could use something like Apache ab in Python ( 
> https://httpd.apache.org/docs/2.4/programs/ab.html ).
> 
> The reason why ab doesn't quite cut it for me is that I need to define a 
> pool of HTTP requests and I want the tool to run those (as opposed to 
> running the same request over and over again)
> 
> Does such a marvel exist?
> 
> Thinking about it, it doesn't necessarily need to be Python, but I guess 
> I would have a chance to tweak things if it was.


I actually have a Python program that does exactly this.  The intention 
was to simulate a large number of independent users hitting a particular 
web site as rapidly as possible, so see what the typical throughput is. 
The program is somewhat specialized in the nature of the requests, but 
the method is easy enough to implement.

The requests are composed from a pool of 300 pieces, and for each 
request, four pieces are selected randomly with replacement and combined 
to form the entire request. The idea here is to try to minimize caching, 
so as to better assess the throughput for random queries.

The program runs a configurable number of threads.  Each thread tries to 
maintain an average query rate, but you have to throttle them to prevent 
an exponential buildup of the request queue.  If you run the program, 
the server machine (usually the same as the querying machine) is likely 
to get very hot - it's can be quite a stress test - and you want to 
monitor the CPU temperatures just in case.

I can't share the actual code for copyright reasons, but the above 
description should be helpful.  The actual code is not very complicated 
nor hard to develop.

I also have a version that uses async techniques instead of threads.  To 
give a feel for using a program like this, I think I can show the 
'__main__' bit:

if __name__ == '__main__':
     handleCmdLine()

     # Warm up [redacted] in case it is not ready to get flooded with 
queries
     for n in range(WARMUP_REPS):
         HTTPClient(HOST, setpath(), True)
     asyncore.loop()

     # Warmup done, reset hit counter
     reps = 0

     # And away we go ...
     for n in range(NUMCLIENTS):
         HTTPClient(HOST, setpath())

     start = clock()
     asyncore.loop(timeout=50)
     now = clock()

     sys.stderr.write('\n')
     reps_per_sec = reps / (now - start)
     print ('%0.1f hits/sec' % reps_per_sec)


There are also some polling-based systems available that do a similar 
job.  I don't remember the name of the one I tried a few years ago.  It 
ran a server to run the queries and reported the results via your browser.




More information about the Python-list mailing list