Load testing with Autocannon
Already one of the fastest tools available for testing web-server performance, Autocannon gets even faster with new access to Node.js worker threads.
To ensure a blazing fast web server, you must first measure its performance so you can optimise it to its fullest potential. Autocannon is one of many tools available for this purpose, as described in this overview of key features and use cases.
Users can simulate many thousands of requests per second with Autocannon , but Node.js’s single-threaded nature always set a limit at which it would get capped. That was until now. Support added to Autocannon recently gives it access to Node’s worker_threads
, which allows for more effective CPU utilisation and means Autocannon can simulate even higher amounts of web traffic than it could before. Check out the latest release of Autocannon v7 .
For an idea of just how fast Autocannon can be, try this example:
Setting up a test server
I’m setting up a basic http server in GoLang , but if you’re not familiar with the language, you can set up a Node.js cluster. You need to ensure it’s not a plain/single-threaded Node server, or you won’t know whether the output numbers have been limited by the capacity of the server or by Autocannon’s performance.
To see if your target server is the bottleneck, you can check the process stats or Activity Monitor on Mac as shown below. The process with PID 33621 is a basic single-threaded Node server. Looking at % CPU, it’s apparent that, because the server is utilising one CPU fully, it might have hit the physical limit.
On the other hand, the process with PID 33625 is the Autocannon process with three workers, which has the potential to utilise more CPU if the target server allows.
Now, let’s start firing requests with different scenarios: First, we’ll see how Autocannon performs with a single thread. Then, we’ll compare the results with the new multithreaded Autocannon.
Autocannon without workers
autocannon https://localhost:3000/ -d 10 -c 30
Autocannon with workers
autocannon https://localhost:3000/ -d 10 -c 30 -w 3
In the first example, we did not use worker threads, and the results were already an astonishing 54k requests per second. However, in this example the Autocannon process gets capped by a single CPU thread. We changed this in the second example, running Autocannon with three workers. This time, we hit closer to 112k requests per second!
Nonetheless, it is difficult to interpret the results without something to compare them with. Let’s look at how the similar setup performs when the load was generated using wrk
instead of Autocannon. wrk
is a popular open-source HTTP benchmarking tool written in C. Capable of generating significant load to the server, wrk
supports response processing, custom reporting and HTTP request message generation via Lua scripts. wrk https://localhost:3000/ -d 10 -c 30 -t 3
The setup is the same — three threads, ten seconds and 30 connections — but this time we get 29k requests per second, which is four times less than what we achieved in the previous example with Autocannon. Note that single-threaded Autocannon is not nearly as fast as wrk
because it is limited by its inability to utilise multiple cores. This is the primary motivation behind implementing workers’ support in Autocannon.
How it works
Now that we’ve established the basis, let’s dive deeper to see what really happens and how Autocannon goes from fast to blazing fast when using workers.
Splitting the task
When Autocannon is executed, it hands over the configurations to the worker manager, which breaks down the config for each worker and fires up the number of workers requested. For example, if there are 30 connections and three workers, it gives ten connections to each worker. Or if 90k requests are set using the --amount
option, each worker gets 30k requests to make. This greatly improves the performance by eliminating the single CPU thread cap. Each worker starts to batter the target server individually and reports back to the aggregator, which then collates each worker’s output.
Aggregating
As shown in the screenshot above, the output histogram contains information including latencies, throughput and request/sec. Tracking these in a single thread is pretty easy using HdrHistogramJS , but this becomes far more challenging when dealing with multiple workers. Each worker essentially runs an isolated task of firing requests as fast as possible for a given configuration. They track each of the above-mentioned metrics, and latencies can be added up easily after the process completes. But aggregating requests and throughput is tricky because, even though each worker is firing, for example, 10k requests per second, the effective load on the server is 30k requests per second (assuming a three-worker setup). Tracking this in individual histograms isn’t helpful because summing them up in the end would give us incorrect output.
Let’s see how Autocannon resolves this issue:
Instead of creating histograms for requests and throughput for each worker, Autocannon creates them in the manager
. Each worker abstracts a histogram-like API and passes it to the test runner, which records data on every tick (one second). Runner is agnostic of where this data goes, but in reality, the worker updates the main histogram using postMessage
.
Customising requests
Running command autocannon
fires the same request every time, but for any non-trivial scenario, you typically want to customise this behaviour and fire specific requests based on various conditions.
Let’s assume a hypothetical server with two APIs: POST /product
— creates a new product POST /catalog
— associates product with an existing catalog
We need to hit the create API with the new product’s name and would like to read the generated ID and use it in the next request to associate it with the catalog.
In the above example, we fire a set of two requests each time. The first one creates a product, receives its ID and makes another request to add that product to a catalog. This is possible with the help of setupRequest
and onResponse
options.
The first request has an onResponse
callback, which saves the response into the context
object. This context
object is available in the second request, which uses it to mutate the req
object and add a body
. After that, the context
is reinitialised.
When using workers
When using workers, the functions passed to setupRequest
and onResponse
cannot be cloned while passing options to workers. You can read more about the structured clone algorithm here . Due to this limitation, the behaviour of these options is slightly different. Instead of supplying a function
, you need to pass in the absolute path of the file that exports
a function.
The above example changes to:
This helps passing the options down to the workers, which then require these files before starting the task. The on-response.js and setup-request.js files look like this:
on-response.js
setup-request.js
Customising clients
Similar to the above section, there is an option to customise at client level. For example:
This ensures that all requests this client
creates will have the specified body or header. To access this client
instance, a setupClient
option needs to be passed.
When using workers
Similarly, when using workers, it’s not possible to pass a function
to setupClient
. Hence, a file path needs to be given, which will then be required
.
Conclusion
Autocannon is pretty flexible. As shown in above examples, you can run it via command-line as well as programmatically to implement certain complex testing scenarios. You can also use numerous other options and settings to tune it to your specific requirements. Check out Autocannon’s official documentation and try it out if you haven’t already.
Struggling with Node.js development ? Get in touch today to organise a discovery workshop .
Insight, imagination and expertly engineered solutions to accelerate and sustain progress.
Contact