Using Google Benchmark
Adding benchmarks to our build to measure the performance of our code, and track how it changes over time
Our tests tell us if our code is correct, but not if it's fast. To measure performance, we use a benchmark. A benchmark is a program designed to run a piece of code repeatedly under controlled conditions to produce stable and comparable performance metrics.
Specifically, we'll focus on micro-benchmarks. A micro-benchmark is an automated script that measures the performance of a specific part of our program - perhaps a single function or algorithm.
Google Benchmark is the de-facto standard for C++ micro-benchmarking. It is a library that provides a framework for quickly creating these benchmarks, and it takes care of details like running the code enough times to get a stable result and preventing the compiler from optimizing the code away.
Integrating Google Benchmark
First, let's add benchmark
to our vcpkg.json
manifest.
vcpkg.json
{
"name": "greeter",
"dependencies": [
"gtest",
"spdlog",
"benchmark"
]
}
Next, we'll create a new benchmarks/
directory for our benchmark code and its CMakeLists.txt
.
benchmarks/CMakeLists.txt
cmake_minimum_required(VERSION 3.23)
find_package(benchmark CONFIG REQUIRED)
add_executable(GreeterBenchmarks bench_main.cpp)
target_link_libraries(GreeterBenchmarks PRIVATE
GreeterLib
benchmark::benchmark
)
Finally, we add this new directory to our root CMakeLists.txt
:
CMakeLists.txt
cmake_minimum_required(VERSION 3.23)
project(Greeter)
include(cmake/Coverage.cmake)
include(cmake/Sanitize.cmake)
add_subdirectory(app)
add_subdirectory(greeter)
enable_testing()
add_subdirectory(tests)
add_subdirectory(benchmarks)
Writing a Benchmark
A benchmark looks very similar to a GoogleTest case. Let's write one in benchmarks/bench_main.cpp
to measure our Greeter::greet()
method.
benchmarks/bench_main.cpp
#include <benchmark/benchmark.h>
#include <greeter/Greeter.h>
static void BM_Greeter_Greet(benchmark::State& state) {
Greeter g;
// This loop is the core of the benchmark
for (auto _ : state) {
// This code gets timed
std::string result = g.greet();
// Prevent the result from being optimized away
benchmark::DoNotOptimize(result);
}
}
// Register the function as a benchmark
BENCHMARK(BM_Greeter_Greet);
// Run all benchmarks
BENCHMARK_MAIN();
Running the Benchmark
To get useful benchmarking results, we should build our project in "Release" mode. This prevents any debugging helpers from degrading our performance.
Let's add some presets for this, if we haven't already:
CMakePresets.json
{
"version": 3,
"configurePresets": [
// ... other presets
{
"name": "release",
"inherits": "default",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "Release"
}
],
"buildPresets": [
// ... other presets
{
"name": "release",
"configurePreset": "release",
"configuration": "Release"
}
]
}
We can now configure and build our project with these new presets. From the project root:
cmake --preset=release
cmake --build --preset=release
Our GreeterBenchmarks
target should have generated a GreeterBenchmarks
(or GreeterBenchmarks.exe
) in the build/benchmarks
directory. We can run it in the usual way from the project root:
./build/benchmarks/GreeterBenchmarks
Google Benchmark will run the test and produce a detailed report showing the average time taken per iteration, CPU time, and other useful metrics.
-----------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------
BM_Greeter_Greet 72.3 ns 69.8 ns 8960000
Parameterized Benchmarks
A simple benchmark is useful, but often we'll want to run the same benchmark across many cases. For example, we might want to see how our code behaves across a variety of different input sizes, or to compare the performance of multiple options.
Much like Google Test, Google Benchmark includes utilitities to help us create parameterized benchmarks that run the same code with different arguments.
Let's modify our Greeter
class to greet a specific person by name, and then benchmark how the greet()
method performs with names of different lengths.
Files
Now, we can update our benchmarking. We'll pass arguments to the benchmark function using the Arg()
method, representing the length of the name we want to test:
benchmarks/bench_main.cpp
#include <benchmark/benchmark.h>
#include <greeter/Greeter.h>
#include <string>
static void BM_Greeter_Greet(benchmark::State& state) {
// state.range(0) is the first argument to the benchmark
std::string name(state.range(0), 'x');
Greeter g(name);
for (auto _ : state) {
std::string result = g.greet();
benchmark::DoNotOptimize(result);
}
}
// Register benchmarks with different arguments
BENCHMARK(BM_Greeter_Greet)->Arg(8);
BENCHMARK(BM_Greeter_Greet)->Arg(64);
BENCHMARK(BM_Greeter_Greet)->Arg(512);
BENCHMARK(BM_Greeter_Greet)->Arg(4096);
BENCHMARK(BM_Greeter_Greet)->Arg(32768);
BENCHMARK_MAIN();
This Arg()
approach is generally flexible, but the specific case of testing different input sizes is so common that a shortcut is available in the form of the Range()
method:
// Before:
BENCHMARK(BM_Greeter_Greet)->Arg(8);
BENCHMARK(BM_Greeter_Greet)->Arg(64);
BENCHMARK(BM_Greeter_Greet)->Arg(512);
BENCHMARK(BM_Greeter_Greet)->Arg(4096);
BENCHMARK(BM_Greeter_Greet)->Arg(32768);
// After:
BENCHMARK(BM_Greeter_Greet)->Range(8, 32768);
This use of Range(8, 32768)
tells Google Benchmark to run this benchmark multiple times. It will start with an argument of 8, and for each subsequent run, it will multiply the argument by 8 until it reaches or exceeds (which is ).
If we run this new benchmark, we'll get a table of results showing how the performance scales with the input size:
cmake --preset release
cmake --build --preset release
./build/benchmarks/GreeterBenchmarks
------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------
BM_Greeter_Greet/8 15.7 ns 15.7 ns 44800000
BM_Greeter_Greet/64 91.6 ns 92.1 ns 7466667
BM_Greeter_Greet/512 106 ns 106 ns 5600000
BM_Greeter_Greet/4096 167 ns 167 ns 4480000
BM_Greeter_Greet/32768 1507 ns 1507 ns 497778
Other Benchmark Features
Google Benchmark has a wide range of features to support benchmarking. Here are a few other capabilities you might find useful:
- Fixtures: Just like in GoogleTest, you can create a fixture class (by inheriting from
benchmark::Fixture
) to handle complex setup and teardown logic that can be shared across multiple benchmarks. - Time Units: You can control the time unit reported in the output (nanoseconds, microseconds, etc.) by calling
Unit(benchmark::kMillisecond)
on your benchmark registration. - Complexity Analysis: Google Benchmark can automatically compute the asymptotic complexity of your code - e.g., , - if you provide it with the input size via the
Arg()
orRange()
methods. - Custom Counters: You can report your own custom metrics (e.g., "bytes processed per second") from within your benchmark loop.
For a complete guide to all the features, the official Google Benchmark documentation is the best resource.
Summary
In this lesson, we've seen how to to start measuring performance within our build.
- Micro-benchmarking: A benchmark is a program for measuring the performance of a small piece of code in a controlled environment.
- Google Benchmark: This is an extremely popular library for C++ benchmarking. We integrated it into our project using vcpkg and a dedicated
benchmarks
target. - Writing Benchmarks: We learned the basic structure of a benchmark, including the
for (auto _ : state)
loop and the importance ofbenchmark::DoNotOptimize()
to prevent the compiler from removing the code being tested. - Parameterized Benchmarks: We used the
Arg()
andRange()
method to run our benchmark with a variety of inputs, allowing us to see how performance scales.
Using Clang-Tidy
Integrating Clang-Tidy to enforce good practices and find bugs before we even compile