Using Google Benchmark

Adding benchmarks to our build to measure the performance of our code, and track how it changes over time

Greg Filak
Published

Our tests tell us if our code is correct, but not if it's fast. To measure performance, we use a benchmark. A benchmark is a program designed to run a piece of code repeatedly under controlled conditions to produce stable and comparable performance metrics.

Specifically, we'll focus on micro-benchmarks. A micro-benchmark is an automated script that measures the performance of a specific part of our program - perhaps a single function or algorithm.

Google Benchmark is the de-facto standard for C++ micro-benchmarking. It is a library that provides a framework for quickly creating these benchmarks, and it takes care of details like running the code enough times to get a stable result and preventing the compiler from optimizing the code away.

Integrating Google Benchmark

First, let's add benchmark to our vcpkg.json manifest.

vcpkg.json

{
  "name": "greeter",
  "dependencies": [
    "gtest",
    "spdlog",
    "benchmark"
  ]
}

Next, we'll create a new benchmarks/ directory for our benchmark code and its CMakeLists.txt.

benchmarks/CMakeLists.txt

cmake_minimum_required(VERSION 3.23)

find_package(benchmark CONFIG REQUIRED)

add_executable(GreeterBenchmarks bench_main.cpp)

target_link_libraries(GreeterBenchmarks PRIVATE
  GreeterLib
  benchmark::benchmark
)

Finally, we add this new directory to our root CMakeLists.txt:

CMakeLists.txt

cmake_minimum_required(VERSION 3.23)
project(Greeter)

include(cmake/Coverage.cmake)
include(cmake/Sanitize.cmake)

add_subdirectory(app)
add_subdirectory(greeter)

enable_testing()
add_subdirectory(tests)

add_subdirectory(benchmarks)

Writing a Benchmark

A benchmark looks very similar to a GoogleTest case. Let's write one in benchmarks/bench_main.cpp to measure our Greeter::greet() method.

benchmarks/bench_main.cpp

#include <benchmark/benchmark.h>
#include <greeter/Greeter.h>

static void BM_Greeter_Greet(benchmark::State& state) {
  Greeter g;
  // This loop is the core of the benchmark
  for (auto _ : state) {
    // This code gets timed
    std::string result = g.greet();
    // Prevent the result from being optimized away
    benchmark::DoNotOptimize(result);
  }
}

// Register the function as a benchmark
BENCHMARK(BM_Greeter_Greet);

// Run all benchmarks
BENCHMARK_MAIN();

Running the Benchmark

To get useful benchmarking results, we should build our project in "Release" mode. This prevents any debugging helpers from degrading our performance.

Let's add some presets for this, if we haven't already:

CMakePresets.json

{
  "version": 3,
  "configurePresets": [
    // ... other presets
    {
      "name": "release",
      "inherits": "default",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release"
      }
  ],
  "buildPresets": [
    // ... other presets
    {
      "name": "release",
      "configurePreset": "release",
      "configuration": "Release"
    }
  ]
}

We can now configure and build our project with these new presets. From the project root:

cmake --preset=release
cmake --build --preset=release

Our GreeterBenchmarks target should have generated a GreeterBenchmarks (or GreeterBenchmarks.exe) in the build/benchmarks directory. We can run it in the usual way from the project root:

./build/benchmarks/GreeterBenchmarks

Google Benchmark will run the test and produce a detailed report showing the average time taken per iteration, CPU time, and other useful metrics.

-----------------------------------------------
Benchmark            Time      CPU   Iterations
-----------------------------------------------
BM_Greeter_Greet  72.3 ns  69.8 ns      8960000

Parameterized Benchmarks

A simple benchmark is useful, but often we'll want to run the same benchmark across many cases. For example, we might want to see how our code behaves across a variety of different input sizes, or to compare the performance of multiple options.

Much like Google Test, Google Benchmark includes utilitities to help us create parameterized benchmarks that run the same code with different arguments.

Let's modify our Greeter class to greet a specific person by name, and then benchmark how the greet() method performs with names of different lengths.

Files

greeter
tests
Select a file to view its content

Now, we can update our benchmarking. We'll pass arguments to the benchmark function using the Arg() method, representing the length of the name we want to test:

benchmarks/bench_main.cpp

#include <benchmark/benchmark.h>
#include <greeter/Greeter.h>
#include <string>

static void BM_Greeter_Greet(benchmark::State& state) {
  // state.range(0) is the first argument to the benchmark
  std::string name(state.range(0), 'x');
  Greeter g(name);

  for (auto _ : state) {
    std::string result = g.greet();
    benchmark::DoNotOptimize(result);
  }
}

// Register benchmarks with different arguments
BENCHMARK(BM_Greeter_Greet)->Arg(8);
BENCHMARK(BM_Greeter_Greet)->Arg(64);
BENCHMARK(BM_Greeter_Greet)->Arg(512);
BENCHMARK(BM_Greeter_Greet)->Arg(4096);
BENCHMARK(BM_Greeter_Greet)->Arg(32768);

BENCHMARK_MAIN();

This Arg() approach is generally flexible, but the specific case of testing different input sizes is so common that a shortcut is available in the form of the Range() method:

// Before:
BENCHMARK(BM_Greeter_Greet)->Arg(8);
BENCHMARK(BM_Greeter_Greet)->Arg(64);
BENCHMARK(BM_Greeter_Greet)->Arg(512);
BENCHMARK(BM_Greeter_Greet)->Arg(4096);
BENCHMARK(BM_Greeter_Greet)->Arg(32768);

// After:
BENCHMARK(BM_Greeter_Greet)->Range(8, 32768);

This use of Range(8, 32768) tells Google Benchmark to run this benchmark multiple times. It will start with an argument of 8, and for each subsequent run, it will multiply the argument by 8 until it reaches or exceeds 32,76832,768 (which is 858^5).

If we run this new benchmark, we'll get a table of results showing how the performance scales with the input size:

cmake --preset release
cmake --build --preset release
./build/benchmarks/GreeterBenchmarks
------------------------------------------------------
Benchmark                  Time        CPU  Iterations
------------------------------------------------------
BM_Greeter_Greet/8      15.7 ns    15.7 ns    44800000
BM_Greeter_Greet/64     91.6 ns    92.1 ns     7466667
BM_Greeter_Greet/512     106 ns     106 ns     5600000
BM_Greeter_Greet/4096    167 ns     167 ns     4480000
BM_Greeter_Greet/32768  1507 ns    1507 ns      497778

Other Benchmark Features

Google Benchmark has a wide range of features to support benchmarking. Here are a few other capabilities you might find useful:

  • Fixtures: Just like in GoogleTest, you can create a fixture class (by inheriting from benchmark::Fixture) to handle complex setup and teardown logic that can be shared across multiple benchmarks.
  • Time Units: You can control the time unit reported in the output (nanoseconds, microseconds, etc.) by calling Unit(benchmark::kMillisecond) on your benchmark registration.
  • Complexity Analysis: Google Benchmark can automatically compute the asymptotic complexity of your code - e.g., O(n)O(n), O(n2)O(n^2) - if you provide it with the input size via the Arg() or Range() methods.
  • Custom Counters: You can report your own custom metrics (e.g., "bytes processed per second") from within your benchmark loop.

For a complete guide to all the features, the official Google Benchmark documentation is the best resource.

Summary

In this lesson, we've seen how to to start measuring performance within our build.

  • Micro-benchmarking: A benchmark is a program for measuring the performance of a small piece of code in a controlled environment.
  • Google Benchmark: This is an extremely popular library for C++ benchmarking. We integrated it into our project using vcpkg and a dedicated benchmarks target.
  • Writing Benchmarks: We learned the basic structure of a benchmark, including the for (auto _ : state) loop and the importance of benchmark::DoNotOptimize() to prevent the compiler from removing the code being tested.
  • Parameterized Benchmarks: We used the Arg() and Range() method to run our benchmark with a variety of inputs, allowing us to see how performance scales.
Next Lesson
Lesson 55 of 55

Using Clang-Tidy

Integrating Clang-Tidy to enforce good practices and find bugs before we even compile

Have a question about this lesson?
Answers are generated by AI models and may not be accurate