Comparators and Projections
Learn how C++20 Projections allow us to separate sorting logic from data layout, the mechanics of std::invoke and the hardware reality of sorting large objects.
In the previous chapter, we explored the mechanics of views and pipes. We learned how to compose algorithms into lazy-evaluated pipelines that process data efficiently.
However, all our examples so far have been trivial. We've been sorting lists of integers or filtering arrays of floats. In the real world, we rarely store primitive types. We store Players, Transactions, NetworkPackets, or Monsters.
When we apply algorithms to these complex objects, we run into a "knowledge gap." If you ask std::ranges::sort() to sort a std::vector<int>, it knows exactly what to do: compare the integers using the < operator.
If you ask it to sort a std::vector<Player>, it panics. It doesn't know how to compare two players. Do you want to sort by health? By name? By ID? And even if it knew you wanted to sort by "Health," it doesn't know how to access that data inside your struct.
Previously, solving this required writing custom comparators. In C++20, we have a much more elegant tool: projections.
In this lesson, we will learn how projections allow us to separate the "key" from the "value," drastically reducing boilerplate code.
Keys vs. Payloads
To understand the philosophy behind projections, we need to borrow a concept from database theory: the distinction between the key and the payload.
When you sort a spreadsheet of employees by their salary, the "Salary" column is the key. It is the only piece of data that matters for the ordering. The rest of the row (Name, Address, Phone Number) is just the payload - data that is effectively along for the ride.
In C++, we typically bundle the key and the payload together into a single struct.
struct Player {
// Potential Keys
int id;
int score;
float health;
// Heavy Payload
std::string name;
std::vector<std::string> inventory;
char padding[1024];
};The Old Way: Bundled Logic
Before C++20, if we wanted to sort players by score, we had to write a comparator that manually extracted the keys from both objects.
Standard library algorithms that require comparisons accept an additional argument where we can supply that custom logic.
Below, we provide this in the form of a lambda which std::sort() will call every time it needs a decision. It will provide two elements from our container as arguments, and we should return true if the first argument should come before the second:
std::sort(players.begin(), players.end(),
[](const Player& a, const Player& b) {
return a.score < b.score;
}
);This works, but it mixes two distinct logic steps into one lambda:
- Extraction: Getting
.scorefrom the object. - Comparison: Using
<to check the order.
If we wanted to order our players from highest to lowest, we'd have to provide a completely different comparator, just to change the < to >.
If we wanted to check if a range contains a player who hasn't scored yet, perhaps with the help of an algorithm like std::find_if(), we'd have to write yet another comparator:
std::find_if(players.begin(), players.end(),
[](const Player& p) {
return p.score == 0;
}
);The C++20 Solution: Projections
C++20 algorithms in the <ranges> library accept a new, optional argument: the projection.
A projection is simply a function (or member pointer) that the algorithm applies to every element before it performs an operation. It extracts the key from the payload.
Instead of writing a custom comparator, we can tell the algorithm: "I want to sort using the default comparison logic (std::ranges::less), but please sort based on the score member variable."
#include <vector>
#include <algorithm>
#include <ranges>
#include "Player.h"
void ProcessPlayers(std::vector<Player>& players) {
using std::ranges::sort;
// Sort by score using default ordering (ascending)
sort(players, {}, &Player::score);
// Sort by score (descending order)
sort(players, std::ranges::greater{}, &Player::score);
// Sort by Health
sort(players, {}, &Player::health);
// Find a player with score == 0
auto it = std::ranges::find(players, 0, &Player::score);
}How it Works: std::invoke()
You might be wondering: how can we pass &Player::score as a function? It's a member pointer, not a function pointer.
The standard library uses a utility called std::invoke() to handle this. It allows us to use a standard "do something with an object" indirection that gets converted to the appropriate raw C++ syntax at compile time:
- If we pass a function pointer, lambda, or functor to
std::invoke(), the syntax gets converted tofunc(obj). - With a member function pointer, the syntax becomes
(obj.*func)(). - If we pass a member data pointer, the syntax becomes
obj.*member.
Because the standard algorithms use std::invoke() internally, you can use any of these as a projection.
struct Monster {
int id;
int GetPowerLevel() const { return id * 10; }
};
void Examples(std::vector<Monster>& monsters) {
// 1. Data Member Projection
std::ranges::sort(monsters, {}, &Monster::id);
// 2. Member Function Projection
// Sorts by the result of GetPowerLevel()
std::ranges::sort(monsters, {}, &Monster::GetPowerLevel);
// 3. Lambda Projection
// Sorts by a computed value (e.g., negative ID for descending)
std::ranges::sort(monsters, {}, [](const Monster& m) {
return -m.id;
});
}Standard Library Support
Projections aren't just for sorting. They are available across the entire std::ranges namespace, on pretty much any algorithm where they'd be useful. This allows us to use simple algorithms for complex queries.
Here are some examples:
std::ranges::find()
Normally, find() looks for an element equal to a value. With a projection, we can look for an object whose key equals a value:
// Find the player with ID 42
auto it = std::ranges::find(players, 42, &Player::id);
if (it != players.end()) {
// 'it' points to the whole Player object
std::cout << "Found: " << it->name << "\n";
}std::ranges::max_element()
Finding the "best" item becomes trivial:
// Find the player with the highest health
auto toughest = std::ranges::max_element(players, {}, &Player::health);std::ranges::count()
Counting items based on a property:
// Count how many players have a score of 0
int noobs = std::ranges::count(players, 0, &Player::score);Note how readable this is. We don't see any lambdas or loop bodies. The code reads almost like an English sentence: "Count players [where] 0 [matches] score."
Expensive Projections
There is an important performance trap to be aware of. Algorithms like sort() evaluate the projection every time they perform a comparison. For a sort, this happens times.
If your projection is a simple member access (&Player::score), this is fine. It compiles to a single CPU instruction (an offset load).
However, if our projection performs a calculation, that calculation will be repeated millions of times.
// BAD: Computing distance inside the projection
// This square root runs O(n log n) times!
std::ranges::sort(points, {}, [](const Point& p) {
return std::sqrt(p.x*p.x + p.y*p.y);
});The compiler can sometimes optimize this, but you shouldn't rely on it. If the key is expensive to compute, we should intervene here. The proxy sort pattern generally works well in these scenarios - we can compute the keys once into a separate std::vector<float> keys, and then sort indices based on that key vector. We cover proxy sorting in the next lesson.
Summary
In this lesson, we modernized how we interact with algorithms:
- Keys vs. Payloads: We learned that algorithms usually only care about a tiny slice of your object (the key).
- Projections: We used C++20 projections to extract these keys cleanly, removing the need for verbose custom comparators.
- The
std::invoke()Utility: We saw that member pointers like&Class::Memberare first-class citizens in modern C++, serving as efficient, readable extractors.
In the next lesson, we will look at the hardware reality of projections. We will see that while projections make our code cleaner, they do not necessarily make it faster. We will explore how sorting "heavy" objects clogs the memory bus, and how we can use the proxy sort pattern to cheat the physics of memory latency.
Proxy Sort and Structure of Arrays
Learn why sorting large objects is slow, and how to optimize it using Proxy Sorting and Data-Oriented Design (SoA).