Defining Ranges using Sentinels

An alternative way of defining ranges, and why we sometimes need to use them
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, Unlimited Access
3D Character Concept Art
Ryan McCombe
Ryan McCombe
Updated

In earlier lessons, we saw how any container with appropriate begin() and end() methods is a range.

The range starts at the memory address denoted by the begin() iterator and ends at the memory address denoted by the end() iterator.

There is an alternative way to define a range. We establish the starting point as an iterator as before, but rather than defining the ending as a specific memory address, we instead define it using a sentinel.

We’ll cover this pattern in this lesson, and explain the reasons why we would need to use it.

Limitations of End Iterators

Using an iterator to define where a range ends doesn’t give us all the flexibility of traditional loops.

When the end of a range is defined by an iterator, we need to know where that iterator should point to in advance. That’s not always possible - sometimes our end condition is only uncovered in the process of running the algorithm.

For example, imagine we wanted our range of numbers to end at the first negative number. We don’t necessarily know where that is in advance, so we can’t provide an iterator to it unless we first run a different algorithm to figure out where that iterator should point to.

With a traditional for loop, this would have been an easy problem to solve. We could write whatever expression we desire as our continue clause:

#include <iostream>
#include <vector>

int main(){
  std::vector R{1, 4, 3, 8, -2, 5};

  for (
    auto it = R.begin();
    *it >= 0; 
    ++it
  ) {
    std::cout << *it << ", ";
  }
}
1, 4, 3, 8,

Let’s see how we can empower ranges to have the same capability, by introducing sentinels.

What Are Sentinels?

A sentinel is a fairly generic concept within programming. It simply means something that can signal an algorithm to end.

In computer programming, a sentinel value (also referred to as a flag value, trip value, rogue value, signal value, or dummy data) is a special value in the context of an algorithm that uses its presence as a condition of termination, typically in a loop or recursive algorithm.

We were already familiar with this concept when we were writing loops. For example, we saw a similar behavior when we first introduced iterators within a basic for loop. With loops, we just approach it from the opposite direction - we defined when we wanted the loop to continue, rather than when we wanted it to stop:

#include <iostream>
#include <vector>

int main(){
  std::vector R{1, 4, 3, 8, -2, 5};

  for (
    auto it = R.begin();
    it != R.end(); // Continue if true 
    ++it
  ) {
    std::cout << *it << ", ";
  }
}
1, 4, 3, 8, -2, 5,

For reasons we covered earlier in this chapter, we want to move away from for loops. They’re quite verbose and don’t generalize well.

Using the iterator-sentinel form of a range, we can get the same behavior using a range-based algorithm, like std::ranges::for_each():

#include <algorithm>
#include <iostream>
#include <vector>

void Log(int x){ std::cout << x << ", "; } 

int main(){
  std::vector R{1, 4, 3, 8, -2, 5};
  std::ranges::for_each(
    R.begin(), R.end(), Log 
  );
}
1, 4, 3, 8, -2, 5,

In this case, rather than providing our range as a single value, we’ve provided it as two values - an iterator where the range starts, and a sentinel that will signal when the range ends.

Within the context of C++ ranges, sentinels are objects that can be compared to iterators, using the equality (==) operator. If the operator returns true, that’s the signal for the algorithm to stop.

Iterator vs Range-Based Algorithms

A common source of confusion at this point, particularly with those familiar with the older iterator-based algorithms, is how the range-based variation is any different.

In scenarios like these where we’re defining our inputs as a pair of arguments, the algorithms seem to have an identical API and behavior:

std::ranges::for_each(R.begin(), R.end(), Log);
std::        for_each(R.begin(), R.end(), Log);

The key difference is:

  • For the iterator-based algorithms, the ending point must be an iterator
  • For the range algorithms, the ending point can be any sentinel

A point of confusion is that iterators can be used as sentinels. That’s what we’re doing in this case, so the algorithms are effectively the same

However, not all sentinels are iterators. We have other options, as we’ll see later in this lesson.

As such, the range-based variations of these algorithms are more flexible than their older, iterator-based counterparts

Iterators as Sentinels

It is somewhat common for the sentinel to be an iterator. That was the case in our previous example - specifically, the iterator that was returned from R.end()

std::ranges::for_each(
  R.begin(), R.end(), Log 
);

In this example, the starting iterator moves through our container as the algorithm progresses. At each step, the algorithm compares the current iterator to the sentinel.

The comparison is done by the == operator. After enough iterations, that operator will return true, as the advancing iterator will eventually point to the same memory address that was returned by R.end().

The option to define a range as an iterator-sentinel pair gives us slightly more flexibility right away. For example, we can run the algorithm only on a subrange, by modifying either of the arguments.

Below, we move the starting iterator one position forward, and move the sentinel iterator one position backward. This causes us to exclude the first and last objects from our algorithm execution:

#include <algorithm>
#include <iostream>
#include <vector>

void Log(int x){ std::cout << x << ", "; } 

int main(){
  std::vector R{1, 4, 3, 8, -2, 5};
  std::ranges::for_each(
    R.begin() + 1, R.end() - 1, Log 
  );
}
4, 3, 8, -2,

Creating Custom Sentinels

The key point to realize is that our sentinel does not need to be an iterator. It just needs to be an object that can be compared to an iterator, using the == and != operators. As such, an object created from this type could be a sentinel:

struct Sentinel {
  bool operator==(SomeIteratorType I){
    /* ... */
  }
};

Let's see a real example, using our proposal for a range that stops at the first negative value.

A sentinel that implements this requirement is going to receive some iterator type as an argument to a == operator. That operator will need to return true if the value that iterator is pointing at is less than 0

Below, we’ve created an initial version of this, which we’ll improve through subsequent examples:

#include <iostream>
#include <vector>

struct Sentinel {
  bool operator==(
    std::vector<int>::const_iterator Iter
  ) const {
    return *Iter< 0;
  }
};

int main(){
  std::vector R{1, 4, 3, 8, -2, 5};
  Sentinel S{};

  for (
    auto it = R.begin();
    it != S;
    ++it
  ) {
    std::cout << *it << ", ";
  }
}
1, 4, 3, 8,

What about the != operator?

Sentinels typically need to define both the == and != operators for any iterator type they need to support. But, as of C++20, if we don’t define the != operator, the compiler can rewrite any expressions that use it in terms of the == operator instead.

For example, an expression like A != B can be rewritten as !(A == B)

If we’re writing code to an earlier standard, we can’t benefit from expression rewriting, so we just explicitly define the != operators as usual.

We cover expression rewriting in detail in a dedicated lesson later in this course:

With our sentinel now being defined by a custom type we created, we can implement logic that is as complex as we need for our use case.

Commonly, in addition to analyzing the value the iterator is pointing at, a sentinel will also need to know when the container it’s acting on naturally ends.

In this example, this means that if our input range didn’t contain any negative numbers, our algorithm stops running when there are no more objects in the container.

Improving our sentinel type to support that scenario could look something like this:

#include <iostream>
#include <vector>

struct Sentinel {
  bool operator==(
    std::vector<int>::const_iterator Iter
  ) const {
    return Iter == ContainerEnd || *Iter < 0;
  }

  std::vector<int>::const_iterator ContainerEnd;
};

int main(){
  std::vector R{1, 4, 3, 8, -2, 5};
  Sentinel S{R.end()};

  for (
    auto it = R.begin();
    it != S;
    ++it
  ) {
    std::cout << *it << ", ";
  }
}
1, 4, 3, 8,

Template Sentinels

Sentinel types are typically implemented as templates. This is because they don’t necessarily know (or care) what specific type of iterator they’re going to be receiving as part of the == operator.

The sentinel in our previous example is only compatible with inputs that specifically have a type of std::vector<int>.

We can make it more flexible by converting it to a template type, calling our templated iterator type T:

#include <iostream>
#include <vector>

template <typename T>
struct Sentinel {
  bool operator==(T Iter) const {
    return Iter == ContainerEnd || *Iter < 0;
  }

  T ContainerEnd;
};

int main(){
  std::vector R{1, 4, 3, 8, -2, 5};
  Sentinel S{R.end()}; // Unchanged 

  for (
    auto it = R.begin();
    it != S;
    ++it
  ) {
    std::cout << *it << ", ";
  }
}
1, 4, 3, 8,

Note, that in this example, the main function didn’t need to be updated to specify the Sentinel template type.

This is because the template type T is the same type as the ContainerEnd variable within our struct.

As we’re providing a value for that member when we’re initializing our Sentinel, the compiler can use class template argument deduction (CTAD) to automatically infer what T is.

Using Sentinels with Algorithms

Now that we can define sentinels using any logic we require, we can go back to using algorithms rather than raw for loops:

#include <iostream>
#include <vector>
#include <algorithm>

struct Sentinel {/*...*/}; void Log(int x){ std::cout << x << ", "; } int main(){ std::vector R{1, 4, 3, 8, -2, 5}; std::ranges::for_each( R.begin(), Sentinel{R.end()}, Log ); }
1, 4, 3, 8,

If our custom sentinel is going to see recurring use through a large code base, we can consider packaging it into a range-compatible container.

This container can then work exactly the way our project needs. The basic foundations involve defining the begin() method as normal, but having our end() method return our custom sentinel, rather than a simple iterator:

#include <iostream>
#include <vector>
#include <algorithm>

struct Sentinel {/*...*/}; template <typename T> class PositiveRange { public: PositiveRange( std::initializer_list<T> Numbers) : Container{Numbers}, Sentinel{Container.end()} {} auto begin() const{ return Container.begin(); } auto end() const{ return Sentinel; } private: std::vector<T> Container; Sentinel<typename std::vector< T>::const_iterator> Sentinel; };

This means consumers do not need to worry about constructing the sentinel - that’s all encapsulated way. They can treat our range as a simple, single object:

void Log(int x){ std::cout << x << ", "; }

int main(){
  PositiveRange R{1, 4, 3, 8, -2, 5};
  std::ranges::for_each(R, Log);
}
1, 4, 3, 8,

Finding where Ranges End

A common requirement we will have for our programs is to know how large a range is, or where it ends. When that ending is defined by a sentinel, this can require some extra work.

If we just want to find the first object for which the sentinel returns true, the std::ranges::find() algorithm, which we cover in the next section, can help us.

However, this is rarely needed. When using almost any algorithm on our range, that algorithm naturally needs to determine where the range ends.

Running a second process to calculate the same thing again is a waste of resources. Fortunately, most of the range-based standard library algorithms return an object that includes an iterator pointing at where the sentinel was triggered.

For example, the std::ranges::for_each() algorithm returns an object with two fields:

  • in - An iterator to where the sentinel was triggered
  • fun - A reference to the function that we provided to the algorithm - such as the Log function from our earlier examples

Different algorithms may have slightly different return values. Still, it’s somewhat common that any algorithm that accepts an input range will have a return value that includes an iterator to where the range ended.

By accessing this property, typically by structured binding, we can uncover where the sentinel was triggered. We can then use that information for any follow-up actions we might need to do:

#include <iostream>
#include <vector>
#include <algorithm>

struct Sentinel {/*...*/};
class PositiveRange {/*...*/}; void Log(int x){ std::cout << x << ", "; } int main(){ PositiveRange R{1, 4, 3, 8, -2, 5}; auto [in, fun]{std::ranges::for_each(R, Log)}; std::cout << "\nObject at sentinel: " << *in; std::cout << "\nSize of range: " << std::distance(R.begin(), in); std::cout << "\nLast in range: " << *(in - 1); }
1, 4, 3, 8,
Object at sentinel: -2
Size of range: 4
Last in range: 8

Summary

In this lesson, we explored the concept and applications of sentinels in C++ ranges, demonstrating how they offer flexibility in defining range endpoints beyond traditional iterators. We also examined how to create custom sentinel types and use them with algorithms.

Main Points Covered

  • Understanding that a range in C++ can be defined using a sentinel instead of an end iterator.
  • Exploring the limitations of end iterators in ranges and how sentinels provide greater flexibility.
  • Defining what a sentinel is and its role in terminating algorithms or loops.
  • Demonstrating the use of iterators as sentinels in standard range-based algorithms.
  • Creating custom sentinels to implement specific stopping conditions for ranges.
  • Implementing template sentinels for greater flexibility with different iterator types.
  • Using custom sentinels with standard library algorithms, such as std::ranges::for_each.
  • Packaging custom sentinels into range-compatible containers for simplified use.
  • Using structured binding to extract information from algorithms that utilize sentinels.
  • Gaining an understanding of how to determine the size and endpoint of ranges defined by sentinels.

Was this lesson useful?

Next Lesson

Creating Custom Iterators using C++20 Concepts

A detailed guide to implementing a custom iterator type from scratch, using modern recommended techniques
3D Character Concept Art
Ryan McCombe
Ryan McCombe
Updated
A computer programmer
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, Unlimited Access
A computer programmer
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, unlimited access

This course includes:

  • 124 Lessons
  • 550+ Code Samples
  • 96% Positive Reviews
  • Regularly Updated
  • Help and FAQ
Next Lesson

Creating Custom Iterators using C++20 Concepts

A detailed guide to implementing a custom iterator type from scratch, using modern recommended techniques
3D Character Concept Art
Contact|Privacy Policy|Terms of Use
Copyright © 2024 - All Rights Reserved