Binary Search in C++

An introduction to the advantages of binary search, and how to use it with the C++ standard library algorithms binary_search(), lower_bound(), upper_bound(), and equal_range()
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, Unlimited Access
3D Character Concept Art
Ryan McCombe
Ryan McCombe
Updated

The standard library search functions covered in the previous lesson, like std::ranges::find() are examples of linear search. We can imagine the algorithm simply progressing through our container in a straight line, checking every object in turn until it finds what we’re looking for, or reaches the end of the container.

If our containers have many objects, or we’re searching frequently, this can degrade our performance. These are $O(n)$ algorithms - that is, they scale linearly with the size of the collection we’re searching. If we double the number of objects, the algorithm takes twice as long to complete on average.

If our container is sorted, we can instead use binary search algorithms, which have better performance.

What is Binary Search?

An example of a binary search that we might do in the real world might involve finding a specific numbered page in a book. If we have a book with around 200 pages, for example, and were looking for page 100, we wouldn’t start at the beginning and check page 1, then 2, then 3.

Instead, we’d take advantage of the fact that the pages are in order. We’d open the book somewhere in the middle, and then check what page is there. If it’s page 120, we know the page we’re looking for - 100 - is to the left. We can eliminate every page to the right from our search, effectively cutting the problem in half.

We repeat this process on the remaining search space, ie the pages on the left. We open a page somewhere in the middle, check the page number, and use it to cut our problem in half again. After two lookups, the search space is now a quarter of its original size. We repeat this over and over and over until we find what we’re looking for.

This approach is an example of logarithmic scaling. If the book doubles in size, from 200 to 400 pages, our search only takes one additional step. After the first lookup, we cut the search space in half, back to 200 pages again.

As such, logarithmic scaling is much preferred to linear scaling

As the previous example suggests, binary search benefits from the ability to jump around our collection and freely access any object. In other words, they benefit from the input supporting random access, in the form of a range that satisfies the random_access_range concept, or an iterator that satisfies random_access_iterator.

The standard library algorithms we cover in this section benefit from random access, but they do not require it. If our input does not support random access, the algorithms will have a slower, linear run time.

std::ranges::binary_search()

The standard library’s binary search algorithms are available by including <algorithm>. In its most basic use, std::ranges::binary_search() accepts two arguments:

  1. The range to search
  2. The value to search for

It returns a boolean value that is true if the object was found. Later in this lesson, we cover std::ranges::lower_bound(), which we can use if we want to access the object that was found.

Below, we search our sorted vector for the number 4:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 4, 5};

  bool Result{
      std::ranges::binary_search(Numbers, 4)};

  std::cout << "The number 4 "
            << (Result ? "was" : "was not")
            << " found";
}
The number 4 was found

Remember, our container needs to be sorted for binary search to be effective. Below, we try to use it on an unsorted container, and the incorrect result is returned:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{3, 2, 5, 4, 1};

  bool Result{
      std::ranges::binary_search(Numbers, 4)};

  std::cout << "The number 4 "
            << (Result ? "was" : "was not")
            << " found";
}
The number 4 was not found

There is some nuance in what it means to be "sorted". We go into that and explain how to change that meaning, later in the lesson

Partitioned Containers

In almost all cases where we use binary search, our container will be entirely sorted. But, in niche cases, this is not strictly necessary. Technically, our container only needs to be partitioned with respect to the object we’re searching for.

We can think of a partitioned container as being split into two parts, a left and right. Whether an object is in the left or right partition depends on whether it satisfies a specific rule.

For example, if we’re searching for 4 using binary search, our container must be partitioned such that all numbers less than 4 are to the left of all numbers that are not less than 4. The following example satisfies this rule:

{3, 1, 2, 4, 6, 5}

This container is not fully sorted, but it is correctly partitioned with respect to 4 so, if 4 is what we’re searching for, it would be valid to use binary search:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Nums{3, 1, 2, 4, 6, 5};

  bool Result{
    std::ranges::binary_search(Nums, 4)};                        

  std::cout << "The number 4 "
            << (Result ? "was" : "was not")
            << " found";
}
The number 4 was found

But it would not be valid if we were searching for 2, for example, because our container is not partitioned with respect to 2. There is a 3 in the left partition when it should be in the right.

We cover partitioning and partition algorithms in more detail later in the chapter.

std::ranges::lower_bound()

When doing binary searches, the lower_bound() function is often more useful than binary_search().

The two functions are closely related - lower_bound() also accepts a sorted collection, a value to search for, and optional comparison and projection functions.

It will use our comparison function (or the default < operator if we don’t provide one) to return an iterator. That iterator will point to the position where our element would be if it were in our sorted container.

Below, we show an example where the element in that position is indeed equal to our target:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 4, 5};

  auto Result{
      std::ranges::lower_bound(Numbers, 4)};

  std::cout
      << "The number 4 would be in position "
      << std::distance(Numbers.begin(), Result);

  std::cout
      << "\nThe element in that position is "
      << *Result;
}
The number 4 would be in position 3
The element in that position is 4

Here, we rerun the example with a target that is not in our container:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{2, 4, 6, 8, 10};

  auto Result{
      std::ranges::lower_bound(Numbers, 5)};

  std::cout
      << "The number 5 would be in position "
      << std::distance(Numbers.begin(), Result);

  std::cout
      << "\nThe element in that position is "
      << *Result;
}
The number 5 would be in position 2
The element in that position is 6

It’s also possible that our element would be at the end of the container, in which case lower_bound() will return the sentinel of our range. Dereferencing that would be dangerous, but we can test for it:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{2, 4, 6, 8, 10};

  auto Result{
      std::ranges::lower_bound(Numbers, 100)};

  std::cout
      << "The number 100 would be in position "
      << std::distance(Numbers.begin(), Result);

  if (Result == Numbers.end()) {
    std::cout
        << "\nThis is the end of the container";
  }
}
The number 100 would be in position 5
This is the end of the container

If lower_bound() returns the sentinel, we can infer the container doesn't contain the object we're looking for.

Accessing the Object that was Found

The binary_search() algorithm returns a boolean representing whether or not the object was found. Our requirements may be slightly different - we may want access to the item that was found.

The standard library doesn’t provide a direct function for that, but we can quickly build the functionality using std::lower_bound().

std::lower_bound() will return an iterator pointing to where the element would be. By inspecting that element, we can examine if we found what we’re looking for.

Remember, lower_bound() can return the sentinel, so we should check for that before we dereference it:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 4, 5};
  int Target{4};

  auto Result{std::ranges::lower_bound(Numbers,
                                       Target)};

  if (Result != Numbers.end() &&
      *Result == Target) {
    std::cout << "Object found: " << *Result;
  } else {
    std::cout << "Object not found";
  }
}
Object found: 4

lower_bound() with Comparison Functions

The previous example shows the simple case: we’re not using a comparison function, and we can compare our objects using the == operator. We need to clarify an important insight to adapt our previous example to support comparison functions.

The comparison function we provide to binary_search() and lower_bound() will be given two objects. It will return true if the first argument is to the left of the second argument within our container.

We can use this to come up with another definition of equality. Two elements are equal if the comparison function returns false regardless of the order of the arguments. For example:

  • If our comparison function is the < operator
  • And if A < B is false
  • And if B < A is false
  • Then A and B are equal

When using a custom comparison function, lower_bound() will return an iterator to the first object for which Comp(Object, Target) returns false. If we change the order of arguments and Comp(Target, Object) also returns false, we can infer Target and Object are equal.

Let's update our earlier example to use this:

#include <algorithm>
#include <iostream>
#include <vector>

struct Player {
  Player(const char* Name) : Name{Name} {}
  std::string Name;
};

int main() {
  std::vector<Player> Players{
      "Aragorn", "Frodo", "Gandalf", "Gimli",
      "Legolas"};
  Player Target{"Legolas"};

  auto Comparer{
      [](const Player& a, const Player& b) {
    return a.Name < b.Name;
  }};

  auto Result{std::ranges::lower_bound(
    Players, Target, Comparer)};

  if (Result != Players.end() &&
      !Comparer(Target, *Result)) {
    std::cout << "Object found: "
              << Result->Name;
  }
}
Object found: Legolas

Using lower_bound() to maintain sorted containers

The lower_bound() algorithm returning an iterator to where an element would be in our sorted container has another useful property. When we want to add an element to our collection, and we want to maintain the sort order, lower_bound() can give us the position we need to insert it.

We can then use our container-specific methods, like the insert() function on std::vector to add our object to the correct place:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 5};
  int NumberToInsert{4};

  auto Position{std::ranges::lower_bound(
      Numbers, NumberToInsert)};

  Numbers.insert(Position, NumberToInsert);

  std::cout << "Numbers: ";
  for (auto Num : Numbers) {
    std::cout << Num << ", ";
  }
}
Numbers: 1, 2, 3, 4, 5,

std::ranges::upper_bound()

When determining where to insert an object into a sorted container, there can be multiple valid options. For example, let's imagine we have the following collection:

{1, 2, 3, 4, 5}

If we want to insert another 4 into this range whilst maintaining the sort order, there are two valid options - before the existing 4, or after the existing 4.

As we saw in the previous example, the lower_bound() algorithm will return the earliest valid position. The upper_bound() algorithm will return the last valid position:

         ↓ Lower
{1, 2, 3, 4, 5}
      Upper ↑

Below, we use these algorithms in a code example:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 4, 5};

  auto Lower{
    std::ranges::lower_bound(Numbers, 4)};

  std::cout
    << "Lower Bound Index "
    << std::distance(Numbers.begin(), Lower);

  auto Upper{
    std::ranges::upper_bound(Numbers, 4)};

  std::cout
    << "\nUpper Bound Index "
    << std::distance(Numbers.begin(), Upper);
}
Lower Bound Index 3
Upper Bound Index 4

When our container contains multiple matches, the lower and upper bounds get farther apart:

         ↓ Lower
{1, 2, 3, 4, 4, 4, 5}
            Upper ↑

Below, we show an example of this in code, and additionally use the distance between these two iterators to determine how many matches were found:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 4, 4, 4, 5};

  auto Lower{
    std::ranges::lower_bound(Numbers, 4)};

  std::cout
    << "Lower Bound Index "
    << std::distance(Numbers.begin(), Lower);

  auto Upper{
    std::ranges::upper_bound(Numbers, 4)};

  std::cout
    << "\nUpper Bound Index "
    << std::distance(Numbers.begin(), Upper);

  std::cout
    << "\nQuantity of 4s found: "
    << std::distance(Lower, Upper);
}
Lower Bound Index 3
Upper Bound Index 6
Quantity of 4s found: 3

When our container does not already contain the value we’re looking for, there is only one valid position where such a value could be inserted. Therefore, the lower and upper bounds point to the same position:

   Lower ↓
{1, 2, 3, 5}
   Upper ↑

Adapting our previous code example to use this container looks like this:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 5};

  auto Lower{
    std::ranges::lower_bound(Numbers, 4)};

  std::cout
    << "Lower Bound Index "
    << std::distance(Numbers.begin(), Lower);

  auto Upper{
    std::ranges::upper_bound(Numbers, 4)};

  std::cout
    << "\nUpper Bound Index "
    << std::distance(Numbers.begin(), Upper);

  std::cout
    << "\nQuantity of 4s found: "
    << std::distance(Lower, Upper);
}
Lower Bound Index 3
Upper Bound Index 3
Quantity of 4s found: 0

std::ranges::equal_range()

The std::ranges::equal_range() function determines both the lower and upper bound in a single algorithm. It returns a std::pair where first is the lower bound, and second is the upper bound:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 4, 4, 4, 5};

  auto [Lower, Upper]{
    std::ranges::equal_range(Numbers, 4)};

  std::cout
    << "Lower Bound Index "
    << std::distance(Numbers.begin(), Lower);

  std::cout
    << "\nUpper Bound Index "
    << std::distance(Numbers.begin(), Upper);

  std::cout
    << "\nQuantity of 4s found: "
    << std::distance(Lower, Upper);
}
Lower Bound Index 3
Upper Bound Index 6
Quantity of 4s found: 3

Custom Sort Behaviour

By default, the standard library binary search algorithms expect our container to contain objects that implement the < operator, and for those objects to be sorted in ascending order. For example, if ObjectA < ObjectB, then ObjectA should be to the left of ObjectB in our range.

However, the algorithms have an additional optional parameter, which we can use to pass a custom comparison function. That function defines how our objects are sorted. It accepts two objects of the same type in our container and returns true if the first argument comes before the second argument.

Below, we use this parameter to communicate that our container is sorted in descending order:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{5, 4, 3, 2, 1};

  auto GreaterThan{
      [](int x, int y) { return x > y; }};

  bool Result{std::ranges::binary_search(
      Numbers, 4, GreaterThan)};

  std::cout << "The number 4 "
            << (Result ? "was" : "was not")
            << " found";
}
The number 4 was found

This function also allows us to define the sort order of objects that don’t implement comparison operators. Below, we use it to search a collection of custom Player objects that are sorted alphabetically by their Name:

#include <algorithm>
#include <iostream>
#include <vector>

struct Player {
  Player(const char* Name) : Name{Name} {}
  std::string Name;
};

int main() {
  std::vector<Player> Players{
      "Aragorn", "Frodo", "Gandalf", "Gimli",
      "Legolas"};

  auto Comparer{[](const Player& a,
                   const Player& b) {
    return a.Name < b.Name;
  }};

  bool Result{std::ranges::binary_search(
      Players, Player{"Legolas"},
      Comparer)};

  std::cout << "The player Legolas "
            << (Result ? "was" : "was not")
            << " found";
}
The player Legolas was found

Projection Function

Like with most range-based algorithms, the algorithms we covered in this section accept a projection function. Below, we use it with a projector that transforms integers to their absolute value. We pass {} as the third argument, causing the algorithm to use its default parameter for the comparison function:

#include <algorithm>
#include <iostream>
#include <vector>

int Projector(int x) { return std::abs(x); }

int main() {
  std::vector Nums{-1, 4, -5, 9};

  bool Result{std::ranges::binary_search(
    Nums, 5, {}, Projector
  )};                        

  std::cout << "The projection of 5 "
            << (Result ? "was" : "was not")
            << " found";
}
The projection of 5 was found

When using a projection function with binary search, our container also needs to be sorted (or partitioned) based on that projection function. For example:

  • if our call to binary_search() is using the default comparison function <
  • and our call to binary_search() is using a projection function Proj
  • and if Proj(A) < Proj(B) is true
  • then A should be before B in our container.

Iterator-Based Algorithms

The ranges::binary_search() and ranges::lower_bound() algorithms use the ranges feature, added in C++20.

When targetting older versions, we can omit the ranges component of our identifiers. For example, instead of std::ranges::binary_search(), we could use std::binary_search(). These algorithms accept the input collection in the form of an iterator pair:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector Numbers{1, 2, 3, 4, 5};

  bool Result{std::binary_search(
      Numbers.begin(), Numbers.end(), 4)};

  std::cout << "The number 4 "
            << (Result ? "was" : "was not")
            << " found";
}
The number 4 was found

The iterator-based variations of these algorithms do not allow us to provide a projection function.

Summary

In this lesson, we explored binary search, and how to use it in C++ through the standard library’s std::ranges::binary_search() and std::ranges::lower_bound() algorithms.

Main Points Learned

  • Binary search offers logarithmic scaling, making it more efficient than linear search for sorted collections.
  • std::ranges::binary_search() checks for the presence of a value within a range, returning a boolean.
  • std::ranges::lower_bound() finds the earliest position where a value would be inserted within an ordered collection.
  • std::ranges::upper_bound() finds the latest position where a value would be inserted within an ordered collection.
  • std::ranges::equal_range() returns both the lower and upper bound in a single algorithm
  • Custom comparison functions can be used to define sorting behavior for binary search operations.
  • Containers must support random access for optimal binary search performance, but the algorithms adapt for linear time complexity without it.
  • Partitioned containers can also utilize binary search, provided they are partitioned with respect to the search value.
  • Projection functions allow for searches based on transformed or projected values.
  • Inserting elements while maintaining sort order can be achieved with the help of std::ranges::lower_bound().
  • For older C++ versions or specific needs, std::binary_search() and std::lower_bound() without the ranges namespace are available and operate with iterators.

Was this lesson useful?

Next Lesson

Counting Algorithms

An introduction to the 5 main counting algorithms in the C++ standard library: count, count_if, any_of, none_of and all_of
3D Character Concept Art
Ryan McCombe
Ryan McCombe
Updated
A computer programmer
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, Unlimited Access
Standard Library Algorithms
Next Lesson

Counting Algorithms

An introduction to the 5 main counting algorithms in the C++ standard library: count, count_if, any_of, none_of and all_of
3D Character Concept Art
Contact|Privacy Policy|Terms of Use
Copyright © 2024 - All Rights Reserved