Polynomial Algorithms
Learn how input size affects execution speed and the most common class of algorithms.
In the previous lesson, we looked at the cost of operations. We learned that not all operations are created equal; fetching data from the CPU cache is effectively free, while fetching data from main RAM is incredibly expensive.
Factors like locality and caching determine how expensive each operation is, whilst algorithm analysis focuses how many of those operations are required to complete a task.
Big O notation is a high level classifications of how the time required to complete an algorithm changes based on how much data needs to be handled. It's the answer to questions like: "if the number of users doubles, does my system still perform the same, or does it get slower? And if it gets slower, by how much?"
When we analyze algorithms, this idea of the "amount of data", the "input size", or the "number of objects" an algorithm needs to handle is denoted by the variable or .
For example, if we want to find a specific user in our collection, the amount of work we need to do might depend on the number of users, , that is in our collection.
Constant Time
Most algorithms fall into a category called polynomial. This means the runtime is proportional to raised to some power . Mathematically, this looks like . In polynomial time, is a constant, like , , , and so on.
Constant time is the gold standard of algorithmic efficiency. It means the execution time is independent of the input size. Whether you have ten elements or ten billion, the code takes the same approximate amount of time to complete.
In our polynomial model, this is . Since any number to the power of is , we write this as .
The classic example of this is accessing an element in an array (or std::vector) by its index.
constant_time.cpp
#include <vector>
#include <iostream>
void PrintFirstElement(const std::vector<int>& Data) {
if (Data.empty()) return;
// This operation takes the same time regardless
// of whether Data has 1 element or 1,000,000.
std::cout << Data[0] << "\n";
}Why is this constant? Because of the contiguous memory layout we discussed in the intro.
The compiler knows the memory address where the vector starts, and it knows the size of each element. To find the element at index i, it simply performs one calculation:
This is a single instruction for the CPU. It doesn't need to "search" for the element; it teleports directly to it. This is why we prefer contiguous containers like std::vector whenever possible - they unlock access patterns.
Linear Time
Linear time means the work grows in direct proportion to the input. If we double the data, we double the amount of work our algorithm needs to do.
In our polynomial model, this is , or simply .
This is the complexity of "touching every element." If you need to print every player's name, or sum up a list of transaction values, you are doing work.
linear_time.cpp
#include <vector>
#include <iostream>
void PrintAll(const std::vector<int>& Data) {
// A simple loop is the hallmark of O(n)
for (int Value : Data) {
std::cout << Value << "\n";
}
}
bool Contains(const std::vector<int>& Data, int SearchKey) {
// Even though this might return early, in the
// worst case we have to check every element.
for (int Value : Data) {
if (Value == SearchKey) return true;
}
return false;
}Linear algorithms are the workhorse of modern computing. Remember, CPUs are designed with prefetchers - components that predict what memory you will need next.
When you iterate linearly through a std::vector, the access pattern is perfectly predictable. The prefetcher pulls data from RAM into the L1 cache before your code even asks for it.
Because of the realities of hardware, a linear scan over a array is often faster than theoretically superior algorithms, particularly for small to medium datasets.
Quadratic Time
Quadratic time is where things start to get dangerous. This complexity occurs when the work grows proportional to the square of the input. If you double the data, the work quadruples. This is a quadratic algorithm - .
This typically happens when you nest loops - processing the entire collection for each element in the collection. A classic example is checking for collisions between all entities in a game, or checking for duplicates in an unsorted list.
quadratic_time.cpp
#include <vector>
#include <iostream>
void FindDuplicates(const std::vector<int>& Entities) {
// Outer loop: n iterations
for (size_t i = 0; i < Entities.size(); ++i) {
// Inner loop: n iterations
for (size_t j = 0; j < Entities.size(); ++j) {
// We perform this check n * n times
if (i != j && Entities[i] == Entities[j]) {
std::cout << "Duplicate detected!\n";
}
}
}
}To visualize this, imagine a grid. If you have a list of 5 items, a linear algorithm walks a line of 5 steps. A quadratic algorithm has to fill in a grid, performing 25 operations.
If increases, linear algorithm tend to scale in a controllable way, whilst the "grid" of work required of a quadratic algorithm explodes.
These quadratic algorithms are when we begin to enter the danger zone. An algorithm that feels instant when testing can bring systems to a crawl as gets larger.
Naive Solutions
Our previous algorithm to find duplicates in a collection is an example of a "naive" or "brute force" solution. The same problem can be solved in many different ways. The following solution removes the nested-loop interaction:
#include <vector>
#include <iostream>
#include <unordered_set>
void FindDuplicates(const std::vector<int>& entities) {
std::unordered_set<int> seen;
for (int e : entities) { // O(n)
if (seen.contains(e)) { // O(1)
std::cout << "Duplicate detected!\n";
}
seen.insert(e); // O(1)
}
}The total complexity here is multipled by the constant costs of contains() and insert(). And, since Big O ignores multiplicative constants, we now have a linear solution.
For large values of , this is likely to be an improvement over our initial attempt, as long as the additional memory required for the seen container isn't problematic.
Comparing Polynomial Algorithms
We've now seen the three most common types of polynomial algorithms:
- Constant: , usually written as
- Linear: , usually written as
- Quadratic:
This pattern continues. We can have cubic algorithms - , quartic algorithms - , and so on.
It is quite rare that the algorithms we create to solve real-world problems fit exactly into these equations. For example, an algorithm might require operations. Another might require operations.
Big O notation simplifies things to just the core classification - both of these algorithms would be categorized as . This simplification allows us to communicate and compare approaches more simply.
Specifically, Big O makes the following simplifications:
Treat All Operations as Equal
As we've seen in the previous lesson, some operations can take much more time than others, particularly if the required data isn't in the cache. To keep things simple, Big O ignores that. It assumes all operations are equally expensive.
Ignore Constants
One algorithm might require operations whilst another might require exactly . We ignore the additive constant - both these algorithms are linear -
One algorithm might require operations whilst another might require . We also ignore multiplicative constants - these algorithms are still both linear -
Only Include the Dominant Term
One algorithm might require operations whilst another might require . As we saw in the graphs above, the term with the highest order (that is, the largest exponent) dominates the behavior of the function as increases. So, both of these are classified as quadratic -
Structure Dictates Complexity
Remember, the algorithm we use to work with our data is intrinsically linked to how that data is structured. The physical shape of our data sets the speed limit for how fast a task can be accomplished.
Consider the simple task of finding the last element in a collection. If our data is stored in a contiguous array (like std::vector), finding the element at the last index, or any index, is a simple arithmetic calculation based on where the array starts in memory and the size of each element.
It doesn't matter if the array has 5 items or 5 billion; the CPU performs one calculation and jumps instantly to the end. This is a constant time, algorithm.
However, if your data is stored in a linked list, this algorithm isn't available because the nodes could be scattered anywhere in RAM. To find the last node, we have to follow the links from one node to the next. The more objects we have in our list, the more nodes we need to traverse, so the algorithm scales linearly, , with the input size.
The task is the same for both containers, but the algorithmic complexity can be different depending on how that container organizes its collection in memory.
Theory vs. Engineering
Academic resources often disproportionately focus on abstract Big O analysis and underrepresent the importance of physical hardware, memory layout, and the "constant factors" that Big O ignores. In this abstract world, a linear solution is strictly superior to a quadratic solution, but that's not always true in the real world
When is small, the overhead of a complex algorithm can dwarf the actual work being done. You can implement a complex algorithm that should be highly efficient in theory, but if it doesn't consider the physical reality of the hardware, it can perform terribly in the real world.
Meanwhile, a beginner with no academic knowledge might store everything in arrays and solve every problem by looping over those arrays. Their approach can create highly performant systems by accident because they're taking full advantage of the caches through locality, and they're consistently accessing their data in predictable ways that the prefetcher can help with.
We can use Big O to prevent architectural disasters, but day-to-day, we use profiling to test and optimize the actual implementation.
Summary
In this lesson, we explored polynomial time complexity:
- Constant: The ideal - instant access, enabled by contiguous memory layouts and pointer arithmetic.
- Linear: The typical reality - we need to touch every item. The hardware loves this pattern if the data is stored contiguously, making it surprisingly fast.
- Quadratic: Where things begin to get dangerous - usually nested loops. Avoid this for large datasets.
We also learned that Big O is a model, not a law. For small data, a "bad" algorithm with good memory locality can beat a "good" algorithm with bad locality.
In the next lesson, we will look at logarithmic time - . We will see how we can exploit sorted data to discard half the workload with every single step, allowing us to search billions of items in nanoseconds.
Logarithmic and Linearithmic Algorithms
Working with massive datasets using divide and conquer strategies, and the the required trade-offs.