Algorithm Analysis

An introduction to algorithms - the foundations of computer science. Learn how to design, analyse and compare them.
3D art showing a woman in a cyberpunk environment
Ryan McCombe
Published

When we're programming, we're writing code to perform a specific task. An algorithm is the set of instructions - the procedure we ask the computer to perform - in order to complete that task. Typically, one of the main requirements we have for these procedures is that they need to complete within an appropriate amount of time.

If we are writing a program that creates real time graphics, we have only a few milliseconds to create each frame, and each frame has millions of pixels.

In other contexts, we might not be dealing with short timeframes - rather, we might be dealing with large amounts of data. If we're creating software that analyses a billion records, each additional millisecond we require to process each one adds 12 days to our overall run time.

Therefore, it is important that we be able to analyse the performance of our algorithms, understand the impact of what we find, and develop instincts for how we might improve them.

Algorithm Scaling

Generally, how long our algorithms take to run will depend their inputs, specifically, the size of those inputs. Our input's size might be the number of records we need to process, or the number of pixels we need to colour. When the input is larger, our algorithms typically needs to do more work.

The main thing we care about is the scaling between those two variables. As our input size increases, how many more operations does our algorithm need to perform to complete its task? Based on this analysis, we can classify how our algorithm scales into certain categories. Here are some examples:

Constant Scaling

This is the best case scenario. With constant scaling, even if our input gets bigger, our algorithm doesn't need to do any extra work. An example of a problem that can be solved in constant time is: determine if the first element in an array of numbers is even.

isEven(input) {
  return input[0] % 2 == 0;
}

Here, it doesn't matter how big the input array is - our algorithm takes the same, constant, amount of time.

Linear Scaling

With linear scaling, as our input gets bigger, our algorithm needs to perform proportionally more work. If our input is twice as large, our algorithm takes approximately twice as long to complete.

An example of a problem that can be solved in linear time is: determine how many numbers in an array are even. To solve this, we need to inspect every element in the array. If the array is twice as large, we need do approximately twice as much work.

1countEven(input) {
2  int Total { 0 };
3  for (i in input) {
4    if (i % 2 == 0) Total++;
5  }
6  return Total;
7}
8

Here, if the array has 10 items, line 4 is executed 10 times. If the array is doubled to 20 items, line 4's execution count is also doubled to 20. In other words the number of operations increases at the same rate as the size of the input.

Quadratic Scaling and Beyond

Algorithms in this category tend to involve nested loops. Lets imagine we have this task: check if an array contains any duplicates

One possible way to solve this would be using the following algorithm:

1hasDuplicates(input) {
2  for (i in input) {
3    for (j in input) {
4      if (i == j) return true;
5    }
6  }
7  
8  return false;
9}
10

In this example, if our input array has 10 items, line 4 is executed up to 100 times. If the array is increased to 100 items, line 4 is now executed up to 10,000 times. With quadratic scaling, the number of operations increases increases much faster than the size of the input.

In general, the number of operations O involved in this algorithm will be approximately O=n2O = n^2, where nn is the size of the input.

This is an example of a quadratic equation, so we'd describe this algorithm as quadratic, or having quadratic scaling.

We could also have cubic algorithms where O=n3O = n^3. This would be the case had we needed a 3rd nested loop:

for (i of input) {
  for (j of input) {
    for (k of input) {
      // Code here will run n * n * n times
    }
  }
}

This pattern continues - there are quartic algorithms where O=n4O = n^4, and so on.

Algorithm Scaling Compared

The following chart is a visual representation of how many operations an algorithm in each of these classifications might need to perform.

A chart showing the performance characteristics of some algorithms

If we zoom out in this chart, we can see the differences can get quite extreme, even at modest input sizes

A zoomed-out chart showing the performance characteristics of some algorithms

Even with a modest input size of 10, the huge differences in each type of algorithm are clear. This is why it's important to try to create more efficient algorithms where possible.

Simplifying the Classifications

In real scenarios, algorithms are generally going to be much more complicated. To analyse and compare them, we simplify things a lot by removing some nuance. This does not mean these nuances do not matter - we should absolutely be mindful of them when creating our software.

However, optimising our algorithms is generally not going to make as big a difference as finding an algorithm in a totally different scaling class. Therefore, to categorise algorithms into these scaling classes, we ignore nuance in 3 main areas:

1. Treat all operations as equal

In the real world, some operations can take much more time than others. For example, it's faster to add numbers together than multiply them. It's faster to read data from memory than from the hard drive.

Never the less, to classify algorithms, we assume all operations are equally fast

2. Ignore Constants

One algorithm might require n+100n + 100 operations whilst another might require nn. We ignore the additive constant - both these algorithms are in the same class (linear)

One algorithm might require 50n50n operations whilst another might require nn. We also ignore multiplicative constants - these algorithms are still both in the same class (linear)

3. Only Include the Dominant Term

One algorithm might require n2+nn^2 + n operations whilst another might require n2n^2. With large values of nn, the n2n^2 component will be dominant and the nn will be trivial. So, we classify these algorithms into the same bucket - quadratic.

Finding Better Algorithms

With our abilty to analyse algorithms, we now have some guidance on ways we might seek to improve them. Earlier, we had this quadratic O(n2)O(n^2) algorithm:

hasDuplicates(input) {
  for (i in input) {
    for (j in input) {
      if (i == j) return true;
    }
  }
  
  return false;
}

Is there a way we can solve this problem in O(1)O(1) time? We still need to check every element in the array to make sure there are no duplicates. If our array gets bigger, we have more to check, so constant time seems impossible.

What about O(n)O(n) time? To achieve linear time, we need to iterate over our array a fixed number of times - ideallly once.

To do this, we could create an additional data structure, and store values we've previously seen in that container. Then, for each value we encounter, we can check if we've already seen it.

For this overall algorithm be done in linear time, the data structure we create will need to be able to tell us if it contains a specific value as quickly as possible. Our best option here would be a set.

So, we can now iterate over our array once, which requires nn operations. For each element, we check if a value is in a set, and add it to the set if not.

Searching and adding an item to a set can both be done in constant time. So, for each of nn elements, we perform two operations. This means our algorithm will have approximately 2n2n operations in total. Dropping the multiplicative constant, we have an O(n)O(n) algorithm.

Our code might look like this:

hasDuplicates(input) {
  set previousValues {};
  for (i in input) {
    if (previousValues.has(i) {
      return true;
    } else {
      previousValues.add(i);
    }
  }
  
  return false;
}

Lets compare its scaling properties to our old algorithm:

Input SizeOld AlgorithmNew Algorithm
10010,000 Operations100 Operations
1,0001,000,000 Operations1,000 Operations
100,00010,000,000,000 Operations100,000 Operations

To quantify the difference in real world terms, if our input size is 100,000 and our new algorithm takes a second to run, our old algorithm takes 28 hours.

Increase the input size to 1,000,000 and our new algorithm scales up to taking 10 seconds, but our old algorithm scales up to 3 years.

Big O Notation

When writing and describing the properties of algorithms, the concept of "Big O" is sometimes used. This is a form of mathematical notation used to describe the concepts covered above.

We can describe constant time algorithms as O(1)O(1), linear as O(n)O(n), quadratic as O(n2)O(n^2), and so on. More specifically, this notation describes the upper bound of the run time. That is, big O describes the worst case scenario.

For example, our original, quadratic algorithm for detecting duplicates might complete within 2 operations. Even if the input array has millions of items, if the first first two numbers are the same, our algorithm will complete in constant time.

However, in the worst case scenario, there are no duplicates at all in our input array. In that scenario, our algorithm needs to check n×nn \times n combinations to reach that conclusion, so it's an O(n2)O(n^2) algorithm.

In general, the worst case performance is what we focus on when creating and describing algorithms. If an algorithm is described simply as "quadratic", we should interpret that as meaning "quadratic in the worst case".

The average case is less important. An algorithm that is linear in both the average and worst case is almost always better than an algorithm that is faster on average, but quadratic in the worst case.

This is because the gulf between classes can have very large implications to the run times. We always have to design and prepare our software and processes for the worst case scenario. That is much more practical to do with an algorithm that always completes within 10 seconds, than for one that takes only 10 milliseconds on average, but sometimes requires 3 days.

Polynomial Scaling

All algorithms covered in this lesson are examples of polynomial scaling.

Polynomials have the form O(nk)O(n^k) where kk is a constant, that is, kk is unrelated to the size of the data

Linear scaling - O(n)O(n) - is polynomial as it is equivalent to O(n1)O(n^1). Constant scaling - O(1)O(1) - is polynomial as it is equivalent to O(n0)O(n^0)

There are algorithms that scale much faster than polynomials - this includes exponential scaling - O(kn)O(k^n) , and factorial scaling - O(n!)O(n!)

We'll see examples of these in later lessons

Was this post useful?

3D art showing a progammer setting up a development environment

Intro to C++ Programming

Become a software engineer with C++. Starting from the basics, we guide you step by step along the way

Free, unlimited access!

This course includes:

  • 66 Lessons
  • Over 200 Quiz Questions
  • Capstone Project
  • Regularly Updated
  • Help and FAQ

Not Ready Yet?

Subscribe to this course to return to it later, and get updates on future changes

Get Started Now

Intro to C++ Programming

Starting from the fundamentals, become a C++ software engineer, step by step.

Contact|Privacy Policy|Terms of Use
Copyright © 2023 - All Rights Reserved