Previously, we discussed how we could use arrays to manage collections of objects that have the same type. Arrays accomplish this task by storing all our objects in large, contiguous blocks of memory.
However, this is not the only way to solve this problem. There are alternatives, and each method have their own benefits and drawbacks. In this lesson, we will introduce an alternative data structure: the linked list.
Unlike arrays, linked lists can store their elements anywhere in memory - they are not restricted to using a contiguous block.
Whilst arrays and linked lists generally solve the same problem of storing a collection of items, the difference in the way items are stored in memory means arrays and linked lists have different performance characteristics.
For example, if we add or remove an item from an array, we have to move the elements that come after them in order to maintain the contiguous structure. Similarly, if we need to expand an array, we may not be able to if the next address in memory is already being used for something else.
We would need to find a totally separate block of free memory, and move every element in our array to that new block.
Adding something to a linked list is much faster - it just involves storing a new object anywhere in memory and updating the previous object to link to it. No other elements need to be updated.
This gives linked lists improved performance characteristics when compared to arrays. The benefit of linked lists comes when adding, removing, or changing the order of elements. Nothing needs to be moved around in memory - all we need to do is update the links.
The downside of this structure is that we no longer have direct access to any element in the collection. With an array, we could jump straight to the element in position 100, for example. We can do this because, with an array, we know two things:
As such, with some arithmetic, we can work out exactly where in memory any element will be. As a simplified example of this process, if we know an Array<SomeType>
starts at memory address and objects of SomeType
require slots of memory, we can infer that item Array[100]
is at memory address , given by
This is not the case with linked lists. Given the more free-form memory structure of a linked list, if we don’t already know where the 100th element is, the only way to find it is to follow the links. We have to follow the links from item 1, to 2, to 3, all the way to 100. This is a slower, linear process.
We have to choose what to optimize for, based on the problem we're trying to solve. If fast random access is more important, use an array. If fast insertion and deletion are important, a linked list might be better.
How does the "linking" of a linked list actually work? With a linked list, we are not just storing the item in each position - we are storing both the item, as well as a pointer to the next element in the list, or a nullptr
if there are no more items.
These objects that contain both the item and the pointer are commonly called "nodes". We could create these nodes using something like a std::pair
but typically we would define a struct
. For example, to construct a linked list containing the integers 10
, 20
, and 30
, we could do this:
struct Node {
int Value;
Node* Next;
};
Node Third { 30, nullptr };
Node Second { 20, &Third };
Node First { 10, &Second };
We could use templating here, so our Nodes can store any type of data, not just integers:
template<typename T>
struct Node {
T Value;
Node<T>* Next;
};
Node<int> Third { 30, nullptr };
Node<int> Second { 20, &Third };
Node<int> First { 10, &Second };
Additionally, we could update our Node
struct to also include a pointer to the previous node. This would allow us to easily traverse our linked list in either direction, forwards or backward. A linked list where each node has a link to both the previous and next node is called a "Doubly Linked List"
<typename T>
struct Node {
T Value;
Node<T>* Prev;
Node<t>* Next;
};
Of course, we don't want to be managing all these nodes and pointers manually. When we are trying to create a collection of ints
, we really just want to be working with integers, not these arbitrary Node
objects.
To achieve this, we could create a LinkedList
class to manage all these nodes for us. This class could be templated, to allow us to store any type of data in our linked lists, not just integers.
There are many existing implementations of linked lists we can use. We’ll introduce the C++ standard library implementation in the next lesson.
It is essential to understand the difference between linked lists and arrays, as well as their relative trade-offs. Time complexity is a fundamental concept in computer science, and topics like this are very commonly asked in interviews.
But, what is true in theory is not necessarily true in reality. There is a lot of nuance in how CPUs and memory work at a hardware level that subverts the academic analysis. Typically, these characteristics mean the tightly packed memory layout of arrays is preferred, even in scenarios where the theory would suggest linked lists should perform better.
Because of this, arrays should generally be the default choice in day-to-day programming.
Only use linked lists in special circumstances, where you know that the constant vs linear time complexity difference will matter. This is an instinct that will come with experience.
Comprehensive course covering advanced concepts, and how to use them on large-scale projects.