Dynamic Arrays using `std::vector`

Explore the fundamentals of dynamic arrays with an introduction to std::vector

This lesson is part of the course:

Intro to C++ Programming

Become a software engineer with C++. Starting from the basics, we guide you step by step along the way

View Full CourseFree, Unlimited Access

Abstract art representing computer programming

Ryan McCombe

Updated 1 month ago

Inevitably, we will want to store a group of related objects. That might be a collection of characters in a party, a collection of buttons on a UI, or countless other use cases.

Let's imagine we have these objects, which we want to store and manage as a single collection:

1class Character {};
2Character Frodo;
3Character Gandalf;
4Character Gimli;
5Character Legolas;
6Character Aragorn;

Programmers have invented a huge range of options for storing collections of objects. These containers are broadly called data structures, and which data structure we should use depends on our specific requirements. We'll cover a much wider range of data structures in the next course. Here, we'll introduce the most common choice - the dynamic array.

Under the hood, arrays are contiguous blocks of memory, big enough to store multiple objects in sequence.

Dynamic Arrays vs Static Arrays

Arrays broadly fall into two categories - static arrays, and dynamic arrays.

With static arrays, we need to know at compile time how much space we need. For example, if we create a static array for holding 5 characters, and we need to add a 6th, we're out of luck.

Dynamic arrays can resize themselves at run time. If we try to add a 6th character to a dynamic array that can only hold 5, the dynamic array will do a load of memory management and related tasks to make itself bigger. That all happens behind the scenes - from our perspective, it just works.

In this lesson, we're working with dynamic, resizable arrays. In the next course, we cover static arrays.

There are hundreds of implementations of arrays in C++ that we can use, and we can even create our own once we learn more advanced topics. In this lesson, we're using the standard library's implementation of dynamic arrays, which is called std::vector.

The C++ standard library using the "vector" name for their implementation of dynamic arrays is unfortunate. When we're working on projects that involve maths and physics, such as a video games, the constructs we use to represent things like positions, movement and forces are also called vectors.

These vectors are not related to arrays at all - the confusing naming conflict is just something we need to adapt to.

Creating Arrays using `std::vector`

To use std::vector, we need to #include <vector>. We can then declare a std::vector by giving it a name, and specifying the type of objects it will contain within angled brackets: < and >. The following example shows how we can declare a std::vector called MyVector that stores int objects:

1#include <vector>
2
3std::vector<int> MyVector;

Preview: Templates

When we see chevrons - < and > - in our code, we're typically using an advanced C++ feature called templates. Templates allow classes and functions to be written without specifying all of the data types that the class or function will be using.

In this case, the standard library authors created a container called std::vector that can store a collection of any type of object. As developers using std::vector, we just need to provide the type we want to store between the chevrons. This type we provide is referred to as a template argument.

When we pass int as the template argument - that is, std::vector<int> - we create an array for storing int values. But we can replace int with any type:

1std::vector<float> FloatCollection;
2std::vector<bool> BoolCollection;

Templates are particularly powerful as they allow classes and functions to be compatible with types that aren't known or don't exist at the time the template was written.

For example, the authors of std::vector couldn't know we would want to store a collection of SomeCustomType objects, but their code supports it anyway:

1struct SomeCustomType{
2  // ...
3};
4
5std::vector<SomeCustomType> MyVector;

We cover templates in much more detail, including how to create our own, in the advanced course.

We can provide a std::vector with some initial values:

1#include <vector>
2
3std::vector<int> MyVector { 1, 2, 3, 4, 5 };

When we are initializing the values at the same time we declare the array, we can optionally remove the template argument. The compiler can infer what type the vector will be storing, based on the type of objects we provided for its initial values:

1#include <vector>
2
3std::vector MyVector { 1, 2, 3, 4, 5 };

To do this, the compiler is using Class Template Argument Deduction (CTAD), which we'll cover more in our advanced course.

Accessing Array Items

We can access the members of our array with the subscript operator, which uses [] syntax. For example, to access an object within MyVector, we'd use MyVector[x], where x is the index of the element we want to access.

The index of an element within an array is simply its position within the array. However, in most programming contexts, indexing is zero-based, meaning we start counting from 0. This means the first element of the vector is at index 0, the second element is at index 1, and so on.

For example, to access the elements of our array, we would do this:

1#include <vector>
2
3std::vector MyVector { 1, 2, 3, 4, 5 };
4
5int FirstElement { MyVector[0] };
6int SecondElement { MyVector[1] };
7int ThirdElement { MyVector[2] };
8int LastElement { MyVector[4] };

As with all values, the index can be derived from any expression that evaluates to an integer:

1#include <iostream>
2#include <vector>
3
4int CalculateIndex() { return 1 + 1; }
5
6int main() {
7  std::vector MyVector{
8    "First", "Second", "Third"};
9
10  // Log out the element at index 0
11  std::cout << MyVector[3 - 3] << '\n';
12
13  // Log out the element at index 1
14  int Index{1};
15  std::cout << MyVector[Index] << '\n';
16
17  // Log out the element at index 2
18  std::cout << MyVector[CalculateIndex()];
19}

This code logs out elements at indices 0, 1, and 2 in order:

1First
2Second
3Third

Array Size

The size of an array refers to the number of elements it currently contains. This is also sometimes called the length of the array.

The size() function returns the current size of our std::vector:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{
6    "First", "Second", "Third"};
7
8  std::cout << "Size: " << MyVector.size();
9}

1Size: 3

Note that because indexing is zero-based, the last element of a vector is at an index of one less than its size(). For an array of size 3, the last element is at index 2.

We can get the last element by applying this arithmetic within the subscript operator, or by using the back() function:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{
6    "First", "Second", "Third"};
7
8  std::cout << "Last Element: "
9    << MyVector[MyVector.size() - 1];
10    
11  // Equivalently:
12  std::cout << "\nLast Element: "
13    << MyVector.back();
14}

1Last Element: Third
2Last Element: Third

Adding and Removing Items

Once our array is created, we'll often want to add more items to it. Typically, we want to add items to the end of arrays, as this has performance benefits over adding them to the start. We'll cover the performance characteristics of containers in more detail in the next course.

std::vector and many other data structures provide two ways of adding objects:

Pushing an object involves moving or copying an existing object into the container
Emplacing an object involves creating a new object directly within the container

Creating an object in place has performance benefits over creating it elsewhere and then moving it. Therefore, where possible, we should prefer emplacing over pushing.

With std::vector, we're adding items to the back of the container, so the respective functions are called push_back() and emplace_back()

Moving Objects using `push_back()`

If an object has already been constructed, and we want to add it to the back of our array, we can use push_back():

1#include <iostream>
2#include <vector>
3
4class Character {
5public:
6  Character(const std::string& Name)
7  : Name(Name) {}
8  std::string Name;
9};
10
11int main() {
12  Character Legolas{"Legolas"};
13  std::vector<Character> MyVector;
14  MyVector.push_back(Legolas);
15
16  std::cout << "First character: "
17    << MyVector[0].Name;
18}

1First character: Legolas

Constructing Objects using `emplace_back()`

If the object we want to add does not already exist, we can construct it right inside of the array, using emplace_back().

The arguments we call emplace_back() with will be passed to the constructor of the object type the vector stores:

1#include <iostream>
2#include <vector>
3
4class Character {/*...*/}
11
12int main() {
13  std::vector<Character> MyVector;
14  MyVector.emplace_back("Legolas");
15
16  std::cout << "First character: "
17    << MyVector[0].Name;
18}

Removing Objects using `erase()`

The std::vector type also offers an erase() method to remove objects from our array. The erase method requires we provide an iterator, which is slightly more advanced than a simple index.

We cover iterators in detail in the advanced course, but for now, note that we can get an iterator corresponding to a std::vector index using MyVector.begin() + x, where x is the index we're interested in.

So, for example, to erase the second item (the item at index 1) from our array we'd use:

1MyVector.erase(MyVector.begin() + 1);

The size() of our array will be reduced by 1, and all of the objects that were after the element we erased get moved by one position to the left to fill in the gap:

1#include <iostream>
2#include <vector>
3
4class Character {/*...*/}
11
12int main() {
13  std::vector<Character> MyVector;
14  MyVector.emplace_back("Aragorn");
15  MyVector.emplace_back("Gandalf");
16  MyVector.emplace_back("Legolas");
17
18  std::cout << "Second character is: "
19    << MyVector[1].Name;
20  MyVector.erase(MyVector.begin() + 1);
21  std::cout << "\nSecond character is now: "
22    << MyVector[1].Name;
23}

1Second character is: Gandalf
2Second character is now: Legolas

Modifying Array Items

We can modify objects in a std::vector in the usual ways. We can replace the object at a given index using the assignment operator, =:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3};
6
7  std::cout << "Second number: "
8    << MyVector[1];
9  MyVector[1] = 42;
10  std::cout << "\nSecond number: "
11    << MyVector[1];
12}

1Second number: 2
2Second number: 42

We can also call functions (including operators) on our objects as needed:

1#include <iostream>
2#include <vector>
3
4class Character {
5 public:
6  Character(std::string Name) : mName{Name} {}
7  std::string GetName() { return mName; };
8  void SetName(std::string Name){
9    mName = Name;
10  }
11
12 private:
13  std::string mName;
14};
15
16int main() {
17  std::vector<Character> MyVector;
18  MyVector.emplace_back("Roderick");
19
20  std::cout << "First character: "
21    << MyVector[0].GetName();
22  MyVector[0].SetName("Anna");
23  std::cout << "\nFirst character: "
24    << MyVector[0].GetName();
25}

1First character: Roderick
2First character: Anna

Storing Complex Types in Arrays

Our above example uses vectors with values, but we can store pointers and references in arrays too:

1#include <vector>
2
3class Character {};
4
5// A collection of characters
6std::vector<Character>;
7
8// A collection of character pointers
9std::vector<Character*>;
10
11// A collection of const character references
12std::vector<const Character&>;

Arrays can also store other arrays. This creates "multidimensional arrays". For example, a 3x3 grid could be represented as an array of 3 rows, each row being an array of 3 items:

1#include <vector>
2#include <iostream>
3
4std::vector<std::vector<int>> MyGrid {{
5  { 1, 2, 3 },
6  { 4, 5, 6 },
7  { 7, 8, 9 }
8}};
9
10int main() {
11  std::cout
12    << "Top Left: " << MyGrid[0][0];
13
14  std::cout << "\nBottom Right: "
15    // Alternatively: MyGrid.back().back()
16    << MyGrid[2][2];
17}

1Top Left: 1
2Bottom Right: 9

Type Aliases

Remember, we can add a using statement to alias complex types:

1using Grid = std::vector<std::vector<int>>;
2
3Grid MyGrid {{
4  { 1, 2, 3 },
5  { 4, 5, 6 },
6  { 7, 8, 9 }
7}};

Passing Arrays to Functions

We can treat a collection of objects like any other value. Below, we pass a collection to a function:

1#include <iostream>
2#include <vector>
3
4using Grid = std::vector<std::vector<int>>;
5
6void LogTopLeft(Grid GridToLog) {
7  std::cout << "Top Left: " << GridToLog[0][0];
8}
9
10int main() {
11  Grid MyGrid {{
12    { 1, 2, 3 },
13    { 4, 5, 6 },
14    { 7, 8, 9 }
15  }};
16
17  LogTopLeft(MyGrid);
18}

As with any other parameter, arrays are passed by value by default. The performance implications are particularly important here, as copying a collection of objects is inherently more expensive than copying a single object.

We can pass by reference in the usual way, with or without const:

1#include <iostream>
2#include <vector>
3
4using Grid = std::vector<std::vector<int>>;
5
6void LogTopLeft(const Grid& GridToLog) {
7  std::cout << "Top Left: " << GridToLog[0][0];
8}
9
10void SetTopLeft(Grid& GridToLog, int Value) {
11  GridToLog[0][0] = Value;
12}
13
14int main() {
15  Grid MyGrid {{
16    { 1, 2, 3 },
17    { 4, 5, 6 },
18    { 7, 8, 9 }
19  }};
20
21  SetTopLeft(MyGrid, 42);
22  LogTopLeft(MyGrid);
23}

1Top Left: 42

We can also generate and use pointers in the normal way. The following code replicates the previous example, using pointers instead of references:

1#include <iostream>
2#include <vector>
3
4using Grid = std::vector<std::vector<int>>;
5
6void LogTopLeft(const Grid* GridToLog) {
7  std::cout << "Top Left: "
8    << (*GridToLog)[0][0];
9}
10
11void SetTopLeft(Grid* GridToLog, int Value) {
12  (*GridToLog)[0][0] = Value;
13}
14
15int main() {
16  Grid MyGrid {{
17    { 1, 2, 3 },
18    { 4, 5, 6 },
19    { 7, 8, 9 }
20  }};
21
22  SetTopLeft(&MyGrid, 42);
23  LogTopLeft(&MyGrid);
24}

1Top Left: 42

Iteration using a For Loop

A common task we have when working with arrays is to perform some action on every object in the collection. We can do this with a for loop:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3};
6
7  for (int i{0}; i < MyVector.size(); ++i) {
8    std::cout << MyVector[i] << ", ";
9  }
10}

11, 2, 3,

The `size_t` type

There is an issue with using int values as the index of vectors: modern computers have enough memory to store a lot of objects in an array. This means that the size() of the array, or the index used to reference a specific position, can be a larger number than the maximum value storable in an int.

To deal with this problem, we have the size_t data type. size_t is guaranteed to be large enough to match the largest possible size of the array.

Because of this, it is the recommended way of storing array sizes and indices:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3};
6
7  for (size_t i{0}; i < MyVector.size(); ++i) {
8    std::cout << MyVector[i] << ", ";
9  }
10}

11, 2, 3,

Range-Based For Loops

Often, we usually don't need to work with indices at all - we just want to iterate over everything in the array. For those scenarios, we can use this syntax:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3};
6
7  for (int Number : MyVector) {
8    std::cout << Number << ", ";
9  }
10}

11, 2, 3,

This is known as a range-based for loop. We cover these in more detail in the next course, including how to make our custom types compatible with this syntax.

Similar to functions, range-based for loops can receive their parameters by reference or value, with value being the default. This means each item in the collection is copied into the body of the loop. We should consider passing by reference instead, particularly if the type we're using is expensive to copy:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3};
6
7  for (int& Number : MyVector) {
8    std::cout << Number << ", ";
9  }
10}

11, 2, 3,

Just like when we're passing by reference into a function if the loop body isn't going to modify that reference, we should consider marking it as const:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3};
6
7  for (const int& Number : MyVector) {
8    std::cout << Number << ", ";
9  }
10}

11, 2, 3,

Memory Management

std::vector and other arrays keep our objects in contiguous blocks of memory on our system. As we push or emplace objects into our array, it may run out of space at its current memory location.

Comparing Array `size()` and `capacity()`

We can see how much of our capacity we're currently using by comparing what is returned from the size() and capacity() methods:

size() returns the number of elements currently in the array.
capacity() returns the number of elements the array can hold in its current memory location.

Below, we see the std::vector initially has a capacity of 5, with all 5 spaces being taken. When we add a 6th item, the size() increases to 6, whilst the capacity() is now 7:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3, 4, 5};
6
7  std::cout << "Capacity: " << MyVector.size()
8    << "/" << MyVector.capacity();
9
10  MyVector.emplace_back(6);
11
12  std::cout << "\nCapacity: " << MyVector.size()
13    << "/" << MyVector.capacity();
14}

1Capacity: 5/5
2Capacity: 6/7

When we add elements beyond the current capacity, the std::vector will do some work behind the scenes to let our array grow. It will:

Allocate a new block of memory with a larger capacity.
Move all existing elements to the new location.
Deallocate the old memory block.

In our previous example, this new location has enough space to store 7 objects. We're currently only storing 6, so we have room to add one more before everything needs to be moved to a larger memory block again.

Proactively Reserving Capacity

Moving an array to a new location has a performance cost, so if we're able to reduce unnecessary movements, we should consider it.

If we know approximately how many objects our std::vector is likely to need to store, we can directly reserve() enough capacity for them. Below, we ask our std::vector to move to a location with room for 100 objects:

1#include <iostream>
2#include <vector>
3
4int main() {
5  std::vector MyVector{1, 2, 3, 4, 5};
6  MyVector.reserve(100);
7
8  std::cout << "Capacity: " << MyVector.size()
9    << "/" << MyVector.capacity();
10}

Now, it has plenty of room to grow before it needs to move again:

1Capacity: 5/100

Naturally, a std::vector with a capacity for 100 objects will consume more system memory than one with a capacity of only 5.

But if we think it's eventually going to grow to 100 objects anyway, we may as well just reserve that space from the start and eliminate all the expensive moving.

To summarise our considerations:

Reallocation can be expensive, especially for large vectors or complex objects.
Using reserve() can help avoid unnecessary reallocations.
However, over-reserving can waste memory if the extra capacity isn't used.

Summary

In this lesson, we've explored the essentials of dynamic arrays, with a particular focus on std::vector. You've learned how to create, access, modify, and efficiently manage collections of objects, laying a strong foundation for more advanced programming concepts.

Main Points Learned

Understanding the difference between static and dynamic arrays, and the flexibility of std::vector.
Creating, initializing, and accessing elements in a std::vector, including using push_back() and emplace_back() methods.
Techniques for modifying elements within a std::vector, and the implications of passing vectors to functions.
Managing complex data types and multidimensional arrays using std::vector.
Effective looping through vectors using both traditional and range-based for loops, and the importance of size_t in managing vector sizes.
How std::vector manages its capacity, and how we can interact with that mechanism using capacity() and reserve().

Next Lesson: Memory Ownership and Smart Pointers

In the next lesson, we introduce the main design pattern used to manage memory in complex applications. We also introduce smart pointers, the main mechanism used to implement this pattern. We cover:

Memory Management Simplified: We explore how smart pointers automate memory management, thereby reducing the chances of memory-related errors.
Understanding Memory Ownership: What smart pointers are and how they implement an ownership model for objects in dynamically allocated memory
Creating Unique Pointers using std::make_unique(): How to create unique pointers using std::make_unique(),
Access and Ownership Transfer: How to give other functions access to our resources, and whether we want to maintain or transfer ownership at the same time.