Multidimensional Arrays and `std::mdspan`

A guide to std::mdspan, allowing us to interact with arrays as if they have multiple dimensions

This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, Unlimited Access

Ryan McCombe

Updated 3 months ago

Earlier in the course, we introduced the concept of a multidimensional array (or vector), which we could use to represent more complex data structures.

For example, we could represent a 2D grid using an array of arrays:

1#include <vector>
2#include <iostream>
3
4int main(){
5  std::vector<std::vector<int>> Grid{
6    {
7      {1, 2, 3},
8      {4, 5, 6},
9      {7, 8, 9}
10    }};
11
12  std::cout << "Top Left: " << Grid[0][0];
13  std::cout << "\\nBottom Right: " << Grid[2][2];
14}

1Top Left: 1
2Bottom Right: 9

There are a couple of problems with this approach.

Firstly, we lose some capability to work with the collection as a whole. For example, we can no longer simply get the size of the collection (9, in this case) without doing some extra work. We’d need to iterate over every subarray and add all their sizes together.

Secondly, there is some performance overhead involved in using multidimensional arrays in this way. Working with 3 arrays of 3 elements is generally less efficient than working with a single array of 9 elements.

The Use Case for Multidimensional Spans

Because of these problems, we generally want to store our arrays in long, contiguous blocks of memory, even if the structure they represent is multi-dimensional.

We could revert to using a single array, and simply remember that this data structure is supposed to be a 3 by 3 grid, for example:

1#include <vector>
2#include <iostream>
3
4int main(){
5  std::vector<int> Grid{
6    1, 2, 3,
7    4, 5, 6,
8    7, 8, 9
9  };
10
11  std::cout << "Top Left: " << Grid[0];
12  std::cout << "\nBottom Right: " << Grid[8];
13}

1Top Left: 1
2Bottom Right: 9

But, this also has two problems:

Unintuitive API

If our data is supposed to represent rows and columns, it would be much clearer if we could access it using that pattern.

For example, we’d rather column 2, row 2 be accessed using Grid[2, 2] rather than using arithmetic to work out that this cell corresponds Grid[8]

Less Meaningful Types

In more complex programs, it becomes unclear what the dimensions of an object like this are supposed to be.

For example, a std::array<int, 12> could represent:

a one-dimensional array of 12 integers
a two-dimensional array such as a 6x2, 3x4, or 4x3 grid
a three-dimensional array, such as a 2x2x3 volume.

None of that is communicated through the type. When the type does not describe the type of data it’s referring to, that’s often a red flag that our design can be improved.

Terminology

Here is some common terminology and conventions that are used within the context of multidimensional arrays:

Tensor: An alternative phrase for multidimensional array.
Vector: A one-dimensional array. Note this does not relate to the std::vector container - a std::vector is just what the standard library called an array that can be resized
Matrix: A two-dimensional array
Order, Degree, and Rank: The number of dimensions in an array. For example, a vector has one dimension, so it is a first-order array. A matrix has two dimensions - it’s an array of degree 2
Shape: How many values exist in each dimension, often separated by a multiplication symbol. For example, a chess board could be represented as a two-dimensional array with a shape of 8 x 8.

Subscript Notation and Operator

In print and writing, it’s common to use subscript notation to refer to individual objects within a collection. For example, $A_{x}$ refers to the element in position $x$ of vector $A$ .

$B_{xy}$ refers to the element in row $x$ , column $y$ of matrix $B$ . Commas can be added if there is a risk of ambiguity - eg $B_{12,3}$ and $B_{1,23}$

Because the [] syntax is also generally used for this same purpose in programming, it is often referred to as the subscript operator.

Multidimensional `[]` Operator

Note: The multidimensional [] operator is a recent addition to the language, added in C++23. As of 2024, this is not yet widely supported by compilers.

A key component of being able to work with multidimensional arrays is the ability to pass multiple arguments to the [] operator.

For example, that might look like MyObject[x, y]. This has become possible since C++23

The following example shows a custom type implementing a basic multidimensional subscript operator:

1#include <iostream>
2
3class SomeType {
4public:
5  int operator[](int a, int b){
6    return a + b;
7  }
8};
9
10int main(){
11  SomeType SomeObject;
12  std::cout << "Result: " << SomeObject[2, 3];
13}

1Result: 5

Before C++23: Comma Operator in Subscript Expressions

Prior to C++23, it was possible to implement an API like this, in a slightly contrived way. The comma symbol , is also an operator, which our custom types can overload in the usual way.

As such, an expression like A[B, C] could be supported by overloading the comma operator on B's type. Then, an expression like A[B, C] would result in C being passed as an argument to operator,() on B. We then take the return value of that operator and pass it to operator[] of A.

From C++20, the use of the comma operator within subscript expressions was deprecated. In C++23, it was removed entirely, to accommodate the more intuitive multidimensional subscript operator.

Below, we show how we could start building a multidimensional array class using these ideas. In this example, it’s a class that stores a 3x3 matrix of integers by wrapping a one-dimensional array.

The process of mapping the two-dimensional arguments to a single index requires some maths. The offending line is highlighted below, but don’t worry if it doesn’t make sense - we’ll soon switch to using a standard library container to take care of this for us:

1#include <array>
2#include <iostream>
3
4class Grid {
5public:
6  int operator[](size_t Row, size_t Col){
7    return Data[Row * 3 + Col % 3];
8  }
9
10private:
11  std::array<int, 9> Data{
12    1, 2, 3,
13    4, 5, 6,
14    7, 8, 9
15  };
16};
17
18int main(){
19  Grid MyGrid;
20
21  std::cout
22    << "Top Left: " << MyGrid[0, 0]
23    << "\nBottom Right: " << MyGrid[2, 2];
24}

1Top Left: 1
2Bottom Right: 9

We could extend this idea, with the help of template parameters, to create a generalized form of multi-dimensional arrays that could be used in our projects. But, there are third-party libraries that we could use that have already done this.

As of C++23, we’re starting to see standard library classes be introduced to solve these problems too, starting with std::mdspan.

`std::mdspan`

Note: The standard library’s mdspan class is a recent addition to the language, added in C++23. As of 2024, this type is not yet available in most compilers.

The standard library’s mdspan is, at its core, similar to the regular span, which we covered in the previous lesson. It provides a "view" of an underlying array.

Array Spans and `std::span`

A detailed guide to creating a "view" of an array using std::span, and why we would want to

It is performant to create and widely compatible with different array types, including C-style arrays, std::array, and std::vector

The key difference is, as we might expect, the mdspan is designed to create views that we can interact with as if we were working with a multidimensional array.

It is a template class that has four template parameters - two required, and two optional. We’ll discuss these template parameters in more detail a little later.

For now, let's just use class template argument deduction to show a basic example.

In the following code, we create a std::mdspan that views our 6-element array as a 2x3 matrix:

1#include <mdspan>
2#include <iostream>
3
4int main() {
5  std::array Array{
6    1, 2, 3,
7    4, 5, 6
8  };
9
10  std::mdspan Span{Array.data(), 2, 3};
11}

The first argument to the constructor is a pointer to where the source array begins in memory. With a C-style array, that is just the value of the object. For std::array and std::vector, we can get this pointer using the data() method.

After the pointer, we then have a variable number of additional arguments that represent the shape of our multidimensional view.

Variadic Functions

A function that accepts a variable number of arguments is called a variadic function. We have a dedicated lesson on these functions, and how to create our own, a little later in the course.

The number of additional arguments represents the number of dimensions our view will have, whilst the individual value of each argument represents the size of that dimension.

Above, we’re creating a two-dimensional view, so we provide two additional arguments. Their values are 2 and 3, so we’re creating a 2x3 span.

Access to elements is given through the [] operator, passing as many arguments as we have dimensions. In this example, we have two dimensions, so every use of the [] will receive two arguments:

1#include <array>
2#include <mdspan>
3#include <iostream>
4
5int main() {
6  std::array Array{
7    1, 2, 3,
8    4, 5, 6
9  };
10
11  std::mdspan Span{Array.data(), 2, 3};
12
13  std::cout << "Top Left: " << Span[0, 0];
14  std::cout << "\nBottom Right: " << Span[1, 2];
15}

1Top Left: 1
2Bottom Right: 6

Shape argument order

Note, that the order of these shape parameters is significant. A 2x3 matrix is not the same as a 3x2 matrix. How we order the dimensions in the constructor corresponds with how we later access elements using the subscript operator [].

The first argument we pass to [] will map to the first dimension, and the second argument will map to the second dimension.

In the above example, we shaped our span to be 2x3. As such, the first argument to the [] operator only has two valid possibilities - 0 or 1. Our second dimension has a size of 3, so there are valid indices for that dimension: 0, 1, or 2.

`std::mdspan` Size and Rank

The total number of elements the mdspan is viewing is available through the size() method, or empty() if we specifically want to check if the size is 0.

We can also get the rank of the span (i.e., how many dimensions it has) using the rank() method.

1#include <array>
2#include <mdspan>
3#include <iostream>
4
5int main() {
6  std::array Array{
7    1, 2, 3,
8    4, 5, 6
9  };
10
11  std::mdspan Span{Array.data(), 2, 3};
12
13  std::cout << "Size: " << Span.size();
14  std::cout << "\nRank: " << Span.rank();
15}

1Size: 6
2Rank: 2

Getting the Extent of a Dimension

The extent (or size) of any dimension in a mdspan is available through the extent() method, passing in the index of the dimension we want to find. Similar to array indices, counting starts at 0, so the size of the first dimension is available at extent(0).

When we create a 2x3 view, extent(0) will return 2, and extent(1) will return 3:

1#include <array>
2#include <mdspan>
3#include <iostream>
4
5int main() {
6  std::array Array{
7    1, 2, 3,
8    4, 5, 6,
9  };
10
11  std::mdspan Span{Array.data(), 2, 3};
12
13  std::cout << "Rows: " << Span.extent(0);
14  std::cout << "\\nCols: " << Span.extent(1);
15}

1Rows: 2
2Cols: 3

Iterating through `std::mdspan`

By using the extent() method, we can set up iterations that navigate through our array in a way that is aware of its dimensionality:

1#include <array>
2#include <mdspan>
3#include <iostream>
4
5int main() {
6  std::array Array{
7    1, 2, 3,
8    4, 5, 6,
9  };
10
11  std::mdspan Span{Array.data(), 2, 3};
12
13std::cout << "Left Column: ";
14for (size_t i{0}; i < Span.extent(0); ++i)
15  std::cout << Span[i, 0] << ", ";
16
17std::cout << "\nTop Row: ";
18for (size_t i{0}; i < Span.extent(1); ++i)
19  std::cout << Span[0, i] << ", ";
20
21std::cout << "\nEverything: ";
22  for (size_t i{0}; i < Span.extent(0); ++i)
23    for (size_t j{0}; j < Span.extent(1); ++j)
24      std::cout << Span[i, j] << ", ";
25}

1Left Column: 1, 4,
2Top Row: 1, 2, 3,
3Everything: 1, 2, 3, 4, 5, 6,

`std::mdspan` Template Parameters

Previously, we used class template argument deduction (CTAD) to create our std::mdspan without providing the template parameters.

Let's discuss what those template parameters are now. There are four in total:

The type of data in the array
An extents type, documenting the shape of the std::mdspan
An optional layout policy, which allows us to customize the process whereby the multiple indices passed to the span’s [] operator get converted to the single, one-dimensional index that is used by the underlying array
An optional accessor policy, which allows us to customize how the single index returned from the LayoutPolicy is used to generate the return value of the span’s [] operator.

The first template parameter is pretty simple - it’s just the data type of each element in the view. This will be the same as the data type used by the underlying array. For example, if our array is storing int objects, our std::mdspan will be viewing int objects too.

The extents argument is what we’ll spend the rest of this lesson covering.

The final two template parameters are optional and will be covered in a future lesson.

`std::extents`

The std::extents template class is how we define the shape of our multidimensional spans. The first template parameter allows us to customize what type we want to use as the index of the span. This will be an integer type, such as a size_t.

We then can provide a variable number of additional parameters. Similar to the arguments passed to the mdspan constructor, the number of additional parameters we pass will become the number of dimensions in our span. The value of each of the arguments will be the size of that dimension.

Below, we recreate our 2x3 matrix, but this time we explicitly provide the template parameters:

1#include <array>
2#include <mdspan>
3
4int main() {
5  std::array Array{
6    1, 2, 3,
7    4, 5, 6
8  };
9
10  using Extents = std::extents<size_t, 2, 3>;
11
12  std::mdspan<int, Extents> Span{Array.data()};
13}

Note, that we are no longer providing the dimension parameters to the mdspan constructor. They are now instead provided as template parameters to the std::extents template class at compile time.

If the extent of a dimension is not known at compile time, we can then specify it as a dynamic extent.

Dynamic Extents

Within the template parameters of std::extents, we can specify dimensions as having a dynamic extent, that is only known at run time. We do this by passing the std::dynamic_extent token in the appropriate position.

Below, we reference an extent type where the number of rows is known at compile time (2, in this example) but the number of columns is dynamic, as it is not known until run time:

1std::extents<size_t, 2, std::dynamic_extent>;

At runtime, we need to provide the value for any dynamic extents by passing arguments to the mdspan constructor. In the following example, we pass 3, thereby again recreating our 2x3 matrix, this time by combining static and dynamic extents:

1#include <array>
2#include <mdspan>
3
4int main(){
5  std::array Array{
6    1, 2, 3,
7    4, 5, 6
8  };
9
10  using Extents = std::extents<
11    size_t, 2, std::dynamic_extent
12  >;
13
14  std::mdspan<int, Extents> Span{
15    Array.data(), 3};
16}

`std::dextents`

Where we want the extent of every dimension to be dynamic, the std::dextents template class provides a more succinct syntax. It accepts two template parameters - the type of index to use, and the number of dimensions we want.

The following example creates a three-dimensional span, where the extent of every dimension is dynamically determined at run time.

1#include <array>
2#include <mdspan>
3#include <iostream>
4
5size_t GetDynamicValue(){ return 2; }
6
7int main(){
8  std::array Array{
9    1, 2, 3, 4,
10    5, 6, 7, 8
11  };
12
13  using Extents = std::dextents<size_t, 3>;
14
15  std::mdspan<int, Extents> Span{
16    Array.data(), GetDynamicValue(),
17    GetDynamicValue(), GetDynamicValue()};
18
19  std::cout << "Dimensions: " << Span.rank();
20}

1Dimensions: 3

Multidimensional Arrays in C++26 (`std::mdarray`)

The standard library currently only supports multidimensional arrays as a view of another container, such as a std::vector.

However, multidimensional containers that "own" their data are being considered for future versions of the language. A multidimensional form of std::vector, tentatively called std::mdarray, is likely coming in C++26.

Summary

In this lesson, we delved into the capabilities of std::mdspan in C++23, learning how it enhances our ability to interact with multidimensional arrays more intuitively and efficiently.

Key Takeaways:

std::mdspan in C++23 allows for intuitive and efficient handling of multidimensional arrays, treating them as views over contiguous memory.
The multidimensional [] operator, a recent addition in C++23, enables accessing multidimensional array elements using multiple indices.
The performance benefits and intuitive API of std::mdspan make it a preferred choice over arrays of arrays for representing multidimensional data.
The mdspan template includes parameters for the data type, extents, layout policy, and accessor policy, enhancing its versatility.
std::extents and std::dextents provide mechanisms to define the shape of multidimensional arrays, with support for both static and dynamic extents.
Iteration through std::mdspan is dimensionally aware, allowing for iteration that respects the array's multidimensional nature.
The size() and rank() methods of std::mdspan provide information about the total number of elements and the number of dimensions, respectively.
The extent() method retrieves the size of a specific dimension in a mdspan.
The lesson also touched upon the prospect of std::mdarray in future C++ standards, hinting at continued evolution in multidimensional array handling.

Was this lesson useful?

Next Lesson

Algorithm Analysis and Big O Notation

An introduction to algorithms - the foundations of computer science. Learn how to design, analyze, and compare them.

3D art showing a woman in a cyberpunk character

Ryan McCombe

Updated 3 months ago

Lesson Contents