Multidimensional Arrays and std::mdspan

A guide to std::mdspan, which allows us to interact with arrays as if they have multiple dimensions
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Character Concept Art
Ryan McCombe
Ryan McCombe
Posted

In the beginner course, we introduced the concept of a multidimensional array (or vector), which we could use to represent more complex data structures.

For example, we could represent a 2D grid using an array of arrays:

#include <vector>
#include <iostream>

int main(){
  std::vector<std::vector<int>> Grid{
    {
      {1, 2, 3},
      {4, 5, 6},
      {7, 8, 9}
    }};

  std::cout << "Top Left: " << Grid[0][0];
  std::cout << "\nBottom Right: " << Grid[2][2];
}
Top Left: 1
Bottom Right: 9

There are a couple of problems with this approach. Firstly, we lose some capability to work with the collection as a whole. For example, we can no longer simply get the size of the collection (9, in this example) without doing some extra work. We’d need to iterate over every subarray and add all their sizes together.

Secondly, there is some performance overhead involved in using multidimensional arrays in this way. Working with 3 arrays of 3 elements is generally less efficient than working with a single array of 9 elements.

We could revert back to using a single array, and simply remember that this data structure is supposed to be a 3 by 3 grid:

#include <vector>
#include <iostream>

int main(){
  std::vector<int> Grid{
    1, 2, 3,
    4, 5, 6,
    7, 8, 9
  };

  std::cout << "Top Left: " << Grid[0];
  std::cout << "\nBottom Right: " << Grid[8];
}
Top Left: 1
Bottom Right: 9

But, this then has the opposite problem. We can no longer work with our structure as a series of rows and columns. This is particularly problematic when that work is being done in a separate function, as we then also have to communicate the nature of the array to that function.

For example, a std::array<int, 12> could be a regular array of 12 integers, but it could also be a two-dimensional array such as a 6x2, 3x4, or 4x3 grid. It could even be a three-dimensional array, such as a 2x2x3 volume.

None of that is communicated through the type. When the type does not describe the type of data it’s referring to, that’s often a red flag that we have a problem.

These problems only compound when dealing with larger, more realistic examples. In most industries, we’ll often be working with arrays that can have millions of objects across 3 or more dimensions. Some machine-learning applications involve arrays with hundreds of dimensions. So, solving these problems can make our projects much easier to work with.

Terminology

Multidimensional arrays are sometimes referred to as tensors.

A one-dimensional array is often called a vector, and a two-dimensional array is called a matrix.

The number of dimensions in an array is its rank, order, or degree.

For example, a rank 2 array, 2nd-degree tensor, or matrix all refer to the same thing - an array with two dimensions

The shape of an array is how many values exist in each dimension, often separated by a multiplication symbol. For example, a chess board could be represented as a two-dimensional array with a shape of 8 x 8.

In maths, it’s common to use subscript notation to refer to individual objects within a collection. For example, AxA_{x} refers to the element in position xx of vector AA.

BxyB_{xy} or Bx,yB_{x,y} refers to the element in row xx, column yy of matrix BB.

Because the [] syntax is also generally used for this same purpose in programming, it is often referred to as the subscript operator.

Multidimensional [] Operator

Note: The multidimensional [] operator is a recent addition to the language, added in C++23, and is not yet supported by all compilers.

A key component of being able to work with multidimensional arrays is the ability to pass multiple arguments to the [] operator.

For example, that might look like MyObject[x, y]. This has become possible since C++23

The following example shows a custom type implementing a basic multidimensional subscript operator:

#include <iostream>

class SomeType {
public:
  int operator[](int a, int b){
    return a + b;
  }
};

int main(){
  SomeType SomeObject;
  std::cout << "Result: " << SomeObject[2, 3];
}
Result: 5

Before C++23: Comma Operator in Subscript Expressions

Prior to C++23, it was possible to implement an API like this, in a slightly contrived way. The comma symbol , is also an operator, which our custom types can overload in the usual way.

As such, an expression like A[B, C] could be supported by overloading the comma operator on B's type. Then, an expression like A[B, C] would result in C being passed as an argument to operator, on B. We then take the return value of that operator and pass it to operator[] of A.

From C++20, the use of the comma operator within substring expressions was deprecated. In C++23, it was removed entirely, to accommodate the more intuitive multidimensional subscript operator.

Below, we show how we could start building a multidimensional array class using these ideas. In this example, it’s a class that stores a 3x3 matrix of integers by wrapping a one-dimensional array.

The process of mapping the two-dimensional arguments to a single index requires some maths. The offending line is highlighted below, but don’t worry if it doesn’t make sense - we’ll soon switch to using a standard library container to take care of this for us:

#include <array>
#include <iostream>

class Grid {
public:
  int operator[](size_t Row, size_t Col){
    return Data[Row * 3 + Col % 3];
  }

private:
  std::array<int, 9> Data{
    1, 2, 3,
    4, 5, 6,
    7, 8, 9
  };
};

int main(){
  Grid MyGrid;

  std::cout
    << "Top Left: " << MyGrid[0, 0]
    << "\nBottom Right: " << MyGrid[2, 2];
}
Top Left: 1
Bottom Right: 9

We could extend this idea, with the help of template parameters, to create a generalized form of multi-dimensional arrays that could be used in our projects. But, there are third-party libraries that we could use that have already done this.

As of C++23, we’re starting to see standard library classes be introduced to solve these problems too, starting with std::mdspan.

std::mdspan

Note: The standard library’s mdspan class is a recent addition to the language, added in C++23, and is not yet available in all compilers.

The standard library’s mdspan is, at its core, similar to the regular span, which we covered in the previous lesson. It provides a “view” of an underlying array.

It is performant to create and widely compatible with different array types, including C-style arrays, std::array, and std::vector

The key difference is, as we might expect, the mdspan is designed to create views that we can interact with as if we were working with a multidimensional array.

It is a template class that has four template parameters - two required, and two optional. We’ll discuss these template parameters in more detail a little later.

For now, let's just use class template argument deduction to show a basic example.

In the following code, we create a std::mdspan that views our 6-element array as a 2x3 matrix:

#include <mdspan>
#include <iostream>

int main() {
  std::array Array{
    1, 2, 3,
    4, 5, 6
  };

  std::mdspan Span{Array.data(), 2, 3};
}

The first argument to the constructor is a pointer to where the source array begins in memory. With a C-style array, that is just the value of the object. For std::array and std::vector, we can get this pointer using the data() method.

After the pointer, we then have a variable number of additional arguments that represent the shape of our multidimensional view.

Variadic Functions

A function that accepts a variable number of arguments is called a variadic function. We have a dedicated lesson on these functions, and how to create our own, a little later in the course.

The number of additional arguments represents the number of dimensions our view will have, whilst the individual value of each argument represents the size of that dimension.

Above, we’re creating a two-dimensional view, so we provide two additional arguments. Their values are 2 and 3, so we’re creating a 2x3 span.

Access to elements is given through the [] operator, passing as many arguments as we have dimensions. In this example, we have two dimensions, so every use of the [] will receive two arguments:

#include <array>
#include <mdspan>
#include <iostream>

int main() {
  std::array Array{
    1, 2, 3,
    4, 5, 6
  };

  std::mdspan Span{Array.data(), 2, 3};

  std::cout << "Top Left: " << Span[0, 0];
  std::cout << "\nBottom Right: " << Span[1, 2];
}
Top Left: 1
Bottom Right: 6

Shape argument order

Note, that the order of these shape parameters is significant. A 2x3 matrix is not the same as a 3x2 matrix. How we order the dimensions in the constructor corresponds with how we later access elements using the subscript operator [].

The first argument we pass to [] will map to the first dimension, and the second argument will map to the second dimension.

In the above example, we shaped our span to be 2x3. As such, the first argument to the [] operator only has two valid possibilities - 0 or 1. Our second dimension has a size of 3, so there are valid indices for that dimension: 0, 1, or 2.

std::mdspan Size and Rank

The total number of elements the mdspan is viewing is available through the size() method, or empty() if we specifically want to check if the size is 0.

We can also get the rank of the span (i.e., how many dimensions it has) using the rank() method.

#include <array>
#include <mdspan>
#include <iostream>

int main() {
  std::array Array{
    1, 2, 3,
    4, 5, 6
  };

  std::mdspan Span{Array.data(), 2, 3};

  std::cout << "Size: " << Span.size();
  std::cout << "\nRank: " << Span.rank();
}
Size: 6
Rank: 2

Getting the Extent of a Dimension

The extent (or size) of any dimension in a mdspan is available through the extent() method, passing in the index of the dimension we want to find. Similar to array indices, counting starts at 0, so the size of the first dimension is available at extent(0).

When we create a 2x3 view, extent(0) will return 2, and extent(1) will return 3:

#include <array>
#include <mdspan>
#include <iostream>

int main() {
  std::array Array{
    1, 2, 3,
    4, 5, 6,
  };

  std::mdspan Span{Array.data(), 2, 3};

  std::cout << "Rows: " << Span.extent(0);
  std::cout << "\nCols: " << Span.extent(1);
}
Rows: 2
Cols: 3

Iterating through std::mdspan

By using the extent method, we can set up iterations that navigate through our array in a way that is aware of its dimensionality:

#include <array>
#include <mdspan>
#include <iostream>

int main() {
  std::array Array{
    1, 2, 3,
    4, 5, 6,
  };

  std::mdspan Span{Array.data(), 2, 3};

std::cout << "Left Column: ";
for (size_t i{0}; i < Span.extent(0); ++i)
  std::cout << Span[i, 0] << ", ";

std::cout << "\nTop Row: ";
for (size_t i{0}; i < Span.extent(1); ++i)
  std::cout << Span[0, i] << ", ";

std::cout << "\nEverything: ";
  for (size_t i{0}; i < Span.extent(0); ++i)
    for (size_t j{0}; j < Span.extent(1); ++j)
      std::cout << Span[i, j] << ", ";
}
Left Column: 1, 4, 
Top Row: 1, 2, 3, 
Everything: 1, 2, 3, 4, 5, 6,

std::mdspan Template Parameters

Previously, we used class template argument deduction (CTAD) to create our std::mdspan without providing the template parameters.

Let's discuss what those template parameters are now. There are four in total:

  1. The type of data in the array
  2. An extents type, documenting the shape of the std::mdspan
  3. An optional layout policy, which allows us to customize the process whereby the multiple indices passed to the span’s [] operator get converted to the single, one-dimensional index that is used by the underlying array
  4. An optional accessor policy, which allows us to customize how the single index returned from the LayoutPolicy is used to generate the return value of the span’s [] operator.

The first template parameter is pretty simple - it’s just the data type of each element in the view. This will be the same as the data type used by the underlying array. For example, if our array is storing ints, our span will be viewing ints too.

The extents argument is what we’ll spend the rest of this lesson covering.

The final two template parameters are optional and will be covered in a future lesson.

std::extents

The std::extents template class is how we define the shape of our multidimensional spans. The first template parameter allows us to customize what type we want to use as the index of the span. This will be an integer type, such as a size_t.

We then can provide a variable number of additional parameters. Similar to the arguments passed to the mdspan constructor, the number of additional parameters we pass will become the number of dimensions in our span. The value of each of the arguments will be the size of that dimension.

Below, we recreate our 2x3 matrix, but this time we explicitly provide the template parameters:

#include <array>
#include <mdspan>

int main() {
  std::array Array{
    1, 2, 3,
    4, 5, 6
  };

  using Extents = std::extents<size_t, 2, 3>;

  std::mdspan<int, Extents> Span{Array.data()};
}

Note, that we are no longer providing the dimension parameters to the mdspan constructor. They are now instead provided as template parameters to the std::extents template class at compile time.

If the extent of a dimension is not known at compile time, we can then specify it as a dynamic extent.

Dynamic Extents

Within the template parameters of std::extents, we can specify dimensions as having a dynamic extent, that is only known at run time. We do this by passing the std::dynamic_extent token in the appropriate position.

Below, we reference an extent type where the number of rows is known at compile time (2, in this example) but the number of columns is dynamic, as it is not known until run time:

std::extents<size_t, 2, std::dynamic_extent>;

At runtime, we need to provide the value for any dynamic extents by passing arguments to the mdspan constructor. In the following example, we pass 3, thereby again recreating our 2x3 matrix, this time by combining static and dynamic extents:

#include <array>
#include <mdspan>

int main(){
  std::array Array{
    1, 2, 3,
    4, 5, 6
  };

  using Extents = std::extents<
    size_t, 2, std::dynamic_extent
  >;

  std::mdspan<int, Extents> Span{
    Array.data(), 3};
}

std::dextents

Where we want the extent of every dimension to be dynamic, the std::dextents template class provides a more succinct syntax. It accepts two template parameters - the type of index to use, and the number of dimensions we want.

The following example creates a three-dimensional span, where the extent of every dimension is dynamically determined at run time.

#include <array>
#include <mdspan>
#include <iostream>

size_t GetDynamicValue(){ return 2; }

int main(){
  std::array Array{
    1, 2, 3, 4,
    5, 6, 7, 8
  };

  using Extents = std::dextents<size_t, 3>;

  std::mdspan<int, Extents> Span{
    Array.data(), GetDynamicValue(),
    GetDynamicValue(), GetDynamicValue()};

  std::cout << "Dimensions: " << Span.rank();
}
Dimensions: 3

Multidimensional Arrays in C++26 (std::mdarray)

The standard library currently only supports multidimensional arrays as a view of another container, such as a std::vector. However, multidimensional containers that “own” their data are being considered for future versions of the language. A multidimensional form of std::vector, tentatively called std::mdarray, is likely coming in C++26.

Was this lesson useful?

Edit History

  • — First Published

Ryan McCombe
Ryan McCombe
Posted
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Arrays and Linked Lists
7a.jpg
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, unlimited access!

This course includes:

  • 106 Lessons
  • 550+ Code Samples
  • 96% Positive Reviews
  • Regularly Updated
  • Help and FAQ
Next Lesson

Linked Lists vs Arrays

An introduction to linked lists, why we would want to use them, and how they compare to arrays
27.jpg
Contact|Privacy Policy|Terms of Use
Copyright © 2023 - All Rights Reserved