std::mdspan
std::mdspan
, which allows us to interact with arrays as if they have multiple dimensionsIn the beginner course, we introduced the concept of a multidimensional array (or vector), which we could use to represent more complex data structures.
For example, we could represent a 2D grid using an array of arrays:
#include <vector>
#include <iostream>
int main(){
std::vector<std::vector<int>> Grid{
{
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
}};
std::cout << "Top Left: " << Grid[0][0];
std::cout << "\nBottom Right: " << Grid[2][2];
}
Top Left: 1
Bottom Right: 9
There are a couple of problems with this approach. Firstly, we lose some capability to work with the collection as a whole. For example, we can no longer simply get the size of the collection (9, in this example) without doing some extra work. We’d need to iterate over every subarray and add all their sizes together.
Secondly, there is some performance overhead involved in using multidimensional arrays in this way. Working with 3 arrays of 3 elements is generally less efficient than working with a single array of 9Â elements.
We could revert back to using a single array, and simply remember that this data structure is supposed to be a 3 by 3Â grid:
#include <vector>
#include <iostream>
int main(){
std::vector<int> Grid{
1, 2, 3,
4, 5, 6,
7, 8, 9
};
std::cout << "Top Left: " << Grid[0];
std::cout << "\nBottom Right: " << Grid[8];
}
Top Left: 1
Bottom Right: 9
But, this then has the opposite problem. We can no longer work with our structure as a series of rows and columns. This is particularly problematic when that work is being done in a separate function, as we then also have to communicate the nature of the array to that function.
For example, a std::array<int, 12>
could be a regular array of 12 integers, but it could also be a two-dimensional array such as a 6x2, 3x4, or 4x3 grid. It could even be a three-dimensional array, such as a 2x2x3Â volume.
None of that is communicated through the type. When the type does not describe the type of data it’s referring to, that’s often a red flag that we have a problem.
These problems only compound when dealing with larger, more realistic examples. In most industries, we’ll often be working with arrays that can have millions of objects across 3 or more dimensions. Some machine-learning applications involve arrays with hundreds of dimensions. So, solving these problems can make our projects much easier to work with.
Multidimensional arrays are sometimes referred to as tensors.
A one-dimensional array is often called a vector, and a two-dimensional array is called a matrix.
The number of dimensions in an array is its rank, order, or degree.
For example, a rank 2 array, 2nd-degree tensor, or matrix all refer to the same thing - an array with two dimensions
The shape of an array is how many values exist in each dimension, often separated by a multiplication symbol. For example, a chess board could be represented as a two-dimensional array with a shape of 8 x 8.
In maths, it’s common to use subscript notation to refer to individual objects within a collection. For example, refers to the element in position of vector .
or refers to the element in row , column of matrix .
Because the []
syntax is also generally used for this same purpose in programming, it is often referred to as the subscript operator.
[]
OperatorNote: The multidimensional []
operator is a recent addition to the language, added in C++23, and is not yet supported by all compilers.
A key component of being able to work with multidimensional arrays is the ability to pass multiple arguments to the []
 operator.
For example, that might look like MyObject[x, y]
. This has become possible since C++23
The following example shows a custom type implementing a basic multidimensional subscript operator:
#include <iostream>
class SomeType {
public:
int operator[](int a, int b){
return a + b;
}
};
int main(){
SomeType SomeObject;
std::cout << "Result: " << SomeObject[2, 3];
}
Result: 5
Prior to C++23, it was possible to implement an API like this, in a slightly contrived way. The comma symbol ,
is also an operator, which our custom types can overload in the usual way.
As such, an expression like A[B, C]
could be supported by overloading the comma operator on B
's type. Then, an expression like A[B, C]
would result in C
being passed as an argument to operator,
on B
. We then take the return value of that operator and pass it to operator[]
of A
.
From C++20, the use of the comma operator within substring expressions was deprecated. In C++23, it was removed entirely, to accommodate the more intuitive multidimensional subscript operator.
Below, we show how we could start building a multidimensional array class using these ideas. In this example, it’s a class that stores a 3x3 matrix of integers by wrapping a one-dimensional array.
The process of mapping the two-dimensional arguments to a single index requires some maths. The offending line is highlighted below, but don’t worry if it doesn’t make sense - we’ll soon switch to using a standard library container to take care of this for us:
#include <array>
#include <iostream>
class Grid {
public:
int operator[](size_t Row, size_t Col){
return Data[Row * 3 + Col % 3];
}
private:
std::array<int, 9> Data{
1, 2, 3,
4, 5, 6,
7, 8, 9
};
};
int main(){
Grid MyGrid;
std::cout
<< "Top Left: " << MyGrid[0, 0]
<< "\nBottom Right: " << MyGrid[2, 2];
}
Top Left: 1
Bottom Right: 9
We could extend this idea, with the help of template parameters, to create a generalized form of multi-dimensional arrays that could be used in our projects. But, there are third-party libraries that we could use that have already done this.
As of C++23, we’re starting to see standard library classes be introduced to solve these problems too, starting with std::mdspan
.
std::mdspan
Note: The standard library’s mdspan
class is a recent addition to the language, added in C++23, and is not yet available in all compilers.
The standard library’s mdspan
is, at its core, similar to the regular span
, which we covered in the previous lesson. It provides a “view” of an underlying array.
It is performant to create and widely compatible with different array types, including C-style arrays, std::array
, and std::vector
The key difference is, as we might expect, the mdspan
is designed to create views that we can interact with as if we were working with a multidimensional array.
It is a template class that has four template parameters - two required, and two optional. We’ll discuss these template parameters in more detail a little later.
For now, let's just use class template argument deduction to show a basic example.
In the following code, we create a std::mdspan
that views our 6-element array as a 2x3Â matrix:
#include <mdspan>
#include <iostream>
int main() {
std::array Array{
1, 2, 3,
4, 5, 6
};
std::mdspan Span{Array.data(), 2, 3};
}
The first argument to the constructor is a pointer to where the source array begins in memory. With a C-style array, that is just the value of the object. For std::array
and std::vector
, we can get this pointer using the data()
 method.
After the pointer, we then have a variable number of additional arguments that represent the shape of our multidimensional view.
A function that accepts a variable number of arguments is called a variadic function. We have a dedicated lesson on these functions, and how to create our own, a little later in the course.
The number of additional arguments represents the number of dimensions our view will have, whilst the individual value of each argument represents the size of that dimension.
Above, we’re creating a two-dimensional view, so we provide two additional arguments. Their values are 2
and 3
, so we’re creating a 2x3 span.
Access to elements is given through the []
operator, passing as many arguments as we have dimensions. In this example, we have two dimensions, so every use of the []
will receive two arguments:
#include <array>
#include <mdspan>
#include <iostream>
int main() {
std::array Array{
1, 2, 3,
4, 5, 6
};
std::mdspan Span{Array.data(), 2, 3};
std::cout << "Top Left: " << Span[0, 0];
std::cout << "\nBottom Right: " << Span[1, 2];
}
Top Left: 1
Bottom Right: 6
Note, that the order of these shape parameters is significant. A 2x3 matrix is not the same as a 3x2 matrix. How we order the dimensions in the constructor corresponds with how we later access elements using the subscript operator []
.
The first argument we pass to []
will map to the first dimension, and the second argument will map to the second dimension.
In the above example, we shaped our span to be 2x3. As such, the first argument to the []
operator only has two valid possibilities - 0
or 1
. Our second dimension has a size of 3
, so there are valid indices for that dimension: 0
, 1
, or 2
.
std::mdspan
Size and RankThe total number of elements the mdspan
is viewing is available through the size()
method, or empty()
if we specifically want to check if the size is 0
.
We can also get the rank of the span (i.e., how many dimensions it has) using the rank()
 method.
#include <array>
#include <mdspan>
#include <iostream>
int main() {
std::array Array{
1, 2, 3,
4, 5, 6
};
std::mdspan Span{Array.data(), 2, 3};
std::cout << "Size: " << Span.size();
std::cout << "\nRank: " << Span.rank();
}
Size: 6
Rank: 2
The extent (or size) of any dimension in a mdspan
is available through the extent()
method, passing in the index of the dimension we want to find. Similar to array indices, counting starts at 0
, so the size of the first dimension is available at extent(0)
.
When we create a 2x3 view, extent(0)
will return 2
, and extent(1)
will return 3
:
#include <array>
#include <mdspan>
#include <iostream>
int main() {
std::array Array{
1, 2, 3,
4, 5, 6,
};
std::mdspan Span{Array.data(), 2, 3};
std::cout << "Rows: " << Span.extent(0);
std::cout << "\nCols: " << Span.extent(1);
}
Rows: 2
Cols: 3
std::mdspan
By using the extent
method, we can set up iterations that navigate through our array in a way that is aware of its dimensionality:
#include <array>
#include <mdspan>
#include <iostream>
int main() {
std::array Array{
1, 2, 3,
4, 5, 6,
};
std::mdspan Span{Array.data(), 2, 3};
std::cout << "Left Column: ";
for (size_t i{0}; i < Span.extent(0); ++i)
std::cout << Span[i, 0] << ", ";
std::cout << "\nTop Row: ";
for (size_t i{0}; i < Span.extent(1); ++i)
std::cout << Span[0, i] << ", ";
std::cout << "\nEverything: ";
for (size_t i{0}; i < Span.extent(0); ++i)
for (size_t j{0}; j < Span.extent(1); ++j)
std::cout << Span[i, j] << ", ";
}
Left Column: 1, 4,
Top Row: 1, 2, 3,
Everything: 1, 2, 3, 4, 5, 6,
std::mdspan
Template ParametersPreviously, we used class template argument deduction (CTAD) to create our std::mdspan
without providing the template parameters.
Let's discuss what those template parameters are now. There are four in total:
std::mdspan
[]
operator get converted to the single, one-dimensional index that is used by the underlying array[]
operator.The first template parameter is pretty simple - it’s just the data type of each element in the view. This will be the same as the data type used by the underlying array. For example, if our array is storing int
s, our span will be viewing int
s too.
The extents argument is what we’ll spend the rest of this lesson covering.
The final two template parameters are optional and will be covered in a future lesson.
std::extents
The std::extents
template class is how we define the shape of our multidimensional spans. The first template parameter allows us to customize what type we want to use as the index of the span. This will be an integer type, such as a size_t
.
We then can provide a variable number of additional parameters. Similar to the arguments passed to the mdspan
constructor, the number of additional parameters we pass will become the number of dimensions in our span. The value of each of the arguments will be the size of that dimension.
Below, we recreate our 2x3 matrix, but this time we explicitly provide the template parameters:
#include <array>
#include <mdspan>
int main() {
std::array Array{
1, 2, 3,
4, 5, 6
};
using Extents = std::extents<size_t, 2, 3>;
std::mdspan<int, Extents> Span{Array.data()};
}
Note, that we are no longer providing the dimension parameters to the mdspan
constructor. They are now instead provided as template parameters to the std::extents
template class at compile time.
If the extent of a dimension is not known at compile time, we can then specify it as a dynamic extent.
Within the template parameters of std::extents
, we can specify dimensions as having a dynamic extent, that is only known at run time. We do this by passing the std::dynamic_extent
token in the appropriate position.
Below, we reference an extent type where the number of rows is known at compile time (2, in this example) but the number of columns is dynamic, as it is not known until run time:
std::extents<size_t, 2, std::dynamic_extent>;
At runtime, we need to provide the value for any dynamic extents by passing arguments to the mdspan
constructor. In the following example, we pass 3
, thereby again recreating our 2x3 matrix, this time by combining static and dynamic extents:
#include <array>
#include <mdspan>
int main(){
std::array Array{
1, 2, 3,
4, 5, 6
};
using Extents = std::extents<
size_t, 2, std::dynamic_extent
>;
std::mdspan<int, Extents> Span{
Array.data(), 3};
}
std::dextents
Where we want the extent of every dimension to be dynamic, the std::dextents
template class provides a more succinct syntax. It accepts two template parameters - the type of index to use, and the number of dimensions we want.
The following example creates a three-dimensional span, where the extent of every dimension is dynamically determined at run time.
#include <array>
#include <mdspan>
#include <iostream>
size_t GetDynamicValue(){ return 2; }
int main(){
std::array Array{
1, 2, 3, 4,
5, 6, 7, 8
};
using Extents = std::dextents<size_t, 3>;
std::mdspan<int, Extents> Span{
Array.data(), GetDynamicValue(),
GetDynamicValue(), GetDynamicValue()};
std::cout << "Dimensions: " << Span.rank();
}
Dimensions: 3
std::mdarray
)The standard library currently only supports multidimensional arrays as a view of another container, such as a std::vector
. However, multidimensional containers that “own” their data are being considered for future versions of the language. A multidimensional form of std::vector
, tentatively called std::mdarray
, is likely coming in C++26.
— First Published
Comprehensive course covering advanced concepts, and how to use them on large-scale projects.