One of the most common ways of connecting storage systems in a performant way is the sparse set pattern. These are designed for optional one-to-one relationships, where a record in one collection can be linked to zero or one record in another collection.

We cover one-to-many patterns later in the chapter, but for our use case, sparse sets are perfect. They offer $O(1)$ lookup - that is, if we have a proximity component (an index in our ProximityStorage) we can quickly find the entity that "owns" the component (the index in the EntityStorage).

This is the case in both directions - if we have the index in our EntityStorage, we can find if that entity has a related record in ProximityStorage, and retrieve that record if so.

Sparse sets also offer $O(1)$ insertion and deletion. We can create and remove these links in constant time.

The Architecture

A sparse set connects two storage systems together using two arrays:

The Sparse Array, whose size corresponds to how many entities we have. That is, the number of records in our EntityStorage.
The Dense Array, whose size corresponds to how many components we have of a specific type. For example, the number of records in a component pool such as ProximityStorage.

For example, if our world had eight entities (with IDs from 0 to 7) and two proximity components (with IDs 0 and 1), then the sizes of our dense and sparse arrays would be 8 and 2 respectively:

The relationship between an entity and a proximity component is established by the values in the array slots.

For example, if entity 5 owned the proximity component 0, then sparse[5] would contain the value 0, and dense[0] would contain 5, effectively linking those two records together.

The sparse array gets its name from the fact that it typically contains a lot of empty space - gaps that are created every time an entity does not have a corresponding component.

If an entity does not have a proximity component, we place some sentinel value such as -1 in that entity's slot within the sparse array:

We then simply repeat this pattern for every other type of component. Our audio component pool would have its own sparse set. That would be a sparse array representing the same entities, whose values map to indicies of a dense array corresponding to components within AudioStorage, or -1 if that entity does not have an audio component.

Implementing the Sparse Set

In our case, we're trying to model a situation where only some entities have proximity components. So, the "sparse" array corresponds to rows in the EntityStorage, whilst the "dense" array will match the row count of the ProximityStorage.

We'll need multiple sparse sets as we add more systems and relationships between them, so we'll create a dedicated type for this. We'll use basic int indices to keep things simple and memory efficient, but this could also be size_t or a template type if we need to support larger collections:

dsa_core/include/dsa/SparseSet.h

1#pragma once
2#include <vector>
3
4class SparseSet {
5public:
6  // Maps Entity -> Proximity Component
7  std::vector<int> m_sparse;
8  
9  // Maps Proximity Component -> Entity
10  std::vector<int> m_dense;
11  
12  // What value is the "sentinel" in the sparse array?
13  static constexpr int null_id = -1;
14  
15  // If m_sparse[42] == null_id , then entity 42 does
16  // not have a proximity component
17
18  // Does a specific entity have a component?
19  bool contains(int entity_id) const {
20    if (entity_id >= m_sparse.size()) return false;
21    return m_sparse[entity_id] != null_id;
22  }
23};

Handling Insertion

To create a link between an entity and a proximity component, our sparse set needs to update its arrays.

The indices we're being asked to link may be outside the bounds of our arrays. For example, if our sparse array currently has a size of 5 (entities 0, 1, 2, 3, and 4) and a request is made to add a component to entity 7, we need to resize it.

Conceptually, this means that entities 5 and 6 probably exist too - our sparse set just hasn't seen them before because those entities don't have the component type we're tracking.

This means that, in addition to increasing the size of m_sparse to support m_sparse[7], we also must fill any intermediate slots that are created by that resizing with our sentinal value -1:

We may want to consider how we should respond if we receive a request that would create gaps in our supposedly "dense" array, too. We could throw an exception or use a result type like std::expected but, to keep our examples focused on the key concepts, we'll just assume our inputs are always valid:

dsa_core/include/dsa/SparseSet.h

1#pragma once
2#include <vector>
3
4class SparseSet {
5public:
6  // ...
7
8  // Create a link between the two storage systems
9  void insert(int entity_id, int dense_index) {
10    if (entity_id >= m_sparse.size()) {
11      // Grow our array, and assign each new index to the sentinel
12      m_sparse.resize(entity_id + 1, null_id);
13    }
14    m_sparse[entity_id] = dense_index;
15
16    if (dense_index >= m_dense.size()) {
17      // Grow our array
18      m_dense.resize(dense_index + 1);
19    }
20    m_dense[dense_index] = entity_id;
21  }
22};

Using the `SparseSet`

Let's use this to let our ProximityStorage keep track of which entities own each of the components.

We'll add a SparseSet for this, and we'll also update our Add() function to receive the entity ID that this new component is for:

Files

dsa_core

benchmarks

Select a file to view its content

Handling Deletions

This is the most error-prone and confusing part of the process, especially if we want to use the swap-and-pop idiom. This requires a carefully coordinated set of steps:

We need to delete a record from the dense array using swap and pop
This invalidates two indices in the sparse array, so those need to be updated
We need to actually delete the component from our ProximityStorage, again using pop and swap.

Let's start with our dense array:

dsa_core/include/dsa/SparseSet.h

1// ...
2
3class SparseSet {
4public:
5  // ...
6  int remove(int entity_id) {
7    // Which component needs to be "deleted"?
8    int deleted_index = m_sparse[entity_id];
9    // Which component will replace it? (the last one)
10    int last_entity = m_dense[m_dense.size() - 1];
11    // Swap
12    m_dense[deleted_index] = last_entity;
13    // Pop
14    m_dense.pop_back();
15    
16    // ...more coming soon
17    
18    return deleted_index;
19  }
20};

Our sparse array is up next. Whichever entity owned the last component needs to be aware that we just swapped it. Also, the entity that initially requested their component be removed needs to be updated to confirm that this happened:

dsa_core/include/dsa/SparseSet.h

1// ...
2
3class SparseSet {
4public:
5  // ...
6  int remove(int entity_id) {
7    // Swap and pop the dense array (proximity components)
8    int deleted_index = m_sparse[entity_id];
9    int last_entity = m_dense[m_dense.size() - 1];
10    m_dense[deleted_index] = last_entity;
11    m_dense.pop_back();
12    
13    // Update the sparse array (entities)
14    m_sparse[last_entity] = deleted_index;
15    m_sparse[entity_id] = null_id;
16    
17    // ...more coming soon
18  }
19};

These SparseSet updates have just removed the relationship - we still need to remove the component in our ProximityStorage. To help with this, our SparseSet can return the index that it just swapped and popped:

dsa_core/include/dsa/SparseSet.h

1// ...
2
3class SparseSet {
4public:
5  // ...
6  int remove(int entity_id) {
7    // Swap and pop the dense array (proximity components)
8    int deleted_index = m_sparse[entity_id];
9    int last_entity = m_dense[m_dense.size() - 1];
10    m_dense[deleted_index] = last_entity;
11    m_dense.pop_back();
12    
13    // Update the sparse array (entities)
14    m_sparse[last_entity] = deleted_index;
15    m_sparse[entity_id] = null_id;
16
17    return deleted_index; 
18  }
19};

Over in our ProximityStorage, we can finally integrate everything. We ask the SparseSet to erase the relationship, and we use its return value to swap and pop the corresponding component:

dsa_core/include/dsa/ProximityStorage.h

1// ...
2
3class ProximityStorage {
4public:
5  // ...
6  void Remove(int entity_id) {
7    if (!m_map.contains(entity_id)) return;
8
9    // Update the map and get the index of the gap we need to fill
10    int index = m_map.remove(entity_id); 
11
12    // Perform the Swap-and-Pop on the data
13    // We swap the back element into the 'index' slot...
14    std::swap(m_distance[index], m_distance.back());
15    // ...and then pop the back element
16    m_distance.pop_back();
17    
18    // Repeat for all columns
19    std::swap(m_angle[index], m_angle.back());
20    m_angle.pop_back();
21    
22    std::swap(m_occlusion[index], m_occlusion.back());
23    m_occlusion.pop_back();
24  }
25};

Bringing It All Together

We now have two storage systems, with some basic bookkeeping that links them together behind the scenes:

dsa_app/main.cpp

1#include <dsa/EntityStorage.h>
2#include <dsa/ProximityStorage.h>
3
4int main() {
5  EntityStorage entities;
6  ProximityStorage proximity;
7  
8  // Create a player
9  int alice = entities.Add("Alice");
10  
11  // Player is close
12  proximity.Add(alice, 1.0f, 2.0, 3.0f);
13  
14  // Not any more
15  proximity.Remove(alice);
16}

In the next lesson, we'll strengthen these capabilities further, letting us get the data we need in a cohesive and joined-up way, regardless of what underlying storage system it's coming from.

Complete Code

A complete version of the example code is below:

Files

dsa_app

dsa_core

benchmarks

Select a file to view its content

Summary

We have successfully transitioned from a single monolithic storage system to a flexible composition of systems.

SoA provides the fastest possible iteration speed but lacks flexibility for optional data.
Memory Indirection using techniques like vector<unique_ptr> handles optional data but destroys performance via cache misses and branch mispredictions.
Sparse Sets bridge the gap. They allow us to keep data packed in contiguous arrays while providing a mapping between storage systems.

Relational Data and Sparse Sets

The Architecture

Implementing the Sparse Set

dsa_core/include/dsa/SparseSet.h

Handling Insertion

dsa_core/include/dsa/SparseSet.h

Using the `SparseSet`

Files

Handling Deletions

dsa_core/include/dsa/SparseSet.h

dsa_core/include/dsa/SparseSet.h

dsa_core/include/dsa/SparseSet.h

dsa_core/include/dsa/ProximityStorage.h

Bringing It All Together

dsa_app/main.cpp

Complete Code

Files

Summary

The Join Algorithm

Entity-Component-System

Relational Data and Sparse Sets

The Architecture

Implementing the Sparse Set

dsa_core/include/dsa/SparseSet.h

Handling Insertion

dsa_core/include/dsa/SparseSet.h

Using the SparseSet

Files

Handling Deletions

dsa_core/include/dsa/SparseSet.h

dsa_core/include/dsa/SparseSet.h

dsa_core/include/dsa/SparseSet.h

dsa_core/include/dsa/ProximityStorage.h

Bringing It All Together

dsa_app/main.cpp

Complete Code

Files

Summary

The Join Algorithm

Using the `SparseSet`