When we write code, we often think about memory as a simple sequence of bytes. However, modern processors work with memory in larger chunks for efficiency. Two key concepts drive this behavior: cache lines and memory pages.

Cache lines, typically 64 bytes, are the smallest unit of data that can be transferred between the CPU cache and main memory. Similarly, memory pages are the smallest unit managed by the operating system's virtual memory system.

For optimal performance, we typically want our data to be aligned to minimise the frequency with which a single value crosses one of these boundaries. An example of a boundary cross might be a 4-byte integer where it's first two bytes are at the end of one cache line, and the last two bytes are at the start of the next.

The boundary between our two cache lines might look like the following, where X represents the integer we're interested in, and A and B represent other arbitrary variables:

1Line 1  | Line 2
2A A X X | X X B B

Most systems handle this scenario gracefully - they perform multiple reads to grab both blocks of memory, then take the appropriate bytes from each and combine them to to reconstruct our integer X.

However, this comes at a performance cost. Instead, we want to align our data to maximise the chances that it is stored entirely within the same cache line or page, eliminating the need for this additional processing.

Aligning data means we simply add additional bytes in strategic positions within the memory layout of our objects. These bytes, which contain no useful data and exist only to push subsequent bytes into later memory addresses, are called padding.

We could align our previous structure by adding 2 bytes of padding after A, thereby pushing X entirely onto the next line, where it can be accessed in a single read operation.

We'll represent padding by underscores, _, and the boundary between our cache lines would now look like this:

1Line 1  | Line 2
2A A _ _ | X X X X B B

Alignment and Padding

Let's see an example where our compiler will likely intervene, adding some padding to achieve a specific alignment:

1#include <iostream>
2
3struct MyStruct {
4  char A; // 1 byte
5  int B; // 4 bytes
6};
7
8int main() {
9  std::cout << sizeof(MyStruct) << " bytes";
10}

Given instances of MyStruct require 1 byte for the char and 4 for the int, we might expect the overall size to be 5 bytes. However, in most scenarios, 3 bytes of padding are added to objects of this type, bringing their total size to 8:

18 bytes

This additional padding is added to ensure the B integer is placed in its natural alignment - that is, a memory address divisible by 4.

As such, we can imagine the memory layout of an instance of MyStruct looking like the following, where we have 1 byte assigned to storing the char called A, followed by 3 bytes of padding, and finally 4 bytes assigned to the int value B:

1A _ _ _ B B B B

Natural alignment refers to placing data at memory addresses that match the size of the data type - 32-bit integers are typically aligned to 4-byte boundaries, 64-bit doubles to 8-byte boundaries, and so on.

This alignment strategy comes from the CPU's memory access patterns: modern processors are designed to read data most efficiently when it's placed at these aligned addresses.

This typically allows them to fetch the entire value in a single operation, rather than the more expensive process of multiple memory reads and then reconstructing the required value by combining them.

Let's see another example, where we simply reorder the A and B members within our struct definition:

1#include <iostream>
2
3struct MyStruct {
4  int B; // 4 bytes
5  char A; // 1 byte
6};
7
8int main() {
9  std::cout << sizeof(MyStruct) << " bytes";
10}

Perhaps surprisingly, the compiler adds 3 bytes of padding here too:

18 bytes

In this case, the padding is added to the end of our memory layout. It looks like this:

1B B B B A _ _ _

The primary reason for this padding is to deal with the common scenario where multiple instances of our objects are stored contigously in memory, such as in a std::vector<MyStruct>.

In that context, the memory layout of two objects in an array looks like this:

1B B B B A _ _ _ B B B B A _ _ _

The additional padding was added to maintain alignment in scenarios like this. The B integer in the first object is correctly aligned to byte offset 0, whilst the B in the second object is correctly aligned to byte offset 8, and so on.

Packing

We can order the members of our type to make more efficient use of memory. That is, to reduce the amount of padding the compiler requires to maintain alignment.

For example, let's consider the following struct:

1#include <iostream>
2
3struct MyStruct {
4  char A; // 1 byte
5  int B;  // 4 bytes
6  char C; // 1 byte
7};
8
9int main() {
10  std::cout << sizeof(MyStruct) << " bytes";
11}

Objects of this type only contain 6 bytes of useful data. However, to correctly align the integer B (including for the array context), 6 additional bytes of padding are required, taking its size to 12:

112 bytes

The memory layout of an instance of this struct looks like this:

1A _ _ _ B B B B C _ _ _

By reordering our members, we can pack memory more efficiently. The following version of MyStruct contains all the same data, but only requires 8 bytes of storage:

1#include <iostream>
2
3struct MyStruct {
4  int B;  // 4 bytes
5  char A; // 1 byte
6  char C; // 1 byte
7};
8
9int main() {
10  std::cout << sizeof(MyStruct) << " bytes";
11}

18 bytes

This is more efficient because only 2 bytes of padding are required to align the integer for use in arrays:

1B B B B A C _ _

Configuring Padding

The way in which our compiler adds padding is typically configurable. The default settings are almost always preferred but, in rare situations, we may need to modify them.

For example, in memory-constrained environments, it may be desirable to remove packing completely. This will reduce the memory demands of our program, but can also degrade performance and may cause unexpected behavior, so we should proceed with caution here.

One way to control our packing settings is through the #pragma pack directive. The following program does not add any padding, reducing the size of MyStruct objects to 6 bytes:

1#pragma pack(1)
2#include <iostream>
3
4struct MyStruct {
5  char A;  // 1 byte
6  int B;   // 4 bytes
7  char C;  // 1 byte
8};
9
10int main() {
11  std::cout << sizeof(MyStruct) << " bytes";
12}

16 bytes

1A B B B B C

Documentation for the #pragma pack directive as implemented by the MSVC compiler used by Visual Studio is available on their official site.

Customising Padding

In addition to configuring padding through our compiler settings, we can add padding on a case-by-case basis. Again, this is rarely necessary, but it has some use cases which we'll cover later.

One way to configure padding is through the alignas() specifier. This allows us to explicitly set the alignment of a class or struct, or a data member within that class or struct.

Below, we align MyStruct instances to 16, adding 4 additional bytes of padding relative to what the compiler's default alignment of this type would be (12):

1#include <iostream>
2
3struct alignas(16) MyStruct {
4  char A;  // 1 byte
5  int B;   // 4 bytes
6  char C;  // 1 byte
7};
8
9int main() {
10  std::cout << sizeof(MyStruct) << " bytes";
11}

116 bytes

The memory layout of these objects will be:

1A _ _ _ B B B B C _ _ _ _ _ _ _

We can also totally customise the memory layout of objects simply by adding additional, unused members to act as padding:

1#include <iostream>
2
3struct MyStruct {
4  char A;       // 1 byte
5
6  char padB[7]; // 7 bytes
7  int B;        // 4 bytes
8
9  char padC[3]; // 3 bytes
10  char C;       // 1 byte
11};
12
13int main() {
14  std::cout << sizeof(MyStruct) << " bytes";
15}

116 bytes

The memory layout of these objects will be:

1A _ _ _ _ _ _ _ B B B B _ _ _ C

We won't need to go this low-level with the objects we're creating in this course. However, it's important to know that it's an option, and that the objects we're working with may have customised their memory layout. As such, we should be careful with making any assumptions when serializing or copying complex objects.

For example, SDL intervenes in the layout of the pixel data associated with an SDL_Surface. From the compiler's perspective, this data is just another array of numbers. But in context, SDL knows this contiguous block of memory actually represents a two-dimensional image - a grid of pixel colors.

With this additional context in mind, SDL intervenes in the memory layout, adding additional padding in ways that make grid-based operations (like reading a rectangular area of pixels) as efficient as possible.

Serializing with Padding

As we might expect, these padding and alignment behaviors have implications when it comes to serializing our objects. If we're not mindful that these "gaps" exist between our variables, our serialization and deserialization code can contain serious bugs and result in data loss.

Below, we attempt to serialize MyStruct without being aware that padding is added between A and B. We assume, therefore, that writing 5 bytes will capture all of the data:

1#include <SDL.h>
2#include <iostream>
3
4struct MyStruct {
5  char A;
6  int B;
7};
8
9int main(int argc, char** argv) {
10  SDL_RWops* rw{
11    SDL_RWFromFile("example.bin", "wb")};
12  if (!rw) {
13    std::cerr << "Failed to open file: "
14      << SDL_GetError();
15    return 1;
16  }
17
18  MyStruct Serialized{'A', 42};
19  
20  // Assume MyStruct is 5 bytes
21  SDL_RWwrite(rw, &Serialized, 1, 5);
22  SDL_RWclose(rw);
23  
24  std::cout << "Serialized: A = "
25    << Serialized.A
26    << ", B = " << Serialized.B;
27
28  return 0;
29}

1Original: A = A, B = 42

If we later read this file using the same assumptions, we'll see our B integer doesn't have the correct value:

1#include <SDL.h>
2#include <iostream>
3
4struct MyStruct {
5  char A;
6  int B;
7};
8
9int main(int argc, char** argv) {
10  SDL_RWops* rw{
11    SDL_RWFromFile("example.bin", "rb")};
12  if (!rw) {
13    std::cerr << "Failed to open file: "
14      << SDL_GetError();
15    return 1;
16  }
17
18  MyStruct Deserialized;
19  SDL_RWread(rw, &Deserialized, 1, 5);
20  SDL_RWclose(rw);
21  
22  std::cout << "Deserialized: A = "
23    << Deserialized.A
24    << ", B = " << Deserialized.B;
25  return 0;
26}

1Deserialized: A = A, B = -859045846

To solve this problem, we need to approach alignment of class and struct instances differently.

Adding Save and Load Methods

The standard way of serializing and deserializing objects in a way that respects alignment across a variety of platforms is to handle their data members as individual values.

Rather than serializing a MyStruct object in a single operation, we'd serialize each of its variables individually. In large programs, this is typically done by adding dedicated serialization and deserialization methods to our class or struct:

1// MyStruct.h
2#pragma once
3#include <iostream>
4#include <SDL.h>
5
6class MyStruct {
7 public:
8  char A;
9  int B;
10
11  void Save(const std::string& path) const {
12    SDL_RWops* Handle{SDL_RWFromFile(
13      path.c_str(), "wb")};
14
15    if (!Handle) {
16      std::cout << "Error opening file: "
17        << SDL_GetError();
18      return;
19    }
20
21    SDL_RWwrite(Handle, &A, sizeof(char), 1);
22    SDL_RWwrite(Handle, &B, sizeof(int), 1);
23
24    SDL_RWclose(Handle);
25  }
26
27  void Load(const std::string& path) {
28    SDL_RWops* Handle{SDL_RWFromFile(
29      path.c_str(), "rb")};
30
31    if (!Handle) {
32      std::cout << "Error opening file: "
33        << SDL_GetError();
34      return;
35    }
36
37    SDL_RWread(Handle, &A, sizeof(char), 1);
38    SDL_RWread(Handle, &B, sizeof(int), 1);
39
40    SDL_RWclose(Handle);
41  }
42};

Elsewhere in our program, we can now instruct MyObject instances to save their state to a file using the Save() method, or load their state from a file using the Load() method:

1#include <iostream>
2#include "MyStruct.h"
3
4int main(int argc, char** argv) {
5  MyStruct MyObject{'A', 42};
6  MyObject.Save("example.bin");
7  std::cout << "Serialized: A = "
8    << MyObject.A << ", B = " << MyObject.B;
9
10  MyObject.Load("example.bin");
11  std::cout << "\nDeserialized: A = "
12    << MyObject.A << ", B = " << MyObject.B;
13
14  return 0;
15}

1Serialized: A = A, B = 42
2Deserialized: A = A, B = 42

Summary

In this lesson, we've seen how memory alignment affects our C++ programs and why padding is necessary. Understanding these concepts helps us write more efficient code and avoid common pitfalls when working with data serialization.

Key takeaways:

Alignment requirements come from hardware design
Padding maintains proper alignment
Structure layout affects memory usage
Careful serialization is essential
Tools exist for custom alignment control

Padding and Alignment

Alignment and Padding

Packing

Serializing with Padding

Adding Save and Load Methods

Summary

Read/Write Offsets and Seeking

Game Development with SDL2

Questions & Answers