Previously, we've focused on serializing single bytes of data at a time - usually the 8-bit char type. However, when we start to serialize multi-byte objects, such as int and float values, things can get slightly more complex.

We need to pay careful attention to the order of those bytes, especially when working across different systems.

In this lesson, we'll explore how computers store multi-byte values, understand the challenges of different byte orderings, and learn practical techniques for handling these differences using SDL's binary manipulation functions.

Digit Order

Let's first consider how we represent numbers in plain English. Numbers larger than 9 are represented by multiple digits - eg 45 requires two digits (4 and 5), whilst 162 requires three (1, 6, and 2).

There are two things to note here that we may take for granted, but we should be conscious of as we build out these concepts:

The order of the digits is important - 45 and 54 are not the same
We order our digits such that more significant digits comes earlier.

For example, in a number like 45, the first digit (4) is more significant than the second digit (5). It is more significant because, if we increment the first digit, we increase the value of our number by 10 (from 45 to 55), but incrementing the second digit only increases the value of the overall number by 1 (from 45 to 46).

This pattern continues for numbers of any length. In a number like 162, the 1 is more significant than the 6, and the 6 is more significant than the 2.

Byte Order

These exact same considerations apply when dealing numeric types comprising of multiple bytes, such as 16, 32 and 64 bit integers, which have 2, 4 and 8 bytes respectively.

But unfortunately, unlike our familiar English numeric system, there's no agreed standard here. Some systems store the most significant bytes first, while others use different orderings

When working in a low level context, we often also need to deal with multiple different byte-order conventions within the same system, and convert data from one to the other as needed. For example, data transferred over a network is commonly done using the most-significant-byte-first order, but many CPU architectures expect the exact opposite ordering.

Endianness

The way in which a system orders its bytes is referred to as it's endianness. Most fall into one of two categories:

Big-endian systems place the more significant bytes before less significant bytes. Therefore, the way we represent numbers in English is similar to big-endian systems.
Little-endian systems position less significant bytes earlier - that is, at smaller memory addresses.

There are other, less popular possibilities with names like bi-endian, middle-endian and mixed-endian. However, big and little endian are the most common, and they're what we'll focus on for now.

Why Endianness Matters

When we read the binary representation of a multi-byte number, such as a Uint32, it's important that we understand and react appropriately to how those bytes are ordered.

For example, if data is serialized in a little-endian format, and then some other system deserializes that data with the assumption that it is big-endian, the values will mismatch.

The following program shows the implications this has. To simulate the endianness mismatch, we'll use the SDL_Swap32() function, which reverses the order of the 4 bytes in a 32-bit block of memory:

src/main.cpp

1#include <iostream>
2#include <SDL3/SDL.h>
3#include <SDL3/SDL_main.h>
4
5int main(int, char**) {
6  Uint32 Serialized{42};
7  std::cout << "Serialized: " << Serialized;
8
9  Uint32 Deserialized{SDL_Swap32(Serialized)};
10  std::cout << "\nDeserialized: "
11    << Deserialized;
12
13  return 0;
14}

1Serialized: 42
2Deserialized: 704643072

Checking Endianness

If we need to check the endianness configured by our compiler, SDL provides the SDL_BYTEORDER preprocessor definition.

We can compare this to the SDL_BIG_ENDIAN or SDL_LIL_ENDIAN definitions to understand our environment.

The following example also includes some code to log out how 1234 is stored in memory using reinterpret_cast, which we explain in more detail later in the lesson:

src/main.cpp

1#include <iostream>
2#include <iomanip>
3#include <SDL3/SDL.h>
4#include <SDL3/SDL_main.h>
5
6int main(int, char**) {
7  std::cout << "System Endianness:\n";
8
9#if SDL_BYTEORDER == SDL_BIG_ENDIAN
10  std::cout << "Big Endian (most significant "
11    "byte first)";
12#elif SDL_BYTEORDER == SDL_LIL_ENDIAN
13  std::cout << "Little Endian (least "
14    "significant byte first)";
15#else
16  std::cout << "Unknown byte order";
17#endif
18
19  Uint16 value{0x1234};
20  auto* bytes{
21    reinterpret_cast<std::byte*>(&value)
22  };
23
24  std::cout << "\nValue 0x1234 is stored as: "
25    << std::hex << static_cast<int>(bytes[0])
26    << " " << static_cast<int>(bytes[1]);
27
28  return 0;
29}

1System Endianness:
2Little Endian (least significant byte first)
3Value 0x1234 is stored as: 34 12

Serializing with Endianness

To control the endianness of multi-byte values when serializing, SDL provides a series of helpful functions. For example, to write binary data in the little-endian format, we can use one of 3 functions depending on the memory size of the value:

SDL_WriteU16LE(): Writes 16 bits (2 bytes) of data in the little-endian order
SDL_WriteU32LE(): Writes 32 bits (4 bytes) of data in the little-endian order
SDL_WriteU64LE(): Writes 64 bits (8 bytes) of data in the little-endian order

These functions will convert the data if needed. If our system is little-endian, our memory will be written as-is. If our system is big-endian, SDL will write the bytes in the opposite order. Either way, it guarantees that our output is in the little-endian order.

Here's an example in code:

src/main.cpp

1#include <iostream>
2#include <SDL3/SDL.h>
3#include <SDL3/SDL_main.h>
4#include <SDL3/SDL_iostream.h>
5
6int main(int, char**) {
7  SDL_Init(0);
8  SDL_IOStream* Handle{
9    SDL_IOFromFile("data.bin", "wb")};
10
11  if (!Handle) {
12    std::cout << "Error opening file: "
13      << SDL_GetError();
14  }
15
16  Uint32 Content{42};
17  SDL_WriteU32LE(Handle, Content);
18
19  SDL_CloseIO(Handle);
20  return 0;
21}

Big-endian variations of these functions are also available - SDL_WriteU16BE(), SDL_WriteU32BE(), and SDL_WriteU64BE().

Analysing Binary Output

Binary data isn't inherently designed to be read by humans - even opening a binary file usually requires specialist tools rather than a standard text editor.

However, if we really need to analyse binary data, we still can. The tool we need is commonly called a hex editor. Our IDE is likely to include a hex editor or have one available as a plugin. We can alternatively use a standalone tool or a website such as hexed.it.

If we open our previous output representing the number 42 in a hex editor, we should see our 4 bytes of binary data represented in hexadecimal as 2a 00 00 00.

42 is a relatively small number in the range of what can be stored in a 4-byte integer. As such, its value can be represented entirely in the least significant byte and, because we wrote this data in the little-endian order, the least significant byte comes first.

Converting the hex value 2a to decimal should confirm that our number, 42, was accurately serialized. The $2$ represents $2 \times 16 = 32$ and the $a$ represents $10$ , so $2a = 32 + 10$ .

Error Handling

SDL's endianness-sensitive write functions like SDL_WriteU32LE() return the number of bytes written. We can use this to check if the write was successful, and react accordingly.

Below, we attempt to write to a file that we opened only for reading using the rb flag:

src/main.cpp

1#include <iostream>
2#include <SDL3/SDL.h>
3#include <SDL3/SDL_main.h>
4#include <SDL3/SDL_iostream.h>
5
6int main(int, char**) {
7  SDL_Init(0);
8  SDL_IOStream* Handle{
9    // We won't be able to write to this
10    SDL_IOFromFile("data.bin", "rb")
11  };
12
13  if (!Handle) {
14    std::cout << "Error opening file: "
15      << SDL_GetError();
16  }
17
18  Uint32 Content{42};
19
20  if (SDL_WriteU32LE(Handle, Content)) {
21    std::cout << "Write Successful";
22  } else {
23    std::cout << "Write Failed: "
24      << SDL_GetError();
25  }
26
27  SDL_CloseIO(Handle);
28  return 0;
29}

1Write Failed: Error writing to datastream: Access is denied.

Deserializing with Endianness

Once we know the endianness of the data we're working with, we can choose an appropriate function to read that data into memory. For example, if we know the data follows the little-endian byte order, we can use one of these functions:

SDL_ReadU16LE(): Read the next 16 bits (2 bytes) of data.
SDL_ReadU32LE(): Read the next 32 bits (4 bytes) of data.
SDL_ReadU64LE(): Read the next 64 bits (8 bytes) of data.

These functions will read the data with the assumption that it is in the little-endian format. Then, if our system is also little-endian, it will write it to memory as-is. If our system is not little-endian, SDL will convert the data to our native format before storing it in memory.

Here's an example in code:

src/main.cpp

1#include <iostream>
2#include <SDL3/SDL.h>
3#include <SDL3/SDL_main.h>
4#include <SDL3/SDL_iostream.h>
5
6int main(int, char**) {
7  SDL_Init(0);
8  SDL_IOStream* Handle{
9    SDL_IOFromFile("data.bin", "rb")};
10
11  if (!Handle) {
12    std::cout << "Error opening file: "
13      << SDL_GetError();
14  }
15
16  Uint32 Content{0};
17  SDL_ReadU32LE(Handle, &Content);
18  std::cout << "Content: " << Content;
19
20  SDL_CloseIO(Handle);
21  return 0;
22}

1Content: 42

Big-endian variations of these functions are also available - SDL_ReadU16BE(), SDL_ReadU32BE(), and SDL_ReadU64BE().

When serializing and deserializing data exclusively for our own program, these functions make dealing with endianness quite easy. We simply choose one (little-endian or big-endian) and stick to it.

So, for example, if we choose little-endian, we use the SDL_WriteU32LE() function to write all 4-byte values, and SDL_ReadU32LE() to read them.

Reordering Bytes Manually

SDL provides some utility functions that allow us to swap byte orders at any time, independently of the SDL_IOStream context. For example, the SDL_Swap32() function byte-swaps 4 bytes of data.

The following program shows an example of this, and also includes a LogBytes() function. We walk through LogBytes() in the next section but, for now, we can just note that it's helpful for visualizing how a value is represented in bytes:

src/main.cpp

1#include <iomanip>
2#include <iostream>
3#include <SDL3/SDL.h>
4#include <SDL3/SDL_main.h>
5
6void LogBytes(Uint32 x) {
7  std::byte(*bytes)[4]{
8    reinterpret_cast<std::byte(*)[4]>(&x)
9  };
10
11  for (const auto& byte : *bytes) {
12    std::cout
13      << std::hex
14      << std::setw(2)
15      << std::setfill('0')
16      << std::to_integer<int>(byte) << " ";
17  }
18}
19
20int main(int, char**) {
21  Uint32 Original{42};
22  std::cout << "Original: ";
23  LogBytes(Original);
24
25  std::cout << "\n Swapped: ";
26  LogBytes(SDL_Swap32(Original));
27  return 0;
28}

1Original: 2a 00 00 00
2 Swapped: 00 00 00 2a

We also have SDL_Swap16() and SDL_Swap64() for byte-swapping 16 and 64 byte values respectively, and SDL_SwapFloat() for the float data type.

Advanced: What is `LogBytes()` Doing?

This function uses some more complex techniques that we haven't covered in this course so far. We cover these techniques in more detail in our advanced course, but we'll provide a brief summary here.

The LogBytes() function receives 32 bits of arbitrary binary data in the form of an unsigned integer - Uint32. It first reinterprets this memory as a C-style array of 4 individual bytes, with each byte represented by the std::byte type:

1void LogBytes(Uint32 x) {
2  std::byte(*bytes)[4]{
3    reinterpret_cast<std::byte(*)[4]>(&x)
4  };
5
6  // ...
7}

The reinterpret_cast operator allows us to change which type the compiler associates with a block of memory, without changing the contents of that memory. This is generally unsafe and we should only use it if we're sure the underlying memory layout will be correctly understood by type we're casting it to.

It's safe to use in this context because x is just a blob of binary data. It's originally stored as a Uint32, but that type isn't meaningful - it's just used as a way to transfer some bits. We can safely reinterpret those bits as whatever type makes them most convenient to work with in our function.

Once we have our array of std::bytes, we want to iterate over them and display the contents of each byte. However, there is no overloaded << operator allowing us to stream a std::byte.

So, to represent the contents of each byte visually, we convert each one to a basic int using std::to_integer, a function that is provided alongside std::byte for exactly this purpose.

We then stream these integer representations to std::cout:

1void LogBytes(Uint32 x) {
2  std::byte(*bytes)[4]{
3    reinterpret_cast<std::byte(*)[4]>(&x)
4  };
5
6  for (const auto& byte : *bytes) {
7    std::cout
8      // ... 
9      << std::to_integer<int>(byte) << " ";
10  }
11}

We also apply three "IO manipulators" from the standard library's <iomanip> header to control how these integers are displayed. We have std::hex, which ensures the output is in hexadecimal. We also have std::setw(2) and std::setfill('0') which ensures each output is at least two characters wide, with 0 being used to to fill gaps where needed.

1void LogBytes(Uint32 x) {
2  std::byte(*bytes)[4]{
3    reinterpret_cast<std::byte(*)[4]>(&x)
4  };
5
6  for (const auto& byte : *bytes) {
7    std::cout
8      << std::hex
9      << std::setw(2)
10      << std::setfill('0')
11      << std::to_integer<int>(byte) << " ";
12  }
13}

This means the integer 4, for example, would be output as 04 instead.

We cover IO manipulation in more detail in a dedicated lesson on .

Reordering Bytes to Native Order

SDL also provide a range of functions for converting data from a known endianness to the system's native endianness.

For example, if we have 4 bytes of big-endian data, and we want to ensure it is in the system's native endianness, we can use the SDL_Swap32BE() function.

If our system is also big-endian, this function will just return the data without modification. However, if our system is little-endian, the function will return it with its byte order reversed:

src/main.cpp

1#include <iomanip>
2#include <iostream>
3#include <SDL3/SDL.h>
4#include <SDL3/SDL_main.h>
5
6void LogBytes(Uint32 x) {/*...*/}
20
21int main(int, char**) {
22  Uint32 BigEndian{SDL_Swap32(42)};
23  std::cout << "Big-Endian: ";
24  LogBytes(BigEndian);
25
26  std::cout << "\n    Native: ";
27  LogBytes(SDL_Swap32BE(BigEndian));
28  return 0;
29}

1Big-Endian: 00 00 00 2a
2    Native: 2a 00 00 00

We can also handle 2 bytes of big-endian data using SDL_Swap16BE(), 8 bytes using SDL_Swap64BE(), and a big-endian float using SDL_SwapFloatBE().

And, if we know our data is little-endian, we can convert it to our native order using SDL_Swap16LE(), SDL_Swap32LE(), SDL_Swap64LE(), and SDL_SwapFloatLE().

Summary

Binary data handling across different platforms requires understanding and managing byte order differences

We've explored the concept of endianness, learned about SDL's binary manipulation functions, and practiced implementing stable data serialization techniques. Key takeaways:

Endianness affects how multi-byte values are stored in memory
Different systems may use different byte orders (big-endian or little-endian)
SDL provides functions for handling byte order conversion
Always consider endianness when serializing binary data
Use appropriate SDL functions for reading and writing binary data

Byte Order and Endianness

Digit Order

Byte Order

Endianness

Why Endianness Matters

src/main.cpp

Checking Endianness

src/main.cpp

Serializing with Endianness

src/main.cpp

Analysing Binary Output

Error Handling

src/main.cpp

Deserializing with Endianness

src/main.cpp

Reordering Bytes Manually

src/main.cpp

Advanced: What is `LogBytes()` Doing?

Reordering Bytes to Native Order

src/main.cpp

Summary

Padding and Alignment

Game Development with SDL3

Byte Order and Endianness

Digit Order

Byte Order

Endianness

Why Endianness Matters

src/main.cpp

Checking Endianness

src/main.cpp

Serializing with Endianness

src/main.cpp

Analysing Binary Output

Error Handling

src/main.cpp

Deserializing with Endianness

src/main.cpp

Reordering Bytes Manually

src/main.cpp

Advanced: What is LogBytes() Doing?

Reordering Bytes to Native Order

src/main.cpp

Summary

Padding and Alignment

Advanced: What is `LogBytes()` Doing?