Byte Order and Endianness
Learn how to handle byte order in using SDL3's endianness functions
Previously, we've focused on serializing single bytes of data at a time - usually the 8-bit char type. However, when we start to serialize multi-byte objects, such as int and float values, things can get slightly more complex.
We need to pay careful attention to the order of those bytes, especially when working across different systems.
In this lesson, we'll explore how computers store multi-byte values, understand the challenges of different byte orderings, and learn practical techniques for handling these differences using SDL's binary manipulation functions.
Digit Order
Let's first consider how we represent numbers in plain English. Numbers larger than 9 are represented by multiple digits - eg 45 requires two digits (4 and 5), whilst 162 requires three (1, 6, and 2).
There are two things to note here that we may take for granted, but we should be conscious of as we build out these concepts:
- The order of the digits is important -
45and54are not the same - We order our digits such that more significant digits comes earlier.
For example, in a number like 45, the first digit (4) is more significant than the second digit (5). It is more significant because, if we increment the first digit, we increase the value of our number by 10 (from 45 to 55), but incrementing the second digit only increases the value of the overall number by 1 (from 45 to 46).
This pattern continues for numbers of any length. In a number like 162, the 1 is more significant than the 6, and the 6 is more significant than the 2.
Byte Order
These exact same considerations apply when dealing numeric types comprising of multiple bytes, such as 16, 32 and 64 bit integers, which have 2, 4 and 8 bytes respectively.
But unfortunately, unlike our familiar English numeric system, there's no agreed standard here. Some systems store the most significant bytes first, while others use different orderings
When working in a low level context, we often also need to deal with multiple different byte-order conventions within the same system, and convert data from one to the other as needed. For example, data transferred over a network is commonly done using the most-significant-byte-first order, but many CPU architectures expect the exact opposite ordering.
Endianness
The way in which a system orders its bytes is referred to as it's endianness. Most fall into one of two categories:
- Big-endian systems place the more significant bytes before less significant bytes. Therefore, the way we represent numbers in English is similar to big-endian systems.
- Little-endian systems position less significant bytes earlier - that is, at smaller memory addresses.
There are other, less popular possibilities with names like bi-endian, middle-endian and mixed-endian. However, big and little endian are the most common, and they're what we'll focus on for now.
Why Endianness Matters
When we read the binary representation of a multi-byte number, such as a Uint32, it's important that we understand and react appropriately to how those bytes are ordered.
For example, if data is serialized in a little-endian format, and then some other system deserializes that data with the assumption that it is big-endian, the values will mismatch.
The following program shows the implications this has. To simulate the endianness mismatch, we'll use the SDL_Swap32() function, which reverses the order of the 4 bytes in a 32-bit block of memory:
src/main.cpp
#include <iostream>
#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>
int main(int, char**) {
Uint32 Serialized{42};
std::cout << "Serialized: " << Serialized;
Uint32 Deserialized{SDL_Swap32(Serialized)};
std::cout << "\nDeserialized: "
<< Deserialized;
return 0;
}Serialized: 42
Deserialized: 704643072Checking Endianness
If we need to check the endianness configured by our compiler, SDL provides the SDL_BYTEORDER preprocessor definition.
We can compare this to the SDL_BIG_ENDIAN or SDL_LIL_ENDIAN definitions to understand our environment.
The following example also includes some code to log out how 1234 is stored in memory using reinterpret_cast, which we explain in more detail later in the lesson:
src/main.cpp
#include <iostream>
#include <iomanip>
#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>
int main(int, char**) {
std::cout << "System Endianness:\n";
#if SDL_BYTEORDER == SDL_BIG_ENDIAN
std::cout << "Big Endian (most significant "
"byte first)";
#elif SDL_BYTEORDER == SDL_LIL_ENDIAN
std::cout << "Little Endian (least "
"significant byte first)";
#else
std::cout << "Unknown byte order";
#endif
Uint16 value{0x1234};
auto* bytes{
reinterpret_cast<std::byte*>(&value)
};
std::cout << "\nValue 0x1234 is stored as: "
<< std::hex << static_cast<int>(bytes[0])
<< " " << static_cast<int>(bytes[1]);
return 0;
}System Endianness:
Little Endian (least significant byte first)
Value 0x1234 is stored as: 34 12Serializing with Endianness
To control the endianness of multi-byte values when serializing, SDL provides a series of helpful functions. For example, to write binary data in the little-endian format, we can use one of 3 functions depending on the memory size of the value:
SDL_WriteU16LE(): Writes 16 bits (2 bytes) of data in the little-endian orderSDL_WriteU32LE(): Writes 32 bits (4 bytes) of data in the little-endian orderSDL_WriteU64LE(): Writes 64 bits (8 bytes) of data in the little-endian order
These functions will convert the data if needed. If our system is little-endian, our memory will be written as-is. If our system is big-endian, SDL will write the bytes in the opposite order. Either way, it guarantees that our output is in the little-endian order.
Here's an example in code:
src/main.cpp
#include <iostream>
#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>
#include <SDL3/SDL_iostream.h>
int main(int, char**) {
SDL_Init(0);
SDL_IOStream* Handle{
SDL_IOFromFile("data.bin", "wb")};
if (!Handle) {
std::cout << "Error opening file: "
<< SDL_GetError();
}
Uint32 Content{42};
SDL_WriteU32LE(Handle, Content);
SDL_CloseIO(Handle);
return 0;
}Big-endian variations of these functions are also available - SDL_WriteU16BE(), SDL_WriteU32BE(), and SDL_WriteU64BE().
Analysing Binary Output
Binary data isn't inherently designed to be read by humans - even opening a binary file usually requires specialist tools rather than a standard text editor.
However, if we really need to analyse binary data, we still can. The tool we need is commonly called a hex editor. Our IDE is likely to include a hex editor or have one available as a plugin. We can alternatively use a standalone tool or a website such as hexed.it.
If we open our previous output representing the number 42 in a hex editor, we should see our 4 bytes of binary data represented in hexadecimal as 2a 00 00 00.
42 is a relatively small number in the range of what can be stored in a 4-byte integer. As such, its value can be represented entirely in the least significant byte and, because we wrote this data in the little-endian order, the least significant byte comes first.
Converting the hex value 2a to decimal should confirm that our number, 42, was accurately serialized. The represents and the represents , so .
Error Handling
SDL's endianness-sensitive write functions like SDL_WriteU32LE() return the number of bytes written. We can use this to check if the write was successful, and react accordingly.
Below, we attempt to write to a file that we opened only for reading using the rb flag:
src/main.cpp
#include <iostream>
#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>
#include <SDL3/SDL_iostream.h>
int main(int, char**) {
SDL_Init(0);
SDL_IOStream* Handle{
// We won't be able to write to this
SDL_IOFromFile("data.bin", "rb")
};
if (!Handle) {
std::cout << "Error opening file: "
<< SDL_GetError();
}
Uint32 Content{42};
if (SDL_WriteU32LE(Handle, Content)) {
std::cout << "Write Successful";
} else {
std::cout << "Write Failed: "
<< SDL_GetError();
}
SDL_CloseIO(Handle);
return 0;
}Write Failed: Error writing to datastream: Access is denied.Deserializing with Endianness
Once we know the endianness of the data we're working with, we can choose an appropriate function to read that data into memory. For example, if we know the data follows the little-endian byte order, we can use one of these functions:
SDL_ReadU16LE(): Read the next 16 bits (2 bytes) of data.SDL_ReadU32LE(): Read the next 32 bits (4 bytes) of data.SDL_ReadU64LE(): Read the next 64 bits (8 bytes) of data.
These functions will read the data with the assumption that it is in the little-endian format. Then, if our system is also little-endian, it will write it to memory as-is. If our system is not little-endian, SDL will convert the data to our native format before storing it in memory.
Here's an example in code:
src/main.cpp
#include <iostream>
#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>
#include <SDL3/SDL_iostream.h>
int main(int, char**) {
SDL_Init(0);
SDL_IOStream* Handle{
SDL_IOFromFile("data.bin", "rb")};
if (!Handle) {
std::cout << "Error opening file: "
<< SDL_GetError();
}
Uint32 Content{0};
SDL_ReadU32LE(Handle, &Content);
std::cout << "Content: " << Content;
SDL_CloseIO(Handle);
return 0;
}Content: 42Big-endian variations of these functions are also available - SDL_ReadU16BE(), SDL_ReadU32BE(), and SDL_ReadU64BE().
When serializing and deserializing data exclusively for our own program, these functions make dealing with endianness quite easy. We simply choose one (little-endian or big-endian) and stick to it.
So, for example, if we choose little-endian, we use the SDL_WriteU32LE() function to write all 4-byte values, and SDL_ReadU32LE() to read them.
Reordering Bytes Manually
SDL provides some utility functions that allow us to swap byte orders at any time, independently of the SDL_IOStream context. For example, the SDL_Swap32() function byte-swaps 4 bytes of data.
The following program shows an example of this, and also includes a LogBytes() function. We walk through LogBytes() in the next section but, for now, we can just note that it's helpful for visualizing how a value is represented in bytes:
src/main.cpp
#include <iomanip>
#include <iostream>
#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>
void LogBytes(Uint32 x) {
std::byte(*bytes)[4]{
reinterpret_cast<std::byte(*)[4]>(&x)
};
for (const auto& byte : *bytes) {
std::cout
<< std::hex
<< std::setw(2)
<< std::setfill('0')
<< std::to_integer<int>(byte) << " ";
}
}
int main(int, char**) {
Uint32 Original{42};
std::cout << "Original: ";
LogBytes(Original);
std::cout << "\n Swapped: ";
LogBytes(SDL_Swap32(Original));
return 0;
}Original: 2a 00 00 00
Swapped: 00 00 00 2aWe also have SDL_Swap16() and SDL_Swap64() for byte-swapping 16 and 64 byte values respectively, and SDL_SwapFloat() for the float data type.
Advanced: What is LogBytes() Doing?
This function uses some more complex techniques that we haven't covered in this course so far. We cover these techniques in more detail in our advanced course, but we'll provide a brief summary here.
The LogBytes() function receives 32 bits of arbitrary binary data in the form of an unsigned integer - Uint32. It first reinterprets this memory as a C-style array of 4 individual bytes, with each byte represented by the std::byte type:
void LogBytes(Uint32 x) {
std::byte(*bytes)[4]{
reinterpret_cast<std::byte(*)[4]>(&x)
};
// ...
}The reinterpret_cast operator allows us to change which type the compiler associates with a block of memory, without changing the contents of that memory. This is generally unsafe and we should only use it if we're sure the underlying memory layout will be correctly understood by type we're casting it to.
It's safe to use in this context because x is just a blob of binary data. It's originally stored as a Uint32, but that type isn't meaningful - it's just used as a way to transfer some bits. We can safely reinterpret those bits as whatever type makes them most convenient to work with in our function.
Once we have our array of std::bytes, we want to iterate over them and display the contents of each byte. However, there is no overloaded << operator allowing us to stream a std::byte.
So, to represent the contents of each byte visually, we convert each one to a basic int using std::to_integer, a function that is provided alongside std::byte for exactly this purpose.
We then stream these integer representations to std::cout:
void LogBytes(Uint32 x) {
std::byte(*bytes)[4]{
reinterpret_cast<std::byte(*)[4]>(&x)
};
for (const auto& byte : *bytes) {
std::cout
// ...
<< std::to_integer<int>(byte) << " ";
}
}We also apply three "IO manipulators" from the standard library's <iomanip> header to control how these integers are displayed. We have std::hex, which ensures the output is in hexadecimal. We also have std::setw(2) and std::setfill('0') which ensures each output is at least two characters wide, with 0 being used to to fill gaps where needed.
void LogBytes(Uint32 x) {
std::byte(*bytes)[4]{
reinterpret_cast<std::byte(*)[4]>(&x)
};
for (const auto& byte : *bytes) {
std::cout
<< std::hex
<< std::setw(2)
<< std::setfill('0')
<< std::to_integer<int>(byte) << " ";
}
}This means the integer 4, for example, would be output as 04 instead.
We cover IO manipulation in more detail in a dedicated lesson on .
Reordering Bytes to Native Order
SDL also provide a range of functions for converting data from a known endianness to the system's native endianness.
For example, if we have 4 bytes of big-endian data, and we want to ensure it is in the system's native endianness, we can use the SDL_Swap32BE() function.
If our system is also big-endian, this function will just return the data without modification. However, if our system is little-endian, the function will return it with its byte order reversed:
src/main.cpp
#include <iomanip>
#include <iostream>
#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>
void LogBytes(Uint32 x) {/*...*/}
int main(int, char**) {
Uint32 BigEndian{SDL_Swap32(42)};
std::cout << "Big-Endian: ";
LogBytes(BigEndian);
std::cout << "\n Native: ";
LogBytes(SDL_Swap32BE(BigEndian));
return 0;
}Big-Endian: 00 00 00 2a
Native: 2a 00 00 00We can also handle 2 bytes of big-endian data using SDL_Swap16BE(), 8 bytes using SDL_Swap64BE(), and a big-endian float using SDL_SwapFloatBE().
And, if we know our data is little-endian, we can convert it to our native order using SDL_Swap16LE(), SDL_Swap32LE(), SDL_Swap64LE(), and SDL_SwapFloatLE().
Summary
Binary data handling across different platforms requires understanding and managing byte order differences
We've explored the concept of endianness, learned about SDL's binary manipulation functions, and practiced implementing stable data serialization techniques. Key takeaways:
- Endianness affects how multi-byte values are stored in memory
- Different systems may use different byte orders (big-endian or little-endian)
- SDL provides functions for handling byte order conversion
- Always consider endianness when serializing binary data
- Use appropriate SDL functions for reading and writing binary data
Padding and Alignment
Learn how memory alignment affects data serialization and how to handle it safely