Managing Libraries and Dependencies

Learn to manage third-party libraries in C++, covering include/library paths, static vs. shared libraries, versioning, and platform differences.

Greg Filak
Published

In the last lesson, we established a recommended directory structure for our projects. This organization brings order to our code, but modern applications are rarely built in isolation. They stand on the shoulders of giants: third-party libraries that provide everything from low-level graphics rendering to complex data analysis.

Integrating these external dependencies can be a challenge. How do you tell your compiler where to find a library's header files? How do you tell your linker which library files to use? And how do you manage this across different platforms, where file names and conventions vary wildly?

This lesson will cover the concepts needed to manage these external dependencies. We'll explore include paths and library paths and confront the issues of versioning and binary compatibility

Header-Only Libraries

When we download a C++ library for our project, they will be in one of three forms:

  1. One or more header files, in addition to a library that has already been compiled (such as a .a or .lib file)
  2. One or more header files, in addition to the library source code. This allows us to compile the library ourselves.
  3. One or more header files, with no additional library files or source code required. This is called a "header-only" library

We'll mostly focus on scenario 1 and 2 in this lesson, but let's talk about header-only libraries briefly here. As the name suggests, they're distributed as one or more header files which we can #include anywhere they are neeeded.

All the code required to use the library is within those header files. As such, all the definitions are available to the compiler after the preprocessor creates the translation unit - no additional linking is required.

This makes the library very easy to use, but it means the preprocessor will be creating larger translation units, making compilation slower. Because of this, the header-only approach is typically only used for small, simple libraries.

Larger libraries are generally distributed using option 1 or 2, which is what we'll focus on in the rest of this lesson.

Creating a Library

When you use an external library, you're interacting with it at two distinct stages of the build process: compilation and linking. Each stage requires you to provide a specific piece of information: a path.

Let's imagine we've downloaded a simple third-party library called "SuperLog" that provides logging functionality. The library is provided in the form of a header file superlog.h and a static library libsuperlog.a, or source code that we can compile to create libsuperlog.a ourselves.

Files

superlog.h
superlog.cpp
Select a file to view its content

We covered the library compilation process in the previous chapter. For example, using GCC, our command to create our object file would be something like this:

g++ -c superlog.cpp -o superlog.o

Followed by a call to ar to archive our object file into a library:

ar rcs libsuperlog.a superlog.o

Let's create a program that wants to use this library.

Using a Library

Once we have our library ready, we'd place it in the designated place within our project, such as a libs/ directory:

my_project/
├─ libs/
│ └─ SuperLog/
│  ├─ include/
│    └─ superlog.h
│  └─ lib/
│    └─ libsuperlog.a
└─ src/
  └─ main.cpp

Source files in our project that want to use the library would #include the appropriate header, and call upon the library features in the usual way:

src/main.cpp

// This probably won't work yet
#include "superlog.h"

int main() {
  superlog::initialize();
  superlog::print("Hello from the library!");
  return 0;
}

If we try to compile this directly, we'll likely hit two roadblocks - one at the compilation stage, and one at the linking stage.

Compilation: The Include Path

First, the compiler needs to find superlog.h. When the preprocessor sees #include "superlog.h", it looks for that file. If it's not in a known location, the build will fail immediately. Let's try it:

g++ -o my_app src/main.cpp

Note we're calling our output my_app in these examples but, if you're on Windows, you would include the .exe extension, as in my_app.exe. Either way, our compilation will likely fail at our #include directive:

src/main.cpp:1:10: fatal error: superlog.h: No such file or directory
    1 | #include "superlog.h"
      |          ^~~~~~~~~~~~
compilation terminated.

To fix this, we need to tell the compiler where to search for header files. We do this by providing an include path using the -I flag.

g++ -o my_app src/main.cpp -Ilibs/SuperLog/include

The -I flag adds libs/SuperLog/include to the list of directories the compiler will search when it encounters an #include directive. This allows it to find superlog.h, and the compilation phase now succeeds.

However, we're not done. The command will still fail, but at the next stage.

Linking: The Library Path

Even though main.cpp compiled successfully into main.o, we now have a linker error that will look something like this:

main.cpp: undefined reference to `superlog::initialize()'
main.cpp: undefined reference to `superlog::print()'

The compiler was happy because superlog.h declared the functions, but the linker is unhappy because it can't find their definitions. The machine code for superlog::initialize() and superlog::print() lives inside libsuperlog.a, and we haven't told the linker that this file exists.

We need to provide two more pieces of information:

  1. Library Path: The directory where the library file (.a, .lib, etc.) is located. We use the uppercase L flag for this.
  2. Library Name: The name of the library to link against. We use the lowercase l flag for this.

Let's add these flags to our command:

g++ -o my_app src/main.cpp -Ilibs/SuperLog/include -Llibs/SuperLog/lib -lsuperlog
  • -Llibs/SuperLog/lib: Tells the linker to add libs/SuperLog/lib to its search paths for libraries.
  • -lsuperlog: Tells the linker to find and link a library named superlog. The linker is smart enough to prepend lib and append .a (or .so) automatically, so it looks for libsuperlog.a.

With these flags, the linker finds libsuperlog.a, extracts the compiled code for the superlog functions, and successfully builds our my_app executable:

./main
SuperLog is initialized
Hello from the library!

As we're starting to see, managing these -I, -L, and -l flags manually is already tedious and getting difficult to follow, even for this incredibly simple program. This is precisely one of the core jobs of a build system like CMake.

Versioning and Compatibility

There is additional complexity with dependencies. Libraries are constantly being updated to fix bugs, add features, or improve performance. This introduces the concepts of versioning and compatibility.

To understand this, we first need to distinguish between two types of "interfaces."

An ABI break occurs when a change to the library's source code results in incompatible machine code, even if the API looks the same. Examples of ABI-breaking changes include:

  • Adding a virtual function to a class.
  • Changing the order of non-static member variables.
  • Changing a function's parameter types (e.g., from int to long).
  • Compiling the library with a different compiler or even different optimization flags.

Why ABI Stability Matters

The type of library you use dictates how much you need to worry about ABI stability.

With Static Libraries: ABI incompatibility is not a major issue. If you get a new version of a static library that has an ABI break, you simply recompile your entire application. Since all the code is bundled into your executable, everything is guaranteed to be consistent.

With Shared Libraries: ABI incompatibility is the root cause of "DLL Hell." Imagine your application was built and linked against libsuperlog.so version 1.0. Later, a user installs another program that replaces it with libsuperlog.so version 2.0, which has an ABI break.

The next time you run your application, the OS loader will load the new, incompatible library. When your code tries to call a function, it might jump to the wrong address or misinterpret the memory layout of a class, leading to a crash.

Semantic Versioning

To manage this, the software community developed Semantic Versioning (SemVer) as a convention. Software using the semantic versioning convention are identified by three numbers, such as 2.1.5. These numbers correspond to:

  • the MAJOR version (2 in this example)
  • the MINOR version (1 in this example)
  • the PATCH version (5 in this example).

When we release an update to our library, the way in which we adjust these numbers communicates the nature of the change we made in that new version. For example, if the current version of our library is 2.1.5 and we want to release a new version, our version number would be:

  • 2.1.6 if our change is backward-compatible bug fixes (a PATCH version).
  • 2.2.0 if our change is backward-compatible feature additions (a MINOR version).
  • 3.0.0 if our change is not backward-compatible (a MAJOR version).

A MAJOR version bump is a clear signal: if you update to this version, you will need to recompile your code, and you may need to change it to accommodate API changes.

Changes that necessitate a major version change are sometimes called breaking changes, and a goal of good software management and interface design is to minimize the frequency with which breaking changes need to be made.

Platform Differences

As if managing paths and versions wasn't enough, different operating systems have their own conventions for naming and using libraries.

Linux / macOS

Static Library: A file named lib<name>.a (e.g., libsuperlog.a).

Shared Library: A file named lib<name>.so (e.g., libsuperlog.so). On macOS, this is .dylib.

Linking: The l<name> flag works for both. The linker typically prefers the shared version (.so) if both are found in the same directory.

Windows (with MSVC)

Windows has a more complex system for dynamic linking.

Static Library: A file named <name>.lib (e.g., superlog.lib). This contains all the object code, just like a .a file.

Shared Library: A dynamic library consists of two files:

  1. The DLL (<name>.dll): This is the actual library that contains the executable code and is needed at runtime.
  2. The Import Library (<name>.lib): This is a special, small .lib file that you link against at build time. It doesn't contain the actual code. Instead, it contains information that tells the linker how to create the necessary stubs to call into the .dll when the program runs.

This dual-file system for DLLs is a frequent source of confusion. When linking against a DLL on Windows, you link against its import .lib file, but you must distribute the .dll file with your application.

Manually managing these platform-specific conventions is another major headache that CMake eliminates.

Summary

Managing external dependencies is a multi-faceted problem that sits at the intersection of compiling, linking, and deployment. Without a good build system, it's a manual, error-prone, and non-portable process.

  • Compiler and Linker Paths: You must tell the compiler where to find headers (I for include paths) and the linker where to find library files (L for library paths) and which libraries to use (l for library names).
  • Usage Differences: Using a static library results in a large, self-contained executable. Using a shared library results in a smaller executable but requires the library file to be present at runtime.
  • Compatibility is Key: API changes break your source code. ABI changes break your running program. Understanding the difference is important for working with shared libraries, and Semantic Versioning helps communicate these changes.
  • Platform Divergence: Windows, Linux, and macOS have different naming conventions (.lib/.dll vs. .a/.so) and different mechanisms for finding shared libraries at runtime. The Windows "import library" is a particularly important platform-specific detail.

We have now explored the major components of a C++ project: the source code organization, the build pipeline, and the management of external libraries.

We've also seen the manual, command-line approach to handling them and the pain points that arise. In the next chapter, we'll start introducing tools that help us manage a lot of this complexity.

Next Lesson
Lesson 7 of 12

Automated Build Systems and Their Limitations

An introduction to traditional build tools like Makefiles and IDE projects, highlighting the maintenance, scalability, and cross-platform challenges that result

Have a question about this lesson?
Answers are generated by AI models and may not have been reviewed for accuracy