Tutorial: Easy dependency management for C++ with CMake and Git

07 Jul 2016 by Jonathan

C++ dependency management is a more controversial topic with many alternatives and lots of third-party tools. The following reddit comment describes it well:

Comment from discussion Is there a C++ package manager? If not, how do you handle dependencies?.

This tutorial explains a relatively simple solution using CMake - the de-facto standard build tool - and git - the de-facto source code version control system. It doesn’t require any external tools, works on all platforms and is relatively easy to setup and flexible for the user. This is the same system I’m currently using for standardese, my C++ documentation generator.

Since writing this post, CMake has added FetchContent, a superior way to fetch dependencies than with submodules. I’ve published a blog post about it here.

The goals

Let’s say your are developing a C++ library that uses some external dependencies. A library is different from a “normal” program because clients need the external dependencies as well in order to work with the library. So when installing the library you also need to take care the libraries are installed as well.

This is also true if you are deploying a binary only but need shared libraries. I have mixed feelings for those.

Furthermore, while some of the external dependencies are header-only, some aren’t and some take really long to build.

There are now two different approaches you can do - and all the package manager do one of those:

Download the sources and build the dependency.
Download a pre-compiled binary.

Neither of those approaches is perfect.

has the disadvantage that some projects are huge and take really long to build. So often the package manager cache a binary once it is built - something we cannot do in this scope.
seems way better but runs into a problem due to three letters - ABI. The Application Binary Interface, the way your interfaces are when compiled, is not standardized. You cannot use the same binary for different platforms, compilers, standard library implementations, build types (debug vs release), moon phases and a myriad of other factors. If you want a pre-compiled binary it must have the exact same configuration as your system.

Now there is one situation where downloading a pre-compiled binary is enough: when using the package manager of your system. All the libraries are built with one compiler and one standard library under one system so they can all work together. I really wish I could just delegate package management to the OS and simply state that you should install version X of library Y, but not everyone is using ArchLinux or a similar Linux distribution which has the current version of everything as package.

And don’t get me started on Windows. I cannot understand how people can program there.

Thus I decided to go with a mix of 1)/2): first look for a pre-compiled binary on the system and only if none is found, fetch the sources and build. Users who have the library already installed don’t have a penalty for compilation, only those who don’t have it. And if someone doesn’t have it and sees that it is going to be compiled, can look for a different way to get it.

So let’s look at each step in more detail and how to implement it in CMake.

Step 0: Look for a pre-compiled binary

The easy way

CMake provides the find_package() function to look for a package installed on your computer. A package is basically a CMake file that setups a target that you can use just as if it was defined in your CMakeLists.txt itself. For a target that is properly setup, all you need should be something like that:

find_package(dependency [VERSION 1.42])
target_link_libraries(my_target PUBLIC dependency_target)
# for a proper library this also setups any required include directories or other compilation options

The hard way

But not every CMake project supports find_package().

If your project doesn’t, check out my tutorial on how to do it.

For those CMake provides a more manual set of functions: find_file(), find_library(), find_path() and find_program. Those functions try to find a file, a library, a path or a program (duh). They can be used as follows:

find_XXX(VARIABLE_FOR_RESULT "stuff-your-looking-for" locations-where-it-might-be)

For find_path() “stuff-your-looking-for” is a file inside the folder path you want.

For example, to look for a library called foo on a Unix system:

find_library(FOO_LIBRARY "foo" "/usr/lib" "/usr/local/lib")

Yes, this doesn’t work under Windows because I have no idea where Windows users put their stuff.

In case what you are searching for isn’t found, the variable will be set to “VAR-NOTFOUND”, which can be detected through an if(NOT VARIABLE). Note that users can override the value in the cache to “help” CMake find the required stuff.

The find_XXX() family of functions also have a much more advanced syntax with tons of options to control exactly where to look and in which order, in case you need more control.

For convenience in usage you can also create a “fake” target that can be used as if the library was setup properly:

find_path(FOO_INCLUDE_DIR ...)
find_library(FOO_LIBRARY ...)

if(FOO_INCLUDE_DIR AND FOO_LIBRARY)
	add_library(foo INTERFACE)
	target_include_directories(foo INTERFACE ${FOO_INCLUDE_DIR})
	target_link_libraries(foo INTERFACE ${FOO_LIBRARY})
else()
	... # read on
endif()

An INTERFACE library is a library that doesn’t really exist, but you can set the INTERFACE properties which will be passed on if you someone links to the library.

Now, if you’ve found a pre-compiled binary and did something to ensure that it is the right version, you’re done. You can just use it.

Otherwise things are getting interesting.

Case 1: A header-only library

If you have a header-only library that isn’t installed on your system, you simply need to download the header files and make them available.

You can also avoid the search part in that case because using it is so straightforward. But I suggest that you still have something so the user can override it.

Step 1: Get the sources

Now you could just have the library bundled with your own sources, but I wouldn’t do that. You’re probably using Git or some other version control system. It should be used to manage your changes and not those of your dependencies. Polluting the diffs with noise coming from an update of an external library, where you’ve just copy&pasted the new release, feels wrong.

There is a better solution for Git though: git submodules. A submodule can be compared to a pointer to a commit in a different repository. The sources aren’t stored in your history, just a link to it. And if needed the link will be dereferenced and you have the external library available in your working tree.

You can also download the sources with CMake’s file(DOWNLOAD) but this often runs into issues with company proxies or similar, while git should work.

To create a new submodule run git submodule add <repository-url>. This will initialize the “pointer” to the head of the default branch of the repository. It will also clone it in your working directory, so I suggest doing it in a subdirectory named external or similar. The sources of a repository foo will then be available in external/foo just as if it was cloned normally.

But when a user clones it, the submodule will not be cloned (by default). It will be cloned once the users issues a git submodule update --init -- external/foo (with the example above). And this can be leveraged inside CMake:

# step 0
find_path(FOO_INCLUDE_DIR ...)

if((NOT FOO_INCLUDE_DIR) OR (NOT EXISTS ${FOO_INCLUDE_DIR})
    # we couldn't find the header files for FOO or they don't exist
    message("Unable to find foo")

    # we have a submodule setup for foo, assume it is under external/foo
    # now we need to clone this submodule
    execute_process(COMMAND git submodule update --init -- external/foo
                    WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR})

    # set FOO_INCLUDE_DIR properly
    set(FOO_INCLUDE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/external/foo/path/to/include
        CACHE PATH "foo include directory")

    # also install it
    install(DIRECTORY ${FOO_INCLUDE_DIR}/foo DESTINATION ${some_dest})

    # for convenience setup a target
    add_library(foo INTERFACE)
    target_include_directories(foo INTERFACE
                               $<BUILD_INTERFACE:${FOO_INCLUDE_DIR}>
                               $<INSTALL_INTERFACE:${some_dest}>)

    # need to export target as well
    install(TARGETS foo EXPORT my_export_set DESTINATION ${some_dest})
else()
    # see above, setup target as well
endif()

If we couldn’t find the dependency, we need to clone the submodule. This is done by execute_process() after outputting a message. After that is done, we have the sources and can set the include directory variable again.

Also note that we now need to install the headers as well, because they must be available for your installed target. For that we need to call install(DIRECTORY). Note that it will keep the last folder name, i.e. install(DIRECTORY /some/path), will put the folder path at the destination. For that I’ve appended the hypothetical foo directory to the path (foo’s headers are thus under path/to/include/foo).

At last a convenience target is created as described in step 0. Note that we need the generator expressions when we set the include directories: When building the library the headers are in ${FOO_INCLUDE_DIR}, but once it is installed the headers are at the install destination.

We also need to export the target so that it can be re-created when included via find_package().

Step 2: … We’re done!

Assuming that we create the same target in the other case, where we’ve found the headers, we can use it like so:

target_link_libraries(my_target PUBLIC foo)

Case 2: A library that must be build by CMake

It is actually less work if the library isn’t header only and has a “proper” CMake setup.

Step 1: Get the sources

Exactly like in the header only case. Clone the submodule if pre-compiled binary isn’t found.

Step 2: Build the library

Because the library uses CMake we can just use the add_subdirectory() command to make all the targets available:

if((NOT FOO_LIBRARY) OR ...)
    ...

    # build it
    add_subdirectory(external/foo)
else()
    ...
endif()

Thanks to the add_subdirectory() command the library will be built automatically by CMake and you have all the targets available. If the target is setup properly you only need to call target_link_libraries() again. Otherwise I suggest “amending” the target properties after the add_subdirectory() call.

One thing you cannot amend unfortunately is the installation. If your targets gets installed via CMake all targets it links to must be exported, so they can be re-created later on. You can only export targets in the same subdirectory where they’re created, however. So you might need to submit a PR and hope that it gets merged.

Case 3: A library that must be build by another buildsystem

This is the most work but it can be done in a seamless way as way. After fetching the sources like in the other cases, you also need to issue commands to build it.

But you can simply “fake” the commands a user would enter in order to build the library; like done with the git submodules. execute_process() runs a command at configure time (i.e. cmake -D... -G.. path/to/source), add_custom_command() and add_custom_target() run a command at build time (i.e. cmake --build path/to/build).

Then you can also create a fake target to make integration very easy and hope that they’ll switch to CMake someday.

Case 4: A library that takes really long to build

That’s the problematic case. The Case 2 and 3 solutions will build the dependency as well. But if the dependency is a huge project with looong build times, this might not be feasible.

Sometimes you’re lucky however and the dependency has a C API. Then you don’t have most of the ABI issues and can simply fetch a pre-compiled binary for your OS and compiler.

But sometimes you aren’t lucky. In this case you have to bite the bullet and require the user to have the dependency installed by themselves.

ahem - Boost - ahem.

Conclusion

The system I’ve presented here is quite simple to setup (provided that the dependencies are setup properly…) and is completely transparent to the user:

They just need to issue the normal three commands: git clone ..., cmake ... and cmake --build .. Everything else is done by the build system. This makes especially CI very easy.

I’ve used this kind of system in standardese, you can find the source here. If you haven’t read it already, I also recommend my installation tutorial.

This blog post was written for my old blog design and ported over. If there are any issues, please let me know.