Search

Modern CMake in RetDec

CMake has become the de facto standard for C/C++ build process management. Unfortunately, using it is seldom straightforward. Especially if you want to do it right (i.e. use modern CMake) in a non-trivial project. Constant stream of new features, evolution of standard practices, loads of existing projects not adopting them, convoluted documentation, lack of usage examples, etc. All of this can make you bang your head against the table asking: What the hell am I supposed to do here?

Well, this article doesn’t give you an answer to that question. Instead, it shows you how do we use CMake in a fairly complex project called RetDec. The article is meant as a practical example of modern CMake usage, not necessarily a guide on how you should do it. Pick and choose what works for you and your project.

Basics

Before continuing with our specific example, you might wanna have a look at more general resources on modern CMake usage:

Build system objectives

The RetDec project contains multiple standalone libraries, executables, scripts, and other components. As such, we would like our build system to support the following:

  1. Build and install only the selected components and their dependencies (#510).
  2. Make the selected components easily usable in other projects (#648).
  3. Test everything.

Project structure

The RetDec repository has the following basic structure:

retdec
├─── CMakeLists.txt
├─── cmake/
│    └─── CMake scripts
├─── deps/
│    └─── Dependencies
├─── doc/
│    └─── Documentation
├─── include/
│    └─── retdec/
│         └─── Components' public headers
├─── scripts/
│    └─── Helper scripts
├─── src/
│    └─── Components' sources
├─── support/
│    └─── Other resources
└─── tests/
     └─── Unit tests

There is one design decision that we have made that is quite significant to the subject of this article. RetDec builds all its dependencies by itself, apart from the most basic ones that are likely to be a part of a typical OS installation. There are two reasons for it:

  1. RetDec often uses modified forks of these projects.
  2. RetDec tries to minimize its dependencies so it doesn’t bother its users with stuff they need to install, and its users don’t bother the developers when they fail to do so.

Intelligent build component selection

If the user doesn’t want to build and install the entire project, they should be able to specify only the desired component(s). For example, the RetDec project contains a library named Fileformat which provides an object-file-format-agnostic representation of binary files. If the user is only interested in parsing such files, there is no point in building and installing the entire RetDec project with its many modules unrelated to the task. Component selection should be as easy as providing a list of required components at the configuration step.

There is however one fundamental problem with such a task. Targets’ transitive dependencies. What we would like to achieve is to make all the unnecessary components invisible to CMake, but treat the selected components and their transitive dependencies as if nothing happened. Unfortunately, we haven’t been able to achieve this easily using the built-in CMake features like install()‘s component attribute. In the end, we have decided to use the following custom solution.

All the components get an option RETDEC_ENABLE_<component>. By default, all such options are off and option RETDEC_ENABLE_ALL (building the entire repository) is on. If the user sets one or more component options, RETDEC_ENABLE_ALL is disabled and all the transitive dependencies of set options are enabled. All across the project, we use a custom cond_add_subdirectory(dir condition) macro to add directories instead of the traditional add_subdirectory(). You guessed it, individual enable options are used as conditions in adding their associated components. This way, unselected components become truly invisible to CMake.

One downside, from the implementation point of view, is hidden in the statement: all the transitive dependencies of set options are enabled. The problem is that there is no automated way to devise these relations from the information already provided to CMake in the form of target’s link libraries. In fact, the relations are kinda inverted. Whereas target_link_libraries(A B C) says: library A links to (i.e. depends on) libraries B and C, we need to say: enable libraries B and C if library A is enabled. Therefore, we must hardcode these inverted relations into our build system. Fortunately, the ugliness is contained to a single CMake script, the build system itself is regularly tested, and the solution does exactly what we want.

To demonstrate with an example, let’s look at the implementation for these dependent libraries.

                  ┌──────┐    ┌─────┐
            ┌─────┤common├────┤utils│
            │     └──────┘    └─────┘
┌──────┐    │
│serdes├────┤
└──────┘    │ 
            │    ┌─────────┐ 
            └────┤rapidjson│
                 └─────────┘

Their link dependencies are as follows:

add_library(whereami)
add_library(rapidjson)
add_library(utils)

add_library(common)
target_link_libraries(common utils)

add_library(serdes)
target_link_libraries(serdes common rapidjson)Code language: CMake (cmake)

Whereas their enable dependencies would be (the order matters!):

set_if_at_least_one_set(RETDEC_ENABLE_COMMON
		RETDEC_ENABLE_ALL
		RETDEC_ENABLE_SERDES)

set_if_at_least_one_set(RETDEC_ENABLE_RAPIDJSON
		RETDEC_ENABLE_SERDES)

set_if_at_least_one_set(RETDEC_ENABLE_UTILS
		RETDEC_ENABLE_ALL
		RETDEC_ENABLE_COMMON)Code language: CMake (cmake)

Now, if the user sets RETDEC_ENABLE_SERDES=ON, also options RETDEC_ENABLE_COMMON, RETDEC_ENABLE_RAPIDJSON, and RETDEC_ENABLE_UTILS are set. If RETDEC_ENABLE_SERDES weren’t set, RETDEC_ENABLE_ALL would make sure all the RetDec modules are enabled (rapidjson isn’t a native RetDec module, but a dependency). For a full implementation, see this CMake script.

Modern component build & installation

Based on their CMake-script similarity, we can divide RetDec components into three distinct groups:

  • RetDec dependencies
  • RetDec libraries
  • RetDec executables

Although there are some special cases, CMake scripts within these groups are highly typified. We aim to use modern CMake features to build and install our targets in a way that makes them easily usable by other projects. The objective is to get a relocatable install directory that looks like this:

CMAKE_INSTALL_PREFIX
├─── bin/
│    └─── Executable binaries and scripts
├─── include/
│    └─── retdec/
│         └─── RetDec components' headers
├─── lib/
│    └─── RetDec components' static libraries
└─── share/
     └─── retdec/
          ├─── cmake/
          │    └─── RetDec components' CMake package files
          ├─── support/
          │    └─── Other RetDec data
          └─── LICENSE, README, etc.

It should be possible to take this directory, move it around the machine it was created on, or even to a different machine with a compatible environment (OS, system libraries, etc.), and use it by other CMake projects as easily as:

find_package(retdec 4.0 REQUIRED
    COMPONENTS 
        fileformat 
)
add_executable(fileformat-example 
    fileformat-example.cpp
)
target_link_libraries(fileformat-example 
    retdec::fileformat 
)Code language: CMake (cmake)

1. Dependencies

Dependencies are the most complicated of the three. They are a diverse bunch of projects. Some use CMake, others don’t, and even those that do aren’t often easy to work with. The goal here is to abstract all of them in a uniform way which will make it convenient to be used by native RetDec components.

We do so by a combination of External Projects (we may move to Fetch Content in the future) and Interface Libraries. The External-project mechanism downloads and builds the 3rd-party project, and the interface library encapsulates it into a CMake target. A representative example is the Capstone component which we will use for further demonstration.

Download, build, and get results using External Project:

ExternalProject_Add(capstone-project
  # Download and build a 3rd party project.
)

# Get project's properties.
ExternalProject_Get_Property(capstone-project src_dir)
ExternalProject_Get_Property(capstone-project bin_dir)

# Use properties to get headers and libraries.
set(CAPSTONE_LIB      ...)
set(CAPSTONE_INCLUDES ...)Code language: PHP (php)

Encapsulate the results into a retdec::deps::capstone target:

add_library(capstone INTERFACE)
add_library(retdec::deps::capstone ALIAS capstone)
add_dependencies(capstone capstone-project)

target_include_directories(capstone
  SYSTEM INTERFACE
    $<BUILD_INTERFACE:${CAPSTONE_INCLUDES}
    $<INSTALL_INTERFACE:${INSTALL_DEPS_INCLUDE_DIR}>
)

target_link_libraries(capstone INTERFACE
    $<BUILD_INTERFACE:${CAPSTONE_LIB}>
    $<INSTALL_INTERFACE:retdec::deps::capstone-libs>
)

Note the use of Generator Expressions $<BUILD_INTERFACE:...> and $<INSTALL_INTERFACE:...>. The former specifies where are the headers and libraries used at build time, the later where they will be after installation. Simple enough? Well, not quite. As it often happens, there is a complication. While target_include_directories() says: Relative paths are allowed within the install INSTALL_INTERFACE and are interpreted relative to the installation prefix (i.e. they are set properly in the context of installation directory structure), target_link_libraries() makes no such statement (although generator expressions are allowed). This prevents us to simply write something like this, which wouldn’t generate a correct path to retdec-capstone-targets.cmake:

$<INSTALL_INTERFACE:${INSTALL_LIB_DIR}/${CAPSTONE_LIB}>Code language: CMake (cmake)

Instead, we use a workaround. We say that the installed target links against retdec::deps::capstone-libs, and then create this new target in retdec-capstone-config.cmake which we need to provide anyway. CAPSTONE_LIB_INSTALLED contains an absolute path to the installed Capstone library. The configure_package_config_file() macro makes sure that the resulting path is relative and the whole package relocatable:

configure_package_config_file(
  "retdec-capstone-config.cmake"
  "${CMAKE_CURRENT_BINARY_DIR}/retdec-capstone-config.cmake"
  INSTALL_DESTINATION ${INSTALL_CMAKE_DIR}
  PATH_VARS CAPSTONE_LIB_INSTALLED
)Code language: CMake (cmake)

retdec-capstone-config.cmake then looks like this:

@PACKAGE_INIT@

if(NOT TARGET retdec::deps::capstone-libs)
  add_library(retdec::deps::capstone-libs STATIC IMPORTED)
  set_target_properties(retdec::deps::capstone-libs PROPERTIES
    IMPORTED_LOCATION @PACKAGE_CAPSTONE_LIB_INSTALLED@
  )
endif()

if(NOT TARGET retdec::deps::capstone)
  include(${CMAKE_CURRENT_LIST_DIR}/retdec-capstone-targets.cmake)
endif()Code language: CMake (cmake)

All that is left to do is to actually install all the files:

# Install includes.
install(
  DIRECTORY   ${CAPSTONE_INCLUDES}
  DESTINATION ${INSTALL_DEPS_INCLUDE_DIR}
)
# Install libs.
install(
  FILES       ${CAPSTONE_LIB}
  DESTINATION ${INSTALL_LIB_DIR}
)
# Install target.
install(TARGETS capstone
  EXPORT capstone-targets
)
# Export target.
# This will generate retdec-capstone-targets.cmake
install(EXPORT capstone-targets
  FILE        "retdec-capstone-targets.cmake"
  NAMESPACE   retdec::deps::
  DESTINATION ${INSTALL_CMAKE_DIR}
)
# Install CMake files.
install(
  FILES       "${CMAKE_CURRENT_BINARY_DIR}/retdec-capstone-config.cmake"
  DESTINATION "${INSTALL_CMAKE_DIR}"
)Code language: CMake (cmake)

Inside the RetDec build system, other components can use the Capstone dependency as:

target_link_libraries(user
  retdec::deps::capstone
)Code language: CMake (cmake)

When installed, 3rd-party projects can use it like so:

find_package(retdec 4.0 REQUIRED
  COMPONENTS
    capstone
)
target_link_libraries(user
  retdec::deps::capstone
)Code language: CMake (cmake)

2. Libraries

Given the long discussion in the previous section, we already covered pretty much everything needed to build and install a RetDec library component. All the RetDec dependencies are libraries themselves and therefore the procedure for native RetDec libraries is mostly the same – without the extra step of getting and building sources via the External Project mechanism. The process is further simplified by the fact that native components aren’t Interface targets and so we don’t need to manually install (and hack linking of) actual libraries. In fact, installing the target will install and setup the associated library as well. An example for the capstone2llvmir component (full source):

# Install libs.
install(TARGETS capstone2llvmir
  EXPORT capstone2llvmir-targets
  ARCHIVE DESTINATION ${INSTALL_LIB_DIR}
  LIBRARY DESTINATION ${INSTALL_LIB_DIR}
)Code language: CMake (cmake)

Unfortunately, this doesn’t work for an Interface target and so we couldn’t use it for RetDec’s dependency libraries.

A typical RetDec library uses other native and dependency libraries. For the build itself, this is of course expressed by target_link_libraries():

target_link_libraries(capstone2llvmir
  PUBLIC
    retdec::common
    retdec::deps::capstone
    retdec::deps::llvm
)Code language: CMake (cmake)

At installation, this dependency will be generated for the installed target as well (in target files such as retdec-capstone2llvmir-targets.cmake). It is however not enough to work out of the box. The information which components are needed is there, but the knowledge where these components are is missing. What we need to do is to find them. This happens in config files associated with targets (e.g. retdec-capstone2llvmir-config.cmake). There, we use find_package() to locate the target’s required components. These can be system libraries or other RetDec components. An example for the capstone2llvmir component:

if(NOT TARGET retdec::capstone2llvmir)
  find_package(retdec 4.0 REQUIRED
    COMPONENTS
      common
      capstone
      llvm
  )
  include(${CMAKE_CURRENT_LIST_DIR}/retdec-capstone2llvmir-targets.cmake)
endif()Code language: CMake (cmake)

Note: It is generally advised to use the find_dependency() macro instead of find_package() in this context. Regrettably, it turns out it doesn’t reliably work for packages with components in some older CMake versions.

The search procedure will locate the main RetDec Config file (retdec-config.cmake) installed to the same directory as other CMake scripts. It will, in turn, include the Config files of the required components, which will then find their dependencies as well as include components’ Target files containing the necessary usage information:

foreach(component ${retdec_FIND_COMPONENTS})
  include(${CMAKE_CURRENT_LIST_DIR}/retdec-${component}-config.cmake)
endforeach()Code language: CMake (cmake)

3. Executables

Compared to dependencies and native libraries, executable build scripts are truly straightforward. A representative example for the retdec-capstone2llvmir executable can be seen here.

Testing

There is a considerable room for bugs in all this CMake madness. As everything else in software development, even the build system should be tested. We of course test the complete RetDec build in our Continuous Integration. But what about all the different configurations that RETDEC_ENABLE_<component> options give use? And what about all this installation business? Surely a successful RetDec build doesn’t guarantee that its components will be usable after installation.

That is why we have created another repository specialized for build system testing. Coincidentally, it is also a useful demo on how to use the individual components.

It contains a module for every native RetDec component. These are in two flavors. Modules for executable components only enable the build of the said component and check whether it has succeeded. Modules for library components implement a short example using the said component. For example, capstone2llvmir-example uses RetDec’s captone2llvmir library to translate a given assembly instruction into an LLVM IR sequence:

find_package(retdec 4.0 REQUIRED 
  COMPONENTS
    capstone2llvmir
)
add_executable(capstone2llvmir-example 
  capstone2llvmir-example.cpp
)
target_link_libraries(capstone2llvmir-example
  retdec::capstone2llvmir
)Code language: CMake (cmake)

Before every testing module is built, we use the retdec_install() macro to get, build, and install the necessary RetDec components. It uses the already discussed External Project mechanism. To make sure installation is relocatable and doesn’t rely on any build artifacts, we delete the build directory and move the install directory.

At the time of writing, over 40 components are being tested in this way. All of them fully fetch their RetDec instance and build the necessary subset of it. For many of them, this means building LLVM (our largest dependency) as well as a good share of other components. This makes the testing process quite a slow endeavor. But there is nothing like peace of mind brought by the knowledge that somewhere out there, there is a powerful server tirelessly compiling all the possible configurations of your project.

Conslusion

We’ve shown how RetDec uses modern CMake to build, install, and expose its components. We’ve also shown how we continuously test every aspect of our build system. Hopefully, this article along with the full RetDec sources will help you to implement your own perfect build system. Create an issue, or even better a pull request, on our GitHub repository if you have ideas for further improvements.

Disclaimer

CMake is evolving. So is RetDec. This article is not!

It is therefore quite possible that by the time you read this, some things in the RetDec repository will be different. Surely for the better, and hopefully not so much that the article becomes completely useless anytime soon. All the source code links used here point at a particular commit relevant to this article. The current master might differ.