C++#
This section has notes on working on C++ projects. When working on C++ projects you will typically interact with the following tooling:
a compiler (and linker)
a build configuration tool
a package manager
Compilers#
How They Work#
C++ is compiled language, meaning its source files are transformed to executable machine code all at the same time and in advance of the program launching.
When building a smaller C++ project the project source files (often .cpp
or .cc
extension) with the support of header (.h
) files, are converted to machine code, which is stored in a binary ‘executable’ file. On Unix systems this binary file is often in the Executable and Linkable Format (ELF).
C++ programs typically depend on functionality in external libraries, including the C++ Standard Library. When compiling C++ code the headers from the external library are used to give information on interfaces, types and their sizes that will be consumed from the library. When linking C++, to produce a binary, either the binary code from the external library is copied into the binary being produced (static linking) or a symbol name and path to the external library are copied (shared linking) into the binary - so the executable code can be found at runtime.
In larger C++ projects compiled C++ code is typically stored in an object file .o
per translation unit (often a .cpp
file) as an intermediary build project. If a library is being produced to allow static linking then the object files are collected together in an archive (.a
file), which is also known as a static library. If a library is being produced to allow shared linking then the object files are processed to remove duplication and a processed version is included in a shared library file (.so
).
C++ programs can thus be made up of an executable containing code pulled from an external static library, code from its own compiled source files and symbols and paths to code in external shared libraries. The generation of the program is handled by the compiler and linker, which are often exposed via a single program.
There are many ways to go from C++ source code to executable machine code. Modern compilers can optimize the code during machine code generation, flattening loops or avoiding making copies of objects. This comes with the trade-off of longer compile times and the code execution deviating from how it appears verbatim in the source file. Compilers offer different optimization levels to give the user control over this. In addition many other compiler options are available, to control memory safety and security, mathemtical operations and code introspection.
When machine code is added to produced binaries some information from the original source is lost, since it is not needed for the execution and would make the binary larger. However this information may be useful for troubleshooting or debugging a program, so you can check what line an issue happens on, or what the value of a variable is. This is known as ‘debug information’ and can be kept or added to produced binaries through a compiler flag.
Available Compilers#
The two most commonly used C++ compilers on Linux are g++
from the GNU Compiler Collection (GCC) and clang++
. Often the compiler is symlinked on Linux systems to c++
as well.
The architecture of the clang compiler differs greatly from the GCC one, it targets an intermediate representation (IR) for the Low Level Virtual Machine (LLVM), which is itself targeted to run on specific architectures. Clang is also more extensible than GCC, allowing for a wide variety of tools and plugins to be used with it. It can be useful to be familiar with both GCC and clang compilers.
Basic Use#
To compile some C++ code with (for example) GCC you can do:
g++ my_app.cpp -o my_app
which will compile and link my_app
which can be run as an executable. On most systems the compiler will automatically find standard library headers and link with the standard library as needed.
Often you will want to extend your application by linking to additional libraries. This is done by including the paths to library headers with the -I
flag and linking with the -L
and/or -lib_mylib
flags.
You will often want your users to be able to easily build your application with their chosen compiler and build settings. It would be tedious for them to remember ideal settings and include and library paths each time. For this reason most C++ projects include this information when they are distributed, in a format understood by various build tools.
System Setup#
On MacOS, Apple ship their own version of clang by default. They also alias it to g++/gcc, meaning calling them actually runs clang. This version of clang can have integration issues with tooling and may be outdated relative to your needs. To use an up-to-date vanilla clang you can do:
brew install llvm
, add the following to your ~/.zshrc
and refresh your shell:
export HOMEBREW_PREFIX="$(brew --prefix)" # find homebrew install location
export LLVM_PATH=$HOMEBREW_PREFIX/opt/llvm
export PATH=$LLVM_PATH/bin:$PATH # add LLVM bin to system path
export LDFLAGS="-L$LLVM_PATH/lib/c++ -Wl,-rpath,$LLVM_PATH/lib/c++":$LDFLAGS
export CPPFLAGS=-I$LLVM_PATH/include:$CPPFLAGS
export CC=$LLVM_PATH/bin/clang # Set the default compilers to clang
export CXX=$LLVM_PATH/bin/clang++
Build Tools#
A build tool is an application that takes some project configuration from a text file and a collection of user preferences and runs the compiler and linker (possibly multiple times) with suitable inputs to produce desired build outputs, such as an application executable or library.
make is an early and widely used build tool. Project and user configuration are specified in a makefile
, with outputs defined as a collection of targets
. Targets can depend on each other. Running the make
command in a directory with a makefile
will lead to the execution of any steps needed to create the targets in a suitable order. In the context of building programs, make’s primary purpose, targets are usually executables or shared or static libraries. make
can also handle installation - running make install
will install the built output into a pre-specified location. You can run make in parallel with the -j
flag, e.g. -j 4
to build with four cores.
Different build tools with similar functionality to make exist, including ninja
which can given better parallel performance and more flexible handling of different build configurations.
Even though a makefile
can include project and user configuration the syntax can become quite verbose and build options can vary a lot across machines and platforms. As a result, another layer of tooling above them have been developed - which can be regarded as ‘generators’ of makefiles and similar. The most common are:
GNU Autotools - which will often involve running a
configure
scriptCMake - which is launched with a
cmake
binary using aCMakeLists.txt
inputmeson
These tools generate a makefile or ninja specification based on project and user configurations, often in a more user-friendly way than ‘raw’ make, with make
or ninja
still driving the build underneath. When working with C++ projects you are likely to encounter cmake
and autotools
at least.
CMake#
CMake is a meta-build tool (generates output for use by other build tools) that is commonly used to build C++ projects. It is a cross-platform tool, which can also generate build configurations on Windows, Mac and mobile platfoms.
C++ projects that support cmake will have a CMakeLists.txt
file near the top of their project-tree. It is typical to do an ‘out of source’ build with CMake, which involves creating a build directory outside of the project and then pointing back to the source directory containing the CMakeLists file:
mkdir build
cd build
cmake ../src
On Linux a CMake project configuration will typically support at least make
, so after running the cmake
binary you can build the project by running make
in the build directory.
You will often end up running CMake to build projects from source, either due to not having admin priveleges or because you want to instrument or modify the code in some way. In these cases you don’t want to install the resulting make outputs into system locations. You can do this make creating an install directory:
mkdir install
and setting the CMake variable CMAKE_INSTALL_PREFIX
to that directory. You can set CMake variables with when calling cmake:
cmake -DCMAKE_INSTALL_PREFIX=/path/to/install ../src
or using the visual ccmake
tool - which opens a UI in the terminal (TUI).
GNU Autotools#
GNU Autotools or the GNU Build System are a set of tools for building portable C (and C++) applications. They consist of the autoconf
, automake
and libtool
tools, and are used with make
, pkg-config
and the GCC.
A project based on GNU Autotools will either come with a configure
script in its repo, or a configure.ac
file. The latter can be used to generate a configure script by running automake
on it.
Running the configure
script will generate the project makefile
s, which can be run with make
. The configure
script takes options, a useful one is --prefix
which can be used to change the installation prefix for the make install
command.
Package Managers#
Unlike PyPi and pip
in Python, C++ doesn’t have established or centralized build dependency management. Recently CMake has added some dependency management functionality with its ExternalProject
and FetchContent
features. Another popular dependency management tool is conan
- which has two distinct major versions ‘1’ and ‘2’. You can install conan
on Mac with brew
:
brew install conan
Conan 1 can be a bit slow to update its supported compiler versions. If you are using clang
from brew
as documented above you may need to modify your conan config in ~/.conan/settings.yml
to add your compiler version to the clang
section.