Build System Problems
Okay, build systems are the greatest thing since Microsoft Bob. However, they still don't always live up to their potential:
- They Don't Do Enough (Not Fully Automated). This is one of the most common problems. A build system that is not fully automated can compile the software, create documentation, and package the final binary, but it requires a lot of user intervention to run various scripts, wait for previous stages to finish, check error reports, and so on.
- Requires a Lot of Discipline to Use Properly. Some build systems fail inexplicably if you don't follow a slew of obscure steps, like logging into the test server with a specific user, removing directory A, renaming directory B, making sure you perform step X only if the report generated by step Y says okay.
- Requires Too Much Configuration. Some build systems are very powerful and flexible, but are almost unusable due to excessive configuration. You have to define six different environment variables, modify three local config files, and pass eight different command-line options to the main build script. The end result is that 99% of the users use a single default configuration that probably doesn't fit their needs.
- Caters Mainly To a Sole Stakeholder. Another common problem is that a build system is often suitable for just one kind of stakeholder. For example, if the build system was developed mainly by the programmers who compile, link, and unit test all day, then the build system will have good support for these activities, but running integration tests or generating documentation may be poorly supported, if at all. On the other hand, if the build system was developed mainly by a release engineering team, then it will have good support for packaging final executables and will generate good reports about the percentage of passing test, but it may not be possible for developers to run just a single unit test and its dependencies, and they will either have to run the full-fledged build every time or hack the build system in a quick and dirty way (which might lead to errors).
- Intractable Error Messages When Something Is Wrong. Build systems perform many activities that involve external tools. The errors generated by these tools are often swallowed by the build system that much later generates its own error message, which doesn't point to the root cause. This is a serious problem that hurts productivity and causes people to revert to manual but understandable build practices.
- Inextensible and Undebuggable Franken-CodeBuild systems are often one of the earliest tools created at project initiation. The requirements of this early build system are usually minimal. As time goes by and the project grows, the demands from the build system grow too. Since the build system is an internal tool, less effort is dedicated to making it high quality code. More often than not, it is just a bunch of scripts slapped together and extended to support additional requirements by the tried and true practice of copy and paste. Such build systems quickly become a maintenance nightmare and can't be extended easily to accommodate new requirements.
- Not Integrated With Developer's IDEMost build systems that don't come with an IDE built-in don't support IDEs. They are command-line based only and if a developer wants to work in an IDE, the IDE project files must be maintained and synchronized with the build system build files. For example, the build system may be Makefile-based, and a developer that uses Visual Studio has to maintain a .vcproj file for each project, and any additional files must be added to the Makefile as well.
The Perfect Build System
The build system I present in this series is open ended and can be used to automate any software process that is mainly file-based. However, the focus is on a cross-platform build system for large-scale C++ projects because these are often the most complicated to build. The perfect build system solves or minimizes the problems associated with existing build systems.
"Convention over configuration" is a principle that has successfully governed in domains like web frameworks, reducing the learning curve and increasing developer productivity. It demands that you organize your project in a consistent way (which is always good practice in any event):
- Regular directory structure. This is the key principle on which the entire build system rests. Even in the most complicated systems, there is usually a relatively small high-level directory structure that contains a potentially huge number of similar directories. For example, a project may have a libs directory that contains all the C++ static libraries. The contents of the libs directory may grow and change, but it always contains a single type of entities.
- Well-known locations. The build system should be aware of the location and names of the top-level directories and "understand" what they mean. For example, it should know that the directories under libs generate static libraries that should later be linked into executables and dynamic libraries that depend on them.
- Automatic discovery of files based on extension. Each directory usually contains a small number of file types. Again, in the libs example, it should contain .h and .c/.cpp files and potentially a couple of other metadata files. The build system should know what files to expect and how to handle each file type. Once you have the regular directory structure in place, the build system "knows" a lot about your system and can do many tasks on your behalf automatically. In particular, it doesn't need in a build file in each directory that tells it what files are in it, how to build them, etc.
- Capitalize on the small variety of sub-project types. In the C/C++ world, there are really only three types of subprojects: a static library, a dynamic library, and an executable. Static libraries (a compiled set of files bundled together) are the simplest. They are later linked into dynamic libraries and executables. Dynamic libraries and executables are similar from a build point of view. They both have source files and depend on precompiled static libraries to link against. It is important to build the dependent dynamic libraries and executables after building all the required static libraries. Many libraries (both static and dynamic) and executables use the same set of compiler and linker flags. Placing these groups under a parent directory informs the build system of these common flags and automatically builds all the subprojects.
- Generate build files from templates for any IDE. Different IDEs, as well as command-line based tools like Make, use different build files to represent the meta information needed to build the software. The build system I present here maintains the same information via its inherent knowledge combined with the regular directory structure and can generate build files for any other build system by populating the appropriate templates. This approach lets developers build the software via their favorite IDE (like Visual Studio) without the hassle involved in adding files, setting dependencies, and specifying compiler and linker flags.
- Automatic dependency management based on #include analysis. Managing dependencies can be simple or complicated depending on the project. In any case, missing a dependency leads to linking errors that are often hard to resolve. This build system analyzes the #include statements in the source files and recursively creates a complete dependencies tree. The dependencies tree determines what static libraries a dynamic library or executable needs to link against.
- Automatic discovery of added/removed/renamed files and directories. The regular directory structure, combined with knowledge of files types (e.g., .cpp or .h files), allows the build system to figure out what files it needs to take into account, so developers just need to make sure the right files are in the right directory.
- Support static libraries, dynamic libraries, executables, and custom artifacts. All possible build artifacts are supported including custom ones like code generators, preprocessors, and documentation generators. The ability to put similar files and subprojects under top-level directories in the regular directory structure is open to any subproject type.
- Control the level of error messages. The build system is designed to support different users, such as QA, developers, and managers. Each type of user may be interested in different error messages.
- Generate custom artifacts like language bindings. The build system is focused on building C/C++ code, but using the same practices and mechanisms it is possible to extend it to support additional artifacts, while maintaining all the existing benefits.
- Allow overriding defaults. While the build system is intended to provide a hands-free experience, where all the necessary build information is derived automatically from the directory structure, it is possible to override it for special purposes, such as a single library that needs different flags.
- Integrated Build System
- Build phases are executed from the same program. The build system is a cohesive program that operates on a set of templates and source files. This one-stop shop approach is very powerful for keeping the build process manageable.
- Invoke external programs as a last resort. Ideally, the build system contains the entire logic of each build step. External programs are invoked only when the effort to implement the logic in the build system itself is deemed too costly. For example, the compiler and linker are invoked as external programs.
- Full debugging of the build system. The fact that the build system is a single program allows users to debug the build process in real-time including setting breakpoints, viewing the current state, and finding live build system bugs. This is very different from standard declarative build files that usually only provide obscure error messages at a much later stage.