Staged CI is a practice where a tight loop is used for the typical CI build and one or more additional loops are used to automate a more thorough determination of the code quality. The reason for the multiple loops is that you want to keep the CI loop tight to provide feedback to developers as quickly as possible. In the tight loop, you're willing to sacrifice accuracy of the quality determination for speed. But once you have the quick feedback, you can take a little more time to get more detailed feedback from the additional loops. The end result is that for each project, you have multiple build loops in your system, one loop for each stage. Each stage is progressively more thorough and thus includes longer sets of tests.
The nice thing about this approach is that it makes it easy to deal with limited hardware and test resources. Since each stage runs in a loop, there's never more than a single instance of a stage running at any one time. If that weren't the case and you could have multiple instances of a stage running at any one time, then you'd need to have a scheduler that has knowledge about available hardware resources and manages the allocation of hardware to stage instances. Furthermore, there would need to be a way to balance the pace of stage instance creation to the throughput of the available hardware. Once common mechanism to do this is request coalescing. But the point is that by keeping each stage in a loop, you can avoid a lot of these complexities.
In a Staged CI type of setup, it is typical to have a tight CI loop that takes 15 minutes or less to run, followed by a longer loop that takes an hour or two to run, followed by a nightly build that can take five or more hours to run; see Figure 1. The tight CI loop runs quite often as indicated by the runs CI1, CI2...CI6. The second and fourth runs of the CI loop failed as indicated by the red color. The longer loop that includes long-running tests is depicted by runs L1 and L2. Notice that this longer loop runs concurrently with the tight CI loop. The entire system being used for CI and the other loops may be made up of multiple machines that include a central server and many agent machines. All the heavy work is typically performed on the agent machines so that the system scales horizontally. The nightly build (N1) takes the most amount of time to run.
Staged CI is not really any different from nightly builds, although this depends on the reason for the nightly build and on what happens during the nightly build. If the reason for the nightly build is simply that the team does not see any value from having a tighter feedback loop, then there are some differences. But if the reason for the nightly build is to let the build run over several hours doing extensive testing of the code base to arrive at a very detailed quality determination, then the nightly build becomes an example of Staged CI. Rather than running in a continuous loop, the nightly build runs once during a 24-hour period during the night. Typically, the decision to run at night is related to the usage of resources. Presumably, if the nightly build were to run during the day, it would require access to some resources that are unavailable or being used for other purposes during the day.
Staged CI makes use of multiple build types. Let's take a look at what we mean by build types, then look at why Staged CI uses them.
To understand build types, you need to understand that most of the time when we use the term "build" we're not exact in what we mean. Usually, when we talk about a "build" we are actually talking about something more than just a build. For example, when we talk about a "continuous integration build," we're talking about a process that extracts source code from the source-code manager (SCM), compiles it, packages it, and then runs some tests on the resulting artifacts. In contrast, a nightly build may extract source code from the source-code management system (SCM), compile it, package it, then deploy the artifacts to a QA server and run functional tests. The CI build and the nightly build are just two examples of build types.
The defining feature of a build type is that it is a combination of multiple processes. Of those multiple processes, one is a build process and the remainder is made up of one or more secondary processes. So what do we mean by a build process? A build process takes source code, dependencies, environment settings, and configuration as input, and transforms them into the output. The typical output of the build process is made up of artifacts (typically a compiled binary), log files, and reports. The transformation of the input into the output typically involves compilation and packaging. However, this varies with the technology being used, as native languages include a linking step, whereas scripting languages don't have the compilation step.
Let's take a look at a CI build type and a nightly build type in light of this definition of the build process. Recall that a CI build type extracts source code from the SCM, compiles it, packages it, and then runs some tests on the resulting artifacts. I can now restate that so that the CI build type extracts source code from the SCM, performs a build, then runs some tests on the resulting artifacts. And the nightly build type extracts source code from the SCM, performs a build, then deploys the artifacts to a QA server and runs functional tests on them (Figure 2). Each build type is a combination of build process along with one or more additional (secondary) processes.
One of the defining properties of Staged CI is that each loop (or stage) is a different build type. This means that each stage builds the source code in addition to running one or more processes. This may seem like a natural thing and you may wonder why this is worth pointing out. The reason is that this is very different from the two approaches I address next.