Extending Continuous Integration Into ALM

Traditional Continuous Integration has been constrained so that it provides only a partial picture of software quality. Maciej suggests alternatives.


September 24, 2008
URL:http://www.drdobbs.com/jvm/extending-continuous-integration-into-al/210603696

Maciej is president of Urbancode. He can be contacted at [email protected].


Continuous integration (CI), the practice made popular by Agile methodologies, has seen tremendous adoption by development teams in the last few years. This wildfire-like spread of the practice has been the result of a common-sense approach to automation and information sharing. But traditional CI has been constrained so that it is localized within the lifecycle and provides only a partial picture of software quality. In this article, I examine the reason for these constraints and suggest approaches for working beyond them.

Continuous Integration Foundation

Continuous integration is a practice made up of two components:

The practice is rooted in the observation that the longer developers go without integrating their work, the more painful the eventual integration. However, frequent integration is only one part of continuous integration and is not sufficient on its own to constitute the practice. To practice continuous integration, the second part of our definition must be met: Each integration should not degrade code quality. Thus, critical to the implementation of continuous integration is being able to determine code quality. This feat is typically accomplished via testing as determining quality is not black and white, but rather comes in shades of grey. The more tests we run, the more accurate our determination of the code quality will be.

There is a force that counterbalances the desire for a clearer picture of code quality. Within continuous integration, it is important to determine quickly whether quality has been degraded. Consider what must happen when we find that the latest integration decreased the code quality—the offending code change must be either backed out or corrected. In either case, we're talking about another integration and the longer we delay that integration, the more painful it is.

So we find that a balance is needed when practicing continuous integration. On the one hand, the more tests we run, the more accurate our determination of code quality will be. On the other hand, more tests mean longer test times and a longer wait before correcting any offending code change. Typically, the balance is struck by running unit (or fast running) tests as part of the continuous integration build process. This leaves a lot of testing on the table (functional tests, performance tests, regression tests, integration tests, and the list goes on). Can anything be done with these remaining tests?

What About the Remaining Tests?

Any tests not performed as part of the CI loop are typically performed manually or with the aid of scripts or tools. Even when scripts or tools are used, they usually perform only a portion of the work and have to be coordinated manually, leading to a semi-automated process. The end result is that there is a long delay between the time the code changes enter the additional testing stage (beyond CI) and the time that the results of those tests are available and acted upon. Presumably, the closer in time the discovery of a bug to the time of its introduction, the smaller the effort and cost required to fix it. Thus, there is value in reducing the feedback loop in the tests beyond continuous integration.

If we didn't have to worry about keeping the CI loop so tight, we could include additional tests as part of the CI process. Continuous integration has the promise of providing the automation framework that is needed to decrease the turnaround time on the longer running tests. More importantly, many teams that have started out with continuous integration and a fast CI loop have implemented automation of additional tests to provide a more thorough view of the quality of their code base. Running these additional tests is not as fast as running tests used in the tight CI loop and does take a substantial amount of time. The turnaround time can range any where from two hours to more than eight hours. In addition to running slow unit tests, some teams automate functional tests (which require that the application be deployed into a test environment), integration tests, system tests, and more. The possibilities to a large extent depend on the approach taken toward the automation and the model for automating these processes outside the tight CI loop.

In the rest of this article, I present some alternative ways in which this can be accomplished. I will cover a build-centric approach in the section about Staged CI, a process-centric approach in the section about Chaining Processes, and a lifecycle-centric approach in the section about Build and Release Pipelines.

Staged CI

Staged CI is a practice where a tight loop is used for the typical CI build and one or more additional loops are used to automate a more thorough determination of the code quality. The reason for the multiple loops is that you want to keep the CI loop tight to provide feedback to developers as quickly as possible. In the tight loop, you're willing to sacrifice accuracy of the quality determination for speed. But once you have the quick feedback, you can take a little more time to get more detailed feedback from the additional loops. The end result is that for each project, you have multiple build loops in your system, one loop for each stage. Each stage is progressively more thorough and thus includes longer sets of tests.

The nice thing about this approach is that it makes it easy to deal with limited hardware and test resources. Since each stage runs in a loop, there's never more than a single instance of a stage running at any one time. If that weren't the case and you could have multiple instances of a stage running at any one time, then you'd need to have a scheduler that has knowledge about available hardware resources and manages the allocation of hardware to stage instances. Furthermore, there would need to be a way to balance the pace of stage instance creation to the throughput of the available hardware. Once common mechanism to do this is request coalescing. But the point is that by keeping each stage in a loop, you can avoid a lot of these complexities.

In a Staged CI type of setup, it is typical to have a tight CI loop that takes 15 minutes or less to run, followed by a longer loop that takes an hour or two to run, followed by a nightly build that can take five or more hours to run; see Figure 1. The tight CI loop runs quite often as indicated by the runs CI1, CI2...CI6. The second and fourth runs of the CI loop failed as indicated by the red color. The longer loop that includes long-running tests is depicted by runs L1 and L2. Notice that this longer loop runs concurrently with the tight CI loop. The entire system being used for CI and the other loops may be made up of multiple machines that include a central server and many agent machines. All the heavy work is typically performed on the agent machines so that the system scales horizontally. The nightly build (N1) takes the most amount of time to run.

Figure 1: Staged CI.

Staged CI is not really any different from nightly builds, although this depends on the reason for the nightly build and on what happens during the nightly build. If the reason for the nightly build is simply that the team does not see any value from having a tighter feedback loop, then there are some differences. But if the reason for the nightly build is to let the build run over several hours doing extensive testing of the code base to arrive at a very detailed quality determination, then the nightly build becomes an example of Staged CI. Rather than running in a continuous loop, the nightly build runs once during a 24-hour period during the night. Typically, the decision to run at night is related to the usage of resources. Presumably, if the nightly build were to run during the day, it would require access to some resources that are unavailable or being used for other purposes during the day.

Staged CI makes use of multiple build types. Let's take a look at what we mean by build types, then look at why Staged CI uses them.

To understand build types, you need to understand that most of the time when we use the term "build" we're not exact in what we mean. Usually, when we talk about a "build" we are actually talking about something more than just a build. For example, when we talk about a "continuous integration build," we're talking about a process that extracts source code from the source-code manager (SCM), compiles it, packages it, and then runs some tests on the resulting artifacts. In contrast, a nightly build may extract source code from the source-code management system (SCM), compile it, package it, then deploy the artifacts to a QA server and run functional tests. The CI build and the nightly build are just two examples of build types.

The defining feature of a build type is that it is a combination of multiple processes. Of those multiple processes, one is a build process and the remainder is made up of one or more secondary processes. So what do we mean by a build process? A build process takes source code, dependencies, environment settings, and configuration as input, and transforms them into the output. The typical output of the build process is made up of artifacts (typically a compiled binary), log files, and reports. The transformation of the input into the output typically involves compilation and packaging. However, this varies with the technology being used, as native languages include a linking step, whereas scripting languages don't have the compilation step.

Let's take a look at a CI build type and a nightly build type in light of this definition of the build process. Recall that a CI build type extracts source code from the SCM, compiles it, packages it, and then runs some tests on the resulting artifacts. I can now restate that so that the CI build type extracts source code from the SCM, performs a build, then runs some tests on the resulting artifacts. And the nightly build type extracts source code from the SCM, performs a build, then deploys the artifacts to a QA server and runs functional tests on them (Figure 2). Each build type is a combination of build process along with one or more additional (secondary) processes.

Figure 2: Build types.

One of the defining properties of Staged CI is that each loop (or stage) is a different build type. This means that each stage builds the source code in addition to running one or more processes. This may seem like a natural thing and you may wonder why this is worth pointing out. The reason is that this is very different from the two approaches I address next.

Chaining Processes

The Staged CI approach results in multiple stages where each stage is a different build type. As such, each stage performs a build as well as one or more secondary processes. A more process-oriented (as opposed to build-centric) approach draws sharp boundaries between the processes, so that there are no overlaps. One process retrieves the source code from source control, compiles it, packages it; in other words, "builds" the software. A second process runs the quick tests. Additional processes may provide even more functionality. This separation in processes offers greater efficiency and allows the same source code to undergo progressively more exhaustive testing.

The idea is that all of these separate processes are executed in a chain. First, the build process is invoked. Once the build process completes successfully, the next process (Quick Tests in Figure 3) is invoked. It is at this point that we run into our first complication. The secondary processes, such as the Quick Tests process, are not build types and do not produce their own artifacts to be tested. This separation between the build and the test processes is the key to the Chained Processes approach and the main differentiator between it and the Staged CI approach. And since the secondary processes do not produce their own artifacts for testing, they need to get the artifacts externally. The artifacts are produced by the Build process. And typically, when the Build process runs, it places the artifacts it produces in a well-known location. This can be an agreed-upon directory on the filesystem, an SCM, or an artifact repository. The secondary processes need to obtain the artifacts from the well-known and agreed-upon location.

Figure 3: Chained Processes.

This type of artifact passing is trickier than it sounds. It is easiest to have the artifacts from the latest build overwrite the artifacts of any previous build. The alternative to having a separate location for the artifacts of each build requires that secondary processes be able to locate their intended artifacts. A simple naming convention where the artifacts of a build are stored in a directory with the build number as its name would require that each secondary process be passed to the build number so that it can find the artifacts. While this is not difficult to do, it is another detail to keep in mind.

Now that we know how the secondary process (such as Quick Tests) is going to locate the build artifacts, we can continue the walk-through. When the Quick Tests is invoked after the completion of the Build process, it retrieves the build artifacts and runs the quick test on them. These results can then be communicated to the development team in order to provide the fast feedback required by continuous integration.

After a successful execution of the Quick Tests process, we would like to run the Deploy to QA process. However, if you recall from our discussion of Staged CI, a limit in the available hardware resources may mean that you can only run a single combination of the Deploy to QA and Functional Tests processes at a time. The scheduler used to schedule the execution of these secondary processes should be robust enough to provide the desired behavior.

This approach does a good job of facilitating a common Build process that provides both the steady CI feedback and feeds longer running processes like tests. One of the benefits of this approach is that downstream processes always have a build that successfully completed all preceding processes. To illustrate this point, consider that at the time we invoke the Deploy to QA process D1 in Figure 4, the latest run of the Quick Tests process is Q2 corresponding to build B3. But since Q2 has failed, and we would like our execution of the Deploy to QA process to deploy the artifacts of the latest build without any detected problems, the D1 process is linked to Q1 and B1 instead of the latest available Q2.

Figure 4: Build and Release Pipeline.

The Chained Processes approach does present the challenge of traceability. As each process stands alone, the linkage between successive process executions is based on parameters being passed in. If we want to determine the build and code responsible for a failed Quick Tests execution, we need to get the build number that was passed into our process execution as a parameter. Process-oriented systems built around this approach typically do not provide any built-in traceability mechanism for navigating the process chains. Likewise, process-based systems typically do not provide any built-in way of passing artifacts from one process to another in a traceable manner.

Build and Release Pipelines

While the Build and Release Pipelines approach is similar to the Chained Processes approach in that both rely on non-overlapping processes, the difference comes down to the central concept. The Build and Release Pipelines approach uses the pipeline as the primary concept, while it is the process that is the central concept in the Chained Process approach. In the Build and Release Pipelines approach, the processes are executed within the context of a pipeline.

As with a simple continuous integration system, a commit to the SCM can trigger the build process after the quiet period is met. At this point, a new pipeline is created and the invoked build process runs within the scope of the pipeline. In this approach, all processes are executed within the scope of a pipeline. The artifacts produced by the build process are stored within the context of the pipeline. It is the pipeline that provides traceability to the artifacts (meaning that the artifacts are traceable to the pipeline and via the pipeline to the build process invocation that created them).

A successful completion of the build process invokes the Quick Tests process within the context of the same pipeline. The Quick Tests process has no problem locating the build artifacts to be tested since they are available from the pipeline context; it is the pipeline that is responsible for storage and retrieval of all artifacts and other data related to the pipeline. The results of the Quick Tests process can be communicated to the members of the development team to provide the fast feedback required by continuous integration.

As in the Chained Processes approach, the process scheduler should be responsible for ensuring that the hardware is not saturated by too many concurrent instances of the Deploy to QA process. One common strategy employed by schedulers to ensure this is to combine requests for the Deploy to QA process. If two separate pipelines made requests for the Deploy to QA process but only a single process could run, the scheduler would honor the most recent request and discard the older one. But any such scheduler needs to be robust enough to provide all desired functionality. For example, the scheduler should probably not discard process requests made by users manually.

As with the Chained Processes approach, the current approach ensures that downstream processes always have a pipeline that successfully completed all preceding processes. For example, if at the time of invocation of the Deploy to QA process, the latest pipeline contained Build B3 and Quick Tests Q2, the Deploy to QA process would run within the context of the latest pipeline to successfully pass both preceding processes, the pipeline containing B1 and Q1.

The Build and Release Pipeline approach brings us the efficiency and flexibility strengths of the Chained Processes approach and the traceability advantages of the Staged CI approach. There is more complexity in this approach; however, this complexity can be safely hidden in a system or tool so that the end result is easier and more intuitive to use.

Conclusion

Given that the more tests we run, the clearer our picture of quality is and the sooner we get the feedback the more valuable it is, there are plenty of reasons to extend automation beyond the development team. With approaches I've discussed here, we've seen QA and even release management teams derive the same benefits that development teams are getting. I hope that some of the ideas in this article will allow you to obtain similar results.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.