Dr. Dobb's | Think Globally, Code Locally

Think Globally, Code Locally

Damon shares tried-and-true best practices that help global teams improve the development process by thinking globally and coding locally.

January 11, 2007
URL:http://www.drdobbs.com/architecture-and-design/think-globally-code-locally/196900182

Damon is founder and CTO of AccuRev (www.accurev.com), a best of breed application lifecycle management (ALM) vendor with a focus on software configuration management (SCM). Damon's blog can be found at damonpoole.blogspot.com.

How frequently have you merged your code with changes from the mainline, only to find that the result doesn't build, or it builds but it doesn't work? Monthly? Weekly? Daily? Hourly? Or worse, how often have you made changes that broke the build, requiring you to quickly correct the problem while getting flames from your team members?

Perhaps your team has recently expanded, added distributed team members, or started doing outsourcing with a team half way around the world only to find that code churn has not only gotten worse but now it is also harder to coordinate with others to track down and correct the problem. Don't despair. You are not alone, and this problem has straightforward solutions using extensions of practices you are probably already familiar with.

I've looked at the development process at hundreds of shops, from small collocated shops all the way up to 10,000-user shops doing global development, including ISVs, financial institutions, networking vendors, and government organizations. There are fewer differences between the development processes employed at different shops than you might imagine. The bottom line is that source code moves from a state of immaturity to a state of maturity through a set of stages and integration points that is similar to an organization chart.

In my travels I have found that many tried-and-true best practices originally developed for collocated teams can be applied as-is or extended to global teams. Usually, global teams have the exact same trouble spots as collocated teams, they are just amplified in proportion to the amount of physical separation. In other words, improving your process is a good thing in general and even more so when every weak area is amplified by distance.

Mainline Chaos

Before describing best practices for global teams, first take a look at some common problems. If everybody in development is making changes against the mainline, introducing a bad change affects everybody until it is detected and corrected. Waiting for someone to fix a change that is impacting everybody is bad enough when the change was committed locally, but it becomes severe when the committer won't even be awake for another eight hours.

You may determine that getting the latest files, which included that bad change, caused you problems. If you have your own changes in progress, you may conclude that it is the combination of your changes and other people's changes that are causing the problem and temporarily solve the problem by backing out the update. However, backing out is generally harder if you have your own changes in progress.

Another way to think of the bad changes problem is that bad changes are a form of unwanted changes. Clearly, nobody wants bad changes. But not everybody wants all changes from all developers as soon as they are committed, either. Perhaps the change includes changes to the API, or perhaps you just don't want to take the time to rebuild the affected files. One way to handle this is to update just parts of your work area when you do an update. The problem with a partial update is that you run the risk of missing dependent changes.

Limiting change is especially problematic at integration and release time. The best way for a team to get their functionality stable is for nobody else to make changes to the mainline. Unfortunately, all teams want to be the only team to make changes while all other teams wait. A typical integration practice is to create a schedule and let each team take their turn integrating their changes with the mainline and stabilizing their functionality. This approach serializes the work and introduces artificial delays. It also means that the other teams have to find something else to do with their time while they wait. A common way of handling the scheduling is via token-passing. Only the team who has the token can commit changes. I've seen people use inflatable dinosaurs, traffic cones, and various other tokens. This practice may work fine for collocated teams, but it is more difficult to use an inflatable dinosaur when one of the teams is halfway across the world.

Whatever method is used, once the first round of integration has completed, it is likely that the functionality of one or more of the teams will need further adjustment, initiating yet another serialized integration round.

Checkpoints

In the face of all this potential chaos, it is comforting to have safety nets. If something goes wrong, you want to have an easy way to revert to a known good state. "Known good" doesn't mean "known to be perfect in every way," it just means a state that is both a well-known state and a state that has a quantifiable level of goodness. For instance, if you know your project built as of one hour ago with no warnings, then the previous state is a known good state. You know that if you revert to that state, then at least it will build again. Even if what is built won't pass all of your automated tests, at least it is a step in the right direction.

One of the best known forms of checkpoints are check-ins to private branches. By creating a private branch, you get all of the benefits of software configuration management (SCM) for your own personal use.

Check-in Early and Check-in Often

Everybody puts off checking-in their changes, sometimes for days. You put it off because you don't want to affect other people too early and you don't want to get blamed for breaking the build. But this leads to other problems such as losing work or not being able to go back to previous versions of your work. Sure, you can copy files aside in temporary directories, but then you are doing what an SCM system can do for you.

Once you reach a new known good state, create a checkpoint. In other words, do a check-in to your private branch. The checkpoint ensures that your work is safe and that you can return to the checkpoint at any time. Your changes are stored on the server and available to other users if they need them, but don't cause problems the way that making them public can. This frees you from worrying about losing progress due to mistakes or dead-end changes.

Development Hierarchy

A bigger picture solution to the chaos of mainline development is to use a development hierarchy. A development hierarchy is simply a hierarchical representation of the dependencies between groups that includes process steps such as integration, quality assurance, and code reviews. It is a natural extension of private branching. When you have a private branch, you have created a hierarchy. You check-in your changes on your branch first, then when you feel comfortable that your changes will have a positive impact and not break the build, you move them up the hierarchy by merging them into the mainline.

This can be extended recursively from there. Each team can think of itself as an individual that updates its private branch from the next step in the hierarchy and checks its changes back in via merging.

In Figure 1, there are four development teams as well as an area for accepting third-party code drops. Each of the teams are located in a different geographical area. Another level of the hierarchy that is not shown are the individual developers that are working in each team. The hierarchy represents the normal flow of changes through development from stage to stage. For instance, changes in the GUI team flow from individuals to the GUI stage to the UI_int stage, to QA, and finally to the release stage. Conversely, changes merged from the GUI stage to the UI_int stage are also applied to the Web stage and changes merged from the Web stage to the UI_int stage are also merged into the GUI stage.

Figure 1: Typical development scenario.

Using a development hierarchy provides more opportunities for checkpointing. Every time changes flow through the system you can have a checkpoint. If it turns out that a change was undesirable, you can return both the source stage and the destination stage back to a previous checkpoint. By contrast, mainline development only offers you a single opportunity for checkpointing, the state of the main codeline itself. The chance of ascertaining the current state of the mainline is very low unless you prevent people from making changes to it for a period of time long enough to do a build, do a smoketest, or whatever level of validation you wish to do. Since the high level of churn in mainline development makes checkpointing so difficult, any available checkpoint is most likely very old and thus of limited value.

Another benefit of combining a development hierarchy with checkpointing is that each step up the hierarchy that a change makes adds another level of maturity to that change and each step up the hierarchy represents a higher level of maturity for the system as a whole.

Gate Keepers

Using a development hierarchy requires gate keepers. A gate keeper is responsible for coordinating the merging of changes in and out of a stage. For a team stage, the gate keeper is usually the team leader.

The gate keeper practice comes in handy when you have a new team and you want to have somebody outside of that team do a code review of all changes prior to merging the changes into the next stage in the hierarchy. Simply add a code review stage in the hierarchy between the remote team and the mainline and assign a local person as the gatekeeper.

One Virtual Site

When making decisions about your development process, always consider the worst case—would the choice work in the worst case. For example, which works best in the worst case, a whiteboard with sticky notes on it or a modern issue-tracking system? In the best case of a handful of developers all working together in an open area on a small project, the whiteboard will work fine. But if even one member of the team works in a different building, then the whiteboard will no longer work. On the other hand, the issue-tracking system will work where a whiteboard will work and it will also work even if a member of the team is halfway around the world.

Make as many of your development processes the same as possible at all sites. This simplifies the interactions between sites. For instance, if the same build mechanism is used everywhere, then the quality associated with the fact that a configuration builds at a particular site can be transferred to another site as opposed to having to be reestablished using a different build mechanism.

Atomic Transactions

Imagine that somebody on a remote team is checking-in a large batch of changes over a slow and flaky connection. If you update your work area during this time and those changes affect you, there's a very good chance that post-update your work area either won't build or the resulting executables won't work properly. Of course, you can update your work area again, but how do you know when to stop? Just waiting for an empty update could take a while, and is no guarantee that you have a cohesive set of changes.

Most of the newer SCM systems available today feature atomic transactions. This feature has many benefits for collocated development, but should be considered a minimum requirement for global development.

Atomic transactions either complete or fail; there's no in-between. This has the added benefit of grouping operations on multiple objects into a logical unit. Now, if there is a long-running check-in in progress, you are protected. When you update your work area, it will be updated to a particular transaction level. Only the transactions in the system up to the time you start your update will be brought into your work area.

Agile Development Practices

There are many well-publicized benefits to Agile development (www.agilealliance.org). In the context of a global team, there are a number of practices that scale well to a global team. One of the most applicable practices is short iterations, a central practice of Agile development. One of the useful side-effects is that you can get information more quickly from teams that are far away.

Let's say you have just added an offshore development team that you haven't worked with before. In a traditional project, you may get frequent status reports saying that everything is going well, but how can you really tell? With the short iterations of an Agile project, you can expect to get working software on a regular basis. If you don't, or if it has less functionality than promised, that's another indication of trouble. In a more positive light, getting working code on a regular basis that is demonstrated to have the promised functionality provides confidence that you can focus your attention elsewhere.

Open-Source Development Practices

You don't have to release your software under an open-source license to benefit from the open-source development process. I believe there is a great deal of benefit from adopting practices from this model for any development project. The open-source development model has been used by very large distributed projects, including Apache and Linux, with great success.

A key concept of the open-source process is a practice that the Apache Group calls "Meritocracy" (www.apache.org/foundation/how-it-works.html). Basically, you start out with read-only access to the project. You can propose changes via patches, and if you propose a change that is approved, your credibility increases in the area that you contributed. Over time, you can earn write access to more and more of the system and become one of the people doing the approvals.

In a global environment where you are incorporating new team members into an existing project, this is an excellent way to mitigate risk. If you are using an SCM system, you can leverage the SCM system itself to facilitate this process. Instead of mailing around patches, proposed changes can be made on branches and then merged into the project when approved.

Putting It All Together

Each of the recommended practices was originally designed primarily for either a collocated team or a global team, but all of them have proven to work well for both. While they can all be applied independently, they all complement each other as well.

Take the first step in putting an end to code churn today. Pick a practice that resonates with a problem you are currently facing and advocate for it. The best candidate is a problem that many people agree is a problem, there is enough pain to motivate a change, and a sufficiently small scope such that the change can be made in a relatively short amount of time. Once you've established the idea that process improvement is worthwhile, doable, and worth doing on a regular basis, you'll be well on your way towards thinking globally and coding locally!