For several years, Pablo was a blogger for Dr. Dobb's covering Agile technologies and best practices. With this article, he returns as a regular blogger. His next post will appear in the left panel of our home page, along with our regular bloggers. Ed.
When I run a demo of our source code merge program, a developer occasionally raises a hand and asks, "What do you mean by 'three versions' of the file being merged?" To answer this and explain how three-way merges work and why they are important, let's start by taking a look at the traditional two-way merge.
What Is a Two-Way Merge?
Suppose we're modifying the same file concurrently. You go and make some changes to the file and then I make some more changes.
At some point in time, someone looks at the two copies of the file and they see something like Figure 1:
This third person looking at the files sees there's a difference on line 30 but:
- How can he tell whether you modified line 30 or if I modified it?
- What if we both modified the line? How can he tell?
He will have to call both of us and trust our memories to find out who modified what. And yes, fortunately we programmers never forget, right? ;-)
This "third person" is likely to actually be the version control system trying to do a simple two-way merge, which just compares two versions of a file and tries to merge them. But here, the VCS will require user intervention because it won't be able to figure out what to do.
It is better to avoid concurrent modifications on a file if you have to rely on two-way merge because it will be a slow manual process. Imagine 300 files requiring a simple merge…it would take ages to complete!
The Three-Way Merge
Let's forget for a second about version control, and go back to this "third person" looking into the two files we modified: How can he figure out what happened by himself?
He can look into the original version of the file we both used as starting point (Figure 2):
Then, looking at how the file was originally ("base common ancestor" or simply "base"), he can figure out what to do.
Based on Figure 2, only one developer actually changed line 30, so the conflict can be manually resolved: Just keep "Yours" as the solution, so it will say:
This is how three-way merge helps: It turns a manual conflict into an automatic resolution. Part of the magic here relies on the VCS locating the original version of the file. This original version is better known as the "nearest common ancestor." The VCS then passes the common ancestor and the two contributors to the three-way merge tool that will use all three to calculate the result.
Using two-way merge only, the lines modified by two developers will require manual intervention, while everything else will be automatically merged. With three-way merge, it is possible to run a painless merge involving hundreds of files.
A Manual Merge Scenario
Let's now check a slightly more complex case, as shown in Figure 3:
Comparing the two files side by side, we can see that there are three lines with differences:
- Line 30: with the same conflict we had before.
- Line 51: with a
forloop being modified.
- Line 70: where we don't know whether "yours" removed some code or "mine" added it.
Let's now look at the common ancestor to be able to properly solve the conflicts (Figure 4):
- The conflict on line 30 can be automatically solved and the "yours" (source contributor) will be kept as result because only one contributor modified.
- The conflict on line 70 can also be automatically solved to "mine" (destination contributor) because it is clear now that the line has been added and it wasn't there before.
- The conflict on line 51 needs manual resolution: You need to decide whether you want to keep one of the contributors, the other, or even modify it manually.
And this is basically how three-way merge works.