Channels ▼
RSS

C/C++

Three-Way Merging: A Look Under the Hood



For several years, Pablo was a blogger for Dr. Dobb's covering Agile technologies and best practices. With this article, he returns as a regular blogger. His next post will appear in the left panel of our home page, along with our regular bloggers. —Ed.


When I run a demo of our source code merge program, a developer occasionally raises a hand and asks, "What do you mean by 'three versions' of the file being merged?" To answer this and explain how three-way merges work and why they are important, let's start by taking a look at the traditional two-way merge.

What Is a Two-Way Merge?

Suppose we're modifying the same file concurrently. You go and make some changes to the file and then I make some more changes.

At some point in time, someone looks at the two copies of the file and they see something like Figure 1:

Three-Way Merge
Figure 1.

This third person looking at the files sees there's a difference on line 30 but:

  • How can he tell whether you modified line 30 or if I modified it?
  • What if we both modified the line? How can he tell?

He can't.

He will have to call both of us and trust our memories to find out who modified what. And yes, fortunately we programmers never forget, right? ;-)

This "third person" is likely to actually be the version control system trying to do a simple two-way merge, which just compares two versions of a file and tries to merge them. But here, the VCS will require user intervention because it won't be able to figure out what to do.

It is better to avoid concurrent modifications on a file if you have to rely on two-way merge because it will be a slow manual process. Imagine 300 files requiring a simple merge…it would take ages to complete!

The Three-Way Merge

Let's forget for a second about version control, and go back to this "third person" looking into the two files we modified: How can he figure out what happened by himself?

He can look into the original version of the file we both used as starting point (Figure 2):

Three-Way Merge
Figure 2.

Then, looking at how the file was originally ("base common ancestor" or simply "base"), he can figure out what to do.

Based on Figure 2, only one developer actually changed line 30, so the conflict can be manually resolved: Just keep "Yours" as the solution, so it will say: Print("hello");

This is how three-way merge helps: It turns a manual conflict into an automatic resolution. Part of the magic here relies on the VCS locating the original version of the file. This original version is better known as the "nearest common ancestor." The VCS then passes the common ancestor and the two contributors to the three-way merge tool that will use all three to calculate the result.

Using two-way merge only, the lines modified by two developers will require manual intervention, while everything else will be automatically merged. With three-way merge, it is possible to run a painless merge involving hundreds of files.

A Manual Merge Scenario

Let's now check a slightly more complex case, as shown in Figure 3:

Three-Way Merge
Figure 3.

Comparing the two files side by side, we can see that there are three lines with differences:

  • Line 30: with the same conflict we had before.
  • Line 51: with a for loop being modified.
  • Line 70: where we don't know whether "yours" removed some code or "mine" added it.

Let's now look at the common ancestor to be able to properly solve the conflicts (Figure 4):

Three-Way Merge
Figure 4.

  • The conflict on line 30 can be automatically solved and the "yours" (source contributor) will be kept as result because only one contributor modified.
  • The conflict on line 70 can also be automatically solved to "mine" (destination contributor) because it is clear now that the line has been added and it wasn't there before.
  • The conflict on line 51 needs manual resolution: You need to decide whether you want to keep one of the contributors, the other, or even modify it manually.

And this is basically how three-way merge works.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video