Channels ▼
RSS

Design

Three-Way Merging: A Look Under the Hood


Three-Way Merge Tool Layout

When you run a three-way merge tool, the typical layout of the tool is as illustrated in Figure 5:

Three-Way Merge
Figure 5.

Good three-way merge tools show four panels:

  • "Theirs" (the source of the merge, see the branch diagram in Figure 6), base, and "Yours" (the destination of the merge) in the upper panel.
  • The result of the merge in the lower panel.

To me this four-panel representation of the three-way merge is the most intuitive, but some tools present this alternative layout with only three panels (Figure 6):

Three-Way Merge
Figure 6.

In this layout, the "destination/yours" (or working copy) and the result of the merge are displayed together.

The Importance of Merge Tracking

To run effective three-way merges, you not only need a good three-way merge tool, you need an effective merge engine in your version control tool.

In fact, part of the mission of version control should be to correctly calculate the common ancestor/base on any three-way merge. When people say "git is very good at merging," what they mean is "git is very good tracking the merge history, hence calculating the common ancestor for each file." In my VCS work, we put a lot of effort into the merge engine and calculating the nearest common ancestor.

Let's go back to the three-way merge with a manual conflict that we just solved, and let's check out the branching structure (well, at least one very simple branching structure); see Figure 7:

Three-Way Merge
Figure 7.

  • Changeset 3: someone working on the "main" branch performed the change of the Print("hello") line
  • Changeset 4: meanwhile, on branch "task001," you were doing the addition of the Print(result) at line 70.
  • And you both modified line 51.

Now, you want to merge the latest changes coming from "main" into your branch "task001." The version control system will find the nearest common ancestor of changesets "3" and "4" and it will use the graph above. The result in this simple case is changeset "1." The "base" version will be retrieved from changeset "1" to do the merge.

Once you solve the manual conflict on line 51, you will be checking in on the branch "task001" and creating a new changeset "5" as in Figure 8:

Three-Way Merge
Figure 8.

Now development continues; somebody will be creating more changes on "main" while you perform a new checkin on "task001." And then you decide you have to merge "task001" back to "main" (Figure 9):

Three-Way Merge
Figure 9.

The version control system will have to calculate the base/common ancestor between "6" and "7."

The common ancestor will be "3" as Figure 10 shows:

Three-Way Merge
Figure 10.

Note that "3" is the common ancestor because the version control is considering the merge that happened between "3" and "5," which you completed before.

What is the benefit of this merge tracking?

Well, if the merge link between "3" and "5" wasn't tracked (as used to happen with old version control systems), then the base would be "1" again, and you would have to again solve the manual conflict you already solved before. However, if the version control system does its job correctly, the ancestor will be identified as "3" and you won't have to waste time on conflicts you already solved.

Now, what would happen here with two-way merge? You would have to solve every difference manually because in two-way merges, every single modification is a conflict since the merge facility doesn't have a way to solve conflicts automatically.

Conclusion

Often, the questions regarding three-way merge are asked by developers using version control systems lacking good merge tracking such as CVS, Microsoft Visual SourceSafe, and even old versions of Subversion.

Understanding how three-way merge works and why it is so important to have a good merge engine like those in new distributed version control systems is key when looking for a replacement to an aging SCM.


Pablo Santos is a blogger for Dr. Dobb's and an expert on the operations of version control systems.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video