Channels ▼


DITA: The Darwin Information Typing Architecture

Level 1: Topics

At its most basic level, DITA is an XML document markup language; but even at its simplest level, DITA enforces a topic structure and reuse architecture that allows DITA documents to reuse content from other, more structured projects. This standardization also sets the stage for topic-level reuse by others as an initial migration of document-oriented content evolves to incorporate better management and authoring practices around topics and maps.


An author for a government agency may need to produce audience-specific versions of a government policy. The author can write all the content in one file and apply conditional processing values to produce different versions of the policy for permanent and contract employees.


The minimum DITA adoption requires that you migrate the current sources of content in XML. You do, however, have the flexibility to decide which sources to migrate when, and how much structure to apply to the migrated content. Many teams have a large amount of legacy information that was authored in a variety of sources. Some teams may choose to migrate only the content that will require updates in the future. Other teams migrate everything, but do not move the content into typed topics; instead they move the content en masse into generic topics, which are the least restrictive topic type and hence require the least amount of content restructuring. However, the generic topic type also provides the least amount of semantic value.

[Click image to view at full size]
Figure 2: Topics

Another way that teams save time at this level is to defer splitting the content into discrete topics and simply recreate their existing document-focused structure by nesting multiple topics within a single file. For example, recreating chapters as DITA files allows you to continue to store all the chapter content in a single file. While this strategy takes less time than restructuring the content into units based on subject, it does not provide small enough units of information to enable easy reorganization of the content into multiple deliverables.

Because XML separates the formatting from the content, the transform for each deliverable type applies the styles and formatting defined in the Cascading Style Sheets (CSS) when you generate or publish the deliverable. Although the DITA Open Toolkit provides default processing for multiple deliverable types, you must customize the transforms to generate deliverables that meet the style, standard, and branding requirements for your organization.


Even with minimal investment, you can realize returns from adopting DITA. Many teams make the move to DITA to gain greater reuse of their content. Working with their current source, they use conditional processing to generate multiple versions of the same document. Even with non-typed topics or multiple topics in the same file, you can easily specify conditions and generate conditional output with DITA. This remains the primary means for reusing content at the first level of adoption.

However, to make progress toward the goal of additional reuse, you can use DITA to meet the challenge of publishing new or multiple deliverables that contain the same information by single-sourcing the content. The DITA Open Toolkit provides default output processing for a wide variety of popular formats, including HTML files, Eclipse, plug-ins, PDFs, and CHM (Microsoft Compiled HTML Help) files. You can easily generate the same information in multiple formats by specifying a different output type when you publish.

When you publish content, the publishing transform applies the specified formatting to each element, which allows you to easily update format styles for large quantities of information. For example, if the style for highlighting the first instance of a term is italics, but later is changed to bold, you simply update the CSS and regenerate the deliverables. This is much more efficient than searching for and updating each instance of a term or style element across the information set.

For links between content, most teams use hard-coded cross-references in their current source. At the basic DITA-adoption level, you can continue this practice and link between DITA topics, as well as to external documents or locations, such as Web sites.

Lastly, at this level, most teams utilize minimal or unmanaged metadata and primarily focus on terms, such as index terms.

By migrating the content source to XML and chunking it according to the appropriate topic type, the first level of adoption supports conditionally generating output and positions you for greater reuse and output fl exibility at the next level.

DITA Features Used

This adoption level uses the following DITA features:

  • Nested DITA topics. DITA provides the ability to nest topics hierarchically within a single XML file. You can chunk the content by topic type, but you don't have to create separate files.
  • Cross-reference elements. You can create cross-references to elements, such as linking to other topics, to non-DITA files, to Web pages or to specific sections referenced from within a bulleted list. These references are hard-coded into the content, which may have implications when you reuse the content.
  • Conditional processing. You must define the processing attribute and its valid values in the .ditaval file in order to conditionally process content. By default, DITA provides three processing values, but you can create additional values by specializing the DITA provides three processing values, but you can create additional values by specializing the props attribute.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.