Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼


Project of the Month: Xtext DSL Framework

For more on Xtext, see Creating Your Own Domain-Specific Language and Generating Code from DSLs.

Everybody seems to be talking about Domain-Specific Languages (DSLs) these days. Martin Fowler just released his new book on that topic and popular frameworks such as Spring Roo, Apache Camel, and Grails make heavy use of these specialized little languages that are tailored to a particular purpose. These examples illustrate that there are sweet spots for DSLs in commonly used frameworks. However, the real power of DSLs is unveiled when they are developed for real-world projects and used to describe main view points of specific systems and problem domains.

An important question to consider is whether developing and maintaining a dedicated language is worth the effort? Isn't language design something complicated that only parser-gurus can handle? And what about IDE support? Today's sophisticated programmers demand top-notch editing facilities, so even if we can afford to develop a DSL, can we provide effective IDE integration?

Eclipse Xtext is a framework that makes developing DSLs much simpler. It facilitates agile, iterative evolution; and the output is not only the infrastructure to parse the files of your newly created language, but a sophisticated IDE with code highlighting, code completion, and error checking for the new DSL. Xtext is the framework used by Maven 3, Eclipse, and other products for the generation of their DSLs.

In his book, Martin Fowler the example of a small DSL that handles the command for operating a hidden compartment in a house. In this article, we will demonstrate how to implement Fowler's secret-compartment state machine using Xtext.

Internal and External DSLs

A domain-specific language can be defined in two ways. One approach is to make use of the syntactic flexibility of a programming language to define an API, where the client code looks like it is written in a completely different language. This is fairly hard to do with rigid languages (like Java); some languages (such as Ruby) provide enough syntactic flexibility so that code can really look like it was written in a language of its own. Such DSLs are known as internal DSLs.

A second approach is to use an external DSL that is not embedded in another host language but explicitly defined using a parser generator or something similar. DSLs developed with Xtext are all external DSLs.

Both approaches have advantages and disadvantages and it is a good idea to know about both. However, most of the commonly cited disadvantages of external DSLs have been eliminated by Xtext.

An Example

Imagine you are running a company, that specializes in developing security systems for secret compartments. Because you have many customers, you want to be able to easily define and reconfigure the security systems for different compartments. You need a concise way to define such security systems, such as a DSL.

Mrs. Grant's security system, for instance, might look like this:

 doorClosed D1CL
 drawerOpened D2OP
 lightOn L1ON
 doorOpened D1OP
 panelClosed PNCL end 


 unlockPanel PNUL
 lockPanel PNLK
 lockDoor D1LK
 unlockDoor D1UL

state idle	
 actions {unlockDoor lockPanel}
 doorClosed => active 

state active
 drawerOpened => waitingForLight
 lightOn => waitingForDrawer 

state waitingForLight
 lightOn => unlockedPanel 

state waitingForDrawer
 drawerOpened => unlockedPanel 

state unlockedPanel
 actions {unlockPanel lockDoor}
 panelClosed => idle 

This code contains a list of the events and commands (with coded names after them), and a list of the various states.

To unlock her panel, Mrs. Grant needs to first close the door. Then, she opens up a certain drawer and turns on her bed light. This triggers a secret mechanism that unlocks the panel. Note the resetEvents section, which is invoked if the door is opened. (The whole system is then reset back into idle state.) The cryptic codes after the events and commands identify the binding to the target platform. Imagine that the controllers can only understand such cryptic names. This kind of name mapping is a typical pattern when integrating DSLs into target software systems and very similar to a mapping of SQL data types to Java or C# types.

The Grammar Definition

To begin, we have to create a parser that is capable of reading the cryptic input-files and instantiating an in-memory model, based on the Eclipse Modeling Framework, that represents the parsed state machine. Fowler calls these typed objects a "semantic model." Common parser generators allow you to describe the syntax of documents, but don't provide abstractions to describe strongly typed structures for the semantic representation of their content. Xtext's grammar definition is different. It allows you to describe the concrete syntax as well as the mapping to a semantic model in a clean and concise way:

grammar org.eclipse.xtext.example.FowlerDsl with org.eclipse.xtext.common.Terminals

generate fowlerdsl "http://example.xtext.org/FowlerDsl"

	'events' (events+=Event)* 'end'
	'commands' (commands+=Command)* 'end'

	(resetting?='resetting')? name=ID code=ID;

	name=ID code=ID;

	'state' name=ID
	('actions' '{' (actions+=[Command])+ '}')?

	event=[Event] '=>' state=[State];

Even though you may not be familiar with grammars in general, you'll be able to make sense of this. Without going into a detailed explanation (which can be found in the Xtext documentation), there are two especially noteworthy constructs: the assignments and the cross references. The assignments describe how the parsed values should be stored in the semantic model, and the cross references (in square brackets) define how the objects should be interlinked. These constructs are unique features of Xtext's grammar definition language.

Aside from the parser, there is much more that Xtext derives from the grammar. For example, there is infrastructure for linking and scoping, as well as for static analysis (validation), based on the semantics of the language. You even get an "unparser," which allows you to instantiate or modify semantic models in memory and write them back as text using the very same syntax defined in the grammar. Even whitespace and comments are preserved, and you can also define formatting rules.

One of the best things about Xtext is the rich Eclipse integration that is available for every language. Based solely on the aforementioned grammar definition, Xtext provides a fully functional Eclipse editor. It supports all the features that you'd expect from other mature IDEs, such as on-the-fly validation and error indication, syntax coloring, content assist, and code navigation. You can find references to your semantic elements, look them up in a global index, and share concepts across different languages.

Modularity Concepts

As soon as you build a language to develop a fair-sized system, you'll have to think about modularization. It's a good idea to split logically different things into different physical files. Xtext supports this requirement. By means of the scoping and visibility rules of your language, references across file boundaries are resolved by the framework based on the names of the semantic elements (instead of concrete file references). As with many other service implementations in Xtext, this follows the lessons learned from programming languages of the past. The defaults are quite similar to the "namespace" concept of C# and Java. Behind the scenes, Xtext uses an index in the IDE to ensure scalability for languages with a large number of artifacts. This index is used for various other features, too. Like the "Open Type" dialog used with Eclipse's Java tooling in order to find and open Java types anywhere in the workspace, Xtext provides the ability to lookup arbitrary semantic elements in your IDE, as well as a global "Find References" capability to identify other objects that refer to a particular element. The index is maintained by an Eclipse builder that automatically picks up any changes in your text files and revalidates the affected files in the background.

The same data is used to trigger processing components such as compilers or code generators. They can participate in the build process and generate or update other files such as Java code that is derived from your model files. The Xtext framework provides a dedicated customization hook for that purpose and bundles the Xtend language, which is specialized for code generation.

The snippet in Figure 1 illustrates a small template that creates an interactive state machine to test the logic of secret compartments:

Figure 1: Xtend template to generate Java code.

Some people don't like code generation, presumably because it adds complexity to the development process and makes turnarounds longer. Although Xtext works perfectly well with interpreters, the tight integration with the Eclipse infrastructure enables code generation to be done transparently in the background, so a longer development turnaround time is easily absorbed. But again, it's up to you whether you want to use code generation or an interpreter to process your DSL scripts.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.