Design

Domain-Specific Languages & DSL Workbench

By Griffin Caprio, April 07, 2006

Domain-specific languages can enable richer development environments than those provided by general-purpose languages.

Griffin is a Microsoft certified consultant with ThoughtWorks in Chicago. He is an active developer on several open-source .NET projects, including Spring.Net and .NET Mock Objects. He can be contacted at [email protected].

Within organizations, software development usually consists of informal domain-modeling sessions with the resulting domain knowledge residing in the heads of key individuals. Trying to translate these vague and intangible concepts (boxes and lines, for instance) into concrete development artifacts (source code) can be problematic not only because of the inherent complexity of the domain and its nuances, but also because implementation details can be left up to the interpretation of the person doing the implementation. Hardcoding this domain knowledge into a domain-specific language (DSL) reduces the chance for incorrect interpretation, while providing a reusable and concrete implementation of domain models and domain concepts.

DSLs are computer languages tailored to specific domains, enabling richer and more flexible development environments than those provided by general-purpose languages. Most applications can be broken into distinct components, ranging from application agnostic (data access) to application specific (specific business rules). Figure 1 illustrates how a typical application can be separated into different slices, both vertically and horizontally. Development languages can be classified in a similar way, depending on the type of problems they are adept at solving. Some languages are suited for horizontal components, affecting multiple types of applications, regardless of the actual domain they are being used in—SQL and HTML, for example. Other languages are suited for vertical components, describing different vertical domains, such as finance, insurance, or advertising.

Figure 1: How a typical application can be separated into different slices.

The process of creating domain-specific languages is expensive and timeconsuming for two reasons:

To be valuable, languages must be used in the development of hundreds—sometimes thousands—of applications. Therefore, they must be designed to achieve a high level of reuse. This is the only way to recoup the cost of developing such a language. Because many IT shops are constrained by workload and resources, it is difficult to imagine an IT shop taking on such an undertaking. Also, most IT shops don't have the necessary skill set to design a complete programming language.
Even if an organization were to create a language from scratch, achieving the required levels of reuse would require arming developers with the tools to use the new language. Because most development is done using IDEs, organizations must not only act as language designers, but IDE designers as well. With the amount of time developers spend in their IDEs, usability is critical and it is unrealistic for a company to meet the needs of its developers with tools that are as easy-to-use as their existing ones.

High upfront development costs combined with the need for reuse has led to the development of DSLs that describe horizontal (or infrastructure) spaces. A classic example of a DSL targeting horizontal space is Microsoft's Visual Basic. Before VB, developers wrote boilerplate code for even the simplest of GUI interactions, often requiring thousands of lines of code to be copied from one project to the next to achieve reuse.

With the introduction of Visual Basic, Microsoft effectively introduced a DSL for Windows GUI programming. Nouns such as Form, Button, and Control went from being abstract concepts used to describe the end product of development, to being actual language constructs. VB developers could create a form using drag-and-drop in a fraction of the time it took to create one using MFC APIs and handwritten source code. Even though VB still uses the MFC APIs, its use is abstracted by the DSL layered on top.

Current 3GLs target a low level of domain-specific modeling. At best, these languages describe minor slices and aspects of domain concepts that must then be hand assembled to describe higher level domain-specific concepts. For many types of applications, this level of control and complexity is unnecessary. Microsoft raised the abstraction level of Windows GUI programming because many developers simply did not need to work at the level of granularity provided by MFC.

The high level of productivity gained from creating DSLs has led many companies to use UML (or some other graphical form of modeling) to create DSLs. However, these forms of modeling suffer from an inherent constraint of the abstraction that they are supposed to represent. Each method is meant for documentation, not code generation. This means that many aspects of the resulting language cannot be represented using a standard modeling language.

Because languages such as UML are aimed at solving the modeling problems of many different applications and verticals, their use is similar to that of other 3GLs. Many components of each language need to be combined to represent even one domain concept of one company's domain. Despite these shortcomings, many companies use modeling languages that are too granular for typical development, forcing them to reinvent the wheel with every project.

Even if such languages were adequate, there would still be difficulties in translating these models into usable development artifacts because developers are not able to work with the models at design time. Either tool support does not exist for mapping the models to source code, or the tool support exists in a separate tool, with only a one-time generation and import into a development IDE. After the initial import, you must work at a lower level of granularity than was supposed to be abstracted away by the model.

Language Design and Tool Support

Fortunately, many of the shortcomings with current languages are being addressed as vendors are starting to support the notion of high-level DSLs for application development, making it easier for organizations to create their own custom DSLs. In addition, many IDE vendors are creating extension toolkits for their IDEs that let organizations create custom DSLs for their domains, and integrate the resulting language into their IDE; see Table 1. Steps such as these address the problems of language design and definition, and tool support for the resulting language. While creating the language requires some language design, the whole process starts resembling a cross between a typical domain-modeling session and VB-form development. With the high level of integration between an IDE and its DSL toolkit, any resulting languages are instantly usable from within an already standard IDE. Not only does this let organizations standardize concepts across teams, departments, and applications, but it also lets them standardize programming style in a consistent representation across applications.

Before getting into the details of a specific DSL implementation, it is important to understand the larger issues involved in enabling systematic reuse within organizations. DSLs are part of a larger software development movement called "software factories." In particular, the concept of a software factory aims to bring traditional factory-like production characteristics to software development by achieving a level of product commoditization similar to other manufacturing product creation. Software factories revolve around recognizing the difference between economies of scale and economies of scope.

As I've pointed out before (see http://www.theserverside.net/articles/showarticle .tss?id=SFRefactoringIndustry), economies of scale occur when the initial development of a design and subsequent construction of a prototype result in multiple copies of that prototype being created. This is similar to the fabrication industry when a custom part is created and that part is used to produce hundreds or thousands of similar parts. In the manufacturing industry, it's similar to a machine being built that can produce screws in bulk. In this case, reuse occurs in the later stage of production, namely construction.

Likewise, economies of scope occur when multiple similar—but still unique—designs are produced in groups, as opposed to individually. In "Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools" (http://www.oopsla.org/2005/ShowEvent .do?id=158), Jack Greenfield and Steve Cook use the example of a car manufacturing plant that uses existing designs (chassis, body, and interiors) to produce several lines of distinct cars. Each product line is complete with custom features, but all share that same underlying design. Here the reuse occurs earlier in production, namely design and prototyping.

Software factories aim to let you better utilize economies of scope during the development process. There are three major components that compromise the idea of a software factory:

Models and patterns.
Domain-specific languages.
Software product lines.

During the development of a software factory, organizations mine and create models and patterns to represent their domain. These models and patterns can then be represented within a DSL. Finally, this DSL can be used to create one or several software product lines, encapsulating a company's software artifacts. Interestingly, Microsoft's DSL Workbench (http://msdn.microsoft.com/vstudio/teamsystem/workshop/DSLTools/default.aspx), the suite of DSL tools that I focus on in this article, fits into the larger software factory model by integrating with Visual Studio Team System to provide many of the reuse and manageability features required to create software factories.

Microsoft DSL Workbench

There are several components that make up Microsoft's DSL Workbench for Visual Studio.NET:

A new project wizard for Visual Studio. This wizard creates new projects in Visual Studio that let users create a domain model and corresponding designer for their language.
A graphical designer for editing/creating domain models.
An XML designer format for defining a graphical designer for working with your domain model.
Code generators that take in a domain model and designer definition, and produce code as output.
A framework for producing code-generation templates.

Figure 2 illustrates the steps required to create your own DSL using the DSL Workbench. More specifically, Figure 2 provides an overall workflow for creating a custom DSL from within the DSL Workbench, but really encapsulates the steps required to create a DSL from within any of the DSL toolkits in Table 1.

[Click image to view at full size]

Figure 2: Steps required to create a DSL using DSL Workbench.

Vendor	IDE	Toolkit	URL
IBM	Eclipse	Eclipse Modeling Framework	http://www.eclipse.org/emf/
Sun	NetBeans	Project Ace	http://research.sun.com/features/ace/
JetBrains	IntelliJ	Meta Programming System	http://www.jetbrains.com/mps/
Microsoft	Visual Studio.Net	DSL Workbench	http://msdn.microsoft.com/vstudio/teamsystem/ workshop/DSLTools/default.aspx

Table 1: DSL vendors.

The first step in creating a custom DSL is to use the New Project Wizard from within Visual Studio.NET 2005 (step 1 in Figure 2), and select the Domain Language-Specific Designer from within the available project types. Using a combination of classes, relationships, and properties, the structure of your DSL can be designed using graphical notation (step 2 in Figure 2). Figure 3 is an example of a domain model defined using the DSL Workbench.

[Click image to view at full size]

Figure 3: Defining a domain model.

A designer definition lets you craft how the DSL is rendered for its users. Shapes, icons, and component properties can all be configured (step 3 in Figure 2). (In the release of the DSL Workbench at this writing, a domain model's designer structure can only be edited in a textual manner; GUI editing of a designer definition is not allowed.)

While crafting a DSL designer definition, custom resources referenced from within the designer definition can be added for custom graphics, icons, and bitmaps. All the custom resources are then packaged up with the language definition when compiled (step 4 in Figure 2).

Once a domain model and designer are created, code templates can be added to enrich and augment the DSL (step 5 in Figure 2). These code templates are used to generate the actual code from the actual instances of the domain model that users create. (This item, arguably the most important, will be available in future releases.)

Once all aspects of the DSL have been created, the resulting DSL is compiled into a package that can be deployed into target Visual Studio.NET 2005 installs (step 6 in Figure 2). The DSL can be debugged from a prepackaged Visual Studio.NET 2005 experimental build. This allows DSL designers to test and debug their DSL under regular usage scenarios (step 7 in Figure 2). Once a DSL is ready, users can deploy the DSL into their Visual Studio.NET install (step 8 in Figure 2) and begin generating artifacts and templates with the DSL.

When your language has been created, it is registered with Visual Studio.NET. This lets you create a new project of your language type. Figure 4 presents an example use of the language defined in Figure 3. Optionally, an installer for the new DSL can be created to allow for easy installation on developers' machines (step 9 in Figure 2).

[Click image to view at full size]

Figure 4: Utilizing the language defined in Figure 3.

Once you have the DSL installed on your Visual Studio.NET instances, you can begin to create applications with the DSL in a graphical fashion, akin to creating GUI applications using a RAD-style environment.

DSLs and Software Product Lines

A software product line is typically defined as "a set of software intensive systems sharing a common, managed set of features that satisfy the specific needs of a particular market segment or mission and that are developed from a common set of core assets in a prescribed way" (see Software Product Lines: Practices and Patterns, by Paul Clements; Addison-Wesley Professional, 2002). In other words, software product lines are development environments that let you create applications that are assembled from a set of common components. These components are typically mined from previous development efforts, repackaged, and released as reusable assets.

There are two main components of development using a software product line (SPL):

The first is Software Product Line Development (SPLD). Within SPLD, the development team responsible for developing the SPL harvests common and reusable components for integration within the SPL. Typically, components designated for reuse within a company's software product line are full-fledged components, with "variation points" clearly defined and documented. These variation points are then exploited by the second half of an SPL, the Product Development staff. (Variation points are aspects of an architecture that can vary from implementation to implementation, thus allowing the core architecture to accommodate any number of implementation-specific requirements. Variation points are created within the stable of core assets by the Product Line Development team, using any number of methods, including inheritance, extension points, and parameterization. These variation points are configured upon application construction by the product development team.)
The second half of an SPL is the Product Development Staff, which uses the common assets defined by the SPLD team, and builds the final product from these common assets by manipulating the asset's variation points, as well as any application-specific development that needs to be done. Constant feedback to the SPLD team is a requirement, as the feedback not only helps refine existing components, but also identifies new components ripe for inclusion within the software product line. Figure 5 is a simple description of the product line creation circle.

[Click image to view at full size]

Figure 5: Example use of the language defined in Figure 3.

DSLs are utilized within an SPL by enabling the creation and encapsulation of domain-specific concepts and frequently used domain abstractions.

Components can be documented and packaged using domain-specific terminology. Variation points and customization aspects can be described using the domain language used by domain experts, allowing for a clearer mapping between business requirements and the resulting application.

Conclusion

There are, of course, DSL toolkits for other languages and IDEs. Currently, most of the work is centering on the two mainstream enterprise programming environments—Java and .NET. Although Java has support for DSL creation using the various available IDEs, aside from Sun's Project Ace (http://research.sun.com/projects/ ace/), it is unclear whether the other IDE projects will make an attempt at filling other areas of the software factory paradigm, such as enabling systematic reuse for other development artifacts (test plans, requirements, and the like).

There are several nontechnical benefits to be had by adopting DSLs for use within a company's development. For one thing, a second level of reuse occurs when creating a DSL—domain knowledge reuse. Domain-specific concepts can be utilized by developers without requiring a deep understanding of the intricacies and nuances of the domain. Second, developers can become instantly productive within the software development team by utilizing a preconfigured instance of an IDE, tailored to a company's specific development process and standards. This also lets a company enforce development standards at a low level. Not only coding standards, but architectural standards as well.

Domain-specific languages are already commonplace in developer toolboxes. Languages such as SQL (relational data manipulation), SGML (data markup), and HTML (hypertext publishing) are already widely used across a wide variety of applications and domains. These languages currently operate in a horizontal fashion. To begin to achieve the same style of productivity and efficiency, the software industry needs to begin to push this type of abstraction upwards, into the vertical space. To do this, developers need an efficient means of not only creating a DSL, but also integrating the resulting DSL into their everyday development tools and IDEs. Recent developments and toolkits by leading IDE vendors will meet these requirements for many major .NET and Java IDEs. Companies that begin to standardize their products, and increase reuse across their enterprise by moving toward a product line-based development architecture, will find their development efforts much more agile and responsive to customer demands. This will enable quicker time to market and a tangible competitive edge. DSLs will become the bridge from abstract domain models and patterns to concrete software product line implementations.

DDJ

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Design