Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

XML and Software Configuration


Jul00: XML and Software Configuration

Tony is a Sun Certified Java Developer specializing in telecommunications consulting for ObjectWave Corp. You can reach him at [email protected].


XML provides a platform-independent description of data and promises to revolutionize the way any data-driven application manipulates data. To fully realize the benefits of XML, therefore, you must see everything in terms of data. This even includes nonobvious things such as software configuration. (Configuration is data, after all.)

Currently, Java provides the Properties class for software configuration. This approach works well for static configurations, such as user preferences and GUI settings. However, Properties may not be the best choice for software that has a dynamic or otherwise complex configuration. Luckily, XML and the Document Object Model's (DOM) flexible tree structure make XML a perfect match for both dynamic and static configurations.

In this article, I'll explore software configuration and how it relates to XML. In the process, I will present an XML configuration markup language and Java framework that you can use as the basis for any Java configuration engine (available electronically; see "Resource Center," page 5).

What is Software Configuration?

Software configuration is a broad topic with a definition that changes with domain. As a result, formulating a single, inclusive definition is difficult. Everyday PC users may know configuration as it relates to setting the background color of the desktop or setting their customer profile at an online store. To PC users, configuration lets them change the look, feel, and behavior of an application. This is an important use of configuration since it lets users tailor applications to their own personal tastes. Users expect and demand such features.

However, as we switch domains, the meaning of software configuration changes. In the telecommunications world, for instance, software is often run on large scale, distributed systems. As a node in a system comes online, it must discover the other nodes in the system and download run-time configuration information. This information may include port information, run-time application mixes, and communication links. Because the information and its structure may change at run time, it is not sufficient for the new node to simply read a local configuration file. Instead, the node must download the newest configuration and discover its contents dynamically at run time.

Using the two previous domains, we can formulate a generic definition of software configuration. Configuration describes any aspect of a piece of software that is variable. These include:

  • Behavior, such as timeouts, poll time, and max connections.
  • Structure, which identifies objects to serve.

  • Appearance, including background, color, and font size.

  • Data, such as batch data set to process.

Because they are variable, configuration values are never hard-coded inside the application. Instead, the values are defined somewhere separate from the application. This separate place can be as simple as a file, or something as complicated as a remote database or global software registry.

Based on the previous examples, we can make one more generalization. Independent of domain, you can break configuration into static configuration and dynamic configuration.

Static Configuration

The name/value pairing of data characterizes static configuration. The application reading the configuration expects a specific name/value pair to appear in the configuration. When the application needs to retrieve a specific value, it looks up the value by its name. If the name is not found in the configuration, the application either falls back onto a default value or raises some kind of error. Since the configuration is rigidly defined, its structure rarely changes. If the structure does change, the application normally requires a corresponding change in its source code.

Currently, Java addresses static configurations through the Properties class found in the java.util package. This class reads name/value pairings from a stream. So, for example, a GUI application could define a pair "background=white" in a file. When the application starts up, the application creates a Properties object and stream the configuration file into it. When the application needs the background value, it retrieves the value by calling the Properties' getProperty method and passing the method the string background.

Generally speaking, static configuration is a simple mechanism. Luckily, most application programmers will face only this configuration scheme. The challenges posed by static configuration mainly lie in finding an efficient and reusable retrieval mechanism. This mechanism must also prevent the inadvertent corruption of configuration data by outside entities.

Dynamic Configuration

Change characterizes dynamic configuration. Dynamic configuration values are not based on name/value pairs. Instead, the information contained in the configuration can change over time without requiring changes in the corresponding application. The application must adapt to the changes in the contents of the configuration. Of course, dynamic configurations are not magical. The application will expect a certain configuration format. However, the format is not restricted to known name/value pairings.

Generally speaking, dynamic configuration is not a simple mechanism. Instead, dynamic configurations pose a significant challenge to the developer. Currently, Java does not provide a mechanism specifically suited to the demands of a dynamic configuration. You must roll a custom solution using a variety of Java technologies including Properties, Serialization, and Reflection.

You could attempt to use name/value name/list pairs to address dynamic configuration. However, this approach is not clean. First, you must update the name/ value pair parsing code each time the list of values become deeper. Second, such an approach would prove error prone. Manual configuration would be difficult. You must develop a dynamic configuration scheme that is flexible enough to address any combination of configuration data without rewrite.

So, what would be an example of a dynamic configuration? Take a web server Servlet plug-in. When the plug-in comes online it will need to load a certain mix of Servlet beans. Each bean will also have its very own configuration information. The bean's configuration may even include other nested beans, and those beans may contain beans, and so on. Simple Properties just don't cut it for such a complicated configuration. A Servlet plug-in would need a way to dynamically discover its configuration at run time.

XML to the Rescue

So, how can XML address the challenges posed by dynamic configuration? Simply put, XML describes data. Configuration is data. Therefore, configuration and XML are a natural match. But XML delivers much more.

First, it turns out that XML is good at expressing dynamic configurations. When parsing, a parser converts the XML tags and data into a Document Object Model (DOM) representation. Because DOM represents the XML data in a tree structure, it is possible to dynamically discover a configuration by simply traversing a tree. As a result, XML naturally lends itself to dynamic configurations. As an added bonus, any dynamic configuration can describe static configurations. Thus, XML works for both the static and dynamic cases.

Second, XML is both application and platform independent. Combined with Java, which is also platform independent, XML lets you express configuration data that is truly portable between platforms and applications. Once you have a configuration markup, you can bring that configuration scheme anywhere. Any application that understands the markup language can process or produce the configuration data. This means that you can easily write applications for scripting and maintaining the configuration data in the language of your choice. Furthermore, the data can be easily persisted into a database that understands the configuration's markup. Really, the possibilities that an XML-based configuration description opens up are endless.

Finally, any configuration scheme should be easy to maintain. A properly designed markup language should be both easy to learn and read. This will lessen user error in human-generated configuration files. In an ideal world, dedicated programs will generate all configuration files through a user-friendly graphical interface. Unfortunately, it is not always possible to have such a utility readily available. An easily learned and read configuration markup will lessen human error. Some XML parsers also employ validation. As a result, the parser will quickly point out any errors in the configuration file. This forces the configuration author to follow the markup definition completely. It also prevents an application from parsing an improperly written configuration.

The Dynamic Configuration Markup Language (DCML)

A configuration markup language needs to address the following design requirements:

  • The markup must support static configuration through name/value pairing.
  • The value of a name/value pair may be either a simple string or nested tags.

  • The markup must support dynamic configurations of n-depth.

Because I come from an object-oriented background, I chose to model all configuration data as objects. To support nesting, each object may have any number of subobjects. Like real objects, each DCML object may have a list of parameters, a list of behaviors, a type, and a name.

To facilitate organization, each configuration has a type attribute. Each DCML entity also has a screen name and a short description. The actual value of these attributes is user dependent. A configuration author can use the type tag to designate the type of configuration held within the description. Or a database may use this type to store and retrieve configuration descriptions. The screen name provides a user-friendly name to display with a tool that visually introspects the configuration.

Figure 1 models the markup's object structure. Notice that an object's structure is recursive. An object may have any number of subobjects. Also notice that the markup's definition takes advantage of XML's tree-like nature. Any application can discover each node of its configuration by walking the configuration's DOM tree.

One other point of interest lies in the ValueTypes object. The Value's ValueType is user defined. This means that the value's value could be a simple string or a user-defined set of tags. This way, a complex data type can be expressed in a user-defined XML markup. Of course, as new tags are added you will need to update the DCML's document type definition (DTD). However, a proper implementation of the DCML parser should allow such modifications to the DTD without needing to be updated itself.

Listing One is DCML's full DTD definition, and Listing Two is an example DCML file. The example DCML file deals with security and adds a new value type to the DTD definition presented in Listing One. For the purpose of this article, I tried to keep the example simple. The configuration could have been much more complicated. For example, there may have been a second subobject that defines users and their permission levels. Or I may have defined Objects that a security manager would need to instantiate at run time.

A Java DCML Implementation

To use DCML in a Java application, you need a Java implementation. At minimum, you need a Java XML parser that will convert the XML into the DOM equivalent (the IBM XML4J parser, for instance, is available free at http://alphaworks.ibm.com/). Once you have the DOM tree, the application can discover the configuration directly by walking the tree.

However, there is a problem with this approach. Now each application that uses DCML configurations must have knowledge of XML and DOM. Suddenly, the nonXML applications must understand XML. Any changes to the DCML markup definition will require additional changes in any application that uses DCML. To avoid redundant work and a maintenance cycle nightmare, it is best to write a DCML engine that will convert the XML to a set of Java interfaces. Instead of programming your applications to manipulate DOM, you can design your applications to use the interfaces. Once expressed as interfaces, the underlying DCML is completely encapsulated. So, the interfaces can hide almost any change from the application. If you take this to an extreme, it is even possible to completely remove the XML. As long as the interfaces can retrieve and persist the data, the application does not need to care about the underlying mechanisms.

Figure 2 illustrates this abstracted architecture. Figures 3, 4, and 5 define the set of interfaces that encapsulate the configuration structure and data. These definitions closely follow the model presented in Figure 1. Keep in mind that interfaces say nothing about implementation. It is entirely possible to have one implementation that simply acts as a structure around the data. Another implementation may actually use a database to retrieve and store the values. Yet another implementation may retrieve these values over the wire from distributed sources. Through the careful use of factories, you will not need to rewrite the parser to use these new implementations. Interfaces truly let you swap in completely unrelated implementations at any time.

Of particular note is ValueAttributeIF's use of XMLAwareIF in Figure 5 (Figure 6 defines XMLAwareIF). Figure 1 defines ValueTypes as being user defined. A ValueType, as defined in the DTD, may be a string or a complex tag structure. Thus, as new user-defined value types are added, the types need a new tag markup to describe their structure and data. Unfortunately, parsers are static beasts. Parsers expect a certain structure and nothing else. Without some kind of dynamic extension to the parser, you would need to update the parser whenever you add new tags. Such an "add tag/rewrite parser" cycle makes it prohibitively expensive to add new types to nontrivial configurations. XMLAwareIF was born out of the need to make such additions cheap and painless. Instead of requiring you to update the parser, the parser requires that each value type implement XMLAwareIF. As a result, all parsing of the value's tag is delegated to the type itself. At run time, the parser reads the value, determines its type, instantiates the type, and passes the XML to the type. At this point, it is up to the type to parse its own tags.

There is one final point of interest. You may notice that all interfaces extend java.io.Serializable. Distributed applications will need to share configuration data over the wire. By making everything Serializable, it is possible to easily stream the data over a network. Thus, everything is Serializable. You can even use this fact to store the configuration's object representation directly to disk.

Conclusion

Instead of going through all the implementation's gory details here, I am making the complete source to the implementation, as well as DCML's DTD and example DCML files, available electronically; see "Resource Center," page 5.

Configuration is an important aspect of software development and design that will become more important as software becomes more complex. DCML attempts to meet these future challenges head on by providing a flexible and extendable markup for the description of dynamic configuration data using XML.

DDJ

Listing One

<!ELEMENT DCML (
    Name, 
    ScreenName, 
    ShortDescription, 
    Object 
)>
<!ATTLIST DCML Type CDATA #REQUIRED>
<!ELEMENT Name (#PCDATA)>
<!ELEMENT ScreenName (#PCDATA)>
<!ELEMENT ShortDescription (#PCDATA)>
<!ELEMENT Object (
    Name, 
    ScreenName, 
    ShortDescription, 
    ParameterGroups?, 
    Behaviors?, 
    SubObjects?
)>
<!ATTLIST Object Type CDATA "null">
<!ELEMENT ParameterGroups (ParameterGroup)*>
<!ELEMENT ParameterGroup (
    Name, 
    ScreenName, 
    ShortDescription, 
    Parameters
)>
<!ELEMENT Parameters (Parameter)*>
<!ELEMENT Parameter (
    Name, 
    ScreenName, 
    ShortDescription, 
    Value
)>

<!-- Update this Entity |OR| list to add support for new types. -->
<!ENTITY % ValueTypes "#PCDATA|Hosts">
<!-- Definition for Hosts -->
<!ELEMENT Hosts (Host)*>
<!ELEMENT Host (IP, SubnetMask?)>
<!ELEMENT IP (#PCDATA)>
<!ELEMENT SubnetMask (#PCDATA)>
<!ELEMENT Value (%ValueTypes;)*>
<!ATTLIST Value DataType CDATA #REQUIRED>
<!ELEMENT Behaviors (Behavior)*>
<!ELEMENT Behavior (
    Name, 
    ScreenName, 
    ShortDescription
)>
<!ELEMENT SubObjects (Object)*>

Back to Article

Listing Two

<?xml version="1.0" standalone="no"?>
<!DOCTYPE DCML SYSTEM "dcml.dtd" >
<DCML Type="Security">  
  <Name>Security</Name>  
  <ScreenName>Security Configuration</ScreenName>
  <ShortDescription>This configuration deals with security issues.</ShortDescription>
  <Object>
    <Name>Security</Name>    
    <ScreenName>Security</ScreenName>    
    <ShortDescription>This is the security object.</ShortDescription>    
    <SubObjects>
      <Object>
        <Name>Hosts</Name>        
        <ScreenName>Hosts</ScreenName>
        <ShortDescription>this configuration object lists allowed 					                hosts</ShortDescription>
        <ParameterGroups>
          <ParameterGroup>
        <Name>HostList</Name>
        <ScreenName>Allowed Hosts</ScreenName>            
        <ShortDescription>this group lists all allowed hosts</ShortDescription>
            <Parameters>
        <Parameter>                
            <Name>security.hosts.allowed</Name>                
            <ScreenName>allowed hosts</ScreenName>                
            <ShortDescripion>the allowed hosts</ShortDescription>                
            <Value DataType="com.objectwave.hosts.HostList">                  
              <Hosts>                    
                <Host>                      
                   <IP>172.16.1.2</IP>                       
                   <SubnetMask>255.255.255.0</SubnetMask>
                </Host>                    
                <Host>                      
                   <IP>172.16.1.3</IP>
                </Host>                    
                <Host>                      
                   <IP>128.174.5.58</IP>
                   <SubnetMask>255.255.255.128</SubnetMask>
                </Host>                  
              </Hosts>
           </Value>              
         </Parameter>            
       </Parameters>          
      </ParameterGroup>        
     </ParameterGroups>      
    </Object>
   </SubObjects>  
  </Object>
</DCML>

Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.