Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

XFDL: The Extensible Forms Description Language


Dec99: XFDL: The Extensible Forms Description Language

John is a software development manager at UWI.com and can be contacted at [email protected].


The Digital Signature Problem


The most important aspect of web-based e-commerce is the formulation of electronic records that achieve transaction nonrepudiation through the proper use of digital signatures. Other quintessential needs of e-commerce include human readability, publicly accessible open standards, precision layout and fine-grain presentation logic typical of dense business and government forms, and the ability to extend the language to express all aspects of records management, such as the representation of server-side business logic.

The Extensible Forms Description Language (XFDL) is an XML extension language that addresses these key issues involved with doing electronic commerce on the Web. A current copy of the XFDL specification appears as a proposal on the W3C web site (http://www.w3.org/TR/ NOTE-XFDL/). In this article, I'll discuss how e-commerce needs affected the design of XFDL, and describe the actual structure of XFDL markup. More on XFDL is available in a paper I coauthored with B. Blair entitled, "XFDL: Creating Electronic Transactions Records Using XML" (Proceedings of Eighth Annual World Wide Web Conference, May 1999).

Although UWI.com (the company I work for) was the driving force behind XFDL, the language is an open standard and UWI.com has placed no intellectual property rights on it. UWI.com does provide an API that reads and writes XFDL, performs computes, creates and checks digital signatures, and enables Get/Set/Search capabilities on the nodes of the parse tree.

UWI.com also provides a trial version of its XFDL viewer at http://www.uwi.com/ products/. The viewer operates both directly in the web browser and standalone. The page also provides links to many example forms, or you could write your own with either a text editor or a trial version of the XFDL visual forms designer.

Third parties support XFDL insofar as they support XML, or by distributing the viewer and utilizing the API and/or UWI.com's prebuilt database and workflow integration products to create application-specific behaviors. There is usually no need to write much custom code -- just write the XFDL forms. Furthermore, implementors can put custom Java classes right into forms if necessary, and those classes can call the UWI.com API. Hence, XFDL is not limited by XML.

The Origin of XFDL

Work on XFDL began in 1993 with a project known as Masque, which happened to be the name of the University of Victoria computer on which the project started. The language had many structural similarities to XML, but lacked the ability to express computations and deep hierarchy. The language was redesigned in 1996 to address these shortcomings, and it received the more ambitious title "Universal Forms Description Language" (UFDL). XML became a W3C recommendation in 1998, and XFDL represents a subsequent migration of the key ideas of UFDL into an XML syntax. At the time, UFDL addressed all of the e-commerce needs described earlier except that it is not based on a Standard. For anyone in the UFDL problem space, this move to an XML syntax represented a significant advantage. Although UFDL was already in a human readable format with a C++/Java-like syntax, the importance of being based on the XML Standard should not be undervalued. Due to the rapidly growing base of software and programming skills for processing XML documents, any XML-based syntax can be processed less expensively and can also be interoperable with a wider range of software. Furthermore, even if it becomes necessary to migrate to other software or formats in the future, this can be done without relying on a specific vendor's products or highly specialized skill sets.

In turn, methodologies developed for UFDL were quite new, and sometimes controversial, for an XML markup language. This includes the document centrism, declarative presentation logic, the size and scope of XFDL as well as its overlap with numerous other W3C efforts, and the fact that a single Document Type Definition (DTD) cannot be created for all XFDL forms.

Document centrism is unusual for web protocols, but it is the key to reliable transaction records. Most web protocols achieve flexibility through a more stratified approach, yet this flexibility is precisely what cannot exist for a transaction record. The full nature of the agreement must be captured, and it must be immutable once signed. Lending credence to XFDL's approach is PDF Version 4.0. Although PDF suffers from structural inaccessibility (opaqueness) and other problems, it does at least apply digital signatures to the whole PDF document, which includes the presentation layer (see the accompanying text box entitled "The Digital Signature Problem").

XFDL has a declarative style for input validation/formatting and for computations. A markup language should be more declarative, yet both HTML forms and PDF rely on JavaScript for these functions. This seems to be a matter of expedience rather than design preference. There is a strong analogy between form calculations and spreadsheet calculations. Imagine how much less effective spreadsheets would be if the computations were expressed in the imperative rather than the declarative. Procedural programming would be no more difficult in small spreadsheets, but business problems don't tend to be small, and assertion-based computations are certainly preferable for spreadsheets that occur in practice because the user doesn't have to rewrite a piece of the inference engine every time the spreadsheet changes. The same is true of forms.

Another controversy surrounding XFDL is its size. One measure of its size is the number of other W3C proposals, works-in-progress, and recommendations that seem to overlap with XFDL. XFDL can be improved in many ways, but it is essentially a complete solution to a large problem. Many prior specifications are smaller because they solve a smaller problem, which is natural because the origin of XFDL predates XML-based work by several years. Moreover, XFDL overlaps certain other specifications, but trying to create an e-commerce solution from the overlapping bits simply creates a patchwork that does not have the design consistency of a work engineered to solve problems in the e-commerce space. One good example is MathML, the Mathematical Markup Language (http://www.w3 .org/TR/WD-math-970515/). This is a fine language for complex mathematical formulae found in mathematics, but not appropriate for business applications. Finally, there is the matter of a document type definition (DTD). Although a DTD could be created for each XFDL form, no single DTD can validate all XFDL forms. DTDs have the expressive power of a regular expression, so infix mathematical expressions (which are recognized by parsers, not lexical analyzers) can only be validated as character data. Other obstacles to creating a single XFDL DTD exist, but by far the largest problem is that XFDL is, itself, extensible. Most XML extension languages have a predefined list of keywords to use as element tags and attributes, but XFDL allows the definition of custom items, options, and compute functions to fill the e-commerce need to express business logic and application-specific functions beyond the core XFDL language. DTDs cannot describe extensible XML languages.

Structural Overview of XFDL

The root element of an XFDL form is surrounded by <XFDL> and </XFDL> tags. The XFDL element has a mandatory version parameter that indicates the XFDL language version to which the form complies. This controls the XFDL keywords that are available. Listing One is an example of this; its comments also state that the content of the XFDL element can include form global options followed by one or more pages.

Pages and Items

The structure of form global options is the same as regular options. Each page element is surrounded by <page> and </page> tags, respectively. Each page has a mandatory attribute called a "sid" (short for "scope identifier") and provides a page name that is unique within the surrounding XFDL element. Each page has a body of zero or more page global options followed by zero or more item elements. Like form globals, page globals have the same structure as regular options. Each item represents a single GUI object or an item that is not directly visible, such as a digital signature or a binary image enclosure (expressed in Base-64; see http://ds.internic .net/rfc/rfc2045.txt/). The tag name of the item element indicates the type of item, such as a field, label, button, checkbox, popup, and so on. Each item also contains a sid attribute that gives it a name that is unique within its page. Listing Two is an example of these ideas.

Each XFDL page, item, and option has a scope identifier. Pages and items specify their scope identifier in the value of the sid attribute. Options do not require the sid attribute because the option's tag name is expected to be unique within the scope of its parent element. According to Tim Bray, coeditor of the XML 1.0 specification, this is a necessary inconsistency as it causes XFDL to have a look-and-feel that XML users have come to expect.

Relative Scope

XML contains a facility for uniquely identifying any markup element, but the identifier must be globally unique within the entire document, so it is not scalable to the large forms typically found in business and government. Relative scoping is quite helpful when trying to cut-and-paste groups of items or options while building a form, and it is particularly useful when the cut-and-paste occurs during the form's run time, such as dynamic duplication of a row of items in a purchase order.

Options

The primary role of form global options is to provide defaults that are also overridden by page globals (although information about the whole form can be stored in form global options). Likewise, the primary role of page globals is to provide defaults for the options appearing in each item, which the item can override when necessary. For example, a page or form global option for font information can specify the typeface, point size, and special characteristics (such as bold) of the default font used by all items. This prevents each item from needing to specify options unless special effects are required. Page globals can also carry page-specific information, like the background color to use on the form. Furthermore, global options like bgcolor can double as defaults when it makes sense. For example, the global bgcolor also provides the default background color for static labels, but items that can take input, such as fields and popups, have a default background color of white.

Options come in two basic varieties: those with element content that is simple character data, and those with element content that contains subelements. The latter are called arrays in XFDL, and are distinguished by a content attribute with a value of array. Listing Three shows option examples for the TITLE label from Listing Two.

In the options of Listing Three, it never seems to make sense to have more than one copy of any of these options. Why state the justification or the size more than once? This is generally true of XFDL options, and this is why their tag names suffice as scope identifiers. On the other hand, a form typically requires many fields, labels, and so on, which is why items require the sid attribute.

The other thing to notice about array options is that their subelements do not need to have unique identifiers. Although the size option shows array elements that do have names, it is most often the case in XFDL that array elements do not need specific names because computations can refer to them by ordinal position, just like arrays in other languages. When a unique name is not used, the array element tag is simply ae, as in the fontinfo options in Listing Three. For example, a computation can obtain the typeface of the label's font using TITLE.fontinfo[0], whereas the label width can be referred to using either TITLE.size[width] or TITLE.size[0].

Formatting

XFDL's format option reduces server-side processing as well as network communication of erroneous forms by providing both input validation and input formatting. In Listing Four, the HOURS field contains a format option that constrains the input to be an integer between 1 and 99. The range requires a scope identifier not because it is a subarray, but rather, to alleviate order dependence within the format option. Field input can also be restricted to floating-point values, dollar values, and dates. Templates offer more exotic field input validators for objects such as phone numbers and zip codes. The RATE field in Listing Four shows a modifier (add_ds) that reformats the user input--in this case, to add a dollar sign if one was not provided.

Computes

It may seem odd that XFDL arrays require a content attribute since they seem distinguishable by the fact that they contain subelements rather than simple character data. However, the actual reason for the content attribute is that XFDL uses XML element depth for different reasons, including to express a computation for character data. The field in Listing Five declares the value of WEEKLYCHECK to be the number of hours worked multiplied by the rate of pay, as indicated by user input. The subelement cval contains the current value of the computation, and the compute element contains the infix expression. As the user enters or changes the values in HOURS and RATE, the cval is automatically updated.

There are two additional important properties of computes. First, if a digital signature is specified to include the field WEEKLYCHECK, then once a user affixes a signature, the compute is locked and does not continue updating the cval. This is part of how XFDL captures a snapshot of the transaction. Second, suppose there was another field in the form we have been incrementally building, and suppose that it needed the value of the weekly check, say, to compute the income tax withholding. Even though the current value is in the subelement cval, you would still use the reference WEEKLYCHECK.value. This allows for seamless integration of computed and uncomputed elements.

Decision Logic, Nested Decisions, and CDATA

XFDL supports the standard comparators and logical operators as well as the ternary decision operator. These features can be used to enhance the computation in Listing Five by, for example, adding a calculation for overtime. The compute in Listing Six replaces the compute in Listing Five. It declares that the value of WEEKLYCHECK is either based on a regular time formula or an overtime formula based on a condition that expresses whether overtime was performed.

Whitespace can appear anywhere in the compute expression. As a matter of style, I tend to put the conditional on one line, then start the consequent on a new line with the question mark, and I put the else part on another line starting with the colon. This style is based on writing nested decision statements, where question marks and colons are indented two spaces.

The use of the XML CDATA feature is required for many decision-based computes. The reason is that "<" and "&" are normally forbidden characters in XML element content because they denote the beginning of start or end tags or the beginning of entity references. This conflicts with the less-than and less-than-or-equal-to operators (<,<=) and the logical and operator (&&).

Parsing Requirements of Computes

The syntax rules for XFDL state that the compute element content can either contain a mathematical expression or a decision statement, yet the starting conditional expression can also begin with an arbitrarily long mathematical expression. For example, a compute could contain x.value+y.value, or it could contain x.value+y.value<''0'' ? x.value : y.value. This inability to choose the language rule based on the left-most symbol implies that no recursive descent parser can be created for XFDL computes; an SLR(1) parser or better is required. However, note that not all XML processors require an auxiliary compute parser; they can derive useful information from XFDL forms by simply treating the computes as character data.

Events, Functions, and Extensibility

XFDL also supports events such as focused, activated, and mouseover as well as functions like toggle() and set() for detecting or causing these events. Many other functions are defined for string manipulations, math and financial calculations, and so on. Finally, the XFDL computation system continues to work in custom item and option elements that hold application-specific (often server-side) business logic.

Digital Signature Filters

A digital signature must be able to omit specific parts of the document, especially in multiple signature scenarios. A simple example of this is omitting the "office use only" section of a form from the signature of the person who fills out the form (and the "office" could sign the remaining section plus the first signature). Another example would be code signing a form; that is, using a signature to guarantee the operation of the form by signing its GUI layout and computational expressions while omitting the computations' current values and the tags that will store user input. For these purposes, XFDL includes signature filters that allow the form developer to define precisely which form elements are kept in or omitted from a signature.

Conclusion

Hopefully, design features of XFDL such as document centrism, assertion-based computations, and digital signature filters will become the expected standard in all future transaction processing applications.

DDJ

Listing One

<?xml version="1.0"?>
<XFDL version="4.0.0">
    <!-- Form Global Options -->
    <!-- One or more pages -->
</XFDL>

Back to Article

Listing Two

<?xml version="1.0"?>
<XFDL version="4.0.0">
    <!-- Form Global Options -->
    <page sid="PAGE1">
        <!-- Page Global Options -->
        <label sid="TITLE">
            <!-- Options -->
        </label>
        <field sid="HOURS">
            <!-- Options -->
        </field>
        <field sid="RATE">
            <!-- Options -->
        </field>
        <field sid="WEEKLYCHECK">
            <!-- Options -->
        </field>
    </page>
</XFDL>

Back to Article

Listing Three

<label sid="TITLE">
    <value>Salary Calculator</value>
    <fontinfo content="array">
        <ae>Times</ae>
        <ae>24</ae>
        <ae>bold</ae>
    </fontinfo>
    <size content="array">
        <width>50</width>
        <height>1</height>
    </size>
    <justify>center</justify>
</label>

Back to Article

Listing Four

<field sid="HOURS">
    <label>Number of Hours:</label>
    <value>40</value>
    <format content="array">
        <ae>integer</ae>
        <range content="array">
            <ae>1</ae>
            <ae>99</ae>
        </range>
    </format>
</field>
<field sid="RATE">
    <label>Pay Rate:</label>
    <value>$7.75</value>
    <format content="array">
        <ae>dollar</ae>
        <ae>add_ds</ae>
    </format>
</field>

Back to Article

Listing Five

<field sid="WEEKLYCHECK">
    <editstate>readonly</editstate>
    <value content="compute">
        <cval>$310.00</cval>
        <compute>
            HOURS.value*RATE.value
        </compute>      
    </value>
    <format content="array">
        <ae>dollar</ae>
        <ae>add_ds</ae>
    </format>
</field>

Back to Article

Listing Six

<compute>
<![CDATA[
  HOURS.value <= "40"
  ? HOURS.value*RATE.value
  : "40"*RATE.value + 
    (HOURS.value-"40")*"1.5"*RATE.value
]]>
</compute>      

Back to Article


Copyright © 1999, Dr. Dobb's Journal

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.