The Libero Development Environment

Libero is a language-independent visual-development environment that lets you design programs as state diagrams. Pieter uses Perl with Libero to develop an HTML preprocessor.

August 01, 1996
URL:http://www.drdobbs.com/web-development/the-libero-development-environment/184410069

Visual Tools 96: The Libero Programming Environment

The Libero
Development Environment

Visually building an HTML preprocessor
in Perl

Pieter Hintjens

Pieter, who is a programmer in Belgium, is the author of Libero. He can be contacted at [email protected].

Libero is a language-independent visual-development environment that lets you design programs as state diagrams. Once you've visually designed a finite-state machine description of the program logic, the Libero programmable code-generator engine generates code in a specific language. The Libero environment supports a number of languages, including ANSI C, UNIX shells (Korn shell, BASH, Bourne shell, C shell), Awk, Perl, Visual Basic, Microsoft Test Basic, Cobol 74/85, and Microsoft 80x86 Assembler. To specify the language, you write a new "schema" (a script or program) using Libero's schema language.

Libero is particularly useful if you are designing systems that other programmers will implement. If your program analyzes text and you don't have access to tools such as yacc, or if your program is a communications handler with incoming and outgoing messages, Libero is a good choice. (I've used Libero for TCP/IP servers in this way.) The tool is also useful for designing programs that control a device. In this article, however, I'll use Perl with Libero to develop the HTML preprocessor that I used to prepare the Libero documentation.

When I put Libero online (http:// www.imatix.com), I discovered the mixed blessings of HTML. It is a portable and convenient format for documentation, yet it takes work to maintain a large HTML documentation kit. The most bothersome thing about HTML was editing lots of small files and keeping track of references scattered throughout the documents.

My solution is an HTML preprocessor called "htmlpp" (available electronically) that works somewhat like the C preprocessor. Htmlpp (see Figure 1) lets you do the following:

Define symbols, then use these symbols in the HTML text instead of hard-coded values.
Centralize symbols and other definitions by using include files.
Manage cross references between documents.
Break long documents into a series of real HTML pages.
Combine documents into "super documents."

Typically, I'll write a section of documentation as a text file, maybe 20 to 50 pages long, using HTML tags. I insert preprocessor commands to divide the file into HTML pages, usually at every H2 level. Figure 2 shows such a file. Preprocessor commands start with a "." in the first column. Commands like .include and .define work like their C preprocessor counterparts. The .page command starts a new output HTML page. Symbols in the text are enclosed by $( and ).

I wrote htmlpp in Perl, which is nearly perfect for this kind of work. For instance, parsing a .define line and storing the result in a symbol table takes just two Perl statements. The first statement matches the current line against a regular expression, and extracts the "interesting" parts (the symbol name and its value). The second statement stores the value in an associative array called $symbols; see Listing One.

Getting Down to Work

When working with Libero, you first design a dialog, then generate a skeleton for the program. You gradually refine these two components into a complete program. Here, I'll present a simplified version of htmlpp that recognizes just the .define and .include commands, and lets you insert symbols in the text. Listing Two(a) shows how the dialog starts, while Listing Two(b) shows how the program starts.

The dialog always begins by calling initialise_the_program to get an initial event. In this case, this is either Ok (proceed as usual) or Error (the user did not specify an argument). The Perl code sets $the_next_event to $ok_event or to $error_event. The After-Init state takes the event and branches accordingly. The Ok branch calls three "code modules" (Perl subroutines), then goes to another state. The Error branch ends the program. This is how a Libero program works. Each state is like a C switch statement, testing the value of the event, and branching accordingly.

In Listing Three(a), I use an associative array called keyword. The index into the array is a word; the value of an array element is an event. I'll use this array in Listing Three(b) to parse a command line. The .include command uses a document stack and an "indirect filehandle" to read from the current file. To open and read a file in Perl, you normally use the code in Listing Four.

To use an indirect filehandle, replace HANDLE with a variable name, which can take any distinct value. It is good practice to use the actual filename as the value for the variable. This may be confusing at first, but it works well. In htmlpp, the variable $document holds the current input file-name and filehandle.

Listing Five reads the next line and checks whether or not it starts with a preprocessor command. This code shows some useful Perl techniques. You use regular expressions, like if (/^$/), to determine what the line looks like. This is familiar to awk and sed programmers. I use the defined function to test if the command is recognized. The code in Listing Five results in one of the following events: Blank-Line, Comment, Body-Text, Include, Define, or Finished. You handle each of these events accordingly in the state Have-Line; see Listing Six. The state takes the event and branches accordingly. For instance, if you read a line starting with .define, you branch to Listing Seven. You can read this as "when you get a Define event in this state, call these three code modules, then go to the next state: Have-Line."

The Perl code in Listing Eight parses a .define line. The regular expression is easy to understand when you break it into pieces; see Table 1. In Listing Nine, I put parentheses around the pieces I need to use$1, $2, and so on. This code replaces symbols by their value. The special Perl variables $' and $' hold everything before and after the last regular expression. This is a nice way to replace part of a stringyou match the part you want to replace, then glue $' and $' around the replacement string.

A good way to handle errors is to "raise an exception" and provide the dialog with an "exception event." When you raise an exception, the dialog stops executing the list of code modules and handles the exception event in the current state, like a normal event. Listing Ten(a), for instance, shows how to handle a (fatal) syntax error. Note that you call &raise_exception to signal an exception event. Iusually like to use an event called Exception, and stick this in the Defaults state; see Listing Ten(b). All states "inherit" the events in the Defaults state, unless they define the events themselves. The name Defaults is standard.

You use the associative array file_is_open to detect circular .includes. When you have a valid .include command, you stack the current document name and take the new document name from the .include line; see Listing Eleven(a). When you reach the end of a document, you close the document, then pop the previous document name off the stack. Listing Eleven(b) shows the "?:" operator, which is good as a short If/Else construct. You don't need to reopen the previous file; just continue reading. When the last file is finished, the dialog halts; see Listing Twelve. The complete htmlpp program (available at http://www.imatix.com) is more complexit builds tables of contents, creates hyper-reference tags, and so on. It can be slow and memory hungry, but it does a great job.

Table 1: Parsing regular expressions.

     Expression           Description

     ^\.                  Start of line followed by a dot.
     \w+\s+               A word followed by whitespace.
     [A-Za-z0-9-\._]+     A name consisting of letters, digits, -._.
     \s+                  Whitespace.
     .*                   The rest of the line.

Figure 1: Developing htmlpp in Libero.

Figure 2: Sample HTML file to be processed.

.include prelude.def
.define version 2.12
<H1>$(TITLE)</H1>
.include contents.def
<H2>Summary</H2>
<UL>
<LI>Libero is a Programmer's Tool and Code Generator.</LI>
<LI>It supports lots of languages.</LI>
<LI>It runs on lots of systems.</LI>
<LI>Current version: $(version).</LI>
</UL>

Listing One

/^\.\w+\s+([A-Za-z0-9-\._]+)\s+(.*)/
$symbols {$1} = $2;

Listing Two

(a)
-schema=lrschema.pl

After-Init:
    (--) Ok                                 -> Have-Line
          + Initialise-Program-Data
          + Open-Main-Document
          + Get-Next-Document-Line
    (--) Error                              ->
          + Terminate-The-Program
(b)
require 'htmlpp1.d';                    #   Include dialog interpreter

sub initialise_the_program
{
    print "htmlpp 1.0 - by Pieter Hintjens\n";

    if ($#ARGV == 0) {                  #   Exactly 1 argument in @ARGV[0]?
        $main_document = @ARGV [0];
        $the_next_event = $ok_event;
    } else {
        print "syntax: htmlpp <filename>\n";
        $the_next_event = $error_event;
    }
}

Listing Three

(a)
sub initialise_program_data
{
    #   These are the preprocessor keywords that we recognise
    $keyword {"define"}  = $define_event;
    $keyword {"include"} = $include_event;
}
(b)
sub open_main_document
{
    $document = $main_document;
    &open_the_document;
}
 sub open_the_document

{
    #   We use an indirect filehandle, whose name is the document name.
    #   To read from the file, we use <$document>
    if (open ($document, $document)) {
        $file_is_open {$document} = 1;  #   Keep track of open documents
    } else {
        print "htmlpp E: ($document $.) can't open $document: $!";
        &raise_exception ($exception_event);
    }
}

Listing Four

open (HANDLE, "somefile.txt");
$line = <HANDLE>;

Listing Five

sub get_next_document_line
{
    if ($_ = <$document>) {             #   Get next line of input
        chop;                           #   Remove trailing newline
        if (/^$/) {                     #   Blank lines
            $the_next_event = $blank_line_event;
        }
        elsif (/^\.-/) {                #   Comments
            $the_next_event = $comment_event;
        }
        elsif (/^\./) {                 #   Line starts with a dot
            /^\.(\w+)/;                 #   Get word after dot
            if (defined ($keyword {$1})) {
                $the_next_event = $keyword {$1};
            } else {
                &syntax_error;
            }
        } else {
            $the_next_event = $body_text_event;
        }
    } else {
        $the_next_event = $finished_event;

    }
}

Listing Six

Have-Line:
    (--) Body-Text                          -> Have-Line
          + Expand-Symbols-In-Line
          + Get-Next-Document-Line
    (--) Blank-Line                         -> Have-Line
          + Get-Next-Document-Line
    (--) Comment                            -> Have-Line
          + Get-Next-Document-Line
    (--) Define                             -> Have-Line
          + Expand-Symbols-In-Line
          + Store-Symbol-Definition
          + Get-Next-Document-Line
    (--) Include                            -> Have-Line
          + Expand-Symbols-In-Line
          + Take-Include-File-Name
          + Open-The-Document
          + Get-Next-Document-Line
    (--) Finished                           -> Doc-Unstacked
          + Close-The-Document
          + Unstack-Previous-Document

Listing Seven

    (--) Define                             -> Have-Line
          + Expand-Symbols-In-Line
          + Store-Symbol-Definition
          + Get-Next-Document-Line

Listing Eight

sub store_symbol_definition
{
    #   Symbol name can consist of letters, digits, -._
    #   We re-parse the line to extract the symbol name and value:

    if (/^\.\w+\s+([A-Za-z0-9-\._]+)\s+(.*)/) {
        #   Stick name and value into associative array '$symbols'
        $symbols {$1} = $2;
    } else {
        &syntax_error;
    }
}

Listing Nine

sub expand_symbols_in_line
{
    #   Expands symbols in $_ variable
    #
    #   Repeatedly expand symbols like this:
    #   $(xxx) - value of variable
    #
    #   Note that the entire symbol must be on one line; if the symbol or
    #   its label is broken over two lines it won't be expanded.

    for (;;) {
        if (/\$\(([A-Za-z0-9-_\.]+)\)/) {
            $_ = $`.&valueof ($1).$';
        } else {
            last;
        }
    }
}
#   Subroutine returns the value of the specified symbol; it issues a
#   warning message and returns 'UNDEF' if the symbol is not defined.
#
sub valueof {     local ($symbol_name) = "@_";        #   Argument is symbol name

    defined ($symbols {$symbol_name}) && return $symbols {$symbol_name};

    print "$_\n";
    print "htmlpp E: ($document $.) undefined symbol \"$symbol_name\"\n";
    $symbols {$symbol_name} = "UNDEF";
    return $symbols {$symbol_name};
}

Listing Ten

(a)
sub syntax_error {
    print "$_\n";
    print "htmlpp E: ($document $.) syntax error\n";
    &raise_exception ($exception_event);
}
(b)
Defaults:
    (--) Exception                          ->
          + Terminate-The-Program

Listing Eleven

(a)
sub take_include_file_name
{
    #   Get filename after .include
    if (/^\.\w+\s+(\S+)/) {
        if ($file_is_open {$1}) {
            print "$_\n";
            print "htmlpp E: ($document $.) $1 is already open";
            &raise_exception ($exception_event);
        };
        #   Save current document name and switch to new document
        push (@document_stack, $document);
        $document = $1;
    } else {
        &syntax_error;
    }
}
(b)
    (--) Finished                           -> Doc-Unstacked
          + Close-The-Document
          + Unstack-Previous-Document

sub close_the_document
{
    close ($document);     undef $file_is_open {$document};
}

sub unstack_previous_document

{
    $document = pop (@document_stack);

    $the_next_event = $document eq ""? $finished_event: $ok_event;
}

Listing Twelve

Doc-Unstacked:
    (--) Ok                                 -> Have-Line
          + Get-Next-Document-Line
    (--) Finished                           -> Have-Line
          + Terminate-The-Program