Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

A Forth HTML Generator


Craig A. Lindley is a degreed hardware engineer who, until recently, had been writing large scale Java applications. Craig has authored over 40 technical articles on various programming topics and has published five books dealing mostly with multimedia. He can be contacted via email at: [email protected].


It's been a long time since I've programmed in Forth but for my current project it seemed like the correct choice of programming language. My current project is the addition of Common Gateway Interface (CGI) like functionality to my <a href="http://www.ddj.com/embedded/211300170">Webster2 web server. Why Forth was chosen and how it was used in support of a CGI interface will be discussed in a future article. Here I would like to discuss a component of my current project that may be of interest to Dr. Dobb's readers -- an HTML generator written entirely in Forth.

If you're like me, you find manual coding of HTML pages a tedious and error-prone activity. Many times over the years I have written code similar to the following:


printf("<html><head><title>HTML Page Title</title></head><body>"
 ...
 ... other HTML content ...
 ....
printf"</body></html>"

The problems I have encountered manually coding HTML include: improper nesting of HTML tags, improper embedding of special characters in text strings, forgetting closing tags, bad format of tag attributes and, in general, the generation of bad HTML understood by some browsers and not by others.

For my current project I decided to build in HTML generator functionality so that the pages returned from my web server would be of constantly higher quality. The challenge was the HTML generator had to be written in Forth.

In the Java world, I have used numerous HTML generator programs. One of the simplest and easiest to use was written by Cyrille Artho and is the Java HTML Generator. I have ported the concepts of this program to Forth -- MinForth actually -- and made some extensions along the way. The Forth code for the HTML generator is available here.

The Forth HTML generator has the following features:

  • Allows creation of any kind of HTML tag, with or without associated attributes.
  • Handles nesting of HTML tags automatically
  • Automatically handles the inclusion of closing tags where appropriate.
  • Handles the proper formatting of attribute strings including double quoting of attribute values and removal of redundant space characters.
  • Handles replacement of special characters in text strings such as &, < and > with their appropriate replacements: &, &ls; and >.

Table 1 shows the public API for the Forth HTML generator.

Forth Word

Stack Diagram

Comments

newtag

(addrn un -- addrt)

Create a new tag with the name given at address addrn with length un. The returned address points at the tag data structure in memory and can be thought of as the id of the tag for all subsequent operations. Tags created with this method have opening and closing structures.

newtagnoclose

(addrn un -- addrt)

Same as above except tag is self contained

newtexttag

(addrt ut -- addrt)

Special type of tag for text. Here address addrt points at the text and ut is the text length. The text is automatically parsed and any special characters found are replaced before storage.

addattribtotag

(addra ua addrt --)

Adds the attribute string specified by addra ua to the tag identified by addrt. The attribute string is parsed and properly formatted before storage.

newtag+attrib

(addra ua addrn un -- addrt)

 

Creates a new tag with the specified attribute string. addra ua identifes the attribute string, addrn un identify the tag name.

addtagtotag

(addrf addrt --)

Connects the from tag addrf to the to tag addrt nesting the from tag inside of the to tag.

renderpage

(addrt -- addrsb)

After the complete HTML tag network is created and linked together this method is called to render the HTML. addrt identifies the top level tag in the network. The rendered HTML is returned in a str (see text). After the HTML is used, the str should be freed.

free-tags

(addrt --)

After the tag network has been rendered into a str, the dynamic memory containing the tag data structures should be freed with this method.

Table 1: The Public API for the Forth HTML Generator.

Examples of How the Forth HTML Generator Is Used


{
Example 1 Test code for the Forth HTML Generator
Illustrates basic HTML page generation
Written by: Craig A. Lindley
}

requires htmlgen.f

0 value htmltag
0 value headtag
0 value bodytag
0 value divtag
0 value titletag
0 value text1tag
0 value text2tag

: html			\ ( -- htmlid) 

	\ Generate static tags
	s" html" newtag to htmltag
	s" head" newtag to headtag
	s" body" newtag to bodytag
	s" align=center" s" div" newtag+attrib to divtag
	s" title" newtag to titletag

	\ Generate static strings
	s" Example One Page" newtexttag to text1tag
	s" Example One Page Generated by Forth HTML Generator" newtexttag to text2tag
  
	\ Connect the tags
	text1tag titletag addtagtotag 
	titletag headtag addtagtotag
	headtag htmltag addtagtotag
	bodytag htmltag addtagtotag
	divtag bodytag addtagtotag
	text2tag divtag addtagtotag

	htmltag		\ ( -- htmlid)
;

: test html dup renderpage dup str-get type str-free free-tags ;
Example 1: Simple HTML page.

When the test method of Example 1 is executed, it outputs HTML code to standard out. Saving the code to a file and displaying with a browser results in Figure 1.

Figure 1: Page generated by Example 1.

Nothing much to get excited about. Example 2 is a bit more complex and shows how special characters in text are handled.


{
Example 2 Test code for the Forth HTML Generator
Illustrates special character replacement in text tags
Written by: Craig A. Lindley
}

requires htmlgen.f

0 value htmltag
0 value headtag
0 value bodytag
0 value divtag
0 value titletag
0 value breaktag
0 value h1tag
0 value text1tag
0 value text2tag
0 value text3tag
0 value text4tag
0 value text5tag


: html	\ ( -- htmlid) 
	\ Create static tags
	s" html" newtag to htmltag
	s" head" newtag to headtag
	s" body" newtag to bodytag
	s" align=center" s" div" newtag+attrib to divtag
	s" title" newtag to titletag
	s" br" newtagnoclose to breaktag

	\ Generate static strings
	s" Example Two Page" newtexttag to text1tag
	s" Craig & Heather" newtexttag to text2tag
	s" 10 < 20" newtexttag to text3tag
	s" 20 > 10" newtexttag to text4tag
	s" Example Two Page Generated by Forth HTML Generator" newtexttag to text5tag

	\ Connect the tags
	text1tag titletag addtagtotag 
	titletag headtag addtagtotag
	headtag htmltag addtagtotag
	bodytag htmltag addtagtotag
	divtag bodytag addtagtotag

	s" h1" newtag to h1tag
	text2tag h1tag addtagtotag
	h1tag divtag addtagtotag
	breaktag divtag addtagtotag

	s" h1" newtag to h1tag
	text3tag h1tag addtagtotag
	h1tag divtag addtagtotag
	breaktag divtag addtagtotag

	s" h1" newtag to h1tag
	text4tag h1tag addtagtotag
	h1tag divtag addtagtotag
	breaktag divtag addtagtotag
	breaktag divtag addtagtotag

	text5tag divtag addtagtotag

	htmltag
;

: test html dup renderpage dup str-get type str-free free-tags ;
Example 2: Treatment of Special Characters in Text.

The code in Example 2 results in the browser display in Figure 2.

Figure 2: Page generated by Example 2.

Example 3 shows a calendar for September 2008 generated by the Forth HTML generator. The code is too long to show here, but is available here for download. This example shows how various heading are used, how tables are generated and populated and how hypertext links are made. The rendered output is shown below:

Figure 3: Page generated by Forth HTML generator.

These three examples show the generation of static HTML pages but I hope you understand that the content of the generated pages can be as dynamic and dramatic as your application demands and your Forth programming skills allow.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.