Channels ▼
RSS

Web Development

An Embeddable HTTP Server

Source Code Accompanies This Article. Download It Now.


Oct01: An Embeddable HTTP Server

Tim has designed and developed software ranging from OS kernels for geostationary spacecraft to a variety of terrestrial embedded network applications. He can be reached at mtj@mtjones.com.


Web browsers have become the standard method for communicating with and managing remote embedded devices. The web browser is a common appliance on networked desktops and provides a rich set of functionality for communication and presentation of data from remote devices.

These days, it's commonplace to find HTTP servers on a variety of small embedded devices. Unfortunately, an HTTP server places requirements on the design of an embedded device that may adversely affect its cost. For example, adding a filesystem to the device (for HTTP server content) may incur both hardware and software licensing costs.

In this article, I'll discuss the construction of an embeddable HTTP server that not only obviates the need for a filesystem, but also provides support for dynamic content along with an API to bridge the HTTP server to the data sources on the device. All that is needed on the target device is a socket library. The source code for the server is available electronically; see "Resource Center," page 5. The test version of the software (that lets you verify that all of your content is in order before deploying on the embedded device) runs on Linux, and I've included make instructions as well as sample content and a binary that can be run on a typical Linux distribution (RedHat, SuSE, and the like).

Design Requirements

Any good development project outlines some of the key requirements that are to be achieved. In this project, I'll focus on four basic requirements:

  • Provide a minimal HTTP server protocol (get/head requests) for standard file types such as HTML, CLASS, JAR, JPEG, and so on.
  • Provide an internal filesystem for content storage (requiring no device filesystem).

  • Support dynamic content in HTML files with an API to provide the content.

  • Provide a compressed system log (to reduce memory requirements).

Additionally, a governing focus of development will be to minimize resource usage and avoid dynamic allocation of resources outright.

HTTP Server Protocol

HTTP, as described in RFC2068 (see http://www.landfield.com/rfcs/rfc2068.html) is a straightforward ASCII-based protocol. HTTP uses a standard synchronous request/response design over TCP/IP, identical to classical client/server architecture. When a client makes a request to an HTTP server, it sends an HTTP request message. The HTTP request message includes the client request as well as information about the client's capabilities. A single blank line at the end of the request terminates the request message; see Example 1(a). The HTTP server response message to the client adheres to the same structure. A response message is generated, followed by any data necessary from the client request. Example 1(b) is a sample response from the prior request.

That's it! Though capability headers are useful, I'll largely ignore them in this application because the server will be very lightweight. Despite this limitation, this example demonstrates an impressive set of features.

HTTP Server Design

Now that you have an understanding of how HTTP works, I'll now focus on the implementation. For the sake of simplicity, I'll adopt a single threaded model that allows a single request at a time (after all, this is a constrained embedded system).

The main function provides a simple server socket. When a request is received, the resulting client socket is passed to a client handler function that implements the HTTP message protocol. The first job in the handler is to read in the request message (a variable number of characters followed by a blank line). The first line of the request message will follow a specific format, such as GET <filename> HTTP/1.1 that represents a request to return the named file.

The filename is parsed from the request, then located in the internal filesystem. Once the file is found, it's returned with the HTTP response message via the client socket. In some cases, a request can be made for a special file that is generated dynamically. For example, a request for the file named "log" represents a request for the internal system log of the HTTP server.

The HTTP response is similar to the request except that it can be composed of two parts. The first part is the response header and the second the response body that represents the file result of the initial request. A single blank line separates the response header and body.

One important element of the response header is the content type. This particular element specifies the media type of the attached data. For example, when responding with an HTML file, a content type of "text/html" is returned. The internal function determineContentType identifies the type of content to be returned and constructs this header.

Figure 1 is a straightforward architectural drawing of the module hierarchy.

Internal Filesystem Design

Although filesystems are taken for granted (even in higher-end embedded systems), they're not surprisingly absent in traditional embedded systems. A filesystem is made up of a storage medium, a format by which data is stored on the medium, and an API to enable access.

To provide typical HTTP services, some kind of filesystem is necessary. The approach of this design is to aggregate the files (the content) as a compilable data structure. Then providing a simple way to read files from the structure satisfies the requirement.

The buildfs utility takes a directory path as an argument and uses this as the root of the content tree. The content tree is then traversed, and each file is accumulated into our internal file system structure and written out to the compilable file filedata.c. This file can be viewed and includes a hex translation of each file's contents along with a textual description of the file and its size.

The internal filesystem is a simple sequential filesystem that stores the files in the order of their appearance in the source filesystem. A header appears with each file to permit it to be read. This header has the structure described in Table 1.

This is repeated for each file that is to be included. Example 2 is a sample file created using the buildfs tool. It consists of two files (/testfile and /file2). Elements are color coded from the prior table.

Locating a file is then a simple process of walking through the file headers, and comparing the source file name with the header file name. When a file is matched, the file size is then used to determine and return the actual contents of the stored file.

Dynamic Content

Support for dynamic content is surprisingly simple and utilizes the tag concept commonly found in HTML. A new tag has been added to interface to the embedded HTTP server to support this capability. As an HTML file is served, it is parsed to search for the new dynamic content tag "<DATA x>," where x is the string name of the dynamic content to insert into the stream.

<P>

The current temperature is <DATA temperature>

<P>

The parser searches for the "<DATA" keyword, then uses the embedded variable name to retrieve the actual content.

Numerous designs were considered for serving dynamic content. The final design implemented here was chosen for its flexibility. When you wish to provide dynamic content, simply build an HTML file that includes the dynamic tags that are to be resolved. Then the code that will provide the dynamic data is written as a function that returns a Null terminated string. This function is then installed along with the string name that is represented in the HTML file; see Example 3(a). The function implementation simply returns a string representation of the dynamic data; see Example 3(b).

That's it. The current implementation allows for up to 20 dynamic variables, though adjusting a symbolic constant can easily increase this.

I chose this particular implementation because the string representation allows for the greatest flexibility in data types (anything can be represented including embedded HTML). Also, by calling the user function instead of storing content in an intermediate array, you have the greatest flexibility in data management. For example, your function knows when data is used because that data's function is called. This permits synchronization on the available data and its presentation.

Compressed Log

Logs are important to understand server usage, but they can also be used for debugging. Unfortunately, logs can be resource hogs and, given strict design requirements, you have to identify a more efficient way to store them.

In traditional systems, a log is a file that processes a string and a variable set of arguments. Since we don't have a filesystem, or much room to store the log, a new approach is required.

To meet these space-constrained requirements, I'll make a few concessions. I won't allow the output of run-time defined strings (all log strings will be known to the logger at compile time). All log strings will be defined as scalar indices into a logString character array that defines the actual log string output. Supporting a variable number of arguments for a log string is also important and should be supported with a minimum amount of work for users.

A log string contains not only the template of the text to emit, but also a declaration of the arguments to embed within it. An example log string is: Received request for ^. The special carat character "^" is a replacement symbol that instructs the log constructor to insert a stored argument in its place.

You build the log using two log functions. The first function places a single control byte into the log and the second places a Null terminated string into the log. Take the following code segment as an example of inserting a log entry:

emitByte(PREFIX_BYTE); emitByte(NORMAL_ REQUEST);

emitString(filename); emitByte(SUFFIX_ BYTE);

This snippet places the following data into the log (where filename is pointed to file1.html):

0xfa,0x00,'f','i','l','e','1','.','h','t','m','l',0x00,0xf3

All compressed log entries start with a prefix byte (0xfa) and end with a suffix byte (0xf3). This is used for synchronization purposes when constructing an HTML log output for the user. The log entry type byte (in this case NORMAL_REQUEST) comes after the prefix byte. This byte is an index into the log strings array. Finally, any arguments are emitted to the log using the emitString function. It should be clear from this discussion that the log is simply a circular buffer of bytes that are interpreted as log entries.

So how does a log entry result in an ASCII string in the log? This is a simple matter of extracting log entries from the log, then interpreting them based upon their types. The types result in an index to the log strings array. The log string is then emitted, one character at a time until a replacement symbol is found (^). When a replacement symbol is found, the log is consulted again to retrieve a Null terminated string that is promptly emitted. Emission of the log string then continues until another replacement symbol or the Null terminator is found.

This simple architecture reduces highly repetitive logs to a handful of bytes that are dynamically created upon user request.

Conclusion

The HTTP server I've described here is targeted towards low-end embedded systems. Although designed for minimal HTTP functionality, advanced features are included such as support for dynamic content and log compression using under 400 lines of C. Again, this code is available electronically; see "Resource Center," page 5.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 
Dr. Dobb's TV