Web Development

CGI and the World Wide Web

By G. Dinesh Dutt, February 01, 1996

The Common Gateway Interface (CGI) makes it possible for Web servers to interact with external programs. Dinesh presents a program that reports gateway-execution errors.

FEB96: CGI and the World Wide Web

Dinesh is an engineer with Hinditron-Tektronix Instruments Limited, Bombay, India. He can be contacted at [email protected].

Much of the usefulness of the World Wide Web stems from the ability of Web servers to interact with external programs. The technology that currently makes this possible is the Common Gateway Interface (CGI). A common application of CGI, for instance, might involve a user querying a database via a form. Once the form is filled out, a CGI script (or program) passes the request from the Web server to the external database, gets the database output, and sends it back to the user.

More specifically, CGI identifies how the Web server should supply input to the external programs, along with the format of output to be returned. The server, in turn, gets inputs from a client such as a Web browser. Since inputs are normally made available via the clients, I'll assume for this article that the interaction occurs between the client and the gateway (external programs), instead of between the server and gateway. Currently, HTTP servers are the only Web servers that support CGI. This means that CGI is supported on all familiar platforms--UNIX, Macintosh, and Windows.

The basic tools you need to use CGI are a language that produces executables (shell scripts, Tcl, Perl, or C, for instance) and access to a CGI-enabled HTTP server. This article is based on UNIX, Perl 4.036, the NCSA HTTPD 1.4 and CERN HTTPD 3pre6 servers, and CGI 1.1, but it applies to other servers, languages, and platforms. For the sake of example, I'll build a simple form-based application (for information on forms, see "Coding with HTML Forms," by Andrew Davison, Dr. Dobb's Journal, June 1995) that uses the form in Figure 1. Listing One presents the HTML code that generates this form. As you can see, this form consists of a text box and two radio buttons, only one of which can be selected. This provides a gateway with Author and Title inputs. Input is passed to the gateway when the user clicks the Submit button.

Data Input

There are several ways in which input is passed to gateways, including forms-based methods and ISINDEX. While each method supplies inputs differently, they all use environment variables to pass information. The environment variables in Perl are available via the %ENV associative array. REQUEST_METHOD is one such variable that indicates the method used to submit the input. (See Table 1 for a description of additional environmental variables CGI uses.)

In Example 1(a), the form has an ACTION METHOD set to GET, so the input is available to the gateway in the environment variable QUERY_STRING. With forms where the ACTION METHOD is set to POST, input is available via stdin (standard input). The CGI specification states that the server need not supply an end-of-file (EOF) for the input available via stdin. Instead, the HTTP server provides the size of the input in the environment variable CONTENT_LENGTH. The first gotcha comes here, when gateways try to read this input. Since the input is not terminated with an EOF marker, the gateway must never read more than the CONTENT_LENGTH or it will wait for further input that will never come. This hangs the gateway and the client awaiting the results. Example 1(b) reads stdin to secure the input.

When input is supplied via the ISINDEX interface, the input is made available via the command-line arguments array ARGV. You can use standard command-line-argument parsing code to extract the inputs. The REQUEST_METHOD variable is set to GET in this case, too. Information supplied via ISINDEX is sent as part of the link, separated by a "?"; for example, http://amadeus.org/play?Jupiter, where Jupiter is made available via command-line arguments ($ARGV[0], in this case). But remember, the arguments cannot have blank spaces between them--even if you "quote" the spaces. To pass arguments with spaces in them, replace each space with its equivalent hexadecimal ASCII code prefixed by %; for example, http:// amadeus.org/play?41st%20Symphony will make the gateway receive "41st Symphony" as its first argument.

Making Sense of the Input

If you were to simply print out the input, the result would be gibberish. One way to see what the gateway receives is to use the test-cgi program that's supplied with the NCSA HTTPD. For example, to examine what QUERY_STRING (which is using the GET submission method) looks like to the gateway, you can set the ACTION to be "test-cgi"(prefixed by the proper http://your-server:port/cgi-bin/) in the example form name=The+Fountainhead&keyword=title.

The inputs are presented to the server as a list of name=value pairs, with each pair separated by a "&" character. This format also converts blank spaces to "+" and converts all special characters to their hexadecimal ASCII code. There are standard libraries available in many languages to decode the input and present it in an understandable format. Listing Two presents CGIGetInput, one such decoder written in Perl.

CGIGetInput understands both GET and POST methods and returns the inputs in an associative array (name supplied by the caller), with each input-field name being the key and the value of the field being the value of the array element. You can use the decoded output for processing. For instance, the debugcgi.cgi script of Listing Three uses this routine.

However, in case of ISINDEX, the data format is different from the aforementioned cases. The name part is entirely absent and the data is as submitted by the user with all hexadecimal symbols converted to their ASCII equivalents. For instance, in the aforementioned example, the gateway gets "41st Symphony" even though we said "41st%20Symphony."

Talking Back

When the gateway needs to communicate with the server after processing, it can return the type of the forthcoming output via the Content-type header. For an HTML document, this would be "Content-type: text/html." If you do not specify the type of the data returned, the browser returns a "500 Server Error" message to the user. The error logs of the server contain the reason "malformed header from script." This happens because HTTP first sends some metainformation about the object that it's about to return--type, size, title, expiration date, and so on. If this information isn't forthcoming, the server is unable to parse the input and returns the aforementioned error message. The valid types that can be used in place of text/html are those that are supported by the browser/HTTP server. This is given by the HTTP_ACCEPT environment variable. (Remember, not all browsers support all environment variables.) For plain ASCII texts, text/html can be replaced by text/plain.

Another effect of the processing could be a request to fetch another document. To do this, the URL of the document is returned in the format print ''Location: http://amadeus.org/Mozarts_Life.html\n\n''; which tells the server that it must retrieve the supplied URL and return that to the client. You could also use the PrintHeader routine supplied in cgi-parse; see Listing Two. Example 2 provides typical calling sequences.

Yet Another Input Method

Another method for obtaining input information makes use of the PATH_INFO environment variable. To illustrate, assume that you have a document that is available in both French and English. Depending on the user's choice of language, the correct document must be served. If you have a CGI script called "document-disher," for instance, a link could be specified as:

http://mymachine.org/cgi-bin/document-disher/French/Mon_Document
http://mymachine.org/cgi-bin/document-disher/English/My_Document

In this case, the CGI script could make use of the extra path information available at the end of the pathname to retrieve the correct lingual document.

The server also provides an environment variable PATH_TRANSLATED, which contains a complete, legal filename based on PATH_INFO. Consequently, this "multilingual" document gateway could simply print the contents of the file specified in PATH_TRANSLATED if the paths are configured properly.

Which Input Method?

The input method you use depends on your application. GET can be used when there's little information to be supplied; for instance, a form like Figure 1, that supplies only a keyword and the type of the keyword. If your form involves more data, the contents of the environment variables may be truncated. Consequently, you should use the POST method for large inputs. (The Mosaic forms tutorial recommends use of the POST method only.) The ISINDEX approach to input, on the other hand, lends itself to querying and works well when you don't know if forms are being used or when you have to support browsers that don't support forms.

Also, keep in mind that you can mix different input methods to some extent:

http://mymachine.org/cgi-bin/documentdisher/French/Mon_Document?Speak
http://mymachine.org/cgi-bin/getfc?791+793

When forms are used as the method to submit input, people might want to pass information not modifiable by the user (the form name, for example). You can do this by adding a ? followed by the form name to the URL of the action link (or via PATH_INFO). While it's okay to do so, your script must "know" that the input would be available in two different ways and read both of them; for example, by manually changing the REQUEST_METHOD variable from within the script. However, the correct way to pass information that's not modifiable by the user (again, the name of the form) when using forms is via hidden fields, specified via the TYPE="hidden" attribute in the form field.

Debugging Gateways

A gateway is like any other program in that you will need to be able to debug it. One of the basic problems with CGI is that the scripts seem to work when used normally, but fail when called from within a Web browser. The lack of error messages makes this doubly confusing.

One way to debug gateways is to simulate the behavior of the HTTP server by setting all the relevant environment variables (QUERY_STRING with the METHOD set to GET, for instance) and executing the script to see if the decoding of information is correct. However, this does not test the changed environment under which the gateway works once it's invoked by the WWW server.

Consequently, I've written a program that reports errors in your gateway's execution, including those caused by wrong assumptions about the environment. Using this program (which has its own forms-based interface; see Figure 2), it should be fairly easy for you to debug your gateways, and the gateway need not even be written in Perl.

The test script/form as shown in Listing Four works as follows:

Listing Five

2. Change the paths to the actual location for Perl (replace /usr/local/bin/perl).

3. If you use forms to supply input, strip off the <FORM> and </FORM> lines from your form and attach the resulting body of the form to the debug form provided. For ISINDEX interfaces, supply the arguments in the ISINDEX area of the form.

4. Bring up a browser on this form.

5. Enter the full pathname of the script invoked by the form to test and also supply the METHOD used to submit information.

6. Supply the input to your form.

7. Click on Submit.

If everything is okay with your form, you are notified accordingly, and the output from the form is displayed. (The program currently lacks support to handle pure image outputs.) If not, an error message is displayed and the cause of the error reported, including any parsing errors for a script. All errors resulting from a change in the environment (user to the server) are trapped and reported. Let's take a look at some of the common errors encountered in writing gateways.

One of the more common errors is to not provide all of the required environment to the script. When testing, the script is running with its user id set to your id, so it has access to your entire environment, files, and databases. However, when running under the server's control, it runs with the user id set to that of the server, usually "nobody," so it doesn't inherit your environment. Thus, executables accessible during testing might not be found in actual use. Similarly, files readable during testing might suddenly become unreadable. The necessary files should be world readable and world executable, and if they need to be written to, world writable.

Another common error occurs when you do not send the Content-type line as the first line of the output returned by CGI. Make sure the first two lines of the form are the Content-type line followed by a blank line; otherwise, the "malformed header" error appears. The Content-type field needs to be set to the type of the object being returned.

Also, by printing a Content-type line at the very beginning or before printing an error message, you can redirect the errors to the user; otherwise, error messages end up in the daemon's error log.

You should also ensure that your HTTP server supports CGI Version 1.1 and that the server is running with the ability to recognize and execute CGI scripts. Many sites turn off CGI since it can be a security hole if not properly configured. In the case of NCSA HTTPD, the directive ScriptAlias gives the paths that can contain scripts. Also, if the AddType directive is defined as AddType application/x-httpd-cgi.cgi, then a script ending in .cgi can be recognized anywhere the server has access. Both directives are to be present in the srm.conf file. For the CERN server, scripts are configured via the Exec directive.

The script must be placed in the proper directory, and the server must be capable of recognizing a file with a specific suffix, such as ".cgi," as a script to be executed.

A simple problem with the CERN server is caused by its parsing policy. If the path to your script is passed/rejected before it encounters the desired Exec rule, the script will not be executed, but its contents are returned as a document (in case of a reject, an error message is returned, not the script as a document). To prevent this, ensure that the Exec rules come first for those directories containing scripts.

When troubleshooting, view the error log file for the HTTPD server you're using. The location of this file is difficult to predict; it varies from a standard/var/httpd/logs/error_log to /usr/lcal/dolphin/httpd/logs/error_log. Figure 3 shows the contents of my server's error_log when problems occurred. One problem immediately apparent from this log is that some of the errors do not indicate which script was the cause of the error message. The "Can't open 1057" error is one such example. These errors are normally system related, such as a call to an external program via system(). To properly trap and report system errors, include the name of the script. It might make sense to print the contents of the environment variable HTTP_REFERER (if available), which contains the the URL specified to get to this script.

Furthermore, when using forms with the POST method, you must not expect an EOF; instead you must access the CONTENT_LENGTH environment variable to get the number of bytes to read and then read only that much. Otherwise, your script will hang. One nice feature of CERN HTTPD Version 3 is the ability to specify a timeout period. If the script doesn't terminate within that time frame, it's killed. This is specified via the ScriptTimeOut directive, and has a default value of five minutes.

If you're mixing output from your script with output from external programs called from within the script, you should unbuffer the output or else the output could be in some nonpredictable order. Also, unbuffering STDOUT seems to improve performance, as the server gets output from the gateway immediately instead of waiting until the buffer is full.

Sending Output to the Browser

In every case, the gateway spews out data without having to bother with HTTP reply-header format and conventions. The server looks at this output and adds headers conforming with the HTTP protocol before sending it to the client. If you wish to save the overhead of your server parsing the output, you may do so by prepending the appropriate HTTP response headers.

To prevent the server from parsing the output of such scripts, the scripts should have names that begin with "nph-". For example, NCSA HTTPD comes with a script called "test-cgi." The same script that talks directly to the browser is named "nph-test-cgi."

The main difference between such scripts and ordinary scripts is in the extra two lines that are prepended to the output. The first is the status-code line and the second is the server: line that specifies the server name and version. You need to look at the draft on HTTP to know all the valid status codes. For example, the nph-test-cgi script returns the headers in Figure 4.

Customized Responses to Problems

The NCSA HTTPD 1.4 server lets you customize the error message returned. For example, you could customize the returned error message for a "500 Server Error" by calling a script that would present a more informative message. To do this, examine the srm.conf file which contains these lines at the end:

ErrorDocument 302 /cgi-bin/redirect.cgi
ErrorDocument 403 /errors/forbidden.html

This means that the redirect.cgi script will be invoked when a redirect error occurs.

Security

Security is a crucial issue when writing CGI scripts because you are in effect allowing other users to execute programs on your machine based on their inputs. Many of the problems encountered in writing CGI scripts in this respect are similar to those encountered when writing UNIX setuid scripts. Consequently, you should always follow the simple rule: Do not trust the client input at all. For example, do not blindly use the client input to construct commands for the system to execute or supply as input to eval. Do not even print the value input to your script (except during testing, of course) as hackers can use clever sequences to break into the system.

Further security is possible via the authentication mechanisms provided by most servers, which require the user to key in a username and password. Only if this validates is the user allowed to execute the script. Details on how to configure the server to do this are beyond the scope of this article. Refer to your server manuals for details.

Conclusion

The net is a rich source of CGI information and numerous, freely available programs to ease your job of writing and debugging gateway applications. Refer to the list of web sites in Table 2 for more information on CGI.

Figure 1: Typical Mosaic form.

Figure 2: Forms-based interface of the debugcgi program.

Figure 3: Error messages for server's error_log.

panic: realloc at /usr/local/bin/rfc2html line 93, <RFC> line 1003.
Can't open 1057: No such file or directory
/usr/local/bin/rfc2html did not return a true value at


/usr/local/etc/httpd/cgi-bin/rfc2html line 52, <>
line 14.
[Mon May 22 09:59:15 1995] httpd: malformed header from script

Figure 4: Headers returned by the nph-test-cgi script.

HTTP/1.0 200 OK
Content-type: text/plain
Server: NCSA/1.3

Example 1: (a) Accessing the contents of QUERY_STRING; (b) reading input via stdin.

(a)
if ($ENV{'REQUEST_METHOD'} eq "GET") {
   $input = $ENV{'QUERY_STRING'};   
}

(b)
if ($ENV{'REQUEST_METHOD'} eq "POST") {
  if (!defined ($ENV{'CONTENT_LENGTH'})) {  
     print "Error: CONTENT_LENGTH not set\n";
     exit;
  }
  read (STDIN, $buffer, $ENV{'CONTENT_LENGTH});
}

Example 2: Typical calling sequences.

# Using the cgi-parse.pl, 
&PrintHeader;       # Print just the default


text/html;
&PrintHeader ("text/plain") # Print type to be


text/plain.
# Redirect request to get new document
&PrintHeader ("http://amadeus.org/Mozarts_Life.html",1);

Table 1: Environment variables used by CGI.

Variable                 Description

HTTP_REFERER             Contains the exact URL in which the
                         script was invoked; for example,
                         http://www.halcyon.com/ htbin/browser-
                         survey. In some older versions of
                         browsers, this is called as
                         REFERER_URL.
                          
HTTP_USER_AGENT          Gives the name of the browser through
                         which the script was invoked. Using
                         this, one could serve different
                         browsers different documents, one with
                         or without netscapisms.
                         
REMOTE_USER              If authentication is enabled, this
                         returns the name for which the
                         authentication succeeded, the server
                         must support it.
                         
REMOTE_ADDR/REMOTE_HOST  Remote machine making the request. If
                         the hostname is unavailable, only the
                         address is set.
                         
SERVER_PROTOCOL          States the protocol and the version of
                         the protocol being used. Currently
                         HTTP 1.0 and HTTP 0.9 for older
                         servers.
                         
GATEWAY_INTERFACE        Name of the gateway interface being
                         used and the version number (currently
                         CGI 1.1).
                         
AUTH_TYPE                Protocol-specific authentication
                         supported by the server. Currently,
                         the only valid value is "Basic."
                         
SERVER_SOFTWARE          Name of the server: NCSA 1.4, for
                         example, for the NCSA server version
                         1.4.

Table 2: Web sites for information on CGI.

NCSA's documentation on CGI

http://hoohoo.ncsa.uiuc.edu/cgi/intro.html

CGI specification

http://hoohoo.ncsa.uiuc.edu/cgi/interface.html

The HTTP draft

http://info.w3.org/hypertext/WWW/Protocols/HTTP/HTTP2.html

CGI FAQ

http://www.halcyon.com/hedlund/cgi-faq/

Yahoo index for CGI material

http://akebono.stanford.edu/yahoo/Computers/World_Wide_Web/CGI___Common


_Gateway_Interface/

Virtual library material on CGI

http://www.charm.net/~web/Vlib/Providers/CGI.html

CGI-related newsgroups

news://comp.infosystems.www.authoring.cgi

Currently available gateways

http://www.w3.org/hypertext/WWW/Tools/Filters.html
http://www.nr.no/demo/gateways.html
http://www.halcyon.com/hedlund/cgi-faq/gateways.html
http://www.cis.ohio-state.edu:80/hypertext/about_this_cobweb.html

Language libraries for decoding forms input and other useful things

http://wsk.eit.com/wsk/dist/doc/libcgi/libcgi.html - C
http://www.bio.cam.ac.uk/web/form.html - Perl
http://www.lbl.gov/~clarsen/projects/htcl/http-proc-args.html - Tcl

Survey of which browsers support which variables

http://www.halcyon.com/htbin/browser-survey

Listing One

<FORM ACTION="http://yourmachine:urport/cgi-bin/testcgi.cgi" METHOD="POST"> 
<H1>Illustration Form</H1> 
 <P ALIGN=JUSTIFY> 
This form is used as a illustration to the article on CGI. 
<HR> 
<INPUT TYPE="text" NAME="name" VALUE=""> 
Title<BR> 
<OL> 
<LI> <INPUT TYPE="radio" NAME="keyword" VALUE="author"> 
Author 
<LI> <INPUT TYPE="radio" NAME="keyword" VALUE="title" CHECKED> 
Title. 
</OL> 
<HR> 
<INPUT TYPE="submit" VALUE="Submit Form"> 
Submit Button<BR> 
<INPUT TYPE="reset" VALUE="Clear Values"> 
Reset Button. 
 <P> 
</FORM>

Listing Two

###############################################################################
##                                CGI-PARSE.PL                               ##
## A library to read and parse the input available from forms as per the     ##
## CGI 1.1 specification.                                                    ##
## This code is in the public domain for people to do whatever they wish to  ##
## with it. But, maintain this copyright notice and don't say you wrote it.  ##
## This work is distributed in the hope that its useful. But, the author is  ##
## not liable for any any incurred damages, directly or indirectly due to    ##
## the use or inability to use this software.                                ##
###############################################################################
###############################################################################
## CGIGetInput                                                               ##
## This is a small function which decodes the forms input. It looks at the   ##
## REQUEST_METHOD environment variable to decide where to get the input from.##
## The user can invoke this subroutine thus :                                ##
##              &CGIGetInput (*cgi_in);                                      ##
## and the input is returned in an associative array called cgi_in, with the ##
## key being the name of field and its value being the value of the field    ##
## as supplied by user. If the field does not have any input, the entry in   ##
## the associative array will be undefined.                                  ##
###############################################################################
sub CGIGetInput {
    local (*input) = @_;
    local ($buffer,@nv_pairs);
    if ($ENV{'REQUEST_METHOD'} eq "GET") {
    $buffer = $ENV{'QUERY_STRING'};
    }
    elsif ($ENV{'REQUEST_METHOD'} eq "POST") {
    read (STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
    }
    else {
    return -1;
    }
    @nv_pairs = split (/\&/,$buffer);
    foreach $nvp (0..$#nv_pairs) {
    $nv_pairs[$nvp] =~ tr/+/ /;
        ($key, $keyword) = split (/=/, $nv_pairs[$nvp], 2);
    $key =~ s#%(..)#pack("c",hex($1))#ge;
    $keyword =~ s#%(..)#pack("c",hex($1))#ge;

    $input{$key} .= '\0' if (defined ($input{$key}));
        $input{$key} .= $keyword;
    }
    return 1;
}
###############################################################################
## &PrintHeader (type/URL, is_it_a_URL)                                      ##
## This function prints the default header. If a type is specified, that is  ##
## printed, else the default text/html is printed. If the second parameter is##
## 1, then the Location header is printed instead of the text/html header.   ##
##                                                                           ##
## Example invocations :                                                     ##
##      &PrintHeader ("text/plain", 0)                                       ##
##              &PrintHeader ("http://www.halcyon.com/hedlund/cgi-faq/",1)   ##
##              &PrintHeader ("",0)                                          ##
###############################################################################
sub PrintHeader {
    local ($toprint, $url_p) = @_;
    if ($toprint eq "") {
    print "Content-type: text/html\n\n";
    }
    elsif ($url_p) {
    print "Location: $toprint\n\n";
    }
    else {
    print "Content-type: $toprint\n\n";
    }
}
1;

Listing Three

###############################################################################
##                               DEBUGCGI.PL                                 ##
## This is a simple script which sets up a test environment for CGI script   ##
## to be executed and then traps the common errors. The PATH is set to the   ##
## minimal set by most systems, for example. All error messages are trapped  ##
## and made available to the user.                                           ##
##                                                                           ##
## This code is in the public domain for people to do whatever they wish to  ##
## with it. But, maintain this copyright notice and don't say you wrote it.  ##
## This work is distributed in the hope that its useful. But, the author is  ##
## not liable for any any incurred damages, directly or indirectly due to    ##
## the use or inability to use this software.                                ##
###############################################################################
$tmpdir = "/tmp/";      # The directory under which the error file will
                # be created.
require "cgi-parse.pl";
%cgi_input = ();
&CGIGetInput(*cgi_input);
$script = $cgi_input{'DebugCgi-ScriptName'};
$method = $cgi_input{'DebugCgi-Method'};
$cmdargs = $cgi_input {'DebugCgi-CmdArgs'};
delete ($cgi_input {'DebugCgi-ScriptName'});
delete ($cgi_input {'DebugCgi-Method'});
delete ($cgi_input {'DebugCgi-CmdArgs'});
$inp = "";
foreach $elem (keys %cgi_input) {
    $cgi_input{$elem} = $cgi_input{$elem};
    $cgi_input{$elem} =~ s# #+#g;
    $cgi_input{$elem} =~ s#([^+A-Za-z0-9])#sprintf("%%%02x",ord($1))#ge;
    $cgi_input{$elem} =~ s#%3d#=#g;
    $inp .= "$elem=$cgi_input{$elem}&";
}
# Encode the input in the form used by HTTP.
#Turn off the include path. The script must use its own @INC and environment.
if (! -e $script) {
    &PrintErrHeader;
    print "<B>Script <EM>$script</EM> does not exist</B><BR>";
    &PrintErrTrailer;
    exit (2);
}
if (! -r $script && ! -x $script) {
    &PrintErrHeader;
    print "<B>Script <EM>$script</EM> is not readable/executable by 
                                                               server</B><BR>";
    &PrintErrTrailer;
    exit (2);
}
#Set the request method.
$error_file = $tmpdir.$^T;
$ENV{'REQUEST_METHOD'} = $method;
if ($method eq "GET") {
    $ENV{'QUERY_STRING'} = $inp;
    open (OUTPUT, "$script $cmdargs 2\>/tmp/errors |") || 
                                         &cry ("unable to pipe script $! \n");
}
elsif ($method eq "POST") {
    $ENV{'CONTENT_LENGTH'} = length($inp);
    open (OUTPUT, "echo \"$inp\" | $script $cmdargs 2>$error_file |") || 
                                         &cry ("unable to pipe script $! \n");
}
else {
    &PrintHeader;
    print "Unknown method: $method\n";
    exit (3);
}
$_ = <OUTPUT>;
if (!/^Content-type: / && !/^Location: /) {
    if (-s $error_file) {
    open (ERRF, "< $error_file") || &cry ("testcgi.cgi - 
                                             Unable to open error file $!\n");
    &PrintHeader;
    print "<HTML><BODY>\n";
    @errors = <ERRF>;
    &PrintErrHeader;
    print "<B>Script <EM>$script</EM> has an execution 
                                                      error !!!</B><BR><BR>";
    print "@errors \n";
    &PrintErrTrailer;
    unlink ($error_file);
    exit (4);
    }
    &PrintErrHeader;
    print "The script <EM>$script</EM> has an error :<BR><BR>";
    print "It does not output the Content-type/Location header.<BR>";
    print "Here's what it printed as the first line.\n";
    print "<PRE>\n";
    print;
    print "</PRE>\n";
    &PrintErrTrailer;
    exit (3);
}
$format = m#^Content-type:[ \t]*text/html#;
$_ = <OUTPUT>;
if (!/^$/) {
    &PrintErrHeader;
    print "The script <EM>$script</EM> has an error :<BR><BR>";
    print "The second line it outputs must be a blank, instead I got <PRE>\n";
    print;
    print "</PRE>";
    &PrintErrTrailer;
    exit (3);
}
&PrintHeader;
print "<HTML><BODY><H3>Script <I>$script</I> seems OK !</H3> \n";
print "<P ALIGN=Justify> Here is its output:<BR>\n";
print "<PRE>\n" if (!$format) ;
print $ENV{'PATH_INFO'},"\n";
while (<OUTPUT>) {
    print;
}
print "</PRE>" if (!$format);
print "</BODY></HTML>";
exit (0);
sub cry {
    local ($message) = @_;
    &PrintHeader;
    print "<HTML><BODY><H2>Debugcgi Error !!</H2>";
    print "DebugCGI encountered an error during execution. 
                                                    The error is: ", $message;
    print "\n<BODY><HTML>";
    exit;
}
sub PrintErrHeader {
    &PrintHeader;
    print "<HTML><BODY><H3>Script Error !!</H3>";
}
sub PrintErrTrailer {
    print "</BODY></HTML>\n";
}

Listing Four

 
###############################################################################
##                                TESTCGI.PL                                 ##
## This is a script which sets up a test environment for the CGI script      ##
## to be executed and then traps the common errors. The PATH is set to the   ##
## minimal set by most systems, for example. All error messages are trapped  ##
## and made available to the user. Thus, he does not have to wonder why for  ##
## error cases.                                                              ##
## This code is in the public domain for people to do whatever they wish to  ##
## with it. But, maintain this copyright notice and don't say you wrote it.  ##
## This work is distributed in the hope that its useful. But, author is not  ##
## liable for any any incurred damages, directly or indirectly due to use    ##
## or inability to use this software.                                        ##
###############################################################################
 
$tmpdir = "/tmp/";  # Directory under which the error file will be created.
require "cgi-parse.pl"; 
sub Usage { 
  print "Usage: testcgi [-f filename containing input] -m METHOD scriptname\n";
  print "       where METHOD is GET/POST\n"; 
  exit (0); 
} 
%cgi_input = (); 
&CGIGetInput(*cgi_input); 
&PrintHeader; 
 
$script = $cgi_input{'TestCgi-ScriptName'}; 
$method = $cgi_input{'TestCgi-Method'}; 
delete ($cgi_input {'TestCgi-ScriptName'}); 
delete ($cgi_input {'TestCgi-Method'}); 
 
$inp = ""; 
foreach $elem (keys %cgi_input) { 
    $cgi_input{$elem} = $cgi_input{$elem}; 
    $cgi_input{$elem} =~ s# #+#g; 
    $cgi_input{$elem} =~ s#([^+A-Za-z0-9])#sprintf("%%%02x",ord($1))#ge; 
    $cgi_input{$elem} =~ s#%3d#=#g; 
    $inp .= "$elem=$cgi_input{$elem}&"; 
} 
# Encode the input in the form used by HTTP. 
 
#Turn off the include path. The script must use its own @INC and environment. 
@INC=(); 
$ENV{'PATH'} = "/bin:/usr/bin/:/etc:"; 
 
#Set the request method. 
$error_file = $tmpdir.$^T; 
$ENV{'REQUEST_METHOD'} = $method; 
if ($method eq "GET") { 
 $ENV{'QUERY_STRING'} = $inp; 
 open (OUTPUT,"$script 2\>/tmp/errors |") || die "unable to pipe script $! \n";
} 
elsif ($method eq "POST") { 
    $ENV{'CONTENT_LENGTH'} = length($inp); 
    open (OUTPUT,"echo \"$inp\" | $script 2>$error_file |") || die 
                                                "unable to pipe script $! \n";
} 
else { 
    print "Unknown method: $method\n"; 
    exit (3); 
} 
print "<HTML><BODY>\n"; 
$_ = <OUTPUT>; 
if (!/^Content-type: / && !/^Location: /) { 
    if (-s $error_file) { 
    open (ERRF, "< $error_file") || die 
                                "testcgi.cgi - Unable to open error file $!\n";
    print "<HTML><BODY>\n";     
    @errors = <ERRF>; 
    print "<H3>Script $script has an execution error !!!</H3>\n"; 
    print "@errors \n"; 
    unlink ($error_file); 
    exit (4); 
    } 
    print "<H3>Script $script has an error !!!</H3>\n"; 
    print "It does not output the Content-type/Location header.\n"; 
    exit (3); 
} 
$format = m#^Content-type:[ \t]*text/html#; 
$_ = <OUTPUT>; 
if (!/^$/) { 
    print "Your second line must be a blank\n"; 
    exit (3); 
} 
print "<H3>Script $script Seems OK</H3> \n"; 
print "<P ALIGN=Justify> Here is its output \n"; 
print "<PRE>\n" if (!$format) ; 
while (<OUTPUT>) { 
    print; 
} 
print "</PRE>" if (!$format); 
print "</BODY></HTML>"; 
exit (0);

Listing Five

 
<!-- <FORM ACTION="mailto:brat" METHOD="POST"> --> 
<FORM ACTION="http://yourmachine:urport/cgi-bin/testcgi.cgi" METHOD="POST"> 
<H1>Test CGI Form</H1> 
 <P ALIGN=JUSTIFY> 
This form is used as a front-end to testcgi. 
<HR> 
<INPUT TYPE="text" NAME="TestCgi-ScriptName" VALUE=""> 
Script Name<BR> 
<INPUT TYPE="text" NAME="TestCgi-Method" VALUE="POST"> 
Method<BR> 
<!-- Insert the form to be tested minus the FORM header and trailer --> 
<!-- and the Submit and clear buttons --> 
<HR> 
<INPUT TYPE="submit" VALUE="Submit Form"> 
Submit Button<BR> 
<INPUT TYPE="reset" VALUE="Clear Values"> 
Reset Button. 
 <P> 
</FORM>

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Web Development

CGI and the World Wide Web

Data Input

Making Sense of the Input

Talking Back

Yet Another Input Method

Which Input Method?

Debugging Gateways

Sending Output to the Browser

Customized Responses to Problems

Security

Conclusion

Figure 1: Typical Mosaic form.

Figure 2: Forms-based interface of the debugcgi program.

Figure 3: Error messages for server's error_log.

Figure 4: Headers returned by the nph-test-cgi script.

Example 1: (a) Accessing the contents of QUERY_STRING; (b) reading input via stdin.

Example 2: Typical calling sequences.

Table 1: Environment variables used by CGI.

Table 2: Web sites for information on CGI.

Listing One

Listing Two

Listing Three

Listing Four

Listing Five

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Web Development Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Web Development

CGI and the World Wide Web

Figure 1: Typical Mosaic form.

Figure 2: Forms-based interface of the debugcgi program.

Related Reading

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Web Development Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content