Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

sGs: A Simple Gopher Server


If you've followed the explosive growth in network activity, you know that three predominant applications--Gopher, Wide Area Information Servers (WAIS), and the World Wide Web (WWW)--have transformed the use of the Internet.

WAIS, developed by Thinking Machines Corp. (Cambridge, MA) is a client/server full-text search system based on a standard library-science network protocol (ANSI Z39.50--1988). Gopher, developed by the microcomputer laboratory of the University of Minnesota, is a menu-based Campus Wide Information System (CWIS) that simplifies the dissemination of information by presenting a uniform user interface to the campus network (and the Internet as a whole). The World Wide Web, developed at the European Laboratory for Particle Physics (CERN) in Geneva, is a distributed hypertext system designed to provide a way for physicists to collaborate on research in high-energy physics.

Along with networking tools such as ftp, telnet, usenet, and mail, Gopher, WAIS, and WWW each have a place in the overall design of an information infrastructure. In this article, we'll present a simple Gopher server written in Perl and discuss the basic Gopher protocol. We'll also show how this server can be extended and combined with other Internet tools to provide more sophisticated network information systems.

About Gopher

The Gopher protocol is a file-system-based model extended to allow "files" to reside on many computers. Clients (often, but not exclusively, represented as user interfaces) run on a local computer connected to a server which manages the information (the files). The Gopher protocol was designed to be human-readable, so all transactions through the protocol are done using the U.S. ASCII character set and consist of lines separated by carriage return/line feed (CR/LF). The first character on the line is a tag that informs the client that the remaining information on the line is of a particular, well-defined type. The types specified by the original protocol were minimal: just enough to provide a simple interface to a distributed file system. This base set included ASCII files (tag=0), directory listings (tag=1), or searchable indexes (tag=7). Table 1 presents the complete list of types.

Type Description

0

A readable ASCII file.

1

A simple directory listing.

2

A CSO phonebook (a special kind of directory listing).

3

An Error was detected by the server.

4

A BinHexed Macintosh file.

5

A DOS binary archive.

6

A UNIX uuencoded file.

7

A keyword searchable index.

8

A telnet session.

9

A binary file.

The Gopher protocol can be extended by defining additional tags, although 0 through Z have been reserved by the original developers. Gopher+, a new Gopher protocol, extends this idea, passing additional type information both through a filename extension (such as movie.mpg), and by appending information to the original Type-1 response returned by the server.

Type 1: A Directory Listing

Much of the work in Gopher is done through Type 1 responses, which consist of a set of lines, each containing a single character type. This is followed by a tab-separated quadruple (4-tuple) of a string to be displayed as the heading by the client, a selector string that can be returned to the server for subsequent processing, a host name (usually the server), and a TCP port.

Figure 1, which is a typical Type 1 response, shows that an item is an ASCII file.

Figure 1: A Type 1 response.

0About this Gopher  
 Article<tab>0/gopher-data/About<tab>gopher.host<tab>70<CR><LF>

* <tab> = ASCII 9
  <CR>  = ASCII 13
  <LF>  = ASCII 10

The client would display the string following the "0" up to the first tab as a menu item (About this Gopher Article). The remaining information (0/About this GopherArticle<tab>g.host<tab>port<CR><LF>) is used to retrieve the next piece of information. The client parses this substring to get the selector, the host, and the port. In this example, the client makes another connection to the host gopher.host via a socket connection at port 70 and sends the string0/About this GopherArticle<CR><LF> to the Gopher server to request the item referred to in the previous response. All requests from the client are performed in a single TCP/IP connection and are terminated by a CR/LF. The server responds to the request and closes the connection.

Why sGs?

Faced with uncertain commercial-use licensing for the University of Minnesota's Gopher server, we decided to develop a simple prototype server to help bootstrap Lockheed onto the information highway. This code is a modification of the Perl program waismail.pl (written by Jonny Goldman), which is a gateway to the WAIS systems through Internet electronic mail.

In providing a systems-engineering organization with a graphical front end to existing, internally written requirements-traceability software, we saw an opportunity to explore solutions to information processing available on the Internet. We chose to implement a client/server system based on the emerging technologies of WAIS and Gopher.

We leveraged our efforts against other pockets of Internet development existing in the corporation. Known as the "Technology Broker System" (TBS), the aim of our project was to simplify and validate the terminology of the Internet within the Lockheed corporate community. sGs quickly took off inside the corporation because it was highly portable and easy to configure. (See Information Week, June 27, 1994, for more information on Lockheed i the Internet.)

sGs Code

As Listing One illustrates, sGs is an excellent example of a hacker's program. It was my second attempt at a real Perl program, with a little help from "the net," especially Jonny Goldman, who was still actively supporting the Public Domain WAIS software at the time. My first program was a modification of Jonny's public-domain waismail.pl program so that it could be used for handling additions, modifications, and deletions to WAIS databases.

Listing One


#!/usr/local/bin/perl
$program="sGs.pl"; # (a simple Gopher server)
$revision="2.0      ";
#
############################################################################
$maintainer_person="Your Name Here";
$maintainer_address="Your Email Address Here";
®ister();
#############################################################################
#                DESCRIPTION
#  A simple gopher server which handles type 0,1, g, and 7 gopher requests.
#
############################## MAIN #########################################
#
&init_program_vars();

#############################################################################
#               RUN AS A DAEMON
fork && exit;
setpgrp(0,$$);

#############################################################################

&init_socket();

&trap_gophers();

#############################  END ##########################################
sub init_program_vars {
 $host=`hostname`;chop($host);$domain=`domainname`;chop($domain);
 if ($domain) { $thishost="$host.$domain"; }
 else { $thishost=$host; }
 $port="1470";$thisport=$port;$gopher_root="/";
 $wais_op="0";$waisq="waisq";$menutype="d";$cachefile=".cache";
 $maxres = 200;
 $errorlog = "./sGs.err";
 $logfile = "./sGs.log";

 # process the command line
 while (@ARGV) {
  $_=shift @ARGV;
  if (/^-c|^-C|^-l|^-h|^-p|^-d|^-m|^-u|^-v|^-w|^-H/) { #good arguments

   if (/-c/) {
    $c_file=shift @ARGV;
    if ( -T "$c_file" ) { &process_config_file(); }
    else { die "-c: improper filename $c_file\n";}
   }

   if (/^-l/) { $logfile=shift @ARGV;}
   if (/^-p/) { $port=shift  @ARGV;$thisport=$port;}
   if (/^-d/) { $gopher_root=shift  @ARGV;
        if (! -d "$gopher_root"){die "Not a valid directory: $gopher_root\n";}}

   if (/^-m/) {         # menu type (d)ynamic (default)  or (s)tatic
    $menutype=shift @ARGV;
    if (!($menutype eq "s" || $menutype eq "d"))
    {die "-m: bad option $menutype  (use d or s) \n"}
   }

   if (/^-C/) { $cachefile=shift @ARGV;}

   if (/^-u/) {         # setuid to user  (default whoever starts it)
    print "-u option not implemented yet\n";
   }

   if (/^-v/) {&print_version(); die "\n";}
   if (/^-w/) {
    $wais_op="1";
    $waisq = shift @ARGV."/waisq";
    if (! -x "$waisq") {die "$waisq... not executable\n"}
   }

   if (/^-H/) {
    $host=shift @ARGV;$thishost=$host;
   }

   if (/^-h|^-\?/) { &print_help(); die "\n";}
  } else {&print_help(); die "\n";} # bad arguments
 }

 if ($logfile){ #we do this once to make sure we can.
  open (LOG,">>$logfile") || die "can't open logfile: $logfile: $!\n";
  close (LOG);
 }
 open(ELOG,">>$errorlog") || die "can't open error log\n";
 close (ELOG);  #just checking...

 $with_options="-h $thishost -p $port -d $gopher_root -m $menutype -w 
                                                                    $wais_op ";
 $start_mess="$timestamp Starting sGs: $with_options";
 &print_version(); sleep 2;
 print "\n";
 print "$start_mess\n";
 sleep 4;system("clear");print "Welcome to sGs....\n";
 &log_request("sGs Started $with_options\n");
}

#############################################################################
sub process_config_file {
#
# a config file is just a bunch of command line options put into a file
# one line at a time.
# Example:
#   -d /users/gopher/gopher-data
#   -p 1500
#   -w /users/wais/w8b5bio/bin
#   -l /your/gopher/log


 open (CONFIG, "<$c_file") || die "cant open $c_file";
 while (<CONFIG>) {
  @op= split(/\s/, $_);
  $_=shift(@op);
  if (/^-H/) { $host=shift(@op); $thishost=$host;}
  if (/^-l/) { $logfile=shift(@op); }
  if (/^-p/) { $port=shift(@op);$thisport=$port } # port
  if (/^-d/) {          # gopher directory
   $gopher_root=shift(@op);
   if (! -d "$gopher_root" ) { die "Not a valid directory: $gopher_root\n";}
  }
  if (/^-m/) {          # menu type (d)ynamic (default)  or (s)tatic
   $menutype=shift(@op);
   $menutype=shift @ARGV;
   if (!($menutype eq "s" || $menutype eq "d"))
   {die "-m: bad option $menutype  (use d or s) \n"}
  }

  if (/^-C/) { $cachefile=shift(@op); } # default is .cache
  if (/^-u/) {          # setuid to user  (default whoever starts it)
   print "-u option\n";
  }
  if (/^-w/) {          # WAIS SEARCH OPTION default nowais
   $wais_op="1";
   $waisq=shift(@op)."/waisq";
   if (! -x "$waisq") {die "$waisq... not executable\n"}
  }
 }
}

###############################################################################
sub print_version {

 system("clear");
 print "\n\n";
 print "
 ##############################################################################
 #                                        
 #                     sGs                
 #                                        
 #                    Gopher                      
 #               simple   server                  
 #                                        
 #              Version: $revision               
 #                                        
 ################################# For Support ################################
 #                                        
     contact: $maintainer_person
      e-mail: $maintainer_address
 #                                        
 ##########################################################################\n";
     }

#############################################################################
sub print_help {

print "
sGs [-c <configfile>] [-p <port>] [-d <gopher-data-dir>] [-l <logfile>]
    [-m <s || d>] (static or dynamic {default} menus)  [-u <user>]
    [-h|-H|-?|?] (prints this file)  [-v] (prints version)  [-w] (allow WAIS)
    [-C <cachefile>] (when running with static menus.  Default .cache)
    \n";
}

#############################################################################
sub init_socket {

 $AF_INET = 2;
 $SOCK_STREAM = 1;
 $sockaddr = 'S n a4 x8';

($name, $aliases, $proto) = getprotobyname('tcp');
 if ($port !~ /^\d+$/) {
  ($name, $aliases, $proto) = getservbyport($port, 'tcp');
 }
 $this = pack($sockaddr, $AF_INET, $port, "\0\0\0\0");

 select(NS); $| = 1; select(stdout);

 socket(S,$AF_INET, $SOCK_STREAM, $proto) || die "socket: $!";
 bind(S,$this) || die "bind: $!";
 listen(S,5) || die "connect:$!";

 select(S); $| = 1; select(stdout);
 $WNOHANG =1;
}

#######################################################################
sub trap_gophers {

 for($con = 1; ; $con++) {
  ($addr = accept(NS,S)) || die $!;
FORK:
  if (($pid = fork()) != 0) {   # parent
   close(NS);
   while (1) { last if (waitpid(-1,$WNOHANG) < 1);}
  } elsif (defined $pid) {  # child

   ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
   @inetaddr = unpack('C4',$inetaddr);
   while (<NS>) {
    if (! &valid_request($_)) {close(NS);exit(-1);}
    if (/^\r/)            {&log_request("CONNECT\n");&senddir();}
    if (/^1/)             {&senddir();}
    if (/^0|^4|^9|^g|^h/) {&sendfile();}
    if (/^7/) {&wa2go();}
    close(NS);
    exit(0);
   }
  } elsif ($! =~ /No more process/) { #EAGAIN is recoverable
   sleep 2;
   redo FORK;
  } else {          # wierd fork error
   die " could not fork child to handle connection!!!: $!\n";
  }
 }
 close(NS);
}

######################################################################
sub sendfile {
 &log_request("FILE:$request");
 open(REPLY, "<$gopher_root/$request");
 while (<REPLY>){send(NS,"$_",0);}
}

######################################################################
sub senddir {           #NEED TO PUT IN A FLAG FOR STATIC/DYNAMIC
 &log_request("DIR:$request");

 if ($menutype eq "d") {
  open(REPLY, "ls -a1 '$gopher_root/$request' |); 
  while (<REPLY>){
   chop $_;
   $file= $_;
   if (/^\./) { &process_link($_);}
   else {
    $type="0" if -T "$gopher_root/$request/$file";
    $type="9" if -B "$gopher_root/$request/$file";  
    $type="1" if -d "$gopher_root/$request/$file";
    $type="7" if "$gopher_root/$request/$file" =~/\.src$/;
    $type="g" if "$gopher_root/$request/$file" =~/\.gif$/;
    $type="4" if "$gopher_root/$request/$file" =~/\.hqx$/;
    $type="h" if "$gopher_root/$request/$file" =~/\.html$/;

    if ($type == 0 || $type == 1 || $type eq "g" || $type eq "9" || 
                                                $type eq "4" || $type eq "h") {
     send(NS,"$type$file\t$type$request/$file\t$thishost\t$thisport\r\n",0);
    }
    $waissourcedir = ""; $ENV{'WAISCOMMONSOURCEDIR'} = $waissourcedir;

    if ($type == 7 && $wais_op) {
     $waissourcedir = "$gopher_root/$request"; #chop $waissourcedir;
     $ENV{'WAISCOMMONSOURCEDIR'} = $waissourcedir;
     send(NS,"$type$file\t$type::search::$waissourcedir::$file::
                                               \t$thishost\t$thisport\r\n",0); 
    }
   }
  }
  send(NS,".\r\n",0);
 } else {           #menutype is static
  open (CACHE, "< $gopher_root/$request/$cachefile") || print "error opening 
                                                              $cachefile $!\n";
  while (<CACHE>){send(NS,"$_",0); }
 }
}

############################################################################
# do a WAIS search
sub wa2go { #Modified from Jonny Goldman's waismail.pl <[email protected]>
 $tmpfile = "/tmp/sGs.$$";
 $sfile   = "sGs.$$.src";
 $outfile = "/tmp/sGs.out.$$";
 $errfile = "/tmp/sGs.err.$$";
 $goph_string=$_;
 ($gophertype, $action, $wais_src_dir, $source, @words) = split(/::/,$_);
 if (/^maxres (\d+)/) { $maxres = $1;}
 
 if (/^7::search|^7::Search|^7::SEARCH/) {
  ($gophertype, $action, $wais_src_dir, $source, @words) = split(/::/,$_);
  $search=1;
  @sources=split(".src",$source);
  $ENV{'WAISCOMMONSOURCEDIR'} = $wais_src_dir;
  $maxres = 200;
  $waissourcedir=$wais_src_dir;
  &dosearch();
 }

 if (/^7::retrieve|^7::Retrieve|^7::RETRIEVE|^[ \t]{0,}DocID: /) {
  ($gophertype, $action, $docid) = split(/::/,$_);
  $retrieve = 1; $indocid = 1; chop($docid); chop($docid);
  &log_request("RETRIEVING: $docid\n");
 }
 
 if ($indocid == 1) {
  $indocid = 0;
  &doretrieve();
 }

 open(RESPONSE,"<$outfile");
 while (<RESPONSE>){
  if ($retrieve) {
   send(NS,"$_",0);
  }

  if ($search) {
   $/ = "";         #paragraph mode
   ($result,$heading,$DOCID) = split(/\n/,$_);
   if ($heading =~/Headline/){

    if ($DOCID =~/GIF/) {
     send(NS,"g$heading\t7::retrieve::$DOCID\t$thishost\t$thisport\r\n",0);
    }
    else {
     send(NS,"0$heading\t7::retrieve::$DOCID\t$thishost\t$thisport\r\n",0);
    }
   }
  }
 }
 send(NS,".\r\n",0);

 unlink $outfile;
 unlink $tmpfile;
 unlink $errfile;
 unlink $sfile;
}

###############################################################################
sub dosearch {
 foreach $source (@sources) {
  if(!(-f "$waissourcedir/$source.src")) {
   &logerror("could not find source: $waissourcedir/$source.src");
  }
 }

 open(TMP, ">$tmpfile");
 printf TMP "(:question :version  2\n :seed-words \"";
 foreach $w (@words) { printf TMP "$w ";};
 printf TMP "\"\n :relevant-documents\n ( ";

 if ($relevant) {
  foreach $rel (@reldocs) {
   $_ = $rel;
   /@/ &&  ($_ = $`) && (/:/) && ($id = $`) && ($db = $');
   printf TMP "\n  (:document-id \n   :document \n   (:document \n    
                                                                  :doc-id \n";
   printf TMP "     (:doc-id \n      :original-database %s \n      
                                                   :original-local-id %s\n)\n";
   &stringtoany($db), &stringtoany($id);
   printf TMP "    :source (:source-id :filename \"$source.src\" )\n";
   printf TMP "    ) )\n";
  }
 }

 printf TMP " )\n";
 printf TMP " :sourcepath \"$waissourcedir/:\" \n";
 printf TMP " :sources (\n";
 
 foreach $source (@sources) {
  printf TMP "  (:source-id :filename \"$source.src\" )\n";
 }

 printf TMP " )\n";
 printf TMP " :maximum-results %d )\n", $maxres;
 close(TMP);
 system("cp $tmpfile /tmp/TESTSEARCH");
 &log_request("WAISSEARCH: @sources, words: @words");

 if ($relevant) {
  foreach $rel (@reldocs) {
   $_ = $rel;
   { &log_request("RelDocID: \"$rel\" ");}
  }
 }
 open (OUT, ">>$outfile");
 printf OUT "Searching: ";
 foreach $source (@sources) {
  printf OUT "$source ";
 }

 printf OUT "\nKeywords: ";
 foreach $w (@words) { printf OUT "$w "; };
 if ($relevant) {
  foreach $rel (@reldocs) {
   $_ = $rel;
   { printf OUT "\nRelDocID: \"$rel\"";}
  }
 }
 printf OUT "\n";
 system("$waisq -f $tmpfile -m $maxres -g >> /dev/null 2> $errfile");
 open(ERR, "$errfile");

 while (<ERR>) {
  if (/Connect to socket did not work:/) {
   &log_request("Error Searching @sources for @inetaddr: Bad connect 
                                                              (source down?)");
   &log_request("Error: $_");
   printf OUT "\n**** Error Searching @sources: not responding ****\n";
   printf OUT "\tPlease send mail to the maintainer.\n";
  }
 }
 close(ERR);
 #unlink($errfile);
 open(TMP, "$tmpfile");
 $inres = 0;

 while(<TMP>) {
  /:result-doc/ && ($inres = 1);
  if ($inres == 1) {
   /:score\s+(\d+)/ && ($score = $1);
   ((/:headline "(.*)"$/ && ($headline = $1)) ||
    (/:headline "(.*)$/ && ($headline = $1))); # one little "" any 
                                                            my formatter dies.
   /:number-of-bytes\s+(\d+)/ && ($bytes = $1);
   /:type "(.*)"/ && ($type = $1);
   /:filename "(.*)"/ && ($sourcename = $1);
   /:original-local-id "(.*)"/ && ($docid = $1);
   /:original-local-id  (\(:any.*\))/ && ($docid = &anytostring($1));
   /:original-database "(.*)"/ && ($database = $1);
   /:original-database  (\(:any.*\))/ && ($database = &anytostring($1));
   /:date "(\d+)"/ && ($date = $1, &docdone);
  }
 }
 printf OUT 
"\n______________________________________________________________________\n\n";
 close(TMP);
 close(OUT);
 $relevant = ''; @reldocs = '';
# unlink($tmpfile);
}

##############################################################################
sub doretrieve {
 $port = "0";
 $_ = $docid;
 s/^DocID: //g;
 if (/%/) {
  $docid = $`;
  $type = $';
  #print "in doretrieve type = :$type:...\n";
 }
 $_ = $docid;
 /:/ && ($id = $`) && ($db = $');
 /@/ &&  ($_ = $`) && (/:/) && ($id = $`) && ($db = $');
 $_ = $docid;
 /@/ &&  ($_ = $') && (/:/) && ($host = $`) && ($port = $');
 open(SRC, ">/tmp/$sfile");
 printf SRC "(:source :version 3 \n";
 printf SRC " :database-name \"$db\"\n";
 if ($port != 0) {
  printf SRC " :ip-name \"$host\" :tcp-port $port\n";
 }
 printf SRC ")\n";
 close(SRC);
 open(TMP, ">$tmpfile");
 printf TMP "(:question :version 2 :result-documents \n";
 printf TMP "  ( (:document-id :document (:document :doc-id\n";
 printf TMP "    (:doc-id :original-database %s\n", &stringtoany($db);
 printf TMP "     :original-local-id %s )\n", &stringtoany($id);
 printf TMP "     :number-of-bytes -1 :type \"$type\"\n";
 printf TMP "     :source (:source-id :filename \"$sfile\") ) ) ) )\n";
 close(TMP);
 $timestamp = &date() . " " . &time() . ":";
 &log_request("WAISSEND:\"$docid%%$type\" to @inetaddr\n");
 open(OUT, ">>$outfile");
# printf OUT "______________________________________________________________________\n" if 
 ! ($type=~/GIF/);
 close(OUT);
 $docid = $docid."%".$type;
 if ($type eq "" || $type eq "TEXT" || $type eq " TEXT" ||$type eq "WSRC" 
      ||$type eq "GIF" || $type eq 
     "HTML" || $type eq "html") {
  $exres = system("$waisq -s /tmp/ -f $tmpfile -v 1 >> $outfile 2> $errfile");
 }
 else {
  $exres = system("($waisq -s /tmp/ -f $tmpfile -v 1 | uuencode WAIS.res >> 
                                                       $outfile) 2> $errfile");
 }
 unlink("/tmp/$sfile");
 open(OUT, ">>$outfile");
 open(ERR, "$errfile");
 while (<ERR>) {
  if (/Missing DocID in request|Could not find Source/) {
   s/done//g;
   printf OUT "Error getting document:\n $_\n";
   printf OUT "(This is usually a bad DocID,\n or the server has deleted the 
                                         document since you ran the search)\n";
   $timestamp = &date() . " " . &time() . ":";
   &log_request("Error Sending \"%s\" to @inetaddr: Bad DocID,\n $docid");
  }
 }
 close(ERR);
 #unlink($errfile);
# printf OUT "______________________________________________________________________\n" if 
 ! ($type=~/GIF/);
 close(OUT);
}

############################################################################
sub docdone {
 open(SRC, "$waissourcedir/$sourcename");
 while(<SRC>) {
  /:ip-name[ \t]{0,}"(.*)"/ && ($ipname = $1);
  /:database-name[ \t]{0,}"(.*)"/ && ($databasename = $1);
  /:tcp-port[ \t]{0,}"(.*)"/ && ($tcpport = $1);
  /:tcp-port[ \t]{0,}(\d+)/ && ($tcpport = $1);
  /:maintainer[ \t]{0,}"(.*)"/ && ($maintainer = $1);
 }
 close(SRC);
 select(OUT); $num++;
 printf "\nResult #%2d Score:%4d lines:%3d bytes:%7d Date:%6d Type: %s\n", 
     $num,  $score,  $lines, 
     $bytes, $date, $type;
 printf "Headline: %s\n", $headline;
 printf "DocID: %s:%s", $docid, $database;
 if ($tcpport != 0) { printf "@%s:%d", $ipname, $tcpport; }
 printf "%%$type\n";
 $score = $headline = $lines = $bytes = $type = $date = '';
 select STDERR;
}

############################################################################
# a couple of WAIS utility functions
sub anytostring {
 local($any) = pop(@_);
 $res = '';
 $_ = $any;
 if (/:bytes            #\((.*)\)(.*)\)/ && ($string = $1)) {
     @chars = split(' ', $string);
     foreach $c (@chars) {
      $res = $res.sprintf("%c", $c);
     }
    }
 $res;
}

sub stringtoany {
 local($str) = pop(@_);
 $len = length($str);
 $res = sprintf("(:any  :size  %d :bytes #(  ", $len);
     for ($i = 0; $i < $len; $i++) {
      $res = $res.sprintf("%d  ", ord(substr($str,$i,1)));
     }
 $res = $res.")  )";
 $res;
}
############################################################################
# error logging
sub logerror {
 $timestamp = &date() . " " . &time() . ":";
 open(ELOG,">>$errorlog") || die "can't open error log\n";
 printf ELOG "$timestamp @_\n"; close (ELOG);
 system("echo \"$timestamp @_\n\" | $maintainer_address");
}
############################################################################
# date and time functions
sub date {
 local ($when) = `date '+%m/%d/%y'`; chop $when;
 return $when;
}

sub time {
 local ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst);
 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
 $mon = $mon + 1;
 return  sprintf("%02d:%02d:%02d", $hour,$min,$sec);
}

###########################################################################
# general logging
sub log_request {
 local ($request)=@_;
 $timestamp = &date() . " " . &time();
 open (LOG,">>$logfile") || &logerror("can't open logfile: $logfile: $!\n");
 printf LOG 
     "%s::%s.%s.%s.%s::%s",
     $timestamp,
     @inetaddr[0],@inetaddr[1],@inetaddr[2],@inetaddr[3],$request;
 close (LOG);
}

###########################################################################
sub process_link {
 local ($lname,$ltype,$lport,$lpath,$lhost);
 if (-T "$gopher_root/$request/$file") {
  open(LINK,"< $gopher_root/$request/$file") || die "cant open 
                                            $gopher_root/$request/$file: $!\n";
  while (<LINK>) {
   @L=split("="); chop(@L);#print " @L\n";
   if(/^Name|^Type|^Port|^Path|^Host/) {
    if (/^Name/) {$lname=@L[1]}
    if (/^Type/) {$ltype=@L[1]}
    if (/^Port/) {$lport=@L[1]}
    if (/^Path/) {$lpath=@L[1]}
    if (/^Host/) {$lhost=@L[1]}
   }
   else { return }      # funny garbage in link file
  }

  if ($ltype == 0 || $ltype == 1 || $ltype eq "g" ) {
   print "$ltype$lname\t$lpath\t$lhost\t$lport\r\n";
   send(NS,"$ltype$lname\t$lpath\t$lhost\t$lport\r\n",0);
  }
  if ($type == 7 && $wais_op) {
   $waissourcedir = "$gopher_root/@tmp"; #chop $waissourcedir;
   $ENV{'WAISCOMMONSOURCEDIR'} = $waissourcedir;
   send(NS,"$ltype/$lname\t$ltype::search::$waissourcedir::$lpath::
                                                      \t$lhost\t$lport\r\n",0);
  }
 }
}
###########################################################################
# validity check
sub valid_request {
 $request=$_;chop $request;chop $request;
 substr($request,0,1)="";
 if ($request=~/\.\./) { return 0;}
 else { return 1}
}

##############################################################################
# registration functions
sub register {
 $c="./.sgsc";
 if  ( ! -r $c) { &cop();®();&sen();return}
 if (&bf() == 1) {&sen(); &cop(); ®();&sen();return}
}

sub cop {
 print"
                   sGs.pl

                (C) COPYRIGHT
               1993 Bob Kaehms
                [email protected]

This software is provided free, AS IS, and neither the author, nor any person
or entity associated with the author in producing this software is responsible
for the condition of the software, it's use, or any damage to a computer or
the information therein, from using this software.

In short, LET THE USER BEWARE. If you plan on running this software you should
be familar with TCP/IP and network security.

You may do what you want with the program as long as the original copyright
and the following notice remain attached.

      Press <RETURN> when you've read and accept the above caveate";
 $OK=<STDIN>; system("clear");

 print"

 A. BECAUSE THE PROGRAM IS AVAILABLE FREE OF CHARGE, THERE IS NO WARRANTY
    FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
    OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDER AND/OR OTHER PARTIES
    PROVIDE THE PROGRAM \"AS IS\" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
    OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
    MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
    TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
    PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF NECESSARY SERVICING,
    REPAIR OR CORRECTION.

 B. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
    WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
    REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
    INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES
    ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT
    NOT LIMITED TO THE LOSS OF DATA BEING RENDERED INACCURATE OR
    LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO
    OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS
    BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

    Press RETURN if you've read and accept the above.Else Cnt C";
 $OK=<STDIN>; system("clear");
}

sub reg {
 $hdr="/tmp/HDR";
 local($d) = `date`;
 local($h) = `uname -a`;local($da)=`domainname`;local ($w)=`whoami`;
 open(H,"> $hdr");printf H "From: $w@$hTo: $$maintainer_address\nSubject: 
                                                                        Gopher 
Registration\n\n";close H;
 print "Please enter the following:\n\n";
 print "NAME    COMPANY               PHONE         e-mail address\n";
 $user_contact = <STDIN>;
 print "thanks.....\n";
 open(F,">$c");
printf F " sGs.pl\n $revision\n $d HARDWARE\n $h $da $w CONTACT PERSON\n 
$user_contact";
 close F;
}

sub sen {
 system("cat $hdr $c |  /usr/lib/sendmail '$maintainer_address'");
}

###########################################################################
# special check to see if we've registered once
#
sub bf {
 local ($h) = `uname -a`;local ($da)=`domainname`;local ($w)=`whoami`;
 $/ = "";           #Enable paragraph mode
 open (F,"<$c");
 while (<F>){
  if (/$h/ && /$da/ & /$w/){$/ = "\n";close F;return "0";}
  close F;$/ = "\n";return "1"}
}

sGs is built upon the simple client/server socket example in Programming Perl, by Larry Wall (O'Reilly & Associates, 1990). As you examine the code, notice that the subroutine initialize_socket() looks similar to the sample server code of his book. Wall notes how concisely the socket, bind, listen, and accept calls can be written in Perl; see Example 1. For more information on Perl, we recommend both Wall's book and "Networking with Perl," by Oliver Sharp (DDJ, October 1993). The client/server example in Wall's book provides a firm foundation on how sockets work. In Example 1, the subroutine creates and initializes a socket to which clients can connect.

Example 1: This code creates and initializes a socket to which clients can connect. The variable $WNOHANG is used to collect child processes that would otherwise result in "zombie'' processes.

sub init_socket {

 $AF_INET = 2;
 $SOCK_STREAM = 1;
 $sockaddr = 'S n a4 x8';
($name, $aliases, $proto) = getprotobyname('tcp');
 if ($port !~ /^\d+$/) {
  ($name, $aliases, $proto) = getservbyport($port, 'tcp');
 }
 $this = pack($sockaddr, $AF_INET, $port, "\0\0\0\0");

 select(NS); $| = 1; select(stdout);

 socket(S,$AF_INET, $SOCK_STREAM, $proto) || die "socket: $!";
 bind(S,$this) || die "bind: $!";
 listen(S,5) || die "connect:$!";

 select(S); $| = 1; select(stdout);
 $WNOHANG =1;
}

We extended Wall's example in two ways: First, instead of echoing the lines sent by the client as Wall did, the server interprets the string (up to the CR/LF) and acts upon it in some way. Second, we reap the child processes spawned to handle each request that would otherwise become "zombies."

All of this fits neatly into the trap_gophers subroutine which forms the core event loop in the server; see Example 2. trap_gopher simply sits and waits for connections on the open socket. (Imagine a cat sitting patiently on the back lawn waiting for gophers. When it sees one, it has a kitten that chases and eats the gopher. When the kitten is finished, the parent cat consumes the kitten.)

Example 2: The trap_gophers subroutine forms the core event loop in the server.

sub trap_gophers {

 for($con = 1; ; $con++) {
  ($addr = accept(NS,S)) || die $!;
FORK:
  if (($pid = fork()) != 0) {   # parent
   close(NS);
   while (1) { last if (waitpid(-1,$WNOHANG) < 1);}
  } elsif (defined $pid) {  # child

   ($af,$port,$inetaddr) = unpack($sockaddr,$addr);
   @inetaddr = unpack('C4',$inetaddr);
   while (<NS>) {
    if (! &valid_request($_)) {close(NS);exit(-1);}
    if (/^\r/)    {&log_request("CONNECT\n");&senddir();}
    if (/^1/)     {&senddir();}
    if (/^0|^4|^9|^g|^h/) {&sendfile();}
    if (/^7/) {&wa2go();}

    close(NS);
    exit(0);
   }
  } elsif ($! =~ /No more process/) { #EAGAIN is recoverable
   sleep 2;
   redo FORK;
  } else {  # weird fork error
   die " could not fork child to handle connection!!!: $!\n";
  }
 }
 close(NS);
}

This completes the basic mechanics of the server. For requests made to the server by the client, we'll start with the simplest request: a "Type 0" request which is used to retrieve a plain text (ASCII) file. The section of Example 2 beginning with while (<NS>) { _ illustrates how this is implemented. In effect, this code works as follows:

  1. Wait for input on the socket.
  2. Check to make sure the request is valid. This has the side-effect of removing the first character from the input line and putting the result in the variable $request.
  3. Check to see if the first character of the line is the number 0.
  4. If so, execute the subroutine sendfile.

Perl provides an easy mechanism for handling lines. We have created a stream called "NS" from the socket using the accept call and processed it one record at a time with the while(<NS>) call. By default, the end of a record in Perl is CR/LF. Conveniently enough, this is the end of a Gopher request. If this subroutine fails to recognize a valid type--the first character of the line--it simply closes the socket.

This routine does some simple preprocessing of the request string, which is passed to the subroutine through the Perl variable $request. The subroutine send-file (see Example 3) prepends the Gopher root-level data directory to the request, opens the file, and sends it back to the client one line at a time. If the file doesn't exist, the connection simply closes.

Example 3: The subroutine sendfile prepends the Gopher root-level data directory to the request, opens the file, and sends it back to the client one line at a time.

sub sendfile {
 &log_request("FILE:$request");
 open(REPLY, "<$gopher_root/$request");
 while (<REPLY>){send(NS,"$_",0);}
}

The Type 1 request is similar. We prepend the root-level Gopher directory to the request. If the directory is dynamic, we create a directory listing using the UNIX ls command and process the output, tagging the resulting lines with the base Gopher types and processing any link files that might be found. If the directory is static, we just open the "cachefile" and send it back to the client; see Example 4. Note that the server sets the type using filename extensions and the standard UNIX tests for binary, text, and directory.

Example 4: Processing a Type 1 request.

sub senddir {
 &log_request("DIR:$request");

 if ($menutype eq "d") {
  open(REPLY, "ls -a1 '$gopher_root/$request' |);
  while (<REPLY>){
   chop $_;
   $file= $_;
   if (/^\./) { &process_link($_);}
   else {
    $type="0" if -T "$gopher_root/$request/$file";
    $type="9" if -B "$gopher_root/$request/$file";
    $type="1" if -d "$gopher_root/$request/$file";
    $type="7" if "$gopher_root/$request/$file" =~/\.src$/;
    $type="g" if "$gopher_root/$request/$file" =~/\.gif$/;
    $type="4" if "$gopher_root/$request/$file" =~/\.hqx$/;
    $type="h" if "$gopher_root/$request/$file" =~/\.html$/;
    if ($type == 0 || $type == 1 || $type eq "g" || $type eq "9" || 
                                                    $type eq "4" ||
                                                    $type eq "h") {
     send(NS,"$type$file\t$type$request/$file\t$thishost\t$thisport\r\n",0);
    }
    $waissourcedir = ""; $ENV{'WAISCOMMONSOURCEDIR'} = $waissourcedir;

    if ($type == 7 && $wais_op) {
     $waissourcedir = "$gopher_root/$request"; #chop $waissourcedir;
     $ENV{'WAISCOMMONSOURCEDIR'} = $waissourcedir;
     send(NS,"$type$file\t$type::search::$waissourcedir::$file::\t$thishost
                                                        \t$thisport\r\n",0);
    }
   }
  }
  send(NS,".\r\n",0);
 } else {           #menutype is static
  open (CACHE, "< $gopher_root/$request/$cachefile") || print "error opening
                                                           $cachefile $!\n";
  while (<CACHE>){send(NS,"$_",0); }
 }
}

In the Type 7 request section, the server implements a gateway to a search tool, in this case, WAIS. The subroutine wa2go (see Example 5) serves two functions: First, it takes a list of keywords, which it hands to the WAIS server. It receives the list of results from the WAIS server and rewrites them to conform to the Gopher protocol before returning them to the client. Second, it receives a result from a previous search, rewrites it to the WAIS protocol, and passes that back to the WAIS server. It receives the result from the WAIS server and passes that back to the client. It would be relatively simple to replace the WAIS gateway with a gateway to some other search tool (like UNIX grep, for instance).

Example 5: The subroutine ws2go takes a list of keywords, which it hands to the WAIS server and receives a result from a previous search, rewrites it to the WAIS protocol, and passes that back to the WAIS server.

# do a WAIS search
sub wa2go { #Modified from Jonny Goldman's waismail.pl <[email protected]>
 [...]
 if (/^7::search|^7::Search|^7::SEARCH/) {
  [...]
  &dosearch();
 }
 if (/^7::retrieve|^7::Retrieve|^7::RETRIEVE|^[ \t]{0,}DocID: /) {
  [...]
  &doretrieve();
 }
 [...]
}

There are several ways that sGs can be configured. Modifications can be made directly to the code inside the subroutine init_program_vars. Configurable parameters can also be passed to the program as command-line options or through a configuration file. Perl allows for a simple syntax for processing the command line. In Example 6, @ARGV is an array containing the command line. Elements from the array are shifted into the stream variable $_ and processed until the end of the array is reached. The variables themselves can be easily understood by looking at a configuration file such as Example 7, where the server would be started either from a command line, or through an rc file as: sGs -c sGs.cnf.

Example 6: @ARGV is an array containing the command line.

while (@ARGV) {
  $_=shift @ARGV;

  if (/-c/) {
   $c_file=shift @ARGV;
   [...]
  }

  if (/^-l/) { $logfile=shift @ARGV;}
  [...]

Example 7: A configuration file.

# sGs.cnf
# port to run on.
# NOTE: official gopher port is 70, but requires root privilege

-p 1492

# gopher root level data dir
# this is where you put the data you want to publish

-d /my-gopher/data

# where you keep your wais binaries (remove or comment the following
# line if no wais)
-w /wais/bin

# gopher log file
-l ./sGs.log

# how to set up menus... if "-m d" then they are dynamic
#                     if "-m s" then they are static, and cache files must
#                     be created manually. sGsCache.pl can be used

#      NOTE:
#      static menus allow finer control over what a gopher menu will look
#      like, more control and security, but in general may be harder to
#      maintain. Properly constructed gopher filesystems on a unix system
#      utilizing long filenames and whitespace allow for very readable
#      client menus.
-m d

# the hostname you wish to have created in your gopher replies
# normally the default is fine "`hostname`.`domainname`".
-H gopher.chaser.com

If the server is running with dynamic menuing, link files can also be placed in any directory below Gopher root and are identified as "dot" files. The server assumes that any such dot file is a link and tries to process it accordingly. If it finds any data within the file that it doesn't understand, it exits the subroutine, ignoring the file; see Example 8. These link files are what makes Gopher so powerful and allow a Gopher administrator to tunnel far and wide throughout the Internet. Example 9 is an example link file conforming to sGs's format.

Example 8: Gopher link files.

sub process_link {
[...]
 if (-T "$gopher_root/$request/$file") {
  open(LINK,"< $gopher_root/$request/$file") || die "can't...";
  while (<LINK>) {
  [...]
    if(/^Name|^Type|^Port|^Path|^Host/) {
    [...]
   }
   else { return }              # funny garbage in link file
  }

Example 9: An example link file that conforms to sGs's format.

Name=The Original Gopher at UofM
Type=1
Port=70
Path=1/
Host=boombox.micro.umn.edu

Provided that you were "on the net," putting such a link into your Gopher data directory would let you jump to the place where it all started, where you would find plenty of additional information and support.

Conclusion

By utilizing this type of server, an organization can utilize other freely available software--in particular, clients for most hardware platforms. Table 2 is a short list of what's currently available.

Platform Application Author Comments

PC

gopher_p.exe

Martyn Hampson

Gopher+ client

PC

gophbook.exe

Kevin Gamiel

Cute client that uses a book metaphor for displaying data.

Mac

GopherApp

Don Gilbert

Uses MacApp extensible Macintosh programming framework. Highly reliable. Some problems with 32K text-file limits.

UNIX

Xgopher

Don Gilbert

 

Also, keep in mind that the official Gopher software is always available from boombox.micro.umn.edu and should not be overlooked when considering an information architecture. The Gopher team provides excellent support and constant improvements to their code, such as the three-dimensional-space user interface for Gopher that's currently under development.

As a development platform, Perl provides a rich, portable language for developing network-based services. Organizations need access to the huge amounts of information available on the net, and they need to provide customized servers to their users. sGs is a marriage of two useful tools that provide such a solution.

References

Wall, Larry and Randall Schwartz, Programming Perl. O'Reilly & Associates, 1992.

Anklesaria, A. et al. RFC 1436 "The Internet Gopher Protocol" available from ftp.internic.net.

"Guide to Network Resource Tools." EARN Association, May 20, 1994, Document Number: 3.0. Available in electronic form from [email protected] (or [email protected]). Send the command: GET filename where the filename is either: NETTOOLS PS or NETTOOLS TXT.

Kahle, Brewster. "Wide Area Information Servers." April, 1991. One-page overview of Internet release of WAIS. Available via anonymous ftp: /pub/wais/wais-discussion/[email protected] or WAIS server wais-discussion-archive.src.

Sharp, Oliver "Networking With Perl." Dr. Dobb's Journal (October 1993).


The authors are programmers in the San Francisco Bay Area. Bob can be contacted at [email protected] and Jonny at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.