Controlling Background Processes Under Unix

DEC90: CONTROLLING BACKGROUND PROCESSES UNDER UNIX

Here's a system that "user-izes" process management

Barr works for Schering-Plough Research, a pharmaceutical company. He can be reached at 60 Orange Street, Bloomfield, NJ 07003.

I undertake scientific computing using a number of networked workstations, compute servers, and a super-computer. Typically, problems are set up and the results are analyzed on the workstations, while the actual number crunching activity is performed on the more powerful machines. The description of this approach is straightforward, but the tools necessary to manage the jobs that originate from a central workstation, and are then run on a variety of remote hosts over a network, are not generally available.

For instance, with the standard available Unix System V tools, you cannot ask user-oriented questions, such as "What applications and datasets are running on any remote host?" "Which applications are pending?" or "What is the status of a certain job?" without resorting to Unix commands that oftentimes do not look like the user's actual query.

This article presents a system designed to "user-ize" the management of background processes that run both locally and across a network. The system, which I call "shepard," includes a standardized interface that utilizes menus where appropriate to reduce the management of processes down to the essentials: Tasks, executable scripts, dataset names, and remote host names. The specifics of the process of executing, monitoring, and retrieving programs that run remotely are hidden in shell scripts. The bulk of this system is written in Bourne shell script and is reasonably portable.

System Overview

The basic idea behind this system is to use a series of scripts to control all activities that occur locally or on a remote host, and to do so in a simple, straightforward, user-oriented manner. Integral to this activity is the establishment of a queuing system to handle job control, simplify the selection of the host and programs, invoke the programs, move the data files over a network, and record all activities. The core of the system consists of five Bourne shell scripts -- run, shepard, shepard_ queue, shepard_exec, and run_update -- and a number of application-specific control files that contain either tables or control scripts. When the system runs it generates a number of status files to show the jobs that are waiting, running, finished, and restartable on each remote host. The system also generates a file on the origin machine to show the status and location of all jobs.

I have commented the code extensively to describe its features and operation. Descriptions of the scripts are presented here to give you a general understanding of the role of each script in the overall system. For specifics, refer to the listings.

The script run is a menu-driven task manager that queries for a task, a script, a dataset, and the host. run then carries out the requested task by invoking shepard on the selected host through a remote shell (rsh) command. run maintains a list of jobs originating from the origin machine as well as a log file, and serves as a hub that coordinates jobs run on several machines. Through run, the user knows which programs are using which datasets and where those programs are running. The tasks managed by run are simple and include starting and stopping the execution of programs, determining the status of jobs running on a specific remote host, and probing running jobs for intermediate results or status information. run performs all of its work, including the generation of most menus, through the other scripts across the network by use of rsh calls.

The types of tasks that the shepard system currently performs include:

Execute a job, locally or over a network by a remote host
Monitor the remote host
Probe a running job for intermediate results
Kill a running job with extreme prejudice
Halt a running job in a controlled manner
Restart a job
List running, waiting (enqueued), restartable, and finished jobs
List general and error log files

The script shepard performs all of the process management for a specific host. shepard takes as its arguments the task, the script, the dataset, the origin machine, and the dataset directory. Execution is handled by shepard_ queue. The process of updating the status file on the origin machine is performed by run_update, and all other activities are conducted directly in shepard. Lists of running and waiting jobs, as well as an execution log are maintained and displayed when requested by run. The process of adjusting the control files, and updating the status file on the origin machine when a job is completed is also handled in shepard.

The script shepard_queue maintains a FIFO queuing system that regulates both the numbers and the types of jobs allowed to execute concurrently. This approach is a good way to maintain a balance between throughput and maximum system performance. The executing task is started by invoking shepard_exec as a background task. An error log is maintained. Instead of using multiple queue files, a single queue file is used both for the sake of simplicity and because the queue file is used to generate a menu.

The script shepard-exec has embedded in it the specifics for moving data into the execution environment, running the selected program, moving the results back, and calling shepard to update the control files. The specifics of program invocation -- the process of copying data files from origin to host, executing the program, and transferring the result files back to the origin machine -- are hidden in scripts. shepard_exec needs only the name of the script and the specific dataset for the run; all else is controlled by the specifics in the scripts.

The script run_update updates the status file on the origin machine to reflect the current status of the job. run-update is invoked by remote shell calls from both shepard and shepard_queue.

Networking

Three types of data transfer are supported: remote, nfs, and server. The best choice depends very much upon your particular environment and operating preferences. The choice of network is set in .shepard.ini.

In remote mode, the data is copied from its directory on the origin machine into the user's login account on the selected host and then executed. The output is transferred back to the origin. The input and the output exist on both machines. This approach provides a more robust operating mode when the network is unreliable, but it is not the best mode to use with very large datasets.

In nfs mode, the data is accessed through remotely mounted network drives by using Sun's NFS Network File System, and is not actually moved. The output in nfs mode is written directly to the network drive. Only one copy of the input and the output is generated. This mode is sensitive to network crashes.

In server mode, the data exists, and the execution is performed, entirely on the compute server. Only one copy of the input and the output is generated. This is the best mode to use with large datasets, and is also useful when the network is unreliable because the data never moves over the network.

I generally use the server mode through an NFS-mounted directory. This method provides the benefits of direct access to data without having to move it, and is especially beneficial when the datasets are large.

An Example

The control files in this example are those used in my working environment, and they reflect both the machines and the programs I use in computational medicinal chemistry. The application example is for a version of BATCHMIN (Columbia University), which is the premiere program for molecular simulation. The application is dimensioned to handle protein-sized problems, and is run only on a Convex supercomputer. The problems are set up graphically on an Iris workstation and then are transferred to the Convex for simulation. (The simulation process can last longer than a week for a single problem.) The application script is called bmin31lv, and the dataset is referred to as bmintest.

The operation of the user interface is described in the comments. The selection of executable scripts, the performance of status queries for running jobs and the like, and the execution of available tasks are all menu-driven activities. I have included niceties such as the listings of completed jobs when run is started, plus the retention of selected values for scripts, datasets, and hosts as defaults for the next invocation of run.

When a job is intended for launching, run passes the selection information to shepard on the selected host, and then adds the new job to the origin machine's .current file with status STARTED. At this point run is no longer involved with the process.

When it's invoked by run, shepard creates a lock file, .sheplock. .sheplock forces all other invocations of shepard to wait and also insures sole access of all control files by shepard. .sheplock's control extends through the invocation of shepard_queue. shepard passes the information to shepard_queue, which places the job in the .waiting file and gives the job the highest priority value for that script so that the newly submitted job will be the last one to execute. The origin machine's .current file is updated to status WAITING.

shepard_queue checks the job count of a particular script against the maximum job count in .limits and proceeds to execution if below, or exits, leaving the job enqueued. To execute the job, the script shepard_exec is passed all the relevant job information and submitted as a background process. The job is removed from .waiting and placed in .running with the process ID (pid) by shepard_exec. If the job fails it is placed in .restart. The .current file on the origin machine is updated to status RUNNING. shepard_ queue returns to shepard, removes the lock, and exits. A log of all activities is maintained so that the user can later determine what happened.

shepard_queue sources an application-specific file, which in this case is bmin31lv.script. This file defines the generic shell variables in shepard_exec for the specifics of the application. (The use of shell variables to represent the various file extensions makes the basic shepard_exec more flexible.) Data is moved by bmin31lv getdata, which is defined in $getdata_script. The application is executed with the assumption that it needs a standard input, and generates output to standard output and standard error. bmin31lv takes its command input via the standard input from bmintest.com, and generates run output via standard output to bmintest.log. It also uses bmintest.dat, which is moved along with bmintest.com into the execution environment. bmin31lv also generates a bmintest.out, which is returned to the origin machine. The output files are moved back across the network using the script bmin31lv.putdata, which is defined in $putdata_script. From the user's point of view, all of the requisite files are named, created, and moved based only on the application script and the dataset name; the specifics are buried in the scripts.

Upon completion of this process, shepard_exec calls shepard with the -z option. shepard_exec may have to wait while other processes (such as run) and other completing jobs execute shepard. Once in control, the job is removed from .running and placed in .finished. The origin machine's .current is updated to DONE, the log files are updated, and the job passes into history. shepard_queue is invoked with the -q option to check for waiting jobs, and launch if possible. The final act is to remove .sheplock.

A different example is the killing of a running job. When the kill task is selected in run, shepard is called on the selected host. Once control is established, a menu of running jobs is generated from .running and control is returned to run. The job to be killed is selected by its entry number. shepard is again invoked, and now passes the selection number of the job as the fifth argument. shepard looks up the pid of the selected job, and kills the script associated with that pid. The process listing is searched for child processes (which in this case is the application) and the child processes are also killed. The job is removed from .running and the log files are updated. Control passes back to run.

From the user's point of view, the shepard system achieves its goals of simplifying the network management of jobs, and tells the user which jobs are located where, based upon what is important: the application and the dataset. A benefit of the queue system is that numbers of jobs can be left pending (and easily managed) without fear of "choking" the system if they're improperly enqueued or if they're blocking other types of jobs.

Control Files and Menu Generation

The control files are doubly useful both for generating menus and building tables of values for lookup. In fact, the Unix utility program Awk is handy for both tasks. I provide code examples to show the process of generating "pick-an-item" menus and lookup of values based upon the selected item number. Although, the code examples are very simple, I have never seen Awk used in this manner.

The control files use space separation between the fields. Awk has the flexibility to use selected fields (words) out of a single record (line). This flexibility offers the option to use the first several fields on a line as lookup values, and to use the rest of the fields on the line as information displayed in the menu. An example of this method is .hosts, in which the first word on each line represents a possible host, and the remainder of the line describes the host.

The formats of the various control files are shown in Table 1. The other control files (.run.ini and .shepard.ini) and the application-specific scripts are all Bourne shell scripts, and are commented with respect to their functions.

Table 1: Control file formats

  File          Format
--------------------------------------------------------------------

  .runscripts:  hostname script (remainder is descriptive comment)
  .hosts:       hostname (remainder is descriptive comment on host)
  .tasklist:    flag (remainder is description of 'run' task)
  .limits:      hostname script maximum-jobs-per-script
  .current:     script dataset host datadir start finish status
  .waiting:     script dataset host datadir queue-position
  .running:     script dataset host datadir process-id
  .finished:    script dataset host datadir finished-time
  .restart:     script dataset host datadir

Installation

The scripts and the control files should be placed on all desired machines. Give the scripts run, shepard, shepard_ queue, shepard_ exec, and run_update (Listings One through Five, pages 82-88) execution privilege, and place them in a directory accessible through the path. Place the control files for the scripts run (.run.ini, .current, .hosts, .tasklist, and .runscripts, Listings Six through Nine, page 90) and shepard (.limits and .shepard.ini, Listings Ten and Eleven, page 90) in the user's login directory. Create a directory for the application-specific scripts (Listings Twelve through Sixteen, page 90) and move those scripts into the directory. Finally, compile lockon.c (Listing Seventeen, page 90) and place the executable in a directory accessible through the path.

Modify the control files according to your needs and your network setup. Change .hosts to reflect the accessible machines on your network, and modify .runscripts so that it uses both the accessible machines and the control scripts for your applications. Make the same changes to .limits that is made to .runscripts, and add the numbers that correspond to a desired mix of running jobs for a maximum load situation.

Create the application-specific scripts for your specific applications and needs. Use the example scripts for bmin31lv as a template. If your applications do not have a graceful terminate or probe capability, set the variable assignments in the master application .script file equal to no strings.

Modify the control file .shepard.ini to reflect your applications present on all available platforms, your preferred mode of network access, and the directory in which the application-specific scripts are kept. (The application names are without the .script extension.) Make sure that the network mode is the same in the .shepard.ini files for pairs of platforms. Generally, you cannot mix network modes between specific pairs of machines, although other modes can be selected for other pairs.

Finally, if the machines on your network are not communicating freely, make the necessary changes to the system files that control access. You should have accounts on all desired machines. The /etc/hosts and /etc/hosts.equiv on each of a pair of desired machines must be set so that the rsh command works properly. If you intend to use nfs to move files between files, make sure that your nfs is enabled and that the necessary directories are mounted remotely.

The system works without modifications between Silicon Graphics Iris platforms that use Unix System V, Version 3.2, and a Convex C-220 that uses Berkeley Unix 4.2, when all of the communications requirements are met.

Future Enhancements

The shepard system is functional, but not complete. A facility to check for crashed jobs and then clean them up is under development. Also, the changes necessary in order to support multiple users are straightforward, but they introduce a complication with respect to the queuing limits: The mix of jobs must also include a mix of users.

_CONTROLLING BACKGROUND PROCESSES UNDER UNIX_ by Barr E. Bauer

[LISTING ONE]

<a name="025d_000c">

origin=`hostname`

# run - the user interface component of the shepard system -- B. E. Bauer 1990
# configuration files associated with run:
#   .run.ini    defaults for script,dataset,host,datadir.
#   .current    jobs originating from workstation environment
#   .hosts      host machines able to run shepard
#   .tasklist   list of tasks. Has flags for shepard
#   .runscripts list of machines and possible scripts
#   these files must be located in the login directory
# flag (and task definitions) definitions, Used in case statement
# and passed as actual flag arguments to shepard:
#    x    executes (submits) a job
#    m    monitors job
#    p    probes job
#    s    status of running jobs on all platforms
#    r    list of running jobs
#    k    kill job (with extreme prejudice)
#    t    terminate job in a controlled manner (script dependent)
#    l    list log on remote machine
#    b    bump a waiting job from the .waiting list
#    d    delete a waiting job
#    f    list finished jobs
#    e    list error log
#    c    change host
#    w    list waiting jobs
#    a    list restart jobs
#    g    restart a restartable job

#place the date/time in day-month-year@time single string format
set - `date`
year=$6 month=$2 day=$3 tm=$4
datetime=$day-$month-$year@$tm

echo 'welcome to run on '$origin' at '$datetime

. $HOME/.run.ini    #source the run-script defaults
. $HOME/.shepard.ini # has network definition

# check for finished jobs, update list, display finished list
# find jobs with status RUNNING, check host for status

if (test -f $HOME/.current) then
    cnt=`grep -c DONE $HOME/.current`
    if (test "$cnt" != "0" ) then
        awk 'BEGIN {
            printf "\njobs recently finished\n"
            printf "\n%-10s %-10s %-8s %-21s %-21s\n\n",\
                "script","dataset","host","start","end"
        } $7 == "DONE" {
            printf "%-16s%-16s%-8s%-20s%-20s\n",$1,$2,$3,$5,$6
        } ' $HOME/.current >tmp
        echo ' '; cat tmp   # display list of completed jobs
        echo 'press any key to continue \c'; read sel
        cat tmp >> $HOME/run.log   # completed job data to runlog
        awk '$7 != "DONE" {
            print $0
        } ' $HOME/.current >tmp
        mv tmp $HOME/.current
    else
        echo "no new finished jobs"
    fi
fi

# set default host. All activities focus on that host until changed
awk 'BEGIN {
    n=1
    printf "\n----- current hosts -------------------------\n\n"
} {
    if ("'$defhost'" == $1)
        printf "%-3s%-16s%s %s %s  (default)\n",n,$1,$2,$3,$4
    else printf "%-3s%-16s%s %s %s\n",n,$1,$2,$3,$4
    n++
}
END {
    printf "\nselect a host machine by number: "
}' $HOME/.hosts
read sel
if (test -z "$sel") then
    host=$defhost
else
    sel=`awk 'BEGIN {n=1}{if ("'$sel'" == n) print $1; n++}' $HOME/.hosts`
    host=$sel; defhost=$sel
fi
loop=YES

# top of loop. exit with <ret>
while (test "$loop" = "YES")
do
    # display menu of tasks
    echo ' '; echo 'current host is ' $host
    echo ' '
    awk ' BEGIN {
        n=1
        printf "\t#      flag          task\n"
        printf "\t----------------------------------------------\n"
    }{
        printf "\t%-3s\t%s\n",n,$0
        n++
    }
    END {
        printf "\ntask selection number [<ret> to exit]: "
    } ' $HOME/.tasklist
    read sel
    # look up value for shepard flag associated with task.
    # Use the flag in the case statement
    task=`awk 'BEGIN {n=1} {if("'$sel'" == n) print $2; n++}' $HOME/.tasklist`
    flag=`awk 'BEGIN {n=1} {if("'$sel'" == n) print $1; n++}' $HOME/.tasklist`

    # if response is <ret>, exit while loop
    if (test -z "$sel") then
        break
    fi

    case $flag in
    -x) # start a job. Queries for script, dataset, datadir
        # list scripts available only on selected host
        awk ' BEGIN {
            n=1
            def=0
            printf "\n# (host) script"
            printf "\n-------------------------------------\n"
        }
        "'$host'" == $1 {
            if ($2 == "'$defscript'") {
                printf "%-2s %s  (default)\n",n,$0
                def = n
       }
            else printf "%-2s %s\n",n,$0
            n++
        }
        END {
            printf "\nselect a script by number [%s]: ",def
        } ' $HOME/.runscripts
        read tmp
        # look up the script selected by number (must be on one line)
        sel=`awk 'BEGIN{n=1} "'$host'"==$1 {if("'$tmp'" == n) print $2; n++} ' $HOME/.runscripts`
        if (test "$sel" = "") then
            script=$defscript
        else
            script=$sel; defscript=$sel
        fi
        echo 'selected script is '$script
        # get the dataset name
        echo ' '; echo 'enter dataset name ['$defdata']: \c'
        read sel
        if (test "$sel" = "") then    # substitute default for <ret>
            dataset=$defdat
        else
            dataset=$sel; defdata=$sel
        fi
        echo 'selected dataset is '$dataset
        # get the directory where the data is located
        # if $SHEPARD_NETWORK is set to "remote", data moves between machines
        # using nfs otherwise, data is retained on server
        # home directory on the host machine, then back when done
        echo ' '; echo 'enter directory of data on '
        case $SHEPARD_NETWORK in
            remote) echo $iam': \c';;
               nfs) echo $iam' using nfs mount on '$host': \c';;
            server) echo $host': \c'; defdir='$HOME';;
        esac
        read sel
        if (test "$sel" = "") then    # substitute default for <ret>
            datadir=$defdir
        else
            datadir=$sel; defdir=$sel
        fi
        echo 'selected directory is '$datadir
        # append new job entry to $HOME/.current
        llist='$script $dataset $host $datadir $datetime'
        echo $llist 'out' 'STARTED' >>$HOME/.current
        if (test "$origin" = "$host") then
            shepard $flag $script $dataset $host $datadir
        else
            rsh $host shepard $flag $script $dataset $origin $datadir
        fi;;
    -s) # listing of current file. shows activity on other platforms
        awk ' BEGIN {
            fmt="%-5s %-16s %-16s %-21s %-16s\n"
            dash5="-----"
            dash16="----------------"
            dash21="---------------------"
            n=1
            printf "\n\ncurrent job status\n\n"
            printf fmt,"#","script","dataset","submitted","status"
            printf fmt,dash5,dash16,dash16,dash21,dash16
            printf "\n"
        } {
            printf fmt,n,$1,$2,$5,$7
            n++
        }
        END {
            printf "\npress any key to continue "
        } ' $HOME/.current
        read sel;;
    -[ktpdbg])
        # these are all list processing commands using pick an item menuing
        # the menu is generated by shepard on the selected host
        # the item is picked in run and the selection happens in shepard
        case $flag in
            -[ktp]) lflag='-r';;  # list running jobs
             -[db]) lflag='-w';;  # list waiting jobs
                -g) lflag='-a';;  # list restartable jobs
        esac
        if (test "$origin" = "$host") then
            shepard $lflag dummy2 dummy3 dummy4 dummy5
        else
            rsh $host shepard $lflag dummy2 dummy3 dummy4 dummy5
        fi
        echo ' '; echo 'select number of job to \c'
        case $flag in
            -k) echo 'kill \c';;
            -t) echo 'halt gracefully \c';;
            -g) echo 'restart \c';;
            -d) echo 'remove from waiting queue \c';;
            -b) echo 'bump to top of queue \c';;
            -p) echo 'probe running status \c';;
        esac
        read sel    # select one from list
        arg5=$sel
        if (test "$origin" = "$host") then
            shepard $flag dummy2 dummy3 dummy4 $arg5
        else
            rsh $host shepard $flag dummy2 dummy3 dummy4 $arg5
        fi;;
    -c) # change hosts
        awk 'BEGIN {
            n=1
            printf "----- current hosts -------------------------\n\n"
        } {
            if ("'$defhost'" == $1) {
                printf "%-3s%-16s%s %s %s  (default)\n",n,$1,$2,$3,$4 }
            else printf "%-3s%-16s%s %s %s\n",n,$1,$2,$3,$4
            n++
        }
        END {
            printf "select a new host machine by number: "
        }' $HOME/.hosts
        read sel
        if (test -z "$sel") then
            host=$defhost
        else
            sel=`awk 'BEGIN {n=1}{if ("'$sel'"==n) print $1; n++}' $HOME.hosts`
        host=$sel; defhost=$sel
        fi;;
    -[rewfalm]) # process listing commands
        if (test "$origin" = "$host") then
            shepard $flag dummy2 dummy3 dummy4 dummy5
        else
            rsh $host shepard $flag dummy2 dummy3 dummy4 dummy5
        fi
   read sel;;
    *)  # woops
        echo $flag 'is not a recognized option, try again'
    esac
done    # bottom of while loop

# write current values to run-script default file
# $HOME/.run.ini is sourced on invocation in effect restoring the
# last values used. Handy for checking on a previously
# started job - values properly default to the previous
echo 'defscript='$defscript >$HOME/.run.ini
echo 'defdata='$defdata >>$HOME/.run.ini
echo 'defhost='$defhost >>$HOME/.run.ini
echo 'defdir='$defdir >>$HOME/.run.ini

echo 'end of run'



<a name="025d_000d"><a name="025d_000d">
<a name="025d_000e">

[LISTING TWO]

<a name="025d_000e">

trap 'rm -f $HOME/.sheplock; exit' 1 2 3 15

# shepard - task management component od shepard system -- B. E. Bauer 1990
# Shepard is the action component of the system. When invoked, it
# owns all the associated files (see top of shepard_queue for list)
# and updates the current file on the originator, log and err files.
# Shepard can be invoked from local or remote machines; it senses
# local or remote operation and behaves accordingly.
# Shepard handles all tasks except for job queueing (shepard_queue) and
# application-specific job probing (defined in $probe_script as sourced
# in 'script'.script). Shepard is called by terminating jobs for cleanup.
# Shepard can be present in several executing copies called by run (the
# user interface) and by completing jobs waiting for cleanup. To avoid
# collision between shepards, absolute ownership of all associated files
# is essential, and is accomplished by creating a lock file. All other
# versions of shepard have to wait until the first is done.

# wait until lock file established insures complete ownership
# of all files by only one version of shepard at a time
until lockon .sheplock
do sleep 5; done

iam=`hostname`
. $HOME/.shepard.ini   # source the initialization file
# do not display greeting message if called from terminating process
if (test "$1" != "-z") then
    echo 'shepard on '$iam' at '`date`
fi
# if you see the message, you made it.
# Important verification that remote shell command is functioning

# lookup values from files depending on mode
pass=NO
case $1 in    # select the file name associated with flag
   -[ktp])    fname=$HOME/.running; pass=YES;;
    -[bd])    fname=$HOME/.waiting; pass=YES;;
       -g)    fname=$HOME/.restart; pass=YES;;
esac
if (test "$pass" = "YES") then # do the lookup
     scr=`awk 'BEGIN {n=1} {if ("'$5'" == n) print $1; n++}' $fname`
    dset=`awk 'BEGIN {n=1} {if ("'$5'" == n) print $2; n++}' $fname`
    host=`awk 'BEGIN {n=1} {if ("'$5'" == n) print $3; n++}' $fname`
    ddir=`awk 'BEGIN {n=1} {if ("'$5'" == n) print $4; n++}' $fname`
     sel=`awk 'BEGIN {n=1} {if ("'$5'" == n) print $5; n++}' $fname`
   tname=$host':'$scr'('$dset')'   # compact file name
fi

# no loop in shepard. Does the command then exits
case $1 in
    -x) # runs job through queue manager which handles submission
        shepard_queue $2 $3 $4 $5;;
    -m) # system-dependent code here. "big" is using Berkeley UNIX
        # while all others use SYSTEM V. Options to ps are different
        if (test "`hostname`" = "big") then
            ps -ax | grep -n shepard_exec  # CONVEX specific (for example)
        else
            ps -ef      # SGI IRIS specific (for example)
        fi;;
    -p) # probe job - script-dependent
        # source the file containing application-specific scripts
        . $SHEPARD_DIR/$2.script
        . $probe_script;;    # defined in sourced file $script.script
        echo 'press <ret> to continue \c'
    -r) # list running jobs on host
        cnt=`wc -l $HOME/.running | awk '{print $1}'`
        if (test "$cnt" = "0") then
            echo ' '; echo 'no jobs running'; echo ' '
        else
            awk ' BEGIN {
                fmt="\n%-5s %-16s %-16s %-8s\n"
                printf "\n----- running jobs on %s -----\n","'$host'"
                printf fmt,"#","script","dataset","pid"
                printf "----- ---------------- ---------------- --------\n"
                n=1
            } {
                printf fmt,n,$1,$2,$5
                n++
            }
            END {
                printf "\npress any key to continue "
            } ' $HOME/.running
        fi;;
    -w) # list waiting jobs on host
        cnt=`wc -l $HOME/.waiting | awk '{print $1}'`
        if (test "$cnt" = "0") then
            echo ' '; echo 'no jobs waiting'; echo ' '
        else
            awk ' BEGIN {
                fmt="\n%-5s %-16s %-16s %-8s\n"
                printf "\n----- waiting jobs on %s -----\n","'$host'"
                printf fmt,"#","script","dataset","position"
                printf "----- ---------------- ---------------- --------\n"
                n=1
            } {
                printf fmt,n,$1,$2,$5
                n++
            }
            END {
                printf "\npress any key to continue "
            } ' $HOME/.waiting
        fi;;
    -a) # list restartable jobs on host
        cnt=`wc -l $HOME/.restart | awk '{print $1}'`
        if (test "$cnt" = "0") then
            echo ' '; echo 'no jobs in restart'; echo ' '
        else
            awk ' BEGIN {
                fmt="\n%-5s %-16s %-16s %-8s\n"
                printf "\n----- restartable jobs on %s -----\n","'$host'"
                printf fmt,"#","script","dataset","position"
                printf "----- ---------------- ---------------- --------\n"
                n=1
            } {
                printf fmt,n,$1,$2,$5
                n++
            } ' $HOME/.restart
        fi;;
    -g) # restart a job from $HOME/.restart and update
        # file to select passed as shell argument 5
        # copys the selected entry to $HOME/.waiting with priority=RESTART
        awk ' BEGIN {
            n=1
        } {
            if (n == "'$5'") printf "%s %s %s %s RESTART\n",$1,$2,$3,$4
            n++
        } ' $HOME/.restart >> $HOME/.waiting
        awk ' BEGIN {   # restarted job is purged from $HOME/.restart
             n=1
        } {
            if (n != "'$5'") print $0
            n++
        }' $HOME/.restart > tmp
        mv tmp $HOME/.restart
        echo 'restarting '$tname' at '$datetime >>shepard.log
        #update .current on origin machine
        if (test "$host" = "$iam") then
            run_update -g $scr $dset $sel
        else
            rsh $host run_update -g $scr $dset $sel
        fi
        shepard_queue -r;;   # do the restart
    -k) # kill job with extreme prejudice
        # pid passed as shell argument 5, assigned to sel
        # running processes have 2 entries in the process list
        # first = shepard_exec and has the pid stored in running
        # second = the executable application
        # searching the process list for first finds second; both
        # must be killed to stop the application: killing shepard_exec
        # alone leaves the application program still running
   if (test "$iam" = "big") then
       cleanup=`ps -axl | awk ' "'$sel'" == $4 {print $3}'`
        else
       cleanup=`ps -ef | awk ' "'$sel'" == $4 {print $3}'`
        fi
        kill -9 $sel
   kill -9 $cleanup
        if (test "$?" = "0") then
            echo 'killed '$tname' at '$datetime >>$HOME/shepard.log
        else
            echo 'status of kill command nonzero - check log for problems'
        fi
   awk ' $5 != "'$sel'" { print $0 }' $HOME/.running > $HOME/tmp
   mv $HOME/tmp $HOME/.running
        #update .current on origin machine
        if (test "$host" = "$iam") then
            run_update -k $scr $dset $sel
        else
            rsh $host run_update -k $scr $dset $sel
        fi
        shepard_queue -q;; # check for waiting jobs
    -t) # terminate job gracefully pass script and origin variables
        # source the file containing application-specific scripts
        . $SHEPARD_DIR/$2.script
        . $terminate_script # found in scriptname.script
        echo 'terminated '$tname' at '$datetime >> $HOME/shepard.log
        #update .current on origin machine
        if (test "$host" = "$iam") then
            run_update -t $scr $dset $sel
        else
            rsh $host run_update -t $scr $dset $sel
        fi;;  # when the application exits, it will check for waiting jobs
    -l) # list the job log on host
        tail -30 shepard.log;;   # only the last is generally interesting
    -b) # bump priority of specific job
        # $HOME/.waiting can be in any order, use 2-pass approach
        # pass 1: set desired to zero, increment all others
        # pass 2: change 0 to 1, zero now being easy to spot
        awk ' {
            if ($1 == "'$scr'") {
                if ($5=="'$sel'") $5 = 0
                if ($5 < "'$sel'") $5 += 1
            }
            printf "%s %s %s %s %s\n",$1,$2,$3,$4,$5
        } ' $HOME/.waiting | awk ' {
                if ($5 == 0) $5 = 1
                printf "%s %s %s %s %s\n",$1,$2,$3,$4,$5
        } ' > $HOME/tmp
        mv $HOME/tmp $HOME/.waiting
        echo 'bumped '$tname' at '$datetime >> $HOME/shepard.log
        if (test "$host" = "$iam") then
            run_update -b $scr $dset $sel
        else
            rsh $host run_update -b $scr $dset $sel
        fi;;
    -d) # delete a waiting job from waiting, selected passed as shell arg 5
        # same script/higher priority have their priorities--
        awk ' {
            if ($1 == "'$scr'") {
                if ($5=="'$sel'") next   # excise deleted job
                if ($5 > "'$sel'") $5 = $5 - 1
            }
            print $0
        } ' $HOME/.waiting > tmp
        mv tmp $HOME/.waiting
        echo 'deleted '$tname' at '$datetime >> $HOME/shepard.log
        #update .current on origin machine
        if (test "$host" = "$iam") then
            run_update -d $scr $dset $sel
        else
            rsh $host run_update -d $scr $dset $sel
        fi;;
    -f) # list finished jobs
        awk ' BEGIN {
            fmt="\n%-16s %-16s %-12s\n"
            printf "\n----- finished jobs on %s -----\n","'$host'"
            printf fmt,"script","dataset","origin"
            printf "---------------- ---------------- ------------\n"
        } {
            printf fmt,$1,$2,$3
        }
        END {
            printf "\npress any key to continue "
        } ' $HOME/.finished;;
    -e) # list error log
        tail -30 $HOME/shepard.err;;
    -z) # go to cleanup routine, $5 has the completed jobs pid number
        echo 'finished '$4':'$2'('$3') at '`date` >>shepard.log
        # write entry to .finished
        # run on origin will look here for completed jobs
        echo $2 $3 $4 $5 `date` >> $HOME/.finished
        # excise finished job from $HOME/.running list
        awk '{
            if ("'$5'" != $5) print $0
            }' $HOME/.running >tmp
        mv tmp $HOME/.running
        #update .current on origin machine
        if (test "$4" = "$iam") then
            run_update -f $2 $3 $5
        else
            rsh $4 run_update -f $2 $3 $5
        fi
        #check queue for waiting process
        shepard_queue -q;;
esac

rm -f $HOME/.sheplock     # remove locking file

# normal return to run if invoked by remote shell, otherwise terminates



<a name="025d_000f"><a name="025d_000f">
<a name="025d_0010">

[LISTING THREE]

<a name="025d_0010">

trap 'rm -f $HOME/tmp; exit' 1 2 3 15

# shepard_queue - queue manager for shepard system -- B. E. Bauer 1990
# shepard_queue places jobs in a waiting queue and allows a job
# to actually start if the count of similar jobs running is
# below a user defined threshold. Its like a FIFO queue with a twist.
# This is intended to balance throughput vs system demands on
# multiprocessor high performance computers. Alter for your environment
# jobs in $HOME/.waiting have a number associated with their place in the
# queue. 1=next to start up to limit defined in .limits
# passed arguments:
#   normal queue submit:    1:  script name
#                           2:  dataset name
#                           3:  originating machine name
#                           4:  dataset directory
#   restart                 1:  -r  (no other values passed)
#   queue check             1:  -q  (no other values passed)
#
#   for restart, $HOME/.waiting has the restart job preappended
iam=`hostname`
. $HOME/.shepard.ini   # source the initialization file
mode=NORMAL
if (test "$1" = "-r") then    # restart entry submitted
    # get the script which has the RESTART code (normally passed as $1)
    scr=`awk 'BEGIN {n=0} $5=="RESTART" {print $1}' $HOME/.waiting`
    # find and replace RESTART with last queue slot for corresponding script
    awk ' BEGIN {
        count = 1
    } {
        if ("'$scr'" != $1) print $0
        else if ($5 != "RESTART") {
            count++
            printf "%s %s %s %s %s\n",$1,$2,$3,$4,$5
        }
        else printf "%s %s %s %s %s\n",$1,$2,$3,$4,count
    } ' $HOME/.waiting  > $HOME/tmp
    mv $HOME/tmp $HOME/.waiting
elif (test "$1" != "-q") then   # new job to submit
    # append new job entry to $HOME/.waiting list
    echo $1 $2 $3 $4 'NEW' >> $HOME/.waiting
    # change NEW label to count of jobs having that script
    # newest entry has the highest number/last to be executed
    awk ' BEGIN {
        count = 1
    } {
        if ("'$1'" != $1) print $0
        else if ($5 != "NEW") {
            count++
            print $0
        }
        else printf "%s %s %s %s %s\n",$1,$2,$3,$4,count
    } ' $HOME/.waiting  > $HOME/tmp
    mv $HOME/tmp $HOME/.waiting
    cnt=`awk 'BEGIN{n=0}"'$1'" == $1 {n++} END {print n}' $HOME/.waiting`
    if (test "$3" = "$iam") then
        run_update -w $1 $2 cnt
    else
        rsh $3 run_update -w $1 $2 cnt
    fi
else
    mode=QUEUE    # flag suppresses terminal response when in -q mode
fi

didit=NO  # flag reports job starting status

# loop through scripts available on this host
# available scripts are in the environment variable SHEPARD_SCRIPTS

# The FIFO queue has a twist: differing job types are subqueued with
# limits for each found in .limits without maintaining separate queue
# structures. This method is easier to implement and permits a maximum
# load balance consisting of a mix of program types, tailored to ones

# needs. In this way, a number of program type 'a' exceeding the limit
# only runs the number set in .limits, while the others queue leaving
# processor time for program types 'b' and 'c'. The optimum load balance is
# determined by the system resource requirements of each program and
# ones needs for throughput; adjusting .limits allows changes on the fly.

for i in $SHEPARD_SCRIPTS
do
    # count jobs actually running for each script, get associated job limit
    if (test -f "$HOME/.running") then
        rcnt=`awk 'BEGIN{n=0} "'$i'"==$1 {n++} END{print n}' $HOME/.running`
    else
        rcnt=0  # set rcnt to 0 if $HOME/.running is not present
    fi
    rlim=`awk '"'$iam'" == $1 && "'$i'" == $2 {print $3}' $HOME/.limits`
    if (test -z "$rlim") then
        rlim=1  # if no limit in $HOME/.limits, one job permitted
    fi
    # if more running jobs exceeds the limit, continue to next script
    if (test "$rcnt" -ge "$rlim") then
        continue
    fi
    # loop to next script if no jobs waiting with priority=1
    script=`awk ' "'$i'" == $1 && $5 == "1" { print $1}' $HOME/.waiting`
    if (test "$script" != "$i") then
        continue
    fi
    # found one for current script, get the remaining values
    dataset=`awk ' "'$i'" == $1 && $5 == "1" { print $2}' $HOME/.waiting`
    origin=`awk ' "'$i'" == $1 && $5 == "1" { print $3}' $HOME/.waiting`
    datadir=`awk ' "'$i'" == $1 && $5 == "1" { print $4}' $HOME/.waiting`

    # put date/time in a single string format
    set - `date`
    day=$3 month=$2 year=$6 tm=$4
    datetime=$day-$month-$year@$tm
    # submit shepard_exec to the background, get its pid
    wait 10    # shepard_queue does not wait for shepard_exec
    nohup shepard_exec $script $dataset $origin $datadir >shepard_junk.log &
    pid=$!  # process identification number - unique for job
    errflag=$?

    # shepard_exec did not initiate, for some reason
    # append the shepard_junk.log to shepard.err, alert the user
    # the job is placed in .restart
    if (test "$errflag" != "0") then
        #notify the user
        echo $script'('$dataset') did not start at '$datetime
        echo '       return code '$errflag
        echo '------ process error logfile contents -----'
        cat $HOME/shepard_junk.log
        echo '------ end of log from '$script'('$dataset') -----'
        echo ' '; echo 'check the contents of shepard.err  for details'
        # update shepard.err
        echo $script'('$dataset') did not start at '$datetime >tmp
        echo '------ process error logfile contents -----' >>tmp
        cat shepard_junk.log >>tmp
        echo '------ end of log from '$script'('$dataset') -----' >>tmp
        cat tmp >> $HOME/shepard.log; rm tmp
        # remove from $HOME/.waiting, place in .restart
        awk ' $1 == "'$script'" && $2 == "'$dataset'" && $5 == 1 {
            print $0
        }' $HOME/.waiting >> $HOME/.restart
        awk '{
            if ($1 == "'$i'" && $5 == 1) continue
            if ($1 == "'$i'") {
                $5 = $5 - 1
                printf "%s %s %s %s %s\n",$1,$2,$3,$4,$5
            }
        }'    $HOME/.waiting > $HOME/tmp
        mv $HOME/tmp $HOME/.waiting
        exit
    fi

    didit=YES
    # append job specifics to $HOME/.running
    echo $script $dataset $origin $datadir $pid >>$HOME/.running

    # append job info to shepard.log
    echo $script'('$dataset') started '$datetime >>$HOME/shepard.log

    # remove running job from $HOME/.waiting, update priority
    awk '{
        if ($1 == "'$i'" && $5 == 1) next
        if ($1 == "'$i'") {
            $5 = $5 - 1
            printf "%s %s %s %s %s\n",$1,$2,$3,$4,$5
        }
    }'    $HOME/.waiting > $HOME/tmp
    mv $HOME/tmp $HOME/.waiting
    #update .current on origin machine
    if (test "$iam" = "$origin") then
        run_update -r $script $dataset $pid
    else
        rsh $origin run_update -r $script $dataset $pid
    fi
    # if job is successfully started, notify user
    if (test "$didit" = "YES") then
        echo ' '
        echo $script'('$dataset') started on '$iam' at '$datetime
    fi
done

if (test "$didit" = "NO") then
    if (test "$mode" != "QUEUE") then
        echo ' '; echo 'no jobs were submitted'
    fi
fi
trap '' 1 2 3 15



<a name="025d_0011"><a name="025d_0011">
<a name="025d_0012">

[LISTING FOUR]

<a name="025d_0012">

trap 'shepard -z $1 $2 $3 $$' 1 2 3 15

# shepard_exec - execution potion of shepard system -- B. E. Bauer, 1990
# passed args: 1: script, 2: dataset, 3: origin, 4: datadir

. $HOME/.shepard.ini      # source initialization file
. $SHEPARD_DIR/$1.script  # source application-specific definitions

# routine to move the required files into the execution environment
# sourcing vs separate shell obviates need to pass values
. $SHEPARD_DIR/$getdata_script

# run the program. Assumes here that stdin, stdout, and stderr are
# required (generally true for UNIX) during execution. All other data
# files were moved into the execution environment by $getdata_script
$exe < $2$inp 1> $2$log 2> $2$err

# source the script to return data back in its proper location
. $SHEPARD_DIR/$putdata_script

# clean up and update status files
shepard -z $1 $2 $3 $$   # pid of completing process returns as arg5

trap '' 1 2 3 15



<a name="025d_0013"><a name="025d_0013">
<a name="025d_0014">

[LISTING FIVE]

<a name="025d_0014">

trap 'rm -f $HOME/tmp; exit' 1 2 3 15

#   run_update - update component of shepard system -- B.E. Bauer 1990
#   updates the .current file to reflect system activities

flag=$1 script=$2 dataset=$3 opt=$4
set - `date`
day=$3 month=$2 year=$6 tm=$4
datetime=$day-$month-$year@$tm

case $flag in
    -w) stat='WAITING';;
    -r) stat='RUNNING';;
    -g) stat='RESTART';;
    -b) stat='BUMPED';;
    -k) stat='KILLED';;
    -f) stat='DONE';;
    -d) stat='DELETED';;
    -t) stat='TERMINATED';;
esac

awk ' {
    if ("'$script'" == $1 && "'$dataset'" == $2) {
        printf "%s %s %s %s %s %s %s\n",$1,$2,$3,$4,$5,"'$datetime'","'$stat'"
    }
    else print $0
}' $HOME/.current >$HOME/tmp
mv $HOME/tmp $HOME/.current

trap '' 1 2 3 15



<a name="025d_0015"><a name="025d_0015">
<a name="025d_0016">

[LISTING SIX]

<a name="025d_0016">

big bmin31lv     Batchmin large  (<2000 atoms)
big bmin31mv     Batchmin medium (<1000 atoms)
big spartan      ab initio electronic structure calculation
big amber        biological structure simulation
big smapps       Monte Carlo peptide simulation
big ampac        semi-empirical electronic structure calculation
big dspace       NMR distance -> structure
moe bmin31ls     Batchmin large  (<2000 atoms) use with caution!
moe bmin31ms     Batchmin medium (<1000 atoms) default
moe bmin31ss     Batchmin small  (<250 atoms)
moe amber        biological structure simulation
moe ampac        semi-empirical electronic structure calculation
moe spartan      ab initio electronic structure calculation
moe smapps       Monte Carlo peptide simulation
larry bmin31ss   Batchmin small (<250 atoms)
larry ampac      semi-empirical electronic structure calculation



<a name="025d_0017"><a name="025d_0017">
<a name="025d_0018">

[LISTING SEVEN]

<a name="025d_0018">

larry SGI IRIS 4D/25TG (B-1-3-09)
curly SGI IRIS 4D/120GTX (B-8-3-22; CADD room)
moe SGI IRIS 4D/240s (B-8-3-22; CADD room)
big CONVEX 220 (B-20-B)



<a name="025d_0019"><a name="025d_0019">
<a name="025d_001a">

[LISTING EIGHT]

<a name="025d_001a">

-x execute (submit) a job
-m monitor remote host process status (ps command)
-p probe running job
-s status of running jobs on all platforms
-r running job list
-k kill job (with extreme prejudice)
-t terminate job gracefully
-g restart job
-l log file on remote machine (tail -30)
-b bump waiting job to next
-d delete a waiting job
-f finished job list on host machine
-e error log on host machine (tail -30)
-c change hosts
-w waiting job list on host machine
-a restartable job list on host machine



<a name="025d_001b"><a name="025d_001b">
<a name="025d_001c">

[LISTING NINE]

<a name="025d_001c">

# Loads values for script, dataset, host, and datadir last used by run.
# This file is recreated at the end of run.

defscript=bmin31lv
defdata=bmintest3
defhost=big
defdir=pla2



<a name="025d_001d"><a name="025d_001d">
<a name="025d_001e">

[LISTING TEN]

<a name="025d_001e">

big bmin31lv 3
big bmin31mv 1
big spartan 1
big amber 2
big smapps 1
big ampac 3
moe bmin31ms 2
moe bmin31ss 4
moe amber 2
moe smapps 1
moe ampac 4
moe spartan 1
larry bmin31ss 1
larry ampac 1



<a name="025d_001f"><a name="025d_001f">
<a name="025d_0020">

[LISTING ELEVEN]

<a name="025d_0020">

# Definitions for runnable scripts, dataset network movement and directory for
# various files. The runnable scripts must be in agreement with contents of
# .runscripts. Behavior of network for the originating machine is set here.

# options for SHEPARD_NETWORK: server, nfs, remote
# SHEPARD_DIR is location of application-specific shepard scripts

# this file is sourced and executes directly in environment of script

case `hostname` in
    larry|curly) # SGI IRIS workstation definitions
        SHEPARD_SCRIPTS='bmin31ss ampac'
        SHEPARD_NETWORK=server
        SHEPARD_DIR=$HOME/shepard_dir;;
    moe) # SGI IRIS-240 compute server definitions
        SHEPARD_SCRIPTS='bmin31ss bmin31ms amber smapps ampac spartan'
        SHEPARD_NETWORK=server
        SHEPARD_DIR=$HOME/shepard_dir;;
    big) # CONVEX specific definitions
        SHEPARD_SCRIPTS='bmin31lv bmin31mv amber smapps ampac spartan dspace'
        SHEPARD_NETWORK=server
        SHEPARD_DIR=$HOME/shepard_dir;;
esac



<a name="025d_0021"><a name="025d_0021">
<a name="025d_0022">

[LISTING TWELVE]

<a name="025d_0022">

# bmin31lv script for large vector (CONVEX) version of Batchmin v 3.1
exe=bmin31lv                         # the executable (in PATH)
inp=.com                             # extension for standard input
log=.log                             # extension for standard output
err=.err                             # extension for error output (channel 2)
getdata_script=bmin31lv.getdata      # get the input datafiles
putdata_script=bmin31lv.putdata      # put the output back
terminate_script=bmin31lv.terminate  # application-specific shutdown
probe_script=bmin31lv.probe          # conducts an application-specific probe



<a name="025d_0023"><a name="025d_0023">
<a name="025d_0024">

[LISTING THIRTEEN]

<a name="025d_0024">

# bmin31lv.getdata: get data script. Sourced in shepard_exec
# shell args: 2: dataset, 3:origin, 4: datadir
# datadir is dependent on network choice:
#       server: host-$HOME/datadir (host-$HOME is prepended)
#          nfs: nfs path of data from host to origin machines
#       remote: origin-$HOME/datadir (origin-$HOME is prepended)

case $SHEPARD_NETWORK in
    server) cd $4;;     # data stays put on host machine
    remote) rsh $3 cat $4/$2.dat >$2.dat   # move data. This is a kluge
            rsh $3 cat $4/$2$inp >$2$inp;; # remote cat puts output on host
       nfs) cp $4/$2.dat .     # copy via remotely mounted nfs dir
            cp $4/$2$inp . ;;
esac



<a name="025d_0025"><a name="025d_0025">
<a name="025d_0026">

[LISTING FOURTEEN]

<a name="025d_0026">


# bmin31lv.putdata: put data script. Sourced in shepard_exec
# shell args: 2: dataset, 3:origin, 4: datadir
# datadir is dependent on network choice:
#       server: host-$HOME/datadir (host-$HOME is prepended)
#          nfs: nfs path of data from host to origin machines
#       remote: origin-$HOME/datadir (origin-$HOME is prepended)


# all application-specific output files are moved, if necessary
case $SHEPARD_NETWORK in
    server) cd $HOME;;   # movement of files is not necessary
    remote) rsh $3 cat ">"$4/$2.out <$2.out     # another network kluge
            rsh $3 cat ">"$4/$2$log <$2$log     # remote cat with ">" writes
            rsh $3 cat ">"$4/$2$err <$2$err;;   # to remote. < read local.
       nfs) cp $2.out $4/$2.out
            cp $2$log $4/$2$log
            cp $2$err $4/$2$err;;
esac



<a name="025d_0027"><a name="025d_0027">
<a name="025d_0028">

[LISTING FIFTEEN]

<a name="025d_0028">

# sourced from shepard
# batchmin terminates when it finds dataset.stp in execution dir

case $SHEPARD_NETWORK in
        server) echo 'help me, please help me' > $4/$2.stp;;
    remote|nfs) echo 'help me, please help me' > $2.stp;;
esac



<a name="025d_0029"><a name="025d_0029">
<a name="025d_002a">

[LISTING SIXTEEN]

<a name="025d_002a">

# Sourced from shepard. $dset=jobname, $ddir=directory, $log=logfile ext
# Script prints the last 30 lines of the log file from the selected job

case $SHEPARD_NETWORK in
        server) tail -30 $ddir/$dset$log;;
    remote|nfs) tail -30 $dset$log;;
esac



<a name="025d_002b"><a name="025d_002b">
<a name="025d_002c">

[LISTING SEVENTEEN]

<a name="025d_002c">

/* lockon.c - creates lock file from argv[1] having no privelege
   B. E. Bauer, 1990
 */
main (argc, argv)
int argc;
char *argv[];
{
    int fp, locked;

    locked = 0;
    if (argc != 1) {
        printf ("\nuseage: lockon lockfile\n");
        exit (0);
    }
    if ((fp = creat(argv[1], 0)) < 0) ++locked;
    else close(fp);
    return (locked);
}

Parallel

Controlling Background Processes Under Unix

Here's a system that "user-izes" process management

System Overview

Networking

An Example

Control Files and Menu Generation

Table 1: Control file formats

Installation

Future Enhancements

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Parallel

Controlling Background Processes Under Unix

Related Reading

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content