PARALLEL EXTENSIONS TO C
Programming parallel networks
This article contains the following executables: ELLIS.LST
Graham K. Ellis
Ken is a graduate student working on parallel processing and controls applications in the Smart Materials and Structures Laboratory at Virginia Tech. He can be contacted at the Mechanical Engineering Department, Randolph Hall, Virginia Tech, Blacksburg, VA 24061.
All extensions to C compilers for transputers are based on the Occam concurrency model. However, there are two different approaches for extending C for concurrency; one that adds keywords to the C language definition, and another that implements parallelism using library routines. Arguments can be made for either case. Bjarne Stroustrup, for example, believes that concurrency should be implemented using libraries because the library approach provides the flexibility required to implement concurrency at a lower level, closer to the machine (Stevens, 1989). On the other hand, Narain Gehani, et al. (of Concurrent C fame), hold that extending C (and C++) by adding keywords is the best solution because extensions can more adequately provide and implement the necessary concurrency functions across a wide range of platforms (Gehani and Roome).
My subjective (and completely unsupported) opinion is a politically safe "somewhere in between." I like the use of C language extensions for readability, but I think libraries allow more flexibility for different parallel architectures (important portability issues notwithstanding). But because most of the programming I do is for small real-time control and signal processing applications, what's important to me isn't necessarily important to programmers working on large programs.
On a day-to-day basis I use Logical Systems' C (LSC) compiler, an ANSI C compiler that implements the transputer parallelism using library calls. The transputer-specific library functions can be grouped into seven categories: channel communication, channel status testing, transputer concurrency (CSP model), transputer concurrency (fork/join model), transputer semaphore support, transputer timing and scheduling, and miscellaneous routines (see Table 1). Because of space constraints, I'll discuss only the channel, channel status, and CSP concurrency functions.
Table 1: Partial list of LSC functions
Channel Communication Channel Status Concurrency (CSP Model) ----------------------------------------------------------------------- ChanAlloc ProcAlt ProcAlloc ChanFree ProcAltList ProcFree Chanin ProcSkipAlt ProcInit ChaninChanFail ProcSkipAltList ProcPar ChaninChar ProcTimerList ProcParam Chaninint ProcTimerAltList ProcParList ChaninTimeFail ProcPriPar ChanOut ProcRun ChanOutChanFail ProcRunHigh ChanOutChar ProcRunLow ChanOutint ProcStop ChanOutTimeFail ProcToHigh ChanReset ProcToLow ProcWait Concurrency (Fork/Join) Timing and Scheduling Miscellaneous ------------------------------------------------------------- PFork GetHiPriQ BitCnt PForkHigh GetLoPriQ BitRevNBits PForkinit ProcAfter BitRevWord PForkLow ProcCall Move2D PHalt ProcGetPriority Move2DNonZero PJoin ProcReschedule Move2DZero PRun SetHiPriQ restorefp PSetup SetLoPriQ _boot_chan_in PStop SetTime _boot_chan_out Time _node_number
Library Functions
Channel communication is implemented by first allocating a channel, then reading and/or writing from process to process. ChanAlloc() is used to allocate an internal channel by returning a pointer to the allocated channel. For channels assigned to actual hardware links, a pointer assignment to the hardware address is used instead of ChanAlloc(). These and other transputer specifics are defined in the conc.h include file. When an allocated channel is no longer needed, memory can be freed using ChanFree(). Code showing basic channel usage is listed in Figure 1.
There are three pairs of functions for channel communication. ChanIn()/ChanOut() transfer an arbitrary number of bytes of data; ChanInInt()/ChanOutInt() transfer a single integer; and ChanInChar()/ ChanOutChar() transfer a character. There are also channel routines for fault-tolerant applications that abort after a user-specified time-out period or by communication on another channel.
The alternative functions operate on lists of channels. ProcAlt() and ProcAltList() scan a list for inputs. The program blocks (waits) until one channel is ready for an input and then selects a channel. ProcSkipAlt() and ProcSkipAltList() scan a list of channels; however, process blocking is not performed. If no channels are ready for input, the function returns immediately. ProcTimerAlt() and ProcTimerAltList()block the current process until one of the channels is ready or until a user-specified time elapses. An example of the alternative process is shown in Figure 2.
There are several methods for running an arbitrary number of processes in parallel on a single chip. In any case, each process must be allocated -- a stack frame must be built for each process and memory created for a process structure. ProcAlloc() performs these tasks by returning a pointer to an initialized process structure. When the allocated processes are no longer needed, ProcFree() returns the allocated memory to the heap.
Once all processes have been allocated, one of the process creation functions -- ProcRun(), ProcRunHigh(), ProcRunLow(), ProcPar(), ProcParList(), and ProcPriPar() -- can be used. These functions take process pointers and spawn time-sliced processes according to the function used. Code for two processes is shown in Figure 3.
Concurrent Sorting
To examine the routines described above, I have developed a simple ASCII sort of a string of characters. The sort is implemented on a pipeline of parallel processes and is a modified version of the sorting example in the Inmos Transputer Development System manual. Single- and multiprocessor implementations of the sorting algorithm are shown in Listings Five and Six, respectively. The two programs utilize the following files (only the formats of the main programs differ). Listing One (page 124) lists proto.h, Listing Two (page 124) mytypes.h, Listing Three (page 124) multisort.nif, and Listing Four (page 124) buffers.c.
The single-processor implementation of the sorting routine (see sort.c, Listing Five, page 124) performs the actions in Example 1. In the schematic representation of this sort shown in Figure 4, the circles represent parallel processes and the lines represent the communication channels. Note that the data flows through the pipeline in sorted order. As a result, the further down the pipeline a process (or processor) is, the lower its utilization. This is acceptable for a single-processor implementation, but it is a waste of resources in a multiprocessor implementation because most of the processors would be idle most of the time.
Example 1: Actions performed by the single-processor implementation of the sorting routine.
Root Transputer: SEQUENTIAL - Read in line of text and store in character buffer. - Allocate channels and processes. PARALLEL - Send out a character at a time to the first pipe process. - In 256 parallel pipe processes: read in a character then read a new character, forward the character with the lowest ASCII value. - Read in one character at a time from last pipe buffer and write it to the screen. - Deallocate channels and processes.
Also note that the channel inputs and outputs match. For each byte (or word or array) output, data of the exact same length must be input. If the length of the data transfers does not match, the data transfer will deadlock. Since deadlock is probably the most common programming bug that occurs when using transputers, extra care must be taken to ensure that channel I/O matches. In addition, channel communication must be decoupled sufficiently to ensure that processes are not waiting for each other to communicate first.
The two-processor version (see multisort.c, Listing Six, page 126) performs the exact same sort, but the organization is slightly different. In this case, the root processor merely reads the character string, then forwards it out a hardware link to the next processor.
At the same time, the sorted data coming from the network is read in and displayed a character at a time. The block diagram of this is shown in Figure 5. In this case, a rectangle represents a transputer and the circles represent concurrent processes (see Example 2). A method for mapping the program from two processors onto a larger transputer network is illustrated in Figure 6. The only differences in the two-processor version are that the pipe processes are distributed among more processors, and some of the channels are mapped onto hardware links instead of using internally allocated channels.
Example 2: Actions performed by the two-processor implementation of the sorting routine.
Root Transputer: SEQUENTIAL - Read in line of text and store in character buffer. - Allocate channels and processes. PARALLEL - Send out a character at a time to the first buffer. - Read in a character at a time and write it to the screen. - Deallocate channels and processes. Transputer Node 1: SEQUENTIAL - Allocate channels and processes. PARALLEL - Read in data from link a character at a time, sort, forward to next pipe process. - In 254 parallel pipe processes: read in a character then read a new character, forward the character with the lowest ASCII value. - Read in data from pipe process sort, forward data to root transputer.
I've written the sorting program in a somewhat unusual manner: The code for each processor of the multitransputer version of the sort is identical. Normally, I would write a unique root transputer program and then another program to be used for all of the additional network nodes. However, because I have only two processors, I've set up the program to determine on which node the code is running and then perform the appropriate actions. The _node_number variable is supplied to each node by the network loader.
To modify the code for more processors, forward the data to another processor with more sorting processes instead of having Processor 1 above return the data to the host. The last processor in the network can then be connected back to one of the other root transputer links.
The sorting algorithm uses the source(), output(), and pipe() functions, which are the same for every version of the program. Only the format of the main program varies from the single- to multitransputer case.
Functions run in parallel require a Process pointer parameter as the first function in the parameter list. This pointer is used by the LSC process allocation function. Aside from its appearance as the first variable in the parameter list, its use is completely transparent to the programmer.
- source() takes as its parameters a pointer to an output channel and a pointer to the character buffer where the text to be sorted is stored. A single character at a time is output down the specified channel. When an ASCII NULL is encountered, the function exits.
- pipe() takes pointers to both an input and an output channel. Two characters are read in, and the one with the lowest ASCII value is forwarded to the next pipe process. Then a new character is read, a comparison is performed, and the character with the lowest ASCII value is forwarded. When a NULL is received, the stored character is forwarded, the NULL is forwarded, and the function exits.
- output() reads a character at a time from an input channel and displays it using the putchar()function. The function exits when a NULL is received.
Conclusion
This article provides a basic idea of how you can take advantage of the features available on transputers for building parallel processing networks. The ease with which parallel networks of transputers can be programmed and expanded, and the performance achieved by using this method, enable programmers to use transputers to solve problems that were previously relegated to large and expensive computing resources.
Acknowledgements
I would like to thank the NASA Graduate Student Researcher's Program (NGT-50392) for their support, and Kirk Bailey at Logical Systems for answering my questions about the compiler.
References
Hoare, C.A.R., "Communicating Sequential Processes," Communications of the ACM, Vol. 21, No. 8 (August), pp. 666-677, 1978.
May, David and Taylor, Richard, "Occam -- an Overview," Microprocessors and Microsystems, Vol. 8, no. 2 (March), pp. 73 - 79, 1984.
Stevens, Al, "From C to C++: Interviews with Dennis Ritchie & Bjarne Stroustrup," Dr. Dobb's C Sourcebook, Vol. 14, no. 159 (Winter), pp. 9 - 17, 1989.
Gehani, N. H. and Roome, W. D., Concurrent C, Computer Technology Research Laboratory Technical Reports, Concurrent C Project, AT&T Bell Laboratories, Murray Hill, NJ 07974.
Henderson, Brian, "Par.c System -- a C Compiler for Transputers," Mailshot April 1990, SERC/DTI Transputer Initiative, Rutherford Appleton Laboratory, Chilton, DIDCOT, Oxon, OX11 OQX, pp. 45 - 51.
Logical Systems Transputer Toolset (Version 89.1), C Library Description, Logical Systems, P.O. Box 1702, Corvalis, OR 97339.
Transputer Development System, INMOS, Ltd., Prentice-Hall, New York, 1988.
Transputer Architecture
A transputer is a microprocessor developed by Inmos and designed specifically for Multiple Instruction Multiple Data (MIMD) parallel-processing applications. All transputers have on-chip memory and serial links for connecting one transputer to another. Transputers come in several flavors: 16-bit, 16-bit with disk control logic, 32-bit, and 32-bit with an on-chip floating-point unit. The current generation of transputers has 4 Kbytes of on-chip RAM and four bidirectional serial links that can operate at 20, 10, or 5 Mbits/sec.
In addition, transputers have a microcoded scheduler with two priority levels, which allows concurrent (time-sliced) processes to share a processor's time. There are also two sets of timers, one for each priority level. A block diagram of the Inmos T800 floating-point transputer is shown in Figure 7. A T800 transputer offers approximately the same performance as the Intel 80386/80387 chip combination at around half the cost because you buy one chip instead of two.
Transputer Concurrency Model
The transputer concurrency model is based on C.A.R. Hoare's "Communicating Sequential Processes" (CSP) (Hoare, 1978). The "native" transputer language Occam (named after the 14th-century philosopher William of Occam) is based on the CSP idea of concurrency and communication (May and Taylor, 1984). A process performs a sequence of actions and then terminates. Concurrent processes can be thought of as multiple black boxes, each of which has its own internal state. Communication between processes is performed through point-to-point channels (CHAN). Each channel provides a one-way connection between two processes. It is not possible to share a channel between more than two processes, nor is there shared memory between processes. Information is updated through channel communication, thus alleviating the need for memory locks such as those used on shared memory multiprocessors. Communication between processes is synchronous and unbuffered. One process is blocked (descheduled) until the other is ready to communicate. When both the sending and receiving processes are ready, they proceed.
The communication channels can be virtual channels maintained internally by a single transputer, or they can be mapped onto one of the hardware links for true parallel processing applications. The transparent nature of virtual and physical channel/link communication enables the programmer to develop many programs on a single transputer and, with minor modification, map the program onto a network of processors.
Another channel construct is the alternative. The alternative (ALT) construct provides a method of choosing the first available channel from a set of channels and is implemented on the input side of channel communication. ALT is often used to control the flow of data through a network.
To generate concurrent (or virtual) concurrent) processes, a parallel construct (PAR) is used. As previously mentioned, parallel processes can only communicate through channels. The transputer supports high priority (used for communication processes) and low priority (for computation processes). While this may seem backward, it is important in a multiprocessor environment to keep as many processors as busy as possible by providing them data to process. Low-priority processes get time-sliced approximately every 1 millisecond. High-priority processes do not get time-sliced. They will, however, deschedule while waiting for channel communication.
Transputer Environments
Transputer networks generally consist of a root transputer that interfaces directly with a host such as a PC, Macintosh, or Sun. All other transputers in the network are connected in some fashion to the root transputer using the built-in serial links. Even though these network nodes may reside in the host, they rely on the host only for power and ground. Only the root transputer communicates with the host computer.
Typically, the root transputer is connected to an adapter chip that converts a transputer serial link to a bus interface. The host runs some sort of file server program to service the transputers. In the case of the C compiler, the server provides access to the keyboard, monitor, and disk storage of the host. The LSC compiler has the hooks in its library so that the standard C library functions communicate correctly with the host server. However, at this point in time, only the root transputer can use stdio routines to the host without extra programming effort. --G.K.E.
_PARALLEL EXTENSIONS TO C_ by Graham K. Ellis
[LISTING ONE]<a name="01af_0011">
/**** File: proto.h -- Contains the function protocols ****/
/* include mytypes.h first....I don't do any checking */
source(Process *, Channel *, UBYTE *);
output(Process *, Channel *);
pipe(Process *, Channel *, Channel *);
/* end proto.h */
<a name="01af_0012"><a name="01af_0012">
<a name="01af_0013">
[LISTING TWO]<a name="01af_0013">
/**** mytypes.h ****/
#define BUFF_SIZE 256 /* the max. char buffer size */
#define NUM_PIPES BUFF_SIZE /* the number of sorting processes */
#define PIPE_STACK 128 /* stack space for the processes*/
#define INPUT_STACK 4096
#define OUTPUT_STACK 4096
#define TRUE 1 /* some simple defines */
#define FALSE 0
typedef int BOOL;
typedef unsigned char UBYTE;
<a name="01af_0014"><a name="01af_0014">
<a name="01af_0015">
[LISTING THREE]<a name="01af_0015">
; multisort.nif -- This is the file the transputer uses to load the network.
; The parameters are:
;----------------------------------------------------------------------
; node number | processor reset | link 0 | link1 | link 2| link 3 |
; | from processor |---------------------------------|
; | number Rxx | The number here is the processor|
; | | number we connect to. |
;----------------------------------------------------------------------
;
1, multisort, R0, 0, , , 2;
2, multisort, R1, , ,1, ;
<a name="01af_0016"><a name="01af_0016">
<a name="01af_0017">
[LISTING FOUR]<a name="01af_0017">
/**** File: buffers.c -- Contains the funtions:
** source(Process *, Channel *, UBYTE *)
** output(Process *, Channel *)
** pipe(Process *, Channel *, Channel *)
****/
#include <stdio.h>
#include <conc.h>
#include "mytypes.h" /* include in this order... */
#include "proto.h"
/**** source(Process *p, Channel *out, UBYTE *ch)
** Output a single character at a time from the buffer pointed to by ch
** on the channel out. Terminate when '\0' or whole buffer is sent.
** Process *p is used by the ProcAlloc() routine. You must have this
** in the parameter list for a parallel process.
****/
source(Process *p, Channel *out, UBYTE *ch)
{
BOOL inline = TRUE;
int i;
for(i=0; i<BUFF_SIZE; i++) {
if(inline)
switch(ch[i]) {
case '\0': /* this is the EOL for us */
ChanOutChar(out, ch[i]);
inline = FALSE;
break;
default:
ChanOutChar(out, ch[i]);
}
}
}
/**** output(Process *p, Channel *in)
** Read a single character at at time on channel in and send it to stdout.
** Terminate when '\0' is received. Process *p is used by the ProcAlloc()
** routine. You must have this in the parameter list for a parallel process.
****/
output(Process *p, Channel *in)
{
UBYTE ch;
BOOL running = TRUE;
while(running) {
ch = ChanInChar(in);
switch(ch) {
case '\0':
putchar('\n');
running = FALSE;
break;
case '\n':
break;
default:
putchar(ch);
}
}
}
/**** pipe(Process *p, Channel *in, Channel *out)
** Reads a single character on channel in and falls into a loop where next
** character is read and compared in an ASCII sense to the current charater.
** Character with the lowest ASCII value is forwarded on channel out. Loop
** terminates on a '\0' character. The "stored" character is sent before the
** '\0' is propagated. Process *p is used by the ProcAlloc() routine.
** You must have this in the parameter list for a parallel process.
****/
pipe(Process *p, Channel *in, Channel *out)
{
UBYTE highest, next;
BOOL running;
BOOL ns_node;
if(_node_number == 1) /* If we are the root node only do the */
ns_node = FALSE; /* main loop once. If we are a non- */
else /* root node, loop forever. */
ns_node = TRUE;
do {
running = TRUE;
highest = ChanInChar(in);
while(running) {
next = ChanInChar(in);
switch(next) {
case '\0' :
ChanOutChar(out, highest);
running = FALSE;
break;
default :
if(next > highest) {
ChanOutChar(out, highest);
highest = next;
}
else
{
ChanOutChar(out, next);
}
}
}
ChanOutChar(out, '\0');
} while(ns_node);
}
<a name="01af_0018"><a name="01af_0018">
<a name="01af_0019">
[LISTING FIVE]<a name="01af_0019">
/*************************************************************************
** sort.c -- Compiles under Logical Systems Transputer C compiler Versions
** 88.4, 89.1. A single transputer, multi-process ASCII sorting routine. This
** shows how to allocate and deallocate parallel (multi-tasking) processes on
** a single transputer. This algorithm is modelled after the sorting example
** in the INMOS occam Transputer Development System (TDS). The algorithm
** relies on having as many parallel processes as the maximum character
** buffer size. Data is sent to a FIFO-like array of input-ouput buffers. If
** the next letter read in is less than (in an ASCII sense) the current letter
** in the buffer, it is forwarded to the next buffer, otherwise the current
** data is forwarded. The program will shut down when a ~ is the first
** character in any input line.
** Must link in file buffers.c (after compilation of course..)
** Programed by: G.K.Ellis, Smart Materials and Structures Laboratory,
** Mechanical Engineering Department, Virginia Tech, Blacksburg, VA 24061
***************************************************************************/
#include <stdio.h>
#include <conc.h> /* the LSC defines are here */
#include "mytypes.h" /* include in this order... */
#include "proto.h"
/**** MAIN PROGRAM ****/
void main()
{
UBYTE ch[BUFF_SIZE]; /* string buffer */
BOOL active = TRUE;
int i;
Channel *thru[NUM_PIPES + 1]; /* internal channels */
Process *p[NUM_PIPES + 3]; /* process pointers */
while(active) {
fgets(ch, BUFF_SIZE-1, stdin); /* get the input line */
/* stop char is ~ as first letter */
if(ch[0] == '~') active = 0;
/* allocate the channels, NULL is returned if no memory */
for(i = 0; i< NUM_PIPES + 1; i++) {
if((thru[i] = ChanAlloc()) == NULL)
printf("No memory for Channel thru[%d]\n", i);
}
/* allocate the processes, NULL is returned if no memory */
p[0] = ProcAlloc(source, INPUT_STACK, 2, thru[0], ch);
for(i = 1; i < NUM_PIPES+1; i++)
p[i] = ProcAlloc(pipe, PIPE_STACK, 2, thru[i-1], thru[i]);
p[NUM_PIPES+1] = ProcAlloc(output, OUTPUT_STACK, 1, thru[NUM_PIPES]);
p[NUM_PIPES+2] = NULL;
/* check to see if all processes were allocated successfully */
for(i = 0; i < NUM_PIPES + 2; i++)
if(!p[i])
printf("No memory for Process p[%d]\n", i);
ProcParList(p); /* Run a NULL terminated list of pointers */
/* to processes. These are all low-pri */
/* Since we allocate these each time through the loop, we needto
** deallocate them here otherwise, we will run out of memory */
for(i = 0; i < NUM_PIPES + 1; i++)
ChanFree(thru[i]);
for(i = 0; i < NUM_PIPES + 2; i++)
ProcFree(p[i]);
}
printf("done!\n"); /* all done... */
}
<a name="01af_001a"><a name="01af_001a">
<a name="01af_001b">
[LISTING SIX]<a name="01af_001b">
/*************************************************************************
** multisort.c -- A two transputer version of a simple parallel ASCII sorting
** routine. This program will work on either node. Developed for Logical
** Systems Transputer C compiler (LSC) versions 88.4 or 89.1. Must link with
** buffers.c. Programmed by: G.K.Ellis, Smart Materials and Structures
** Laboratory, Mechanical Engineering Department, Virginia Tech,
** Blacksburg, VA 24061
**************************************************************************/
#include <stdio.h>
#include <conc.h>
#include "mytypes.h"
#include "proto.h"
/**** MAIN PROGRAM ****/
void main()
{
UBYTE ch[BUFF_SIZE]; /* string buffer */
BOOL active = TRUE;
int i;
Channel *in, *out; /* link channels */
Process *feeder, *sink; /* hi-priority processes */
Channel *thru[NUM_PIPES - 1]; /* internal soft channels */
Process *p[NUM_PIPES - 2]; /* low-priority processes */
if(_node_number == 1) { /* We are the root node */
in = LINK3IN; /* these are our links */
out = LINK3OUT;
while(active) {
fgets(ch, BUFF_SIZE-1, stdin); /* get the input line */
/* stop char is ~ as first letter */
if(ch[0] == '~') active = 0;
/* set up the processes */
feeder = ProcAlloc(source, INPUT_STACK, 2, out, ch);
sink = ProcAlloc(output, OUTPUT_STACK, 1, in);
/* Check that ProcAlloc() doesn't return a NULL, but
** since we KNOW from the example sort.c this is ok */
/* run the feeder and sink processes in parallel */
ProcPar(feeder, sink, NULL);
/* free the previously allocated processes */
/* note we do this each time through the loop */
/* hey, it's only an example */
ProcFree(feeder);
ProcFree(sink);
}
printf("done!\n");
exit(0); /* finis! */
} else { /* if we are the non-root */
in = LINK2IN; /* node, run this as main() */
out = LINK2OUT; /* these are our links */
for(i = 0; i < NUM_PIPES - 1; i ++) { /* allocate soft channels */
thru[i] = ChanAlloc();
}
/* allocate soft channel pipe processes */
for(i = 0; i < NUM_PIPES - 2; i++)
p[i] = ProcAlloc(pipe, PIPE_STACK, 2, thru[i], thru[i+1]);
/* allocate the processes with the links */
feeder = ProcAlloc(pipe, PIPE_STACK, 2, in, thru[0]);
sink = ProcAlloc(pipe, PIPE_STACK, 2, thru[NUM_PIPES-2], out);
/* If we want to check the pointer returned from ProcAlloc(),
** that is extra programming on a non-root node. Therefore,
** don't try to printf() from a non-root node. There is a
** non-server node _ns_printf() that does simple ASCII
** transfers out a channel (link), but this generally requires
** extra effort to communicate back to the server */
/* Let's allocate these asynchronously */
ProcRunHigh(sink); /* run the hardware links at high pri */
ProcRunHigh(feeder);
for(i = 0; i < NUM_PIPES - 2; i++)
ProcRunLow(p[i]); /* and soft channels at low pri */
_ns_exit(); /* we must exit like this on non-server*/
/* node or we will exit from the */
/* main() program */
/* ProcRunHigh() and ProcRunLow() spawn new processes asyncronously
** from main(). The _ns_exit() does a STOPP on the main process
** leaving the two spawned processes. The exit() routine trys
** to send a message back to the server running on the PC. If it
** isn't there, i.e. you're not the root, you've got problems.
** Note also, if we are not the root node, we never terminate
** and restart the pipe processes... we just run the processes ! */
}
}
[Example 1: Actions performed by the single-processor
implementation of the sorting routine]
Root Transputer:
SEQUENTIAL
- Read in line of text and store in character buffer.
- Allocate channels and processes.
PARALLEL
- Send out a character at a time to the first pipe process.
- In 256 parallel pipe processes: read in a character then
read a new character, forward the character with the
lowest ASCII value.
- Read in one character at a time from last pipe buffer
and write it to the screen.
- Deallocate channels and processes.
[Example 2: A rectangle represents a transputer and the
circles represent concurrent processes]
Root Transputer:
SEQUENTIAL
- Read in line of text and store in character buffer.
- Allocate channels and processes.
PARALLEL
- Send out a character at a time to the first buffer.
- Read in a character at a time and write it to the screen.
- Deallocate channels and processes.
Transputer Node 1:
SEQUENTIAL
- Allocate channels and processes.
PARALLEL
- Read in data from link a character at a time, sort, forward to
next pipe process.
- In 254 parallel pipe processes: read in a character then
read a new character, forward the character with the
lowest ASCII value.
- Read in data from pipe process sort, forward data to
root transputer.
[Figure 1: Channel Communication ]
buffer(Process *p, Channel *in, *out) /* the Procees *p is used by */
{ /* ProcAlloc() */
int data;
while(1) {
data = ChanInInt(in);
ChanOutInt(out, data);
}
}
[Figure 2: Multiplexer using the channel alternative]
mux(Process *p, Channel *in1, *in2, *out)
{
int data, index;
while(1) {
index = ProcAlt(in1, in2, NULL); /* return offset into chan list */
switch(index) {
case 0: /* channel in1 */
data = ChanInInt(in1);
ChanOutInt(out, data);
break;
case 1: /* channel in2 */
data = ChanInInt(in2);
ChanOutInt(out, data);
break;
}
}
}
[Figure 3: Two parallel processes code fragment]
Process *p1, *p2; /* Process and Channel are defined in */
Channel *in, *thru, *out; /* conc.h */
in = ChanAlloc(); /* internal channel allocation */
out = ChanAlloc(); /* we should check the return values here */
thru = ChanAlloc(); /* NULL returned if allocation unsuccessful */
p1 = ProcAlloc(buffer, STACK, 2, in, thru); /* allocate space for the */
p2 = ProcAlloc(buffer, STACK, 2, thru, out); /* processes p1, p2 */
ProcPar(p1, p2, NULL); /* run them in parallel */