Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Parallel Extensions to C


AUG90: PARALLEL EXTENSIONS TO C

PARALLEL EXTENSIONS TO C

Programming parallel networks

This article contains the following executables: ELLIS.LST

Graham K. Ellis

Ken is a graduate student working on parallel processing and controls applications in the Smart Materials and Structures Laboratory at Virginia Tech. He can be contacted at the Mechanical Engineering Department, Randolph Hall, Virginia Tech, Blacksburg, VA 24061.


All extensions to C compilers for transputers are based on the Occam concurrency model. However, there are two different approaches for extending C for concurrency; one that adds keywords to the C language definition, and another that implements parallelism using library routines. Arguments can be made for either case. Bjarne Stroustrup, for example, believes that concurrency should be implemented using libraries because the library approach provides the flexibility required to implement concurrency at a lower level, closer to the machine (Stevens, 1989). On the other hand, Narain Gehani, et al. (of Concurrent C fame), hold that extending C (and C++) by adding keywords is the best solution because extensions can more adequately provide and implement the necessary concurrency functions across a wide range of platforms (Gehani and Roome).

My subjective (and completely unsupported) opinion is a politically safe "somewhere in between." I like the use of C language extensions for readability, but I think libraries allow more flexibility for different parallel architectures (important portability issues notwithstanding). But because most of the programming I do is for small real-time control and signal processing applications, what's important to me isn't necessarily important to programmers working on large programs.

On a day-to-day basis I use Logical Systems' C (LSC) compiler, an ANSI C compiler that implements the transputer parallelism using library calls. The transputer-specific library functions can be grouped into seven categories: channel communication, channel status testing, transputer concurrency (CSP model), transputer concurrency (fork/join model), transputer semaphore support, transputer timing and scheduling, and miscellaneous routines (see Table 1). Because of space constraints, I'll discuss only the channel, channel status, and CSP concurrency functions.

Table 1: Partial list of LSC functions

  Channel Communication    Channel Status         Concurrency (CSP Model)
  -----------------------------------------------------------------------
  ChanAlloc                ProcAlt                ProcAlloc
  ChanFree                 ProcAltList            ProcFree
  Chanin                   ProcSkipAlt            ProcInit
  ChaninChanFail           ProcSkipAltList        ProcPar
  ChaninChar               ProcTimerList          ProcParam
  Chaninint                ProcTimerAltList       ProcParList
  ChaninTimeFail                                  ProcPriPar
  ChanOut                                         ProcRun
  ChanOutChanFail                                 ProcRunHigh
  ChanOutChar                                     ProcRunLow
  ChanOutint                                      ProcStop
  ChanOutTimeFail                                 ProcToHigh
  ChanReset                                       ProcToLow
                                                  ProcWait

  Concurrency (Fork/Join)  Timing and Scheduling  Miscellaneous
  -------------------------------------------------------------
  PFork                    GetHiPriQ              BitCnt
  PForkHigh                GetLoPriQ              BitRevNBits
  PForkinit                ProcAfter              BitRevWord
  PForkLow                 ProcCall               Move2D
  PHalt                    ProcGetPriority        Move2DNonZero
  PJoin                    ProcReschedule         Move2DZero
  PRun                     SetHiPriQ              restorefp
  PSetup                   SetLoPriQ              _boot_chan_in
  PStop                    SetTime                _boot_chan_out
                           Time                   _node_number

Library Functions

Channel communication is implemented by first allocating a channel, then reading and/or writing from process to process. ChanAlloc() is used to allocate an internal channel by returning a pointer to the allocated channel. For channels assigned to actual hardware links, a pointer assignment to the hardware address is used instead of ChanAlloc(). These and other transputer specifics are defined in the conc.h include file. When an allocated channel is no longer needed, memory can be freed using ChanFree(). Code showing basic channel usage is listed in Figure 1.

There are three pairs of functions for channel communication. ChanIn()/ChanOut() transfer an arbitrary number of bytes of data; ChanInInt()/ChanOutInt() transfer a single integer; and ChanInChar()/ ChanOutChar() transfer a character. There are also channel routines for fault-tolerant applications that abort after a user-specified time-out period or by communication on another channel.

The alternative functions operate on lists of channels. ProcAlt() and ProcAltList() scan a list for inputs. The program blocks (waits) until one channel is ready for an input and then selects a channel. ProcSkipAlt() and ProcSkipAltList() scan a list of channels; however, process blocking is not performed. If no channels are ready for input, the function returns immediately. ProcTimerAlt() and ProcTimerAltList()block the current process until one of the channels is ready or until a user-specified time elapses. An example of the alternative process is shown in Figure 2.

There are several methods for running an arbitrary number of processes in parallel on a single chip. In any case, each process must be allocated -- a stack frame must be built for each process and memory created for a process structure. ProcAlloc() performs these tasks by returning a pointer to an initialized process structure. When the allocated processes are no longer needed, ProcFree() returns the allocated memory to the heap.

Once all processes have been allocated, one of the process creation functions -- ProcRun(), ProcRunHigh(), ProcRunLow(), ProcPar(), ProcParList(), and ProcPriPar() -- can be used. These functions take process pointers and spawn time-sliced processes according to the function used. Code for two processes is shown in Figure 3.

Concurrent Sorting

To examine the routines described above, I have developed a simple ASCII sort of a string of characters. The sort is implemented on a pipeline of parallel processes and is a modified version of the sorting example in the Inmos Transputer Development System manual. Single- and multiprocessor implementations of the sorting algorithm are shown in Listings Five and Six, respectively. The two programs utilize the following files (only the formats of the main programs differ). Listing One (page 124) lists proto.h, Listing Two (page 124) mytypes.h, Listing Three (page 124) multisort.nif, and Listing Four (page 124) buffers.c.

The single-processor implementation of the sorting routine (see sort.c, Listing Five, page 124) performs the actions in Example 1. In the schematic representation of this sort shown in Figure 4, the circles represent parallel processes and the lines represent the communication channels. Note that the data flows through the pipeline in sorted order. As a result, the further down the pipeline a process (or processor) is, the lower its utilization. This is acceptable for a single-processor implementation, but it is a waste of resources in a multiprocessor implementation because most of the processors would be idle most of the time.

Example 1: Actions performed by the single-processor implementation of the sorting routine.

  Root Transputer:

  SEQUENTIAL
    - Read in line of text and store in character buffer.     
    - Allocate channels and processes.
    PARALLEL
      - Send out a character at a time to the first pipe process.
      - In 256 parallel pipe processes: read in a character then
         read a new character, forward the character with the
         lowest ASCII value.
      - Read in one character at a time from last pipe buffer
         and write it to the screen.     
    - Deallocate channels and processes.

Also note that the channel inputs and outputs match. For each byte (or word or array) output, data of the exact same length must be input. If the length of the data transfers does not match, the data transfer will deadlock. Since deadlock is probably the most common programming bug that occurs when using transputers, extra care must be taken to ensure that channel I/O matches. In addition, channel communication must be decoupled sufficiently to ensure that processes are not waiting for each other to communicate first.

More Details.

The two-processor version (see multisort.c, Listing Six, page 126) performs the exact same sort, but the organization is slightly different. In this case, the root processor merely reads the character string, then forwards it out a hardware link to the next processor.

At the same time, the sorted data coming from the network is read in and displayed a character at a time. The block diagram of this is shown in Figure 5. In this case, a rectangle represents a transputer and the circles represent concurrent processes (see Example 2). A method for mapping the program from two processors onto a larger transputer network is illustrated in Figure 6. The only differences in the two-processor version are that the pipe processes are distributed among more processors, and some of the channels are mapped onto hardware links instead of using internally allocated channels.

Example 2: Actions performed by the two-processor implementation of the sorting routine.

  Root Transputer:

  SEQUENTIAL
     - Read in line of text and store in character buffer.      
     - Allocate channels and processes.
     PARALLEL
        - Send out a character at a time to the first buffer.
        - Read in a character at a time and write it to the screen.
     - Deallocate channels and processes.

  Transputer Node 1:

  SEQUENTIAL
     - Allocate channels and processes.
     PARALLEL
        - Read in data from link a character at a time, sort, forward to
            next pipe process.
        - In 254 parallel pipe processes: read in a character then
            read a new character, forward the character with the
            lowest ASCII value.
        - Read in data from pipe process sort, forward data to
            root transputer.

I've written the sorting program in a somewhat unusual manner: The code for each processor of the multitransputer version of the sort is identical. Normally, I would write a unique root transputer program and then another program to be used for all of the additional network nodes. However, because I have only two processors, I've set up the program to determine on which node the code is running and then perform the appropriate actions. The _node_number variable is supplied to each node by the network loader.

To modify the code for more processors, forward the data to another processor with more sorting processes instead of having Processor 1 above return the data to the host. The last processor in the network can then be connected back to one of the other root transputer links.

The sorting algorithm uses the source(), output(), and pipe() functions, which are the same for every version of the program. Only the format of the main program varies from the single- to multitransputer case.

Functions run in parallel require a Process pointer parameter as the first function in the parameter list. This pointer is used by the LSC process allocation function. Aside from its appearance as the first variable in the parameter list, its use is completely transparent to the programmer.

  • source() takes as its parameters a pointer to an output channel and a pointer to the character buffer where the text to be sorted is stored. A single character at a time is output down the specified channel. When an ASCII NULL is encountered, the function exits.
  • pipe() takes pointers to both an input and an output channel. Two characters are read in, and the one with the lowest ASCII value is forwarded to the next pipe process. Then a new character is read, a comparison is performed, and the character with the lowest ASCII value is forwarded. When a NULL is received, the stored character is forwarded, the NULL is forwarded, and the function exits.
  • output() reads a character at a time from an input channel and displays it using the putchar()function. The function exits when a NULL is received.
In the single-processor case, each time a line of characters is sorted, the channels and processes are allocated, used, and then deallocated. For multiprocessors, only the root node allocates and deallocates each time through the loop. The network pipe buffer processes are allocated once, and terminate only as the root processor exits to the host. Actually, the network nodes deadlock due to lack of data from the root node, but this is okay as long as the root node shuts down gracefully. This implementation is used to keep the network programs as simple as possible and to circumvent the lack of support for multiple processes on the root to being able to communicate using the stdio stream.

Conclusion

This article provides a basic idea of how you can take advantage of the features available on transputers for building parallel processing networks. The ease with which parallel networks of transputers can be programmed and expanded, and the performance achieved by using this method, enable programmers to use transputers to solve problems that were previously relegated to large and expensive computing resources.

Acknowledgements

I would like to thank the NASA Graduate Student Researcher's Program (NGT-50392) for their support, and Kirk Bailey at Logical Systems for answering my questions about the compiler.

References

Hoare, C.A.R., "Communicating Sequential Processes," Communications of the ACM, Vol. 21, No. 8 (August), pp. 666-677, 1978.

May, David and Taylor, Richard, "Occam -- an Overview," Microprocessors and Microsystems, Vol. 8, no. 2 (March), pp. 73 - 79, 1984.

Stevens, Al, "From C to C++: Interviews with Dennis Ritchie & Bjarne Stroustrup," Dr. Dobb's C Sourcebook, Vol. 14, no. 159 (Winter), pp. 9 - 17, 1989.

Gehani, N. H. and Roome, W. D., Concurrent C, Computer Technology Research Laboratory Technical Reports, Concurrent C Project, AT&T Bell Laboratories, Murray Hill, NJ 07974.

Henderson, Brian, "Par.c System -- a C Compiler for Transputers," Mailshot April 1990, SERC/DTI Transputer Initiative, Rutherford Appleton Laboratory, Chilton, DIDCOT, Oxon, OX11 OQX, pp. 45 - 51.

Logical Systems Transputer Toolset (Version 89.1), C Library Description, Logical Systems, P.O. Box 1702, Corvalis, OR 97339.

Transputer Development System, INMOS, Ltd., Prentice-Hall, New York, 1988.

Transputer Architecture

A transputer is a microprocessor developed by Inmos and designed specifically for Multiple Instruction Multiple Data (MIMD) parallel-processing applications. All transputers have on-chip memory and serial links for connecting one transputer to another. Transputers come in several flavors: 16-bit, 16-bit with disk control logic, 32-bit, and 32-bit with an on-chip floating-point unit. The current generation of transputers has 4 Kbytes of on-chip RAM and four bidirectional serial links that can operate at 20, 10, or 5 Mbits/sec.

In addition, transputers have a microcoded scheduler with two priority levels, which allows concurrent (time-sliced) processes to share a processor's time. There are also two sets of timers, one for each priority level. A block diagram of the Inmos T800 floating-point transputer is shown in Figure 7. A T800 transputer offers approximately the same performance as the Intel 80386/80387 chip combination at around half the cost because you buy one chip instead of two.

Transputer Concurrency Model

The transputer concurrency model is based on C.A.R. Hoare's "Communicating Sequential Processes" (CSP) (Hoare, 1978). The "native" transputer language Occam (named after the 14th-century philosopher William of Occam) is based on the CSP idea of concurrency and communication (May and Taylor, 1984). A process performs a sequence of actions and then terminates. Concurrent processes can be thought of as multiple black boxes, each of which has its own internal state. Communication between processes is performed through point-to-point channels (CHAN). Each channel provides a one-way connection between two processes. It is not possible to share a channel between more than two processes, nor is there shared memory between processes. Information is updated through channel communication, thus alleviating the need for memory locks such as those used on shared memory multiprocessors. Communication between processes is synchronous and unbuffered. One process is blocked (descheduled) until the other is ready to communicate. When both the sending and receiving processes are ready, they proceed.

The communication channels can be virtual channels maintained internally by a single transputer, or they can be mapped onto one of the hardware links for true parallel processing applications. The transparent nature of virtual and physical channel/link communication enables the programmer to develop many programs on a single transputer and, with minor modification, map the program onto a network of processors.

Another channel construct is the alternative. The alternative (ALT) construct provides a method of choosing the first available channel from a set of channels and is implemented on the input side of channel communication. ALT is often used to control the flow of data through a network.

To generate concurrent (or virtual) concurrent) processes, a parallel construct (PAR) is used. As previously mentioned, parallel processes can only communicate through channels. The transputer supports high priority (used for communication processes) and low priority (for computation processes). While this may seem backward, it is important in a multiprocessor environment to keep as many processors as busy as possible by providing them data to process. Low-priority processes get time-sliced approximately every 1 millisecond. High-priority processes do not get time-sliced. They will, however, deschedule while waiting for channel communication.

Transputer Environments

Transputer networks generally consist of a root transputer that interfaces directly with a host such as a PC, Macintosh, or Sun. All other transputers in the network are connected in some fashion to the root transputer using the built-in serial links. Even though these network nodes may reside in the host, they rely on the host only for power and ground. Only the root transputer communicates with the host computer.

Typically, the root transputer is connected to an adapter chip that converts a transputer serial link to a bus interface. The host runs some sort of file server program to service the transputers. In the case of the C compiler, the server provides access to the keyboard, monitor, and disk storage of the host. The LSC compiler has the hooks in its library so that the standard C library functions communicate correctly with the host server. However, at this point in time, only the root transputer can use stdio routines to the host without extra programming effort. --G.K.E.

_PARALLEL EXTENSIONS TO C_ by Graham K. Ellis

[LISTING ONE]

<a name="01af_0011">

/**** File: proto.h -- Contains the function protocols ****/
/* include mytypes.h first....I don't do any checking */

source(Process *, Channel *, UBYTE *);
output(Process *, Channel *);
pipe(Process *, Channel *, Channel *);

/* end proto.h */



<a name="01af_0012"><a name="01af_0012">
<a name="01af_0013">
[LISTING TWO]
<a name="01af_0013">

/**** mytypes.h ****/

#define BUFF_SIZE   256      /* the max. char buffer size */
#define NUM_PIPES   BUFF_SIZE  /* the number of sorting processes */
#define PIPE_STACK   128        /* stack space for the processes*/
#define INPUT_STACK    4096
#define OUTPUT_STACK    4096

#define TRUE      1      /* some simple defines */
#define FALSE      0

typedef int            BOOL;
typedef unsigned char   UBYTE;




<a name="01af_0014"><a name="01af_0014">
<a name="01af_0015">
[LISTING THREE]
<a name="01af_0015">

;  multisort.nif -- This is the file the transputer uses to load the network.
;  The parameters are:
;----------------------------------------------------------------------
; node number   |   processor reset | link 0 | link1 | link 2| link 3 |
;               |   from processor  |---------------------------------|
;               |   number Rxx      | The number here is the processor|
;               |                   | number we connect to.           |
;----------------------------------------------------------------------
;
1, multisort, R0, 0, , , 2;
2, multisort, R1,  , ,1,  ;



<a name="01af_0016"><a name="01af_0016">
<a name="01af_0017">
[LISTING FOUR]
<a name="01af_0017">

/**** File: buffers.c -- Contains the funtions:
**      source(Process *, Channel *, UBYTE *)
**   output(Process *, Channel *)
**   pipe(Process *, Channel *, Channel *)
****/

#include <stdio.h>
#include <conc.h>
#include "mytypes.h"      /* include in this order...   */
#include "proto.h"

/**** source(Process *p, Channel *out, UBYTE *ch)
**    Output a single character at a time from the buffer pointed to by ch
**    on the channel out.  Terminate when '\0' or whole buffer is sent.
**    Process *p is used by the ProcAlloc() routine. You must have this
**    in the parameter list for a parallel process.
****/
source(Process *p, Channel *out, UBYTE *ch)
{
   BOOL inline = TRUE;
   int i;
   for(i=0; i<BUFF_SIZE; i++) {
      if(inline)
         switch(ch[i]) {
            case '\0':   /* this is the EOL for us */
               ChanOutChar(out, ch[i]);
               inline = FALSE;
               break;
            default:
               ChanOutChar(out, ch[i]);
         }
   }
}

/**** output(Process *p, Channel *in)
**   Read a single character at at time on channel in and send it to stdout.
**   Terminate when '\0' is received. Process *p is used by the ProcAlloc()
**   routine. You must have this in the parameter list for a parallel process.
****/
output(Process *p, Channel *in)
{
   UBYTE ch;
   BOOL running = TRUE;

   while(running) {
      ch = ChanInChar(in);
      switch(ch) {
         case '\0':
            putchar('\n');
            running = FALSE;
            break;
         case '\n':
            break;
         default:
            putchar(ch);
      }
   }
}

/**** pipe(Process *p, Channel *in, Channel *out)
**  Reads a single character on channel in and falls into a loop where next
**  character is read and compared in an ASCII sense to the current charater.
**  Character with the lowest ASCII value is forwarded on channel out. Loop
**  terminates on a '\0' character.  The "stored" character is sent before the
**  '\0' is propagated. Process *p is used by the ProcAlloc() routine.
**  You must have this in the parameter list for a parallel process.
****/
pipe(Process *p, Channel *in, Channel *out)
{
   UBYTE highest, next;
   BOOL running;
   BOOL ns_node;

   if(_node_number == 1)    /* If we are the root node only do the */
      ns_node = FALSE; /* main loop once.  If we are a non-   */
   else          /* root node, loop forever.           */
      ns_node = TRUE;
   do {
      running = TRUE;
      highest = ChanInChar(in);
      while(running) {
         next = ChanInChar(in);
         switch(next) {
            case '\0' :
               ChanOutChar(out, highest);
               running = FALSE;
               break;
            default :
               if(next > highest) {
                  ChanOutChar(out, highest);
                  highest = next;
               }
               else
               {
                  ChanOutChar(out, next);
               }
         }
      }
      ChanOutChar(out, '\0');
   } while(ns_node);
}




<a name="01af_0018"><a name="01af_0018">
<a name="01af_0019">
[LISTING FIVE]
<a name="01af_0019">

/*************************************************************************
**  sort.c -- Compiles under Logical Systems Transputer C compiler Versions
**  88.4, 89.1. A single transputer, multi-process ASCII sorting routine. This
**  shows how to allocate and deallocate parallel (multi-tasking) processes on
**  a single transputer. This algorithm is modelled after the sorting example
**  in the INMOS occam Transputer Development System (TDS). The algorithm
**  relies on having as many parallel processes as the maximum character
**  buffer size. Data is sent to a FIFO-like array of input-ouput buffers. If
**  the next letter read in is less than (in an ASCII sense) the current letter
**  in the buffer, it is forwarded to the next buffer, otherwise the current
**  data is forwarded. The program will shut down when a ~ is the first
**  character in any input line.
**  Must link in file buffers.c (after compilation of course..)
**  Programed by: G.K.Ellis, Smart Materials and Structures Laboratory,
**  Mechanical Engineering Department, Virginia Tech, Blacksburg, VA 24061
***************************************************************************/

#include <stdio.h>
#include <conc.h>      /* the LSC defines are here */
#include "mytypes.h"           /* include in this order... */
#include "proto.h"

/**** MAIN PROGRAM  ****/
void main()
{
   UBYTE ch[BUFF_SIZE];    /* string buffer */
   BOOL active = TRUE;
   int i;
   Channel *thru[NUM_PIPES + 1];   /* internal channels */
   Process *p[NUM_PIPES + 3];      /* process pointers */
   while(active) {
      fgets(ch, BUFF_SIZE-1, stdin);    /* get the input line */
      /* stop char is ~ as first letter */
      if(ch[0] == '~') active = 0;
      /* allocate the channels, NULL is returned if no memory   */
      for(i = 0; i< NUM_PIPES + 1; i++) {
         if((thru[i] = ChanAlloc()) == NULL)
            printf("No memory for Channel thru[%d]\n", i);
      }

   /* allocate the processes, NULL is returned if no memory */
   p[0] = ProcAlloc(source, INPUT_STACK, 2, thru[0], ch);
   for(i = 1; i < NUM_PIPES+1; i++)
       p[i] = ProcAlloc(pipe, PIPE_STACK, 2, thru[i-1], thru[i]);
   p[NUM_PIPES+1] = ProcAlloc(output, OUTPUT_STACK, 1,  thru[NUM_PIPES]);
       p[NUM_PIPES+2] = NULL;
   /* check to see if all processes were allocated successfully   */
      for(i = 0; i < NUM_PIPES + 2; i++)
         if(!p[i])
            printf("No memory for Process p[%d]\n", i);
      ProcParList(p);       /* Run a NULL terminated list of pointers */
             /* to processes. These are all low-pri   */
        /* Since we allocate these each time through the loop, we needto
        ** deallocate them here otherwise, we will run out of memory */
      for(i = 0; i < NUM_PIPES + 1; i++)
         ChanFree(thru[i]);
      for(i = 0; i < NUM_PIPES + 2; i++)
          ProcFree(p[i]);
    }
    printf("done!\n");      /* all done...   */
}




<a name="01af_001a"><a name="01af_001a">
<a name="01af_001b">
[LISTING SIX]
<a name="01af_001b">

/*************************************************************************
** multisort.c -- A two transputer version of a simple parallel ASCII sorting
** routine. This program will work on either node. Developed for Logical
** Systems Transputer C compiler (LSC) versions 88.4 or 89.1. Must link with
** buffers.c. Programmed by: G.K.Ellis, Smart Materials and Structures
** Laboratory, Mechanical Engineering Department, Virginia Tech,
** Blacksburg, VA 24061
**************************************************************************/

#include <stdio.h>
#include <conc.h>
#include "mytypes.h"
#include "proto.h"

/**** MAIN PROGRAM ****/
void main()
{
   UBYTE ch[BUFF_SIZE];   /* string buffer */
   BOOL active = TRUE;
   int i;
   Channel *in, *out;      /* link channels */
   Process *feeder, *sink;      /* hi-priority processes */
   Channel *thru[NUM_PIPES - 1];    /* internal soft channels */
   Process *p[NUM_PIPES - 2];    /* low-priority processes */
   if(_node_number == 1) {      /* We are the root node */
      in  = LINK3IN;      /* these are our links */
      out = LINK3OUT;
      while(active) {
         fgets(ch, BUFF_SIZE-1, stdin);   /* get the input line */
         /* stop char is ~ as first letter */
         if(ch[0] == '~') active = 0;
         /* set up the processes   */
         feeder = ProcAlloc(source, INPUT_STACK, 2, out, ch);
         sink   = ProcAlloc(output, OUTPUT_STACK, 1,  in);
         /* Check that ProcAlloc() doesn't return a NULL, but
                        ** since we KNOW from the example sort.c this is ok */

         /* run the feeder and sink processes in parallel */
          ProcPar(feeder, sink, NULL);

          /* free the previously allocated processes */
          /* note we do this each time through the loop */
          /* hey, it's only an example */
          ProcFree(feeder);
          ProcFree(sink);
       }
       printf("done!\n");
       exit(0);         /* finis! */
    } else {            /* if we are the non-root */
       in = LINK2IN;         /* node, run this as main() */
       out = LINK2OUT;         /* these are our links */
       for(i = 0; i < NUM_PIPES - 1; i ++) { /* allocate soft channels */
          thru[i] = ChanAlloc();
       }
           /* allocate soft channel pipe processes */
          for(i = 0; i < NUM_PIPES - 2; i++)
           p[i] = ProcAlloc(pipe, PIPE_STACK, 2, thru[i], thru[i+1]);
           /* allocate the processes with the links */
           feeder = ProcAlloc(pipe, PIPE_STACK, 2, in, thru[0]);
               sink   = ProcAlloc(pipe, PIPE_STACK, 2, thru[NUM_PIPES-2], out);
       /* If we want to check the pointer returned from ProcAlloc(),
       ** that is extra programming on a non-root node. Therefore,
       ** don't try to printf() from a non-root node.  There is a
       ** non-server node _ns_printf() that does simple ASCII
       ** transfers out a channel (link), but this generally requires
       ** extra effort to communicate back to the server */

       /* Let's allocate these asynchronously */
       ProcRunHigh(sink);   /* run the hardware links at high pri */
       ProcRunHigh(feeder);
       for(i = 0; i < NUM_PIPES - 2; i++)
          ProcRunLow(p[i]); /* and soft channels at low pri */
       _ns_exit();   /* we must exit like this on non-server*/
            /* node or we will exit from the */
            /* main() program */
          /* ProcRunHigh() and ProcRunLow() spawn new processes asyncronously
          ** from main().  The _ns_exit() does a STOPP on the main process
          ** leaving the two spawned processes. The exit() routine trys
          ** to send a message back to the server running on the PC.  If it
          ** isn't there, i.e. you're not the root, you've got problems.
          ** Note also, if we are not the root node, we never terminate
          ** and restart the pipe processes... we just run the processes ! */
    }
}


[Example 1: Actions performed by the single-processor
implementation of the sorting routine]


     Root Transputer:

     SEQUENTIAL
        - Read in line of text and store in character buffer.
        - Allocate channels and processes.
        PARALLEL
            - Send out a character at a time to the first pipe process.
            - In 256 parallel pipe processes: read in a character then
                read a new character, forward the character with the
                lowest ASCII value.
            - Read in one character at a time from last pipe buffer
              and write it to the screen.
        - Deallocate channels and processes.



[Example 2: A rectangle represents a transputer and the
circles represent concurrent processes]


    Root Transputer:

    SEQUENTIAL
        - Read in line of text and store in character buffer.
        - Allocate channels and processes.
        PARALLEL
            - Send out a character at a time to the first buffer.
            - Read in a character at a time and write it to the screen.
        - Deallocate channels and processes.

    Transputer Node 1:

    SEQUENTIAL
        - Allocate channels and processes.
        PARALLEL
            - Read in data from link a character at a time, sort, forward to
                next pipe process.
            - In 254 parallel pipe processes: read in a character then
                read a new character, forward the character with the
                lowest ASCII value.
            - Read in data from pipe process sort, forward data to
               root transputer.




[Figure 1: Channel Communication  ]

    buffer(Process *p, Channel *in, *out)  /* the Procees *p is used by */
    {                                      /* ProcAlloc()               */
      int data;

      while(1) {
        data = ChanInInt(in);
        ChanOutInt(out, data);
      }
    }




[Figure 2: Multiplexer using the channel alternative]

    mux(Process *p, Channel *in1, *in2, *out)
    {
      int data, index;

      while(1) {
        index = ProcAlt(in1, in2, NULL); /* return offset into chan list */
        switch(index) {
          case 0:                        /* channel in1 */
            data = ChanInInt(in1);
            ChanOutInt(out, data);
            break;
          case 1:                        /* channel in2 */
            data = ChanInInt(in2);
            ChanOutInt(out, data);
            break;
        }
      }
    }


[Figure 3: Two parallel processes code fragment]


    Process *p1, *p2;            /* Process and Channel are defined in     */
    Channel *in, *thru, *out;    /* conc.h                                 */

    in = ChanAlloc();   /* internal channel allocation                     */
    out = ChanAlloc();  /* we should check the return values here          */
    thru = ChanAlloc(); /* NULL returned if allocation unsuccessful        */

    p1 = ProcAlloc(buffer, STACK, 2, in, thru);  /* allocate space for the */
    p2 = ProcAlloc(buffer, STACK, 2, thru, out); /* processes p1, p2       */
    ProcPar(p1, p2, NULL);                       /* run them in parallel   */










Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.