Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Embedded Systems

A Process Group Manager for OS-9


September 1996: A Process Manager For OS/9

A Process Group Manager for OS-9

Adapting the I/O system to provide non-I/O services

Peter C. Dibble

Peter is the author of OS-9 Insights and can be contacted at [email protected].


A few years ago, I wrote a file manager called "PCF" for Microware's OS-9 real-time operating system. PCF was designed to let most OS-9 programs use DOS-format disks as if they were OS-9-format disks. While I was proud of my work, and used it as the basis for discussing file-manager design and construction in the first edition of my book OS-9 Insights (Microware, 1994), it was clearly too complicated. In retrospect, publishing PCF was a mistake.

Mapping back and forth between the MS-DOS FAT file system and the OS-9 RBF file system takes a lot of code, and the OS-9 community looked at my file manager and said, "I'm glad Peter did this for us. I'd rather not try to write one of these." If they limited their concern to PCF they'd be right on target. It's a mess. Unfortunately, the impression has spread to cover all file managers. I didn't intend that, and I'd like to try again. Consequently, in this article, I present a simpler file manager-the program group manager (PGM).

Modular OS Design

Although OS-9 was designed when operating systems were monolithic (and proud of it), OS-9 has always been a collection of independent entities that are linked at run time. This is similar to the microkernel architecture used in many newer operating systems. Where the OS components in a microkernel are full-strength processes, however, OS-9's components are more like dynamically linked functions contained in named blocks of memory called "modules."

The I/O System

Modular structure is heavily used in the I/O system. This allows you to change a system between I/O-less and I/O-rich simply by varying the loaded modules. The OS-9 I/O system is built from four types of modules:

  • One system module: the I/O manager.
  • Any number of device descriptors: data modules containing constants that describe a device.
  • Any number of file managers: executable modules that manage the logical structure of a device.
  • Any number of device drivers: executable modules that interface similar hardware devices.

IOMan

I/O support is rooted in IOMan-the I/O manager. This module installs the standard I/O system calls and assembles the resources for I/O devices and paths.

When it initializes a device, the I/O manager first asks the kernel for the address of a device-descriptor module with the same name as the device. This device descriptor contains all the constant values associated with the device. For instance, the device descriptor called "term" on my system starts with these values:

file manager name: scf device driver name: sccd2401 port address: $fff45000 It contains many more constants that are used by the file manager and device driver. The I/O manager finds the file-manager module and device-driver module for the device and builds a system data structure called a "device table entry." It then calls the device driver to initialize the hardware.

When the I/O manager receives an open request, it finds the device table entry for the requested device and creates a path descriptor for the path. For this, and nearly every other operation, it passes control to the file manager for the device. The I/O manager does not actually know anything about I/O. It processes operations exactly the same way, whether they are for a null device or an ATM networking stack.

File Manager

The file manager is responsible for the logical structure of I/O; it contains all the I/O code that does not depend on the particular hardware used for the device. If a file manager is designed correctly, it can be used with a wide range of I/O hardware. This leads to such interesting twists as an Ethernet chip mated to the file manager normally used with serial and parallel ports.

The I/O manager was designed to manage path descriptors and device table entries, and the interfaces for file managers and device drivers are well suited to their roles. However, if you forget the names attached to things, the I/O manager and its children are actually a resource-management system-a flexible system for associating resources with processes and manipulating those resources. I/O is just a well-supported special case.

For example, if the kernel's memory-management system is not a good fit for some special requirement, a memory device can add a new memory allocator. Once a process opens a path to the memory device, it gains access to the services of a new memory manager located in what the I/O manager thinks is a file manager. If there is some special hardware (an MMU, for instance) involved with the memory, its control can be abstracted into a device driver. Provided you get by the I/O-related names of the module types and system calls, the resource-manager view of the I/O system opens a world of possibilities.

By now, experienced OS-9 programmers are thinking this might be easier with a "P2" module. Maybe, but have you thought about releasing resources when a process terminates? What if the process is killed? Here's the issue: You can add new system calls to OS-9. Most standard system modules add a call or two, and many programmers have developed custom system calls for their personal systems. Unfortunately, it is hard to ensure that resources allocated to a process will be returned to the system when the process terminates. This is a sore point for many OS-9 programmers.

Problem Statement

Termination is a problem for some complex OS-9 systems. OS-9 allows applications to intercept almost everything that could kill them. This allows the application to clean up before termination, which is important to processes that are part of a larger system. They may take some significant final action such as releasing locks, saving exit information in a file, or passing control to another process. However, OS-9 reserves one thing that applications cannot intercept-the kill signal. The prospect of a process that could only be stopped by rebooting the system is not acceptable.

When a group of processes cooperate, they may share resources: locks, data modules, files, service processes. If they share resources, it is immensely convenient for the processes to know the names of the other processes in the group.

Requiring processes to ask to join a group is reasonable and implementable. It's also reasonable to ask processes to give notice when they leave the group, and this could be implemented with the standard OS-9 system-except that OS-9 can kill the process without notice. This problem drives system designers to either add operating-system support for multiprocess systems, or find a single-process approach.

Proposed Solution

I decided to build a tool to help maintain process groups arranged in a client/server architecture, a simple and popular way of organizing a group of processes. Lines of communication in a client/server organization all go between a client and the server. The server in this structure wants to know which processes are its clients, and it would be nice if the clients can be told if the server goes away. Under OS-9, this is done in at least three ways:

  • Make the server a system service, probably part of the I/O system.
  • Build a system service that notifies processes when other processes terminate.
  • Use a watchdog trick. For instance, you have each process update a time stamp in shared memory every few seconds. If a process's time stamp falls significantly behind, it's probably dead.

The first alternative appeals to programmers who like the sense of power that comes from writing system-state software. It is the only good alternative when the server needs to control preemption and interrupts, or when there are strenuous performance requirements.

Watchdog tricks are attractive, too, if you want to stay out of system state, but they are an unhappy choice. System designers who choose to update time stamps frequently invest a lot of CPU time; those who choose to update the time stamps infrequently won't notice a process's death for a long time.

OS-9 has seen a couple of good stabs at an exit-notification service. People have installed an accounting-system call and used it to signal processes when other processes die. That is a good solution, but it costs performance on every fork and exit, even those that have nothing to do with the process group. It's also inflexible. OS-9 likes to have all its kernel extensions installed at boot time. And there is a conflict problem if several modules all install an accounting extension.

My solution is to use a file manager. The important fact here is that the process-termination code in the kernel calls the I$Close I/O service for each path the terminating process had open. I require that each client open a path to my file manager. That gives my system a mechanism that reliably gives the file manager control when any client exits.

I call the file manager PGM (short for process group manager). It implements the following services:

    Create. A function called from the server. A process that creates a PGM device defines itself as a server on that device.

    Open. A function called by each client. The PGM manager queues a NEW_PATH message for the server, and only returns to the client when the server responds.

    Close. When a client closes a path that was opened (as opposed to a duplicate of another path), the server is notified with a signal and a message. When the server closes its path, PGM notifies any clients that requested notification on server death.

    SetStat. Only SS_SSig is supported. For clients, this tells the file manager to send the caller a signal if the server exits. It also tells the file manager which signal value to use. If the server uses the SS_SSig setstat, it changes the value of the signal PGM will send it when a client opens or closes a PGM path.

    GetStat. Only SS_ Ready is supported. It returns success if there is at least one message pending for the path. It returns E_EOF (end of file) if it is called by a client, or if it is called by the server and there are no messages ready for reading.

    Read and ReadLn. These calls are only permitted from the server and return messages with a defined structure:

int message_type process_id client_process_id

    If the message is NEW_PATH, it means that a new client just opened a path to PGM. The server should decide whether it will permit the connection, and write a message to PGM telling it to accept or reject the request.

    If the message is CLOSED_PATH, it specifies the process ID of a client that just closed its PGM path, either explicitly or by dying.

    Write and WriteLn. These are used by the server process to tell PGM how to handle a client connection request. There are two possible message types: ACCEPT_ CONNECTION and REJECT_CONNECTION. PGM allows the corresponding Open request to return with a suitable return code.

Sample Use

PGM is designed to sit above a client/server process group and publish obituaries. Each process in the group must join by opening a path to the group's PGM device. The entire file manager contains about 900 lines of C, plus assorted header files. It is available at http://www.microware.com and from DDJ. It is freely usable with the usual restrictions about credit and warranty.

The server must join the group first. It uses the I$Create system call, see Listing One, to identify itself as the server and initialize the group.

The typical event-driven server will install a signal-intercept routine using code like that shown in Listing Two, and ask PGM to send signals for new-client and client-death events.

Listing Three shows one way to maintain the server's list of clients. This code reads a message from the PGM path, then executes different routines depending on whether it reads a NEW_PATH message or a CLOSED_PATH message. CLOSED_ PATH messages just cause manage_clients to update its client list. NEW_PATH messages require a response, so manage_ clients writes a message back to the PGM path telling it how to respond to the client's I$Open request.

The client code in Listing Four is simpler than the server. It must open a path to PGM and deal with the response from the server. It may ask to be told if the server exits.

Data Structures

The I/O manager maintains three data structures that might interest the file manager:

  • A device table entry that contains pointers to the modules and structures associated with the device.
  • Device static storage that contains data associated with the device. Device static storage contains three sections, one each for the I/O manager, the file manager, and the device driver.
  • A path descriptor. The I/O manager creates one of these for each open path. There can be many path descriptors associated with each device. The path descriptor contains sections for the I/O manager and the file manager.

PGM doesn't need to refer to the device table entry, and we only use the device static storage as a fast way to find the server's path descriptor.

The server's path descriptor is the base for three singly linked lists. The client list contains the path descriptors from the clients. The message queue contains path descriptors that are trying to open, but haven't yet been seen by the server. The reply-pending list contains path descriptors that the server has seen but has not accepted or rejected. Figure 1 shows how PGM structures are connected.

Implementing PGM

PGM executes Listing Five when a client opens a path. First, it checks a field in the device static storage to see if the device has a server. If it does not, open() returns the E_PNNF error immediately. If a server is registered, open() initializes the path descriptor. Stamp_path marks the path descriptor so that no matter how the path is duplicated and inherited by other processes, PGM will know where it was originally opened and will not bother the server with false alarms as duplicates of the path are closed. Send_msg signals the server that there is a NEW_PATH message waiting to be read.

Once the server has been notified, open() needs to wait for the server to respond. If the server was the only thing that could awaken the client, this would be simple, but we have no control over this. PGM must appropriately handle other signals that might be sent to the client process during this interval. Some file managers are sloppy here; they won't stop waiting until they get the response they want. These are easy to write, but they are not responsive in a signal-driven system.

The _os_sleep(zero) function returns when the process gets a signal. Depending on the signal and the state of the path, open() chooses one of three options: Ignore the signal and sleep some more, note that the operation is complete and return, or note that the operation is incomplete and return anyway.

If a path descriptor is in the client list, the server has accepted the connection, and open() can return. If the process is condemned, it has been killed, and should leave as soon as possible. Since the path is not yet in the client list, we can just clean up and leave. If it is in the client list, the server will need to be told about this death; that's why the loop doesn't check the status of the process until it has checked the status of the connection. If open() returns SUCCESS, the path will be marked open and the process-termination code will call close(). If open() returns an error, the path will not be marked as open, and it will not be closed.

"I/O deadly" signals are supposed to break out of I/O operations, so any I/O code that takes significant time should look for them. Open() follows this rule. It looks for deadly signals every time it becomes active, and returns to the caller-reporting an error-if it gets one.

Listing Six is executed when a client closes a PGM path. By the time control gets here, PGM has already determined that this is the client that opened the path.

First the client-close routine checks to see whether the path has become a dead end. That would signify that the server is gone. Client paths are still open, but they are useless. These paths can be closed, but any other operation on them will fail.

The close routine removes the client from the server's client list and sends the server a message telling it that the client has closed its path. This Send_msg() function call is the reason PGM exists. This is where the server is notified about client deaths. After it has sent the message, the open() function sleeps stubbornly. Normally, a file manager should be willing to return quickly on some signals. PGM's open() code demonstrates good citizenship in this respect. Close() is another story: PGM will not return until it has delivered its message to the server. The function keeps looping through the unbounded sleep until PGM indicates that the server has received the CLOSED_PATH message by changing the preset return code.

The server-exit code is similar, except that it includes a substantial amount of code that disassembles the linked lists rooted at the server's path descriptor, notifies all the clients that care about its death, and generally destroys everything that the file manager has built for the server.

The Read() entry point in PGM does some error checking, then calls the function in Listing Seven.

Read_action first pulls a path descriptor out of the message queue (where open() or close() put it). It builds a message in the caller's buffer using the action and the process ID from that path descriptor. If the message is a NEW_ PATH message, read_action moves the path descriptor to the reply-pending list, then returns to the server. The process waiting in its open() call will not be released until the server writes a message telling PGM how to dispose of its request.

When the server reads a CLOSED_ PATH message, PGM can wake the process waiting in close() and return. There's no need to handshake on a close request.

OS-9 turns preemption off for file managers. Some of them turn it back on, but PGM is so simple that it seems pointless. With preemption off, the kernel will only timeslice PGM when it executes F$Sleep. Since PGM doesn't have interrupt-driven device drivers, it doesn't even have to worry about sharing data structures with them. This simplifies life greatly. PGM needs to leave data structures consistent when it returns or sleeps, but other than that, it can ignore the usual concerns about atomic operations.

For demonstration, I've started to make PGM preemptable by masking interrupts during a few critical routines. Up to about 200 clock cycles, masking interrupts imposes no important penalty on a 68000 (slow instructions already impose that much delay on interrupt service). Unless there are a lot of clients, the interrupt masking in PGM probably stays within the 200-cycle boundary.

On the 68xxx (which I used for this), the interface to file managers is specified in assembly language. I chose to write the bulk of the code in C, so I wrote a little assembly language to bind to the interface. It replaces the usual _cstart code; (available electronically).

This glue reflects my simplistic approach to C-language file managers. I provide only parameters and automatic storage. There is no support for stack checking, and static memory of all kinds is passed in through parameters. It's possible to give the file managers a type of static storage by doing the right thing with the a6 register, but I find that opening that door leads to trouble.

On the other hand, some of my earlier file managers include glue for the system calls they use. The system-call bindings in the Ultra C library (the Ultra C/C++ compilers are included in the OS-9 development toolset) no longer use static or global memory such as errno, so PGM uses the standard bindings.

Putting it Together

PGM needs a device driver and a device descriptor. Since PGM doesn't deal with a real device, the driver and descriptor are trivial, but IOMan insists on them.

For the device driver, I use null (the device driver that looks like a serial device but does nothing). Since it does nothing, it's safe for this kind of file manager. The only concern is that null thinks it is being called by the SCF file manager. It leaves enough space in device static storage for SCF's file-manager static storage. Since PGM uses much less memory than SCF, we can live with this.

The device descriptor (available electronically) contains enough to make IOMan happy. Each PGM device needs a unique name and port address. They all use the same file manager and device driver. The rest of the fields are pro forma.

Room for Improvement

PGM can be improved by making it preemptable, especially if it stops masking interrupts, and the restriction to one server per PGM device could be dropped if PGM accepted path lists with at least one level after the device name. Features can be added to PGM to better suit it to particular servers. There might even be cases where PGM can do some common-resource management that the server can't handle.

It would be interesting to make PGM even more general. The original problem was that processes couldn't reliably get control when killed. Perhaps a PGM-like file manager could execute a carefully constrained set of commands that would let processes leave something like a will.

Figure 1: PGM data structures.

For More Information

Microware Systems

1900 NW 114th Street

Des Moines, IA 50325

515-224-1929

http://www.microware.com

Listing One


#define PGM_DEVICE_NAME /SERVER/DESC
   if((err = _os9_create(PGM_DEVICE_NAME,
             FAM_READ | FAM_WRITE, &path, S_IPRM)) != SUCCESS){
      fprintf(stderr, "%s: failed in create. Error %u\n",
         argv[0], err);
      exit(err);
   }

Listing Two



void catcher(signal_code sig)
{
    if(sig == PGM_SIGNAL)
        manage_clients();

    _os_rte();
}
    /* Install catcher() to catch any signals */
    _os_intercept(catcher, _glob_data);
    if((err = _os_ss_sendsig(path, PGM_SIGNAL)) != SUCCESS){
        fprintf(stderr, "%s: ..._sendsig(path, %u) got error %u\n",
                argv[0], PGM_SIGNAL, err);
        exit(err);
    }


Listing Three



void manage_clients()
{
    u_int32 size;
    pgm_message_t msg;
    error_code err;
    size = sizeof(msg);
    if((err = _os_read(path, &msg, &size)) != SUCCESS){
        /* Probably a logic error */
    } else switch(msg.action_code){
        case NEW_PATH:
           if(client_disposition(msg.process) == REJECT){
                msg.action_code = REJECT_CONNECTION;
                if((err = _os_write(path, &msg, &size)) != SUCCESS){
                    /* Probably a logic error */
                }
            } else {
                msg.action_code = ACCEPT_CONNECTION;
                if((err = _os_write(path, &msg, &size)) != SUCCESS){
                    /* Probably a logic error */
                }
                /* Take note that a client just joined us */
                break;
            case CLOSED_PATH:
                /* Take note that the client left us */
                break;
            default:
                /* Another logic error. There are only two possible messages /* 
                break;
        }
}

Listing Four



/* Sample Client code */
path_id path;
void catcher(signal_code sig)
{
   if(sig == PGM_SIGNAL){
      /* The server exited! */
   }
   _os_rte();
}
void main(int argc, char **argv)
{
   error_code err;
   extern void *_glob_data;
   _os_intercept(catcher, _glob_data);
   if((err = _os_open(PGM_DEVICE_NAME, 0, &path)) != SUCCESS){
      if(err == ECONNREFUSED){
          /* The server won't talk with us! */
      } else {
          /*  Something else went wrong. Check whether client is running. */
      }
   }
    /* Ask to be signaled if the server exits */
   if((err = _os_ss_sendsig(path, PGM_SIGNAL)) != SUCCESS){
      /* The server's probably not running  yet */
   }
   printf("Client: %u completed open...\n", _procid);

Listing Five



status_type open(const Pathdesc pd,
    REGISTERS * const regs,
    const procid * const procd,
    const sysglobs * const globals)
{
    pgmstatic * const devstatic =
            (pgmstatic * const)pd->path.pd_dev->V_stat;
    Pathdesc server_pd;
    error_code err;
    signal_code sig;
    u_int32 zero;

    if(devstatic->server_path == NULL)
        return E_PNNF; /* Not found */
    pd->pgmpvt.flags =  PGM_CLIENT;
    stamp_path(pd, procd);

    server_pd = devstatic->server_path;
    pd->pgmpvt.server_path = server_pd; 

    /* set default error message */
    pd->pgmpvt.rval = E_SIGNAL;

     /* Send message to server */
    Send_msg(NEW_PATH, pd);
    do{
        zero = 0;
        _os9_sleep(&zero);
        if(in_client_lst(pd)){
            pd->pgmpvt.rval = 0;
            break;
        }
        if(condemned_proc(procd)){
            clean_me_up(server_pd, pd);
            return 1;             /* Any value will do, we're dead */
        }
        if((sig = get_signal(procd)) != 0)  /* Wasn't a wakeup sig */
         if(sig < SIGDEADLY){
             clean_me_up(server_pd, pd);
             return sig;
         }
    }while(pd->pgmpvt.rval == E_SIGNAL);
    skip_name(regs);
    return(pd->pgmpvt.rval);
}

Listing Six



status_type ClientClose(Pathdesc client_pd,
    REGISTERS * const regs,
    const procid * const procd,
    const sysglobs * const globals)
{
    u_int32 ticks;      /* For _os9_sleep() */

    if(client_pd->pgmpvt.flags & PGM_DEADEND)
        /* The server is dead. Leave quietly */
        return(SUCCESS);

    remove_from_client_list(client_pd);
    /*===============================================================
        There will be no messages or pendings for this path in the
        server's lists.  If there were, this process would be
        sleeping somewhere else in PGM, and not busy closing.
    ================================================================*/

    /* set default error message */
    client_pd->pgmpvt.rval = E_SIGNAL;
    /* Message and wake server */
    Send_msg(CLOSED_PATH, client_pd);
    do{ /*
            Don't let the closing process escape
            until the server sees it (or exits).
        */
        ticks = 0;         _os9_sleep(&ticks);     /* Wait for server read */
    }while(client_pd->pgmpvt.rval == E_SIGNAL);

    return(client_pd->pgmpvt.rval);
}

Listing Seven



status_type read_action(Pathdesc server_pd,
        REGISTERS * const regs)
{
    Pathdesc client_pd;
    pgm_message_t * const msg_buffer = regs->a[0];

    client_pd = dequeue_msg_q(server_pd);

    /* send the waiting message to the server */
    msg_buffer->action_code = client_pd->pgmpvt.action;
    msg_buffer->process = client_pd->path.pd_cpr;

    if(client_pd->pgmpvt.action == NEW_PATH)
        add_to_pending_lst(server_pd, client_pd);
    else if(client_pd->pgmpvt.action == CLOSED_PATH){
        client_pd->pgmpvt.rval = 0;
        /* Wake the closing process so it can finish the close */
        _os_send(client_pd->path.pd_cpr, SIGWAKE);
    }
    regs->d[1] = sizeof(pgm_message_t);
   return SUCCESS;
}


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.