Embedded Systems

TERSE: A Tiny Real-Time Operating System

By Barry Kauler, December 01, 1995

Barry presents an 8051-implementation of TERSE, a "Tiny Embedded Real-time Software Environment" that's only about 260 bytes in size.

DEC95: TERSE: A Tiny Real-Time Operating System

TERSE: A Tiny Real-Time Operating System

A signature-scheduled, dataflow OS for distributed embedded systems

Barry Kauler

Barry is a member of the computer and communication engineering department at Edith Cowan University. He can be contacted at [email protected].

Since their introduction in the 1970s, microcontrollers have found their way into embedded systems ranging from washing machines to airplanes. Today's complex systems use multiple microcontrollers--each dedicated to a specific control and/or monitoring application--distributed across a network.

Because many microcontrollers may have only 1024 bytes of ROM and 64 bytes of RAM, shoehorning both a real-time operating system and an application onto a single device can be a challenge, to say the least. Clearly, a small, reliable real-time operating system is a fundamental requirement for many embedded-system projects. To that end, I've developed TERSE (short for "Tiny Embedded Real-time Software Environment"), an operating system that's only about 260 bytes in size without network support, and only 450 bytes with it. Although originally written for the 8051 family of microcontrollers (specifically, the Phillips 87C750), TERSE can easily be ported to other controllers. In this article, I'll focus on TERSE's design, implementation, and use. The 8051 assembly-language source code for TERSE51 Version 1.13 is available electronically from DDJ (see "Availability," page 3), ftp://scorpion.cowan.edu.au/pub/terse, or http://scorpion.cowan.edu.au/science/terse/terse.htm.

TERSE takes care of all message interaction between nodes on a distributed system. TERSE fires (executes) a node when all required messages have arrived (and in accordance with a schedule calculated at run time). It carries the output messages to other nodes, even those on different processors.

Consider Figure 1(a), where only one processor unit (PU) is necessary. After node 1 finishes, messages go off to nodes 2 and 3, which are then eligible to fire (this is concurrency). However, since both nodes are on the same PU, they can't really execute concurrently. Since TERSE doesn't support timeslicing, one node would execute, then the other. However, you can simulate timeslicing between nodes 2 and 3 by splitting each node into a serial string of nodes using the scheduling-table setup to jump back and forth between each concurrent path (or thread). Node 4 would then be eligible to fire.

Before calling a node, TERSE puts the input messages into registers, so the node knows exactly where they are. Specifically, the 8051 has registers R0 to R7, and TERSE puts the input messages into R1 to R4, along with a Flags variable that indicates whether or not the messages have arrived.

Just before a node exits to the operating system, it pushes the output messages onto the stack. Each message contains the destination node, terminal, and data so that TERSE will know where to send messages.

Figure 1(b) illustrates a more complicated problem with three interacting processors. PU3 is physically connected to a device ("GIZMO A") on which mutual exclusion must be enforced. With TERSE, you instantiate nodes 1 and 2 (or more) internally with identical code. TERSE automatically allows only one to execute.

How TERSE Works

TERSE uses one interrupt on each PU. This works a bit like the Microsoft Windows message queue--when a node exits, its messages are placed in the buffer (or queue). This buffer lives in the stack, so messages are posted by the PUSH instruction. Despite this, the stack can still be used in the normal fashion, within each node.

When a message arrives from another PU, it is received by an interrupt routine, which either puts it in the stack/buffer or forwards it to the next PU in the ring.

To post messages to destination nodes, the messages are simply popped offthe stack, and TERSE automatically sends off any messages addressed to other PUs.

Some dataflow operating systems employ a static scheduling technique with a fixed node-firing sequence. Since this is far too restrictive for a real-time reactive embedded application, TERSE uses a combination of static and run-time calculation of scheduling. In Figure 1(b), after node 1 exits it's logical to fire node 2 first (so it can send a message off to PU3) and then fire node 3.

TERSE examines currently active "arcs" (data being output from one node and sent to the next), particularly those that have messages on them, as well as the order of the messages' arrival. With this information, TERSE calculates a single 16-bit signature, then looks up this signature in a lookup table to determine which node to fire next. Thus, the operating system is completely deterministic.

Remember that the messages are all sitting in the stack in order of their arrival. TERSE simply looks through this string of messages and calculates a code (similar in concept to the CRC code used in disk storage) unique to that string. This "signature-scheduling" technique is one of TERSE's more-attractive features.

Consider again PU2 in Figure 1(b). If node 1 posts a message to node 2 and to node 3 (in that order) when node 1 exits, then those two messages on the stack form a unique pattern; hence a unique 16-bit signature. TERSE will see in the lookup table that node 2 is to be fired next. If node 1 posts the message to node 3 before pushing the message to node 2, a different pattern is formed and a different signature is generated. For node 1 to control whether node 2 or node 3 fires next, both signatures must be in the lookup table. This scenario illustrates how TERSE allows run-time variations on the execution path--but it can only happen if you specifically put those signatures into the lookup table.

Another possibility is that node 1 is a rogue that erroneously outputs messages due to buggy internal code, (which may be in the wrong order, only one message, or nonexistent). "Wrong" message patterns generate signatures that are not in the lookup table, leading TERSE's signature scheduling capabilities to automatically call node 0, which is always reserved as the error handler. Node 0 can get a good idea of what is wrong by looking at the messages on the stack. (TERSE avoids "signature explosion" due to messages arriving asynchronously from other PUs.)

In short, TERSE allows run-time scheduling, but only over paths that you have allowed. If a node "goes wrong," TERSE knows this as soon as the node exits. If a node gets hung up internally, the processor will need a different timeout mechanism (a watchdog timer, for example); alternatively, TERSE's network interrupt routine can have a timeout check. (TERSE acceptably handles any other interrupts, too--they simply go to an ISR inside a node.)

TERSE also supports "notifications"--messages that take no part in the signature calculation, and thus do not affect node firing even if they go to a node. Notifications are an important extension of the classic dataflow model for real-time reactive engineering applications.

In terms of a classic dataflow diagram like Figure 2, a TERSE diagram corresponds to the unbroken-line arcs and circles, while TERSE's signature-scheduling table corresponds to the dashed arcs and circle. Notifications, however, are not covered in the classic model of Figure 2. These can be thought of as global variables, but without the disadvantages. A notification, in global-variable terms, is only written to at one place in the diagram, but can be read from many places--this is very important for stable designs.

Putting it Together

To put the TERSE design to work, my prototype testbed was the Phillips DS750 development kit that includes a board with the 87C750 and 87C752 EPROM microcontrollers, an assembler, and a simulator that lets you test programs on a PC or download them to the board over a serial cable. The 87C750 has 1 KB of EPROM, 64 bytes of RAM, and various I/O.

I designed TERSE51 to work with up to eight processors (connected in a simple ring) using a parallel port--split into two nybbles, one going each way. The 87C750 does not have a UART, so I left that option out. (I'd like to see someone modify TERSE for other network designs and topologies. Microcontrollers with built-in network processors, Motorola's Neuron, and Charles Moore's F21 are likely candidates.)

Designing a TERSE-based system begins with scribbling flow diagrams on the back of an envelope. The target system affects the design process. Basically, a microcontroller is associated with a piece of hardware. Microcontroller A might control Gizmo A, while Microcontroller B might monitor Widget B, and so on. In some cases, two or more microcontrollers might be involved with one physical entity; or microcontrollers of different speeds, instruction sets, I/O, or memory may be required.

The top layer of software functionality, therefore, tends to follow the physical functionality of the system, and this is how I recommend you design the top-layer diagram. Remember that the internal bus of a processor is much faster than the network, and it is usually easy to select the functionality of each processor to meet speed/resource requirements.

Within a microcontroller, the flow-diagram should usually require only one decomposition step. The top layer is a single node to represent the entire PU, which decomposes to a large-grain dataflow diagram.

Various rules can be followed for this decomposition, such as a shared function being a clone node, a shared resource, and so on. The objective is no more than 32 nodes per PU, since that's all this version of TERSE supports. A PU may have two or more isolated diagrams (no arcs between them), though TERSE designs are based upon identifiable start and end nodes in each PU, so that there is a clearly defined cycle.

Ideally, decomposition should proceed such that all communication between parts of the program, within the PU or to another PU, takes place at the dataflow messaging level. Interprocess communication should not take place directly from within a node, to another, with shared variables--this is a last resort!

If nodes have many pages of source code, decomposition should proceed until each node contains small code segments. One printed page of source text, or shorter, is reasonable.

Decomposition can proceed in other ways. You could delay the division of software functionality onto specific processors and take a more-abstract software approach first.

Listing One illustrates how to write code for each node. Input messages are in registers R1-R4, and Flags contains flags indicating the presence or absence of a message. Normally, a message would be there, but TERSE allows the nodes to fire even if there isn't. An example of this is the Notification message (it also allows TERSE to perform a GOTO, if required). The POSTMSG macro makes it easy to post messages in the correct format.

A node has full use of the stack, but the second, third, and fourth register banks must not be directly used because they hold the stack (though it could be shifted). All of the registers from 20h upwards are free, except for five reserved locations, one of which must be bit-addressable.

Apart from writing the code for each node, you must also fill in the signature-scheduling lookup table; see Listing Two. The first two rows show the signature and the corresponding node that TERSE will execute. TERSE locates the column with the correct signature and uses that index value to perform a programmed jump via the nodeptr: array to go to the node.

Working out the signatures is fairly straightforward. The program SIG-CALC.ASM (Listing Three) prompts you to enter the destination node and terminal of each active message and returns the signature. (SIG-CALC.EXE, the executable version, is available electronically.) Go through the flow diagram step by step, plotting all allowable paths and entering a corresponding signature for each.

It is easy to change TERSE51 to work with signatures smaller than 16 bits, even down to eight bits; I recommend the 8-bit signature only for single-processor designs due to the probability of "chaos" in a distributed system.

I'm currently working on a GUI that will simulate designs and generate code. In the meantime, I use National Instruments' LabView for timing analysis. With LabView, I string the nodes of each PU along a time axis, with execution times entered to each node. The network is treated as a "virtual PU," with its own time axis. The network nodes uncover any timing problems, as the cycle time of the receiver diagram must always be less than that of the sender diagram, which can be checked by the network nodes. The LabView solution is temporary but useful.

Conclusion

There are numerous possibilities for taking TERSE to the next level. For instance, TERSE51 supports up to 32 nodes per processor. On a more powerful system--say, an 80x86-based platform--TERSE's capabilities could be expanded. You could timestamp messages and use an algorithm that enables all processors to automatically synchronize their internal clocks, allowing two or more processors to perform some operation at "exactly" the same time.

You could also implement a faster scheduling-table lookup. TERSE51 1.13 uses a linear search, but you could presort entries in order of numerically ascending signature value for faster lookup.

When scaling up, having a single lookup table per processor becomes a problem. Consequently, you might want to use multiple independent diagrams on the one processor, each with its own cycle times. These could be considered "logical processors."

The first network designed for TERSE51 is admittedly primitive. Porting TERSE to a processor like the Standard Microsystems Corp. COM20051 chip with inbuilt Network Coprocessor, will also improve performance.

Finally, it would be very useful to develop a program that analyzes the scheduling tables and determines whether the diagram will always complete, detect deadlocks, determine reachability of all nodes, and--in conjunction with the diagrams--perform timing analysis.

The electronic clearinghouse for TERSE-related information is at ftp://scorpion.cowan.edu.au/pub/terse. Updates will be posted as TERSE undergoes development. To participate in the further development of TERSE, please upload any contributions to the /incoming directory and notify me via e-mail.

Figure 1: (a) One-processor problem; (b) three-processor problem.

Figure 2: A conventional flow diagram, showing dataflow and control flow.

Listing One

node0:                      ;node-0 is the error-handler
;test for DEADLINE/SHUTDOWN/RESTART....
    jnb flags.0,nodead
    jnb flags.1,nosh2
      ;SHUTDOWN. If responding remote or local broadcast msg, it has
      ;already been forwarded, so now do local response...
      ;...PUT ALL I/O INTO A SHUTDOWN STATE
myself:   ajmp myself
nosh2:  jnb flags.2,nore2
      ;RESTART. If responding remote or local broadcast msg, it
      ;has already been forwarded, so now do local response...
      mov  sp,#emptyst
      ajmp start1               ;restart program.
nore2:   ;DEADLINE OVERRUN.
     ;could maybe broadcast a global shutdown, or some error msg...
     ajmp  nodereturn        ;do this to carry-on regardless.
nodead:
;test for wait-state....
    jnb   flags.5,notwait
    ;can perform deadline timeout if required, else just return...
    ajmp  nodereturn
notwait:
    ;...process more msgs
;default behaviour of node-0...
    mov  sp,#emptyst        ;dump any msgs left on stack.
;for consistency, error-handler should *not* post any msgs.  
;By emptying stack, after exit from here execution will restart 
;from whatever node corresponds to signature=0 in the lookup table.
     ajmp nodereturn
;...................................................................
node1:                          
;assuming this is starting-node, it will not have any i/p msgs.
    ;...code...
    POSTMSG 1,2,1,0,#34h ;pu-1,node-2,term-1,not-notif.,immediate-data.
    POSTMSG 1,3,1,0,#00h    ;pu1,node3,term1,not-notif,immediate-data.
    ajmp nodereturn
;...................................................................
node3:
    jnb   flags.1,n31   ;jump if no msg on terminal-1.
    ;...msg is in r1...process it
n31:
   POSTMSG 1,4,1,0,32h     ;pu1,node4,term1,notnotif,direct-data.
    ajmp nodereturn
;..................................................................
node2:
    jnb   flags.1,n21    ;jump if no msg on terminal-1.
    ;... msg is in r1...process it
n21:
    POSTMSG 1,4,2,0,#00h    ;pu1,node4,term2,0,immediate-data.
    ajmp nodereturn
;...................................................................
node4:
    jnb   flags.1,n41       ;jump if no msg on terminal-1.
    ;...msg is in r1...process it
n41: jnb   flags.2,n42    ;jump if no msg on terminal-2.
    ;...msg is in r2...process it
n42:
    POSTMSG 2,1,1,0,#56h    ;remote message!
    POSTMSG 1,2,2,1,#11     ;notification, back to node-2,term-2.
    ajmp nodereturn

Listing Two

signatures: DB 255,00,00h,0A0h,0C8h, 012 ;starting node has signature=0.
sighigh:    DB 255,00,03h, 00h, 00h, 034
nodenum:    DB   0, 1,  3,   2,   4,WAIT     
nodeptr:                    ;these occupy 2 bytes each.
    ajmp node0        ;1st node in table is error-handler(always node-0).
    ajmp node1              ;this must be in ascending order
    ajmp node2              ;of node number.
    ajmp node3              ;
    ajmp node4

Listing Three

.MODEL SMALL
.STACK
.DATA
    DB   10 DUP(0)
asciitbl DB  7 DUP(0)
    DB   "$","$"
intromsg DB 0Ah,0Dh
     DB "Type-in active-set of messages, crtl-z to clear, crtl-x to quit"
     DB 0Ah,0Dh
intr2    DB "Do not enter remote-output, nor Notification messages.",0Ah,0Dh
intro3   DB "FOR TERSE51: Node range = 1 - 32. Terminal range = 1 - 4."
     DB 0Ah,0Dh
     DB "FOR TERSE51: Signature-calc starts from latest msg on stack."
     DB 0Ah,0Dh,0Ah,0Dh,"$"
destnode DB "Destination-node: $"
destterm DB " Destination-terminal: $"
sigtxt   DB "  SIGNATURE:$"
newline  DB 0Dh,0Ah,"$"
signature DW 0          ;16-bit signature
input     DB 0          ;binary input
header    DB 0          ;used for sig-calc
.CODE
start:
    mov ax,@DATA
    mov ds,ax
start2:
;intro message...
    mov ah,9
    lea dx,intromsg
    int 21h
start3:
;Ask for destination node....
    mov ah,9
    lea dx,destnode
    int 21h
    call getinput           ;comes back with "input", in binary...
    mov al,input
    and  al,00011111b
    shl  al,1
    shl  al,1
    shl  al,1
    mov  header,al
;ask for destination terminal...
    mov ah,9
    lea dx,destterm
    int 21h
    call getinput
    mov  al,input
    dec  al                 ;as terminal no. stored as 0 - 3.
    and  al,011b
    shl  al,1
    or   header,al
;calculate the signature....
    mov ax,signature
    test ax,8000h
    jz   sig1
    xor  ax,0100010000010001b
sig1:   xor  al,header
    rol  ax,1
    mov  signature,ax
    
;convert it to ascii....
    mov ax,signature
    mov dx,0
    call bin2dec
    mov  BYTE PTR asciitbl+7,"$"    ;hex o/p stuffs this up.
;display result...
    mov ah,9
    lea dx,sigtxt
    int 21h
    mov ah,9
    lea dx,asciitbl
    int 21h
      ;also display signature in hex...
     mov bx,signature
      mov cl,4
      rol bx,cl
      mov ax,bx
      and ax,000Fh
        cmp al,9
        jbe xx
        add al,7
xx:   add al,30h
      mov asciitbl+1,al
      rol bx,cl
      mov ax,bx
      and ax,000Fh
        cmp al,9
        jbe yy
        add al,7
yy:   add al,30h
      mov asciitbl+2,al
      rol bx,cl
      mov ax,bx
      and ax,000Fh
        cmp al,9
        jbe zz
        add al,7
zz:   add al,30h
      mov asciitbl+3,al
      rol bx,cl
      mov ax,bx
      and ax,000Fh
        cmp al,9
        jbe mm
        add al,7
mm:   add al,30h
      mov asciitbl+4,al
      mov BYTE PTR asciitbl+5,"h"
      mov BYTE PTR asciitbl+6,0Ah
      mov BYTE PTR asciitbl+7,0Dh
      mov BYTE PTR asciitbl+8,"$"
      mov ah,9
      lea dx,asciitbl
      int 21h
    jmp start3
getout:
    mov ax,4C00h
    int 21h
getinput:
    mov input,0
getinput2:
    mov ah,0
    int 16h
    cmp ax,2C1Ah    ;crtl-z
    jne notcrtlz
    mov signature,0 
    mov ah,9
   lea dx,newline
    int 21h
    pop ax          ;dump return address
    jmp start2
notcrtlz:
    cmp ax,2D18h    ;crtl-x
    jne notcrtlx
    mov ah,9
    lea dx,newline
    int 21h
    jmp getout
notcrtlx:
;test if outside range 0-9...
    cmp al,30h
    jb  below0
    cmp al,39h
    ja  above9
inrange:
;echo it to screen...
      mov ah,0Eh
      push ax
      mov bx,02
      int 10h
      pop ax
    and al,0Fh      ;convert to binary
     mov ah,input
     and ah,0Fh    ;see if 2nd char already there
     jnz second
     ;mov cl,4
     ;shl al,cl
    mov input,al    ;save in lo-nibble.
    jmp getinput2
second:
    mov cl,4
    shl input,cl
    mov ah,input
     or al,ah      ;combine them
    mov input,al    ;save. ...value is in bcd format.
below0:                ;any char outside range, terminates entry
above9:                ;        /
;convert input to binary....
    mov al,input
    and al,0F0h
    mov cl,4
    ror al,cl
    mov dl,10
    mul dl          ;result --> ax
    mov bl,input
    and bl,0Fh
    add al,bl       ;now have binary value.
    mov input,al    ;       /
    ret    
;...................................................................
bin2dec PROC            ;requires binary number in dx:ax...
   lea di,asciitbl
    mov bx,di
    add di,6  ;8
    ;mov BYTE PTR [di],0
    ;dec di
ssss:
    mov cx,10
    div cx
    add dl,30h
sssss:
    mov [di],dl
    dec di
    mov dx,0
    cmp ax,0
    jne ssss
    mov dl," "
    cmp bx,di
    jbe sssss
    ret
bin2dec ENDP
    END  start

Previous 4 5 6 7 8 9

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Embedded Systems

TERSE: A Tiny Real-Time Operating System

TERSE: A Tiny Real-Time Operating System

A signature-scheduled, dataflow OS for distributed embedded systems

Barry Kauler

How TERSE Works

Putting it Together

Conclusion

Listing One

Listing Two

Listing Three

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Embedded Systems Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Embedded Systems

TERSE: A Tiny Real-Time Operating System

TERSE: A Tiny Real-Time Operating System

A signature-scheduled, dataflow OS for distributed embedded systems

Barry Kauler

Related Reading

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Embedded Systems Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content