Design

Software Synthesis for OS-Independent Coding

By Bob Zeidman, April 01, 2005

Software synthesis lets you hide low-level implementation details from programmers.

April, 2005: Software Synthesis For OS-Independent Coding

Bob is a senior member of the IEEE and president of Zeidman Technologies, a developer of hardware/software codesign tools for embedded systems. He can be contacted at [email protected].

A few years back, I grew frustrated with the firmware development for a consumer product at a large company for which I was consulting. The project was very cost sensitive, using an 8-bit processor and tightly written firmware to control the device. As with most low-end processors, there was no real-time operating system (RTOS) available for it. The project leader and I created a specification for our own RTOS, a simple cyclic executive that allocated time to each task. As the number of tasks in the system grew, the complexity of the software grew and problems occurred.

When a task in the system needed to execute in a new thread, certain data structures needed to be set up for the operating system before the thread could start executing. Because our system was proprietary and is still in use today, I use examples from Linux to illustrate my points. Listing One shows how multiple tasks are set up to run concurrently in different threads in Linux. The global variable count is initialized to 0. The valPrint() routine is defined to increment count and print its value along with an argument passed to the routine. In the main() routine, a thread structure is created. First the valPrint() task is started in the thread with argument 1 passed to it, and then the valPrint() task is run in the current thread with argument 0 passed to it.

Since both threads can access the global variable count at the same time, the exact results of running the two tasks depends on the order of execution. The outcome is not deterministic. For our project, we defined mutexes for each task. Our specification defined how a mutex was to be created and accessed for shared resources. Before a task could access a shared resource, such as the serial port or a global variable, the task needed to check the resource's mutex to see if the resource was in use. Listing Two shows one way that a semaphore—a generalized mutex—is used with Linux when accessing a global variable. The main routine is the same as before, but the valPrint() routine now incorporates a semaphore. When execution reaches the P() routine, execution does not proceed to the next statement until no other task is executing the code. The V() routine signals to other routines that it is safe to begin executing the section of code between the P() and V() routines. In this way, only one thread executes the code between P() and V() at any one time, making the output deterministic after all the threads have executed.

In our proprietary system, data were passed between tasks using global data structures that were defined for each task. When a task completed execution, the last thing it would do was to store its results into a global buffer. When one task started a thread for a second task, the first task needed to first store any input data into a global buffer. This process is called "message passing" and the data comprised the message. The buffers needed to be created and initialized. Semaphores were usually needed to protect the messages from being accessed by a thread at the wrong time. The code got more complex as more tasks made use of a common resource, such as an fprint() routine that sent strings to the parallel port.

The process for creating and using message queues in Linux can get fairly complicated. The global data structures of the queues need to be defined by users. Then, routines need to be defined to send messages to a queue and receive messages from a queue. Mailboxes, which are global variables that act as simple message queues, are not as complex but also not as flexible. Listing Three is an example (from Linux) of routines for sending messages to a queue, reading messages from a queue, and removing a queue.

While the specification for our proprietary system took some time to develop, the requirements were fairly straightforward. There were, of course, other rules that ensured, for example, that each task would be allocated time according to its priority, no task would dominate shared resources, and each task would cooperate nicely with other tasks by giving control back to the RTOS at appropriate times. Writing the code for the RTOS was not particularly difficult, though adding the code for each new task took a little time. Where the problems arose were with the mutexes and message queues within each task. When a programmer wrote code for a task and accidentally forgot to set or reset a mutex or semaphore, or forgot to store a value in a mailbox or message queue, dangerous, unpredictable behavior resulted. Sometimes the behavior crashed the system immediately. Other times, the system continued running and was only noticeably erratic sometime later. In all cases, these problems could not be discovered until the code was compiled, loaded onto the hardware platform, and run long enough with enough stimuli to produce an illegal state that was noticeable.

After finishing the project, I continued to think about this problem. The rules we had developed were fairly straightforward, but they did get more complex as the number of tasks in the system grew. Due to rushed schedules and normal human error, mistakes grew more common as the project progressed. The schedule slipped as more time was spent debugging the code than writing it in the first place. I realized, though, that our rules were deterministic and could easily be automated. Why not just have one task call another task or resource? I could create a tool that would go through the source code and insert all of the additional source code needed for manipulating semaphores, mutexes, mailboxes, and message queues. This same tool could write the cyclic executive that scheduled the tasks. I began writing this tool that grew in complexity until it was creating data structures and making significant modifications to the user's code to allow multitasking, eliminate race conditions, and check for potential deadlock situations. Thus was born software synthesis.

Software Synthesis

The main concept of software synthesis is to hide the implementation details from programmers. When programmers write high-level C code, they do not need to worry about how many registers the processor has, which registers are available for which variables, how to increment/decrement the stack pointers, and other low-level issues. Why then should they have to worry about operating-system-dependent issues? Software synthesis allows you to write OS-independent code. A synthesis tool examines all of the source code and rewrites the code to work with a specific operating system. Just as a compiler lets you place values in variables without knowing how those variables are implemented in hardware, a synthesizer lets you control intertask communication and message passing without knowing the specific mechanisms used to accomplish it.

Software synthesis uses "primitives"—statements that can be thought of as higher level abstractions of C code. These primitives describe complex operations. To run a task in multiple threads as was done in Linux in Listing One using software synthesis, the programmer might write the code in Listing Four. You simply specify a high-level primitive, SynthesizeThread(), that looks like a simple C function call. There are other ways that this could be accomplished including having the user create a configuration file that informs the synthesis tool that the valPrint() routine is a task that always runs in its own thread. The software-synthesis tool goes through the code and replaces all thread primitives with the OS-specific code needed to start the specified task in its own thread.

Besides simplifying the thread execution, there is another benefit. With software synthesis, there is no need to specifically code semaphores because the software-synthesis tool creates the code needed to create, set, reset, and check semaphores. The software-synthesis tool can easily find all global variables that could potentially be accessed by multiple threads and surround the access to these variables with semaphore checking code. For those global variables that cannot be accessed by multiple threads, such code is not necessary and is not synthesized.

Message passing using software synthesis is as easy as returning values from a function. To programmers, it is exactly that. Listing Five shows how simple it is to pass messages in this way. The return value of the function in the thread is the message that is passed. The message can be a simple variable, or as in the example, a complex message structure. But the values are passed between threads just like they are passed between ordinary C functions.

Once the code is operating-system independent, the software-synthesis tool can perform another important function—that is, to synthesize the operating system itself. Synthesis of the OS can only be accomplished if the synthesis tool is aware of all of the source code for the system. The tool then knows all of the threads that can be executing as well as their interaction. This synthesized operating system has only the functionality needed by the system and no more. Because the tool must know all of the source code, synthesizing the OS won't work for desktop systems where users can add applications and drivers to run on top of the OS at any time. For true embedded systems, where the system firmware is not changed except occasionally by knowledgeable field engineers, OS synthesis is an attractive possibility. Of course, systems such as cellphones and PDAs are not good candidates because they are not truly embedded systems—users can add applications and drivers to these systems. But for traditional embedded systems—automobile control systems, toys, smart consumer appliances, and network devices—OS synthesis has many benefits in terms of reducing code complexity for programmers, increasing system reliability, decreasing development time, and reducing memory footprint.

SynthOS: Software Synthesis for Embedded Systems

SynthOS is a tool from Zeidman Technologies that performs software synthesis for embedded systems. The specific primitives used by SynthOS are described later. Example 1 is a SynthOS primitive and the resulting synthesized C source code. Example 1(a) shows what looks like a simple function call. However, a configuration file defines taskLed() as a task running in its own thread. Thus, SynthOS creates the code in Example 1(b) to pass the parameters as messages to the taskLed() routine through a message queue.

The learning curve for SynthOS is extremely short because there are only five primitives, as shown in Table 1. Each of the primitives gets synthesized into C code calls to an RTOS that is automatically generated in C by SynthOS. With the five primitives, you can create most of the functionality required by complex embedded systems.

SynthOS_call is used to begin execution of another task (task2 in the example in Table 1) while suspending execution of the current task until the called task has completed. In the first example, the called task does not return a value. In the second example, the called task returns a value. Execution of the called task is started and execution of the current task is suspended until the called task has completed.
SynthOS_check checks whether a task (task2 in the example in Table 1) is currently executing and returns True if the task is executing, False if the task is suspended.
SynthOS_sleep is used to give control back to the RTOS and resume execution at a later time as determined by the RTOS. Execution of the current task is paused. At some later point, the operating system will restart execution of that task at the point in the code immediately after the SynthOS_sleep primitive. Exactly when this resumption occurs is determined by the operating system according to its scheduling algorithm and the priority of other tasks waiting to execute.
SynthOS_start is used to begin execution of another task (task2 in the example in Table 1) without suspending execution of the current task. Execution of the called task is started and execution of the current task continues immediately after the primitive is executed.
SynthOS_wait suspends execution of the current task and waits for another task (task2 in the example in Table 1) to finish executing or for a condition to be true. The condition can be any legal C condition, such as (x == i*j + 5). Execution of the current task is suspended until the specified task is idle or the condition becomes true. Note that this statement does not begin execution of task2.

The Synthesized RTOS

You can also create a configuration file that specifies the parameters of each task, such as the task's priority and its period, and specifies the requirements of the operating system, such as the scheduling algorithm. SynthOS is then run on all of the source code. SynthOS creates the appropriate semaphores for each task and inserts the appropriate code at the appropriate points in the task. SynthOS also creates the RTOS to manage the tasks. Two possible results of RTOS synthesis are illustrated in Figure 1. The code written by the programmer is shown in yellow. The code generated automatically by SynthOS is shown in blue.

The resulting RTOS is optimized because functionality that is not needed is not created. This is different than modular off-the-shelf operating systems. The granularity of functions that can be included or left out of a SynthOS-generated OS is at the C statement level rather than the predefined module level. So a SynthOS-generated RTOS can be much more finely optimized for size and functionality than an off-the-shelf RTOS.

Because the output of SynthOS is in the same language as the input, the engineer has complete visibility to everything that is going on in the operating system. All of the tools that are used to compile and debug the tasks can also be used to compile and debug the operating system.

SynthOS can detect most potential race conditions and eliminate them. Race conditions occur in global variables that are being modified by two or more tasks that are running simultaneously, such that one task modifies a global variable and before it can check its value, another task modifies it again. SynthOS is aware of all global variables in the system and protects them from modification by another task at the wrong time.

A tradeoff for SynthOS is that it relies on the compiler to optimize task switching rather than using any specialized processor hardware. SynthOS creates C code data structures for each task in order to save its state during a task switch. While dedicated hardware inside a microprocessor for task switching is relatively fast, such hardware must swap many registers into and out of memory for each task switch. This is necessary for a desktop system where it is unknown at compile time which applications will be running and when they will be running. SynthOS, however, knows that a certain task only uses four registers while another task uses 16 registers and only swaps those specific registers. A SynthOS-generated RTOS uses slower software mechanisms to swap out task states, but because of its detailed knowledge of each task, it minimizes the amount of state information to swap. Because of this knowledge, for many applications, the SynthOS-generated RTOS will perform task switching much faster than an off-the-shelf RTOS.

The reliance of the SynthOS-generated RTOS on the compiler rather than specialized hardware means that SynthOS immediately supports any processor for which a C compiler exists. Like any RTOS, SynthOS needs to link into a board support package (BSP). But SynthOS does not require a team of engineers to port an RTOS to each new processor architecture.

In an RTOS created for a system using SynthOS, there is no extra hardware functionality required such as memory management hardware and context switching hardware. Almost every off-the-shelf RTOS includes this functionality and must run on a complex processor that supports these functions. In many applications, the embedded system does not need this functionality. SynthOS can generate an RTOS that can be run on a smaller, simpler processor that does not include these complex hardware functions. The smaller, simpler processor is typically much less expensive and much less power hungry than the larger, complex processor.

The SynthOS-generated RTOS is optimized for size and thus requires a very small memory footprint. An off-the-shelf RTOS includes functionality that is not specifically needed for the system and thus requires more memory.

Altera Demo

To demonstrate the effectiveness of SynthOS, we applied it to an Altera development board that contained the Altera Cyclone EP1C20 FPGA with a NIOS 32-bit soft processor (http://www.altera.com/). The development kit came with C source code for a simple web server. The system was a single task with no operating system. The code included a TCP/IP stack so that a computer could be connected to the board using Ethernet. The computer could call up one set of web pages, some static and some dynamic.

To test SynthOS on this system, we added tasks to turn the single-task system into a multitask system. One task let a user press a button to alternately set or clear any of 4 bits in a register. This register was then used to drive a set of single LEDs to display the value of the register in binary. The register was also used to drive two seven-segment LEDs to show the contents in hex. Another task rotated the value of the single LEDs from left to right like an electronic billboard. Another task was added so that when users clicked on a link in a web page, a short description of the page appeared on the attached LCD display.

The new software required only the creation of the new tasks and the insertion in the code of several SynthOS primitives. The resulting RTOS was synthesized and worked as specified, using a mere 3 KB of memory.

Conclusion

Software synthesis has exciting potential for future development, particularly in the area of embedded system development. Because source code is produced for the entire system, analysis, optimization, and experimentation can be performed to an extent that is not otherwise possible. Specific areas of potential benefits include:

Static code analysis. Because a synthesized embedded system consists of source code, including the RTOS, analysis tools can find best-case and worst-case timing for all tasks. Vendors can supply timing numbers for their processors with regard to how fast they execute particular lines of code. For exact timing, this analysis can be done after compiling the system code. The timing numbers would then be plugged into the code and a timing analysis would be performed to determine such things as worst-case latency for task switching or interrupt handling. This kind of "static code analysis" is analogous to static timing analysis for digital circuits. Other kinds of static code analysis are possible, too, including memory usage analysis and resource usage, before the code is ever run on real hardware.
Optimization for the hardware platform. Software synthesis can optimize the resulting source code for particular hardware platforms. For example, if the target processor has hardware for assisting with task switching, software synthesis can introduce code to take advantage of it. If the target processor does not have such built-in hardware, software synthesis can create its own C code data structures for task switching. This is only one example of how synthesis can create code that works best with hardware specified by the user. Because of this ability to target different hardware, the user is not tied to specific hardware when writing code. The resulting code can then be targeted to different hardware platforms and performance can be evaluated before the hardware system has actually been designed.
Analysis of potential deadlock situations. Software synthesis can be used to find which tasks are sharing resources. Sometimes this situation is obvious, such as when two tasks use a shared hardware device. Sometimes this situation is more complex as when tasks invoke other tasks that use a shared hardware device. Synthesized code can be examined to find all call trees that share resources. This is not possible when using compiled libraries or off-the-shelf operating systems. In a typical embedded system today, detecting hazard conditions occurs at runtime. Many conditions can be missed because the runtime tests do not cover all possibilities. With software synthesis, hazard detection can become a deterministic static analysis of the system source code that can potentially find all possible problems before the code is compiled or executed.
Experimentation with algorithms. Software synthesis has the potential to allow users to experiment with different algorithms and see their effects during static code analysis or at runtime. For embedded systems, different algorithms can be specified for the RTOS scheduling algorithm or for assigning task priorities. A static code analysis can be performed to see how the different algorithms affect performance. And of course, the different algorithms can be tried at runtime to determine real-world performance numbers.

Acknowledgments

I would like to thank a number of people who contributed to the development of SynthOS and to the writing of this article. These people are Loc Le, Huong Le, Ronen Keren-Gill, Dan Hafeman, and Michael Barr.

DDJ

Listing One

int count = 0;
void valPrint(int p)
{
    count++;
    printf("%d : count = %d\n", p, count);
}
void main()
{
    // Create two threads for the valPrint() task
    //  Create a new thread
    Thread *the_thread = new Thread("child");
    // Start the new thread running valPrint()
    the_thread->Fork(valPrint, 1);
    // Run the valPrint() task in the current thread
   valprint (0);
}

Back to article

Listing Two

int count = 0;
Semaphore *the_semaphore;

void valPrint(int p)
{
    int local_count;        // Local variable used to hold global count
    // Wait until no other task is executing the following code
    the_semaphore->P(); 
    count++;                // Increment count
    local_count = count;    // Store a local copy of count

    // Allow other tasks to proceed now that we're done executing the code
    the_semaphore->V();
    printf("%d : count = %d\n", p, local_count);
}
void main()
    // Create two threads for the valPrint() task
    //  Create a new thread
    Thread *the_thread = new Thread("child");
    // Create a semaphore
    the_semaphore = new Semaphore("the_semaphore", 1);
    // Start the new thread running valPrint()
    the_thread->Fork(valPrint, 1);
    // Run the valPrint() task in the current thread
    valPrint(0);
}

Back to article

Listing Three

void send_message(int qid, struct mymsgbuf *qbuf, long type, char *text)
{
    // Send a message to the queue
    qbuf->mtype = type;
    strcpy(qbuf->mtext, text);

    if((msgsnd(qid, (struct msgbuf *)qbuf, strlen(qbuf->mtext)+1,0)) ==-1)
    {
        perror("msgsnd");
        exit(1);
    }
}
void read_message(int qid, struct mymsgbuf *qbuf, long type)
{
    // Read a message from the queue
    printf("Reading a message ...\n");
    qbuf->mtype = type;
    msgrcv(qid, (struct msgbuf *)qbuf, MAX_SEND_SIZE, type, 0);
}
void remove_queue(int qid)
{
    // Remove the queue
    msgctl(qid, IPC_RMID, 0);
}

Back to article

Listing Four

int count = 0;

void valPrint(int p)
{
    count++;
    printf("%d : count = %d\n", p, count);
}
void main()
{
    // Create two threads for the valPrint() task
    // Start the new thread running valPrint()
    SynthesizeThread(valPrint(1));
    // Run the valPrint() task in the current thread
    valprint (0);
}

Back to article

Listing Five

int count = 0;
struct StringMsg        // Create a structure for passing messages
{
    char StringBuf[20];
} Message;
struct StringMsg valPrint(int p)
{
    struct StringMsg Message;       // Buffer to hold the message
    count++;
    printf("%d : count = %d\n", p, count);
    // Create the message and pass it to the calling routine
    strcpy(Message.StringBuf, "message #");
    _itoa(count, &Message.StringBuf[9], 10);
    return (Message);
}
void main()
{
    struct StringMsg ThreadMsg1;    // Pointer to the message from thread 1
    struct StringMsg ThreadMsg2;    // Pointer to the message from thread 2
    // Create two threads for the valPrint() task
    // Start the new thread running valPrint()
    SynthesizeThread(ThreadMsg1 = valPrint(1));
    // Run the valPrint() task in the current thread
    ThreadMsg2 = valPrint (0);
    printf("First message = %s\n", ThreadMsg1.StringBuf);
    printf("Second message = %s\n", ThreadMsg2.StringBuf);
}

Back to article

1 2 3 4 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Design

Software Synthesis for OS-Independent Coding

Software Synthesis

SynthOS: Software Synthesis for Embedded Systems

The Synthesized RTOS

Altera Demo

Conclusion

Acknowledgments

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Design Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Design

Software Synthesis for OS-Independent Coding

Software Synthesis

SynthOS: Software Synthesis for Embedded Systems

The Synthesized RTOS

Altera Demo

Conclusion

Acknowledgments

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Design Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content