Embedded Systems

DOS for Embedded Systems: Interrupt Latencies

By Shai Vaingast and Ehud Cohen, January 01, 2001

With the emergence of low-power, low-cost, high-processing PC-based embedded-systems solutions, DOS has turned into a serious alternative for embedded applications. However, you still have to deal with the problem of interrupt latencies.

Jan01: DOS for Embedded Systems: Interrupt Latencies

Shai is a software engineering manager at BioControl Medical and can be contacted at [email protected]. Ehud is the COO for BioControl Medical and can be contacted at [email protected].

With the emergence of low-power, low-cost, high-processing PC-based embedded-systems solutions, DOS has turned into a serious alternative for embedded applications. For instance, in developing a real-time data logging device to measure electrophysiological signals, our company opted to use a PC/104-based solution comprised of a 486 CPU, data-acquisition multi-I/O board (DACQ), and custom-made PC/104 form-factor board. We then started looking for a suitable operating system.

We initially considered the leading operating systems for the embedded market — VxWorks, QNX, and others. Then it struck us — why not DOS? When you don't need TCP/IP and 640 KB of memory is more than enough, DOS is perhaps the most cost effective solution. And even if you do need these features, there's almost unlimited information about DOS on the Internet. Everyone knows DOS. Of course, DOS has its share of problems, ranging from the 640 KB memory limit to out-dated development tools. For example, to compile a 16-bit program for DOS, you must use obsolete tools such as Microsoft Visual C++ 1.52 (the one we're using) or Borland C++ (up to Version 4.5, to the best of our knowledge).

Still, the main problem we encountered in developing our device involved interrupt latencies in DOS. This is the topic we'll address in this article.

System Architecture

We chose to use a PC/104-based solution primarily because we wanted to have a working device up and running without delay. PC/104 parts are commercially available, off-the-shelf components (COTS) with a solid availability and performance history. PC/104 is essentially the same architecture as ordinary PCs. Its mechanical dimensions are smaller (the form factor is 3.8×3.6 inches), enabling it to be lightweight and portable — ideal for embedded devices. Plus, PC/104 components draw much less power than desktop computers. They can also be stacked on top of each other and connected through a 104-pin bus (hence the name "PC/104;" see http://www.pc104.org/).

The device we developed (see Figure 1) consisted of three major off-the-shelf components, two of which were PC/104 components.

The first component is a 486 100-MHz PC module from Ampro (http://www.ampro.com/). This is the computational heart of the device that we built. It also has a built-in socket for a flash device.
The second component is a 12-bit, 8-channel data-acquisition board with eight digital I/O ports from Diamond (http://www.diamondsys.com/).
The third component is the Disk On Chip 2000 (DOC) flash disk with 144 MB from M-Systems (http://www.m-sys.com/). We used this instead of the OEM flash device on the CPU board. The DOC serves both as a hard disk for data logging and a boot device for the CPU. Data stored on the DOC can be downloaded via parallel port. Future versions of our device will incorporate the higher capacities of the DOC.

The main consideration in choosing these specific PC/104 boards was that, given the system requirements, these devices draw the least power, allowing the device to work on batteries for a longer duration. Besides the COTS products, our company also developed a custom PC/104 form-factor board to amplify electrophysiological signals to levels acceptable by the data-acquisition board. Furthermore, the custom board had to deal with user interaction (switches and LEDs). We also developed a power module driven by Lithium batteries.

System Design

System requirements are straightforward. We needed to sample three channels at 5 KHz, filter and decimate the data to 2.5 KHz, and store it to a flash disk. To achieve this, we hook up the interrupt generated by the DACQ and store sampled data to 64-KB buffers. Once the buffers are full, we filter, decimate, and flush them to a disk. The reasons we use 64-KB buffers (and not more) are twofold:

You can't create over 64 KB using static allocation in 16-bit DOS. (Actually, you can, but it does require some tricks and complicates the code.)
Data is saved to disk every few seconds, ensuring that if the power fails or some unforeseen hardware-related problems occur, data is still available for download.

Accessing the disk is performed from outside the interrupt, so you might think of this task as a background task. Figures 2 and 3 illustrate a timing diagram and state chart of the system. (For information on timing the PC family under DOS, see http://pm1.contactor.se/~daniel/links/Pctim003.txt/.)

Interrupt Latencies

The main problem we encountered was that whenever we accessed the disk, we'd lose interrupts (that is, interrupts were disabled). We managed to determine this by the DACQ board, which sends both the channel number and sampled value. So if the Interrupt Service Routine (ISR) receives nonconsecutive channel values, some other task is disabling, and interrupts for a duration longer than the sampling rate interval. Since we sample three channels at 5 KHz, an acceptable interrupt disabling duration would be 67 Sec (1/5000/3 seconds). We saw that we lost interrupts for a duration of approximately 300 Sec — clearly unacceptable.

The CTC

To further investigate, we wrote a program that uses the Counter/Timer Chip (CTC). The CTC on PCs is an Intel 8253/4 or equivalent (see http://support.intel.com/support/controllers/peripheral/231164.htm). The CTC has three independent channels: 0-2.

Channel 0 is connected to IRQ0 and is triggered once the counter reaches 0, invoking interrupt 8, the timer tick interrupt (IRQ0 is connected to interrupt 8).
Channel 1 is used for DRAM refresh rate and is therefore usually unused by programs.
Channel 2 is connected to the speaker and can be easily used for other purposes. This makes Channel 2 attractive to real-time programmers as an accurate time base, independent of software and operating system. It can also be used for profiling code fragments, testing various time-related issues, and as an accurate delay clock. The CTC uses a BUS-derived clock of 1.193182 MHz, which yields a time tick count of 0.838 microseconds.

Each CTC channel can be programmed for one of six modes of operation. Mode 2, which is also known as the "rate generator," is commonly used. In this mode, the relevant CTC channel takes the CTC clock and divides the frequency by a 16-bit divisor value. On boot, for example, most PCs set CTC Channel 0 to work at Mode 2 with a divisor of 65536. The BIOS then handles interrupts generated by CTC Channel 0 at a rate of approximately 0.838·65536=54.9 Sec. The BIOS uses the timer-tick interrupt to maintain an accurate system clock. Accessing the CTC is done using the I/O ports listed in Table 1.

To set up CTC channels, we access the Mode/Command Register (MCR); see Table 2. (Intel's datasheets refer to this register as the "Control Register.") Normally, you'd use: Channel=0 or 2, Access mode=high and low byte (11), Operating mode=2 (rate generator), and also count=binary. (Refer to the literature on the CTC for other values of the MCR.) Programming the CTC requires issuing the MCR command followed by accessing the port related with the channel for setting the divisor value of the rate generator (low byte followed by a high byte divisor value). Once Channel 0 is set, it generates interrupts. However, Channel 2 needs to be enabled after programming it. This is done by setting the Timer 2 Gate bit (LSB on "Port B" located at 0x61h).

To read the count in progress, you latch the channel you want to read by issuing a latch command to the CTC using the MCR. The latch command is the channel value (2 MSB) followed by six 0s; see Table 3. Afterwards, the count in progress can be read by reading the port associated with the channel (again, low byte followed by a high byte).

Tying It All Together

To test the system, we started by hooking the interrupt tick count (interrupt 0x8) to our own ISR. This essentially disables the BIOS interrupt handler. We then set CTC Channel 0 to generate interrupts at a given rate. This lets us simulate the rate at which data is received by the DACQ (which generates interrupts). We also program CTC Channel 2 as our accurate time base. We set its divisor value to 65536 so it will wrap at the lowest possible rate. Now, every time an interrupt is issued (by CTC Channel 0), we accurately time it by reading the count in progress in CTC Channel 2, and store this time to a buffer.

After both CTC channels are enabled and interrupts are generated by Channel 0, we access the disk using the DOS fwrite() command. To make sure that fwrite() is issued after interrupts are hooked, we wait until a quarter of the buffer is full, then call fwrite() with dummy data. Since Channel 2 is not interrupt driven, its value won't be affected by interrupt disabling done by fwrite() (implemented using DOS int 21h). The program exits once the buffer is full. It then proceeds by writing the buffer to disk for later analyzing, unhooking the ISR, and setting CTC Channel 0 to its default values. See Listings One and Two for details.

Analyzing the Results

Analyzing the results produced by the system requires a computational and visualization tool. For various reasons (mainly cost), we use Octave (http://www.che.wisc.edu/octave/), a freely available Matlab-like system that has the same syntax as Matlab and can run most Matlab M-files. Much effort has gone into porting many Matlab toolboxes to Octave. Octave runs on various platforms, including Linux, Win32 (using Cygwin; see http://sourceware.cygnus.com/cygwin/), and more.

Analyzing the results is straightforward. First, we read the results from INT_LATE (results are stored in ASCII format in file INT_LATE.DAT). The CTC counts downwards, but we want an increasing time base, so the first thing we do is to subtract the results from 65536 as follows:

load INT_LATE.DAT

t=65536-INT_LATE;

CTC Channel 2 wraps at 65536, so the next thing to do is to unwrap it. Listing One (unwrap.m) shows how to do this. Next, plot the difference of times at which interrupts are received, by issuing the command:

plot((diff(unwrap(t))-80)*0.838, '+');

The value 0.838 translates ticks into Sec (80 ticks is 67 Sec). What we expect to see is an (almost) steady line located close to zero for a quarter of the graph, then some values above and below zero and then again, a steady line located around zero. As long as interrupt disabling is performed for a duration shorter than 67 Sec, we'd expect to see a symmetrical graph, where a jitter around the zero line should be clearly visible for the times at which the disk is accessed. More than that, after every point above zero, there should be a point below zero.

However, if interrupts are disabled for a longer duration, we expect to see a different behavior. There should be values above 67 Sec, but the graph won't be symmetrical. It can be easily shown that negative values are at most minus 67 Sec. So to sum things up, an OS that has graph values smaller than 67 Sec is the one we're looking for.

The maximum interrupt disabling an OS does while accessing the disk is evaluated using the following expression:

max_int_disable=max((diff(unwrap(t))- 80)*0.838)

See Figures 4 and 5 for results of different DOS implementations.

We tested our program on various PCs and got basically the same results. Our conclusion was that MS-DOS implementation of disk access (interrupt 21H) is poorly written. We tried the program with several flavors of DOS (see Table 4). They all exhibited the same behavior — interrupt latency of above 200 Sec while accessing the disk. (Could it be that all the mentioned OS use similar code to access the disk?) However, there is a remedy. FreeDOS (http://www.freedos.org/) performance was much better — 31 Sec.

BIOS Disk Access

To determine the lowest interrupt latency we can achieve, we wrote a modified version of INT_LATE.C that uses the BIOS (interrupt 13H) to access the disk using the _bios_disk() function. Here we got excellent results. During disk access, there is an interrupt latency that causes jitter around the 67 Sec line. This does not necessarily mean that interrupts are disabled. More likely, what we do measure here is the interrupt latency of the system — probably the time it takes for the programmable interrupt controller to signal the CPU, plus the time it takes the CPU to service the interrupt (finishing current operation, storing registers, and so on). Figure 6 illustrates the results.

Conclusion

All in all, FreeDOS, Octave, and Linux are excellent tools for individual programmers and the software industry as a whole. Not only are they free (some of them GPL), but they compete with commercial tools in terms of performance and usability.

DDJ

Listing One

function y=unwrap(x)
% function y=unwrap(x)
% returns vector y unfolded around 65536

d=diff(x(:));
I=find(d<0);
d(I)=d(I)+65536;
y=cumsum([x(1); d]);

Back to Article

Listing Two

/* INT_LATE.C. Measures interrupt latencies in DOS during disk access.
  The program hooks interrupt 0x8 (CTC channel 0) to generate interrupts at a 
  rate of approximately 66 uSec (80 ticks).  It then measures the time at 
  which interrupts are received (using CTC channel 2 as an accurate timing 
  mechanism) while writing a file to disk.
  Notes:
  1. Must be compiled with 16 bit DOS compiler.
     It was tested on Borland C++ 3.1 and Visual C++ 1.5.
  2. You can NOT compile this program with Test Stack Overflow turned on and 
     get an executable file which will operate correctly. 
  3. The BIOS time is dependent upon interrupt 0x8 and CTC channel 0 values. 
     Therefore system time may be inaccurate after running this program.  
     To maintain an accurate clock, set the correct time after running this 
     program.
*/
#include <stdio.h>
#include <dos.h>

/* interrupt related definitions */
#define CLK_TICK_INT  0x08    /* The clock tick interrupt               */
#define PIC           0x20    /* Programmable Interrupt Controller port */
#define EOI           0x20    /* End Of Interrupts for PIC              */

/* CTC related definitions and macros */
#define MCR           0x43    /* The Counter/Timer Chip register        */
#define PORTB         0x61    /* "Port B" for enabling Channel 2        */
/* a macro for latching the CTC and reading the count in progress       */
#define LATCH_CTC(channel) ((channel)<<6)

/* PROGRAM_CTC macro
   0x34 = 0b00110100 meaning (from MSB to LSB): 
   00 : channel
   11 : access mode=low and high byte,
   010: mode=rate generator
   0  : count in binary
   Refer to 8253/4 documentation for other values
*/
#define PROGRAM_CTC(channel) (((channel)<<6) | 0x34)

#define CTC_PORT(channel) (0x40|(channel))

#define BUF_SIZE   ((int)4096)

/* 'iCount_G' is accessed in the ISR and in main() so it should be volatile */
int volatile iCount_G=0;

/* buffers to store high and low values of CTC channel 2 */
unsigned char auchLoWord_G[BUF_SIZE], auchHiWord_G[BUF_SIZE];

void interrupt handler()
{
  if(iCount_G<BUF_SIZE) 
    {
      /* read channel 2 counter*/
      outp(MCR, LATCH_CTC(2));
      auchLoWord_G[iCount_G] = inp(CTC_PORT(2));
      auchHiWord_G[iCount_G] = inp(CTC_PORT(2));
    }
  iCount_G++;

  outp(PIC,EOI);
  _enable();
}
int main()
{
  FILE *fOutData;
  unsigned char ucIntRate=80;
  int iIndex;
  void (_interrupt *oldhandler)();

  /* save the old interrupt handler our interrupt handler */
  oldhandler = _dos_getvect(CLK_TICK_INT);
  _dos_setvect(CLK_TICK_INT, handler);

  /* program CTC channel 0 to generate interrupts at 'ucIntRate' */
  _disable();

  outp(MCR, PROGRAM_CTC(0));
  outp(CTC_PORT(0), ucIntRate);
  outp(CTC_PORT(0), 0);
  
  /* program CTC channel 2 to count wrap at 65536 */
  outp(MCR, PROGRAM_CTC(2));
  outp(CTC_PORT(2), 0x0);
  outp(CTC_PORT(2), 0x0);

  _enable();

  /* enable CTC channel 2 */
  outp(PORTB, 1);

  /* loop until a quarter of the buffer is full */
  while(iCount_G<(BUF_SIZE>>2));

  /* access disk to measure interrupt latencies by writing a dummy file */
  if(NULL==(fOutData=fopen("INT_LATE.TMP","wb")))
    {
      printf("Can't create file INT_LATE.TMP.\n");
      return 0;
    }
  fwrite(auchLoWord_G, BUF_SIZE, 1, fOutData);
  fclose(fOutData);

  /* loop until buffer is full */
  while(iCount_G<BUF_SIZE);

  /* restore the CTC channel 0 value */
  _disable();
  outp(MCR, PROGRAM_CTC(0));
  outp(CTC_PORT(0), 0);
  outp(CTC_PORT(0), 0);
  _enable();

  /* restore the old interrupt handler */
  _dos_setvect(CLK_TICK_INT, oldhandler);
  
  /* write buffer to file */
  if(NULL==(fOutData=fopen("INT_LATE.DAT","wt")))
    {
      printf("Can't create file INT_LATE.DAT.\n");
      return 0;
    }
  for(iIndex=0; iIndex<BUF_SIZE; iIndex++)
    fprintf(fOutData,"%d\n",(((short)auchHiWord_G[iIndex])<<8) | 
        auchLoWord_G[iIndex]);
  fclose(fOutData);
  printf("Data written to file INT_LATE.DAT\n");
  return 0;
}

Back to Article

1 2 3 4 5 6 7 8 9 10 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Embedded Systems