Don C. Bradley is the technical manager for Melville Software, a software consulting firm specializing in the development of custom data acquisition, analysis, simulation, and process control applications. Don has seven years experience on IBM PCs and VAXs using C and assembly language. His current projects include a Sleeping Disorder Analyzer and a Universal Instrument Interface. Don can be reached at Melville Software, Suite 1007, 350 Sparks St., Ottawa, Ontario, Canada, KIR-7S8, (613) 238-1840, or at Compuserve ID 70410, 405.
Direct Memory Access (DMA) has long been thought of as the best method to transfer data to and from a computer's memory. This notion held true as long as the data transfer was simple. As data transfers became complex, DMA quickly faded out of the picture, and interrupts became the favored method. Complex data transfers were implemented only by the best system developers if permitted by the hardware. With the release of the latest "smart" cards (peripheral boards programmable completely by software), developers are choosing DMA for complex data transfers. Since these boards give the developer complete flexibility in controlling the hardware, what was once left to seasoned developers is now easily attainable by most.
This article shows how to implement DMA transfers in C that will handle both simple and complex data transfers from a smart card called the Lab Master AD Data Acquisition board from Scientific Solution. I chose this device, in part, for the variety of simple and complex data transfers imposed on a data acquisition system. The routines presented here form a complete high-speed data acquisition application and can easily be ported to other devices that support DMA.
The simplest method of transferring data is called polling. In this method the CPU checks for when data is ready to be transferred. When ready, the data is transferred from the I/O device to memory. This method is the most versatile of the three methods since it is the least dependent on hardware design. This method requires the CPU to check regularly for when data is ready. Depending on the frequency at which the data will be transferred, no other tasks can be performed. If other tasks (i.e., saving or displaying data) are executed during the checking phase, valid data points might be missed altogether. Missing data is known as a data overrun condition.
Interrupts allow the CPU to switch temporarily to processing the transfer and then return to the interrupted task. The CPU can perform other tasks without missing a transfer during the time the task executes. Data overruns will occur if the transfer interrupt service routine doesn't complete before the next interrupt occurs. Interrupts are, however, more restrictive than polling in their ability to handle conditions imposed on the transfer. As with polling, all tasks run through the CPU.
Direct Memory Access (DMA) transfers data from an I/O device to the computer's memory. This method bypasses the CPU, using instead a device on the system board called a DMA controller. To transfer data without executing commands through the CPU, the card that implements the DMA signals the DMA controller when data is ready for transfer. During the actual transfer, the DMA controller acquires control over the CPU's data and address busses. To signal the end of the actual transfer, the DMA controller is programmed with the amount of data (number of words) to be transferred. The transfer can be prematurely terminated by initializing either the I/O device or the DMA controller during the transfer.
There are three limiting factors in DMA transfers: no program intervention by the CPU, long transfer latency times, and 64KB transfer boundaries. The first limiting factor, lack of program intervention by the CPU, is the reason that DMA is utilized in the first place. In order for conditions to be implemented that effect how data is selected for transfer, the peripheral card must be "smart" enough to detect when to change these selections. The second limiting factor, transfer latency time, occurs due to the fact that the CPU must finish its current instruction before yielding the bus to the DMA controller. If the CPU is executing a string or I/O instruction with a rep prefix, up to 128KB memory and/or I/O cycles could occur before the DMA controller gains access to the bus. Empirical testing has found some "AT" style computers with latencies between the DMA request and DMA cycle of up to 16 microseconds, thus limiting instantaneous transfer rates to 62KHz. This limitation can be overcome with peripheral FIFOs, allowing an average A/D transfer rate up to one million samples/second on an AT and up to four million samples/second on EISA computers. The third limiting factor, 64KB transfer boundaries, can be overcome by programming techniques described later in this article.
DMA is the best method of transferring data since it is the most efficient method for transferring data and the least likely to cause a data overrun with a properly designed peripheral card. Since DMA can run in the background, the CPU can handle other tasks in an uninterrupted way. On the other hand, the transfer must be handled within the limitations of the hardware. In most circumstances, the transfer conditions are very restrictive. With the introduction of smart cards and their software controlled operations, the limitations imposed by the hardware are becoming less restrictive.
Smart cards offer almost full programmability. All the card's possible configurations are achieved through software, rather than dip switches.
The simplest DMA transfer is one where a device sends the same type of data. A data acquisition card, for example, would transfer one channel's data at a fixed frequency and gain for the duration of the transfer. Most data acquisition cards however allow DMA data transfers from a series of channels starting at the lowest channel to a programmed channel number, at a fixed rate.
Transfers become complex when transfer conditions do not adhere to the design of the DMA transfer in the hardware. Data acquisition sessions where the channels to be sampled are not in a simple sequence and where the gain varies between channels, are very common.
The DMA controller implemented on the PC and compatibles (8237A or its equivalent) transfers a maximum of 64KB words using DMA channels 5 to 7. This controller does not address memory in the same manner as the CPU. Figure 1 illustrates the two methods of addressing memory. The CPU addresses memory in real mode by shifting the Base register four bits to the left and adding the Offset register forming a 20-bit address. The DMA controller addresses memory by appending the Offset register to the first four bits in the Page register. This addressing scheme can cause problems when allocating a buffer for the DMA transfer. The allocated buffer should not contain more than one DMA page. For example, allocating a 16KB buffer at address 0x4D1A4 would cover two DMA pages, page 4 and 5, since the buffer would stretch from 0x4D1A4 to 0x511A3.
The 8237 DMA controller transfers data in three main modes: single, block, and demand. Upon receiving a transfer signal from the I/O device, the DMA controller in single mode transfers one word at a time until the number of words transmitted rolls over from zero to 0xFFFF. The single transfer mode allows at least one full CPU cycle between DMA transfers. In block mode the controller sends the entire amount of requested data before the CPU accesses the data and address busses. Finally, in demand mode, the DMA controller transfers data while the I/O device has data ready. As soon as there is no data, the CPU regains access to the data and address busses. The transfer automatically resumes when more data is ready for transfer. The block transfer mode terminates in the same manner as the other two methods, as soon as the word count rolls over. Each of the modes can be programmed so that they repeat once the word count rolls over. Thus the DMA controller can transfer an unlimited amount of data to a limited region of memory.
The Lab Master AD is one of the first smart cards for data acquisition available for the IBM ISA or EISA bus computers. It is equipped with 16 single or eight differential fully-programmable Analog-to-Digital (AD) channels with 12-bit resolution, two Digital-to-Analog (DA) channels, eight digital-input and eight digital-output channels, eight digital-expansion channels, and five 16-bit programmable timer/counters tied into a 4MHz base frequency. This card has a 2048 word FIFO buffer for AD and one 1024 word FIFO buffer shared by both the DA channels. The Lab Master makes use of both the Single and Demand transfer modes for AD and DA DMA transfers. The number of AD channels can be expanded to 64 single or 32 differential AD channels.
The routines in this article use DMA to create a working, high-speed data-acquisition system. I designed the routines to make it easy to port them to any type of data acquisition application. You can easily modify the code to provide variable channel, frequency, and gain sequences, a pseudo real-time display, and some signal processing capabilities.
As previously described, you can implement a DMA transfer in two ways: a one-time transfer of a specified amount of data, or a cyclic transfer of a specified amount of data to the same memory buffer. However, your decision must take into account the amount of data being requested and the amount of free memory available for the allocation of a buffer.
Most DMA transfers for data acquisition consist of cyclic transfers to a circular buffer divided into two equal sections. During the transfer, a buffer is written to disk when it is full (see Figure 2) . This method does not allow any flexibility in performing real-time tasks with the data, however. Figure 3 displays a more flexible approach where the user can access the data at two levels. The first allows some real-time signal processing, such as a moving average filter or basic signal conversions. The second makes the most recent signal for each channel available for a real-time display, or further processing or testing.
The routines presented here take the second approach. The length of time a routine requires to execute a task becomes more critical as the speed and length of the transfer increases. With this in mind I designed the routines so they could be easily optimized for any type of data acquisition process.
The actual application is called ADCTEST. It is implemented on the PC-DOS command line by typing the command
ADCTEST <number of channels> <number of samples> <frequency>where <number of channels> is the number of channels to sample, starting from channel zero, <number of samples> is the number of conversions performed for each channel, and <frequency> is the frequency at which each sample will be taken.
For example, the command
ADCTEST 10 1000 100.0causes the program to sample channels 0 to 9 1000 times each at a rate of 100.0 Hz per sample, storing the raw data in the file DATAFILE.DAT. A sample is defined as a series of channels.
The main routine in Listing 1 takes the command line arguments and initializes both the DMA and Lab Master AD for a collection process. Before initializing these devices, main validates the command line arguments. main then verifies that a Lab Master AD is present. If present, the Lab Master AD is reset and enabled. The main routine then allocates memory for a file buffer and opens the file DATAFILE.DAT for writing.
The DMA buffer allocation routine alloc_dma_buffer (Listing 4) allocates the memory buffer such that it contains no DMA page boundaries. You can increase the buffer's maximum size of 32KB to the maximum DMA transfer limit of 128KB by allocating a huge buffer.
The main routine then initializes both the DMA and Lab Master AD by calling init_adc_dma. This routine (Listing 2) consists of allocating the special DMA buffer (Listing 4) , setting up the frequency (Listing 6) , initializing the channel gain array, and calling the DMA initializing routines (Listing 4) . init_adc_dma allows you to pass the address of the function that will process your data. In this application, I pass the address of the function dma_handler (Listing 1) . dma_handler is called from within get_next_adc_values, which gets each value from the DMA buffer. get_next_adc_values retains control until the DMA buffer contains no data.
Once everything has been initialized, the program waits for the user to press the space bar to start the data acquisition process. The application then enters the main loop. This loop is exited once the data acquisition process has finished. Each cycle through this loop calls the get_next_adc_values routine, which checks for DMA buffer overruns, calls the dma_handler routine, and keeps track of the number of conversions processed. Overruns are determined by writing a non-possible value into the buffer after the value is retrieved. This value is checked the next time the get_next_adc_values loop is executed. If this value is not the same, a data overrun error has occured.
dma_handler, as it is presented here, transfers the value passed into the file buffer, writes the file buffer if full, and transfers the value into the appropriate channels storage location in the channel data buffer. The channel data buffer contains the most recent value obtained from a channel.
If time permits during the data acquisition process, the main loop will be executed many times. Tasks that require only the most recent channel's data should be executed within the main loop. In the example, I have presented a section of code that has been commented out. This section would be used for printing the most recent values from the channel data buffer.
Once the data acquisition process is complete, any data remaining in the file buffer is written to disk, the data file is closed, the results of the process displayed to the user, and the Lab Master AD board is disabled. The actual DMA and data acquisition sections of the board are not disabled since the data acquisition process is self-terminating.
The results displayed after the collection process contain the number of times the main loop was executed, the status of the Lab Master AD, and the number of samples collected. If the number of samples collected is less than the number requested, an error message is displayed indicting a data overrun error condition.
The frequency used for multiple channels is actually the specified frequency multiplied by the number of channels to sample. This method causes a lag in the data within each sample. This lag can be virtually eliminated by setting the timer/counter to a burst mode (outlined in ). The frequency initialization routine returns the actual frequency to which the timer was initialized and is the closest approximation possible, given the base frequency present at the timer/counter chip.
The present setup limits the sampling frequency to the speed at which the computer can transfer information to disk. If your program requires a disk transfer, you should maximize the size of your disk buffers and try to reduce the rate at which the data is collected. The DMA initializing routines automatically sets up the DMA controller for repetitive cycling if you request an amount of data that is greater than the size of the DMA buffer. You will have to experiment with different sizes of the DMA buffer and the file buffer in order to optimize performance if you request a large amount of data at high frequencies. You should pay special attention to reducing processing times throughout the data acquisition routines wherever possible.
The major drawback to the DMA method implemented here is that an overrun can still occur in the data. The best solution is to tie the DMA buffer full signal to an interrupt that calls a routine that writes the DMA buffer to disk.
If you desire a series of channels that does not conform to a simple series of sampling (channel zero to number of channels minus one), or if you want to use the on-board amplifier at different gains for a channel, you could modify the channel gain array by reading parameters for the data acquisition process from an ASCII configuration file.
DMA applications are one of the most difficult programs to debug, since all important actions occur in the background, without the CPU. This situation presents a formidable task when tracking down a bug.
One of the best ways to debug is to check the buffer for changes in values. To determine if the DMA transfer occurs at all, you should fill the DMA buffer with a non-possible value and check the first memory location in the buffer for any change once everything is initialized and started. If your bug is located here, your best strategy is to familiarize yourself with the hardware as much as possible. You can then trace and verify all steps that occur in initializing and starting the process. If possible, write subprograms that test the individual functionality of a routine. For example, a test procedure could be written to verify that the timer routine is working properly.
You can track down problems that arise during the process fairly easily by analyzing the contents of the DMA buffer. Unpredictable errors most likely mean that the DMA controller is not writing to the proper locations. Recognizing the DMA controller's addressing method and the possibility of pageframe boundary errors is also important. The DMA method implemented here allows for easy trapping of overrun errors in the DMA buffer.
As with any application, porting routines specific to hardware can be difficult. Keeping this in mind, I have tried to layer them so that porting is as easy as possible. The program's general functionality should not have to change when porting. The most important routines are found in Listing 4 and Listing 8. These DMA controller routines will work with most I/O devices that support DMA and are specific to the 8237 type DMA controller. The routines that control the Lab Master AD are hardware specific and must be modified to work with other types of I/O devices. You should keep the main procedure in Listing 1 the same as you port the application.
DMA is by far the best method of transferring data from an I/O device to memory. With the introduction of new smart cards, the restrictions usually associated with this type of data transfer are disappearing. The routines presented here create a fully functional data acquisition application that is easy to modify, debug, and port. I hope that this article will help anyone considering implementing a DMA transfer process.
 Operations Manual Lab Master Advanced Design Scientific Solutions, Solon, Ohio.
 Microsystem Components Handbook Microprocessors Volume 1, Intel, 1986 p 2-52 2-65.
 The Programmer's PC Sourcebook, Thom Hogan, Microsoft Press, 1988, p398, p469-470.
 AMD 9513 Data Book, 1985.