*[Editor's note: For an intro to floating-point math, see Tutorial: Floating-point arithmetic on FPGAs. For a comparison of fixed- and floating-point hardware, see Fixed vs. floating point: a surprisingly hard choice.*

*
This paper was presented at ESC Boston 2006. For more papers from this conference, see Embedded.com.]*

**Introduction**

Many concepts are covered in this paper at a high level. The objective is to familiarize the reader with new concepts and provide a framework for existing knowledge. An in-depth (and often rigorous) presentation on the topics covered can be found distributed throughout several chapters of any comprehensive DSP (digital signal processing) textbook. By intention this level of coverage should motivate the reader to pursue a more in depth understanding of specific technical details.

The primary differences between conventional and DSP processors involve optimization for specific arithmetic operations and data handling. DSP processors are optimized to efficiently execute optimized operations which allow the efficient implementation of signal processing algorithms. The source of these signals can be audio, image-based or simply numerical. Many of these specialized DSP algorithms require repetitive use of the following operation group:

This operation group is clearly a multiply and an addition also known as a multiply and accumulate. This operation is so common that DSP processors have been optimized to implement one or more MAC (multiply and accumulate) operations during each processor instruction cycle. In general DSP processor bus structures and architectures have been highly optimized to implement specialized types of arithmetic operations and associated data manipulations as quickly as possible.

DSP data handling has also been given significant design and architecture attention. Extra buses have been added to processors to allow them to more efficiently handle internal and external data transfers. Pipelines and additional data paths and registers have also been added to speed and automate arithmetic operations and data transfers.

**Fixed Point vs. Floating Point**

DSP processors fall into two major categories based on the way they represent numerical values and implement numerical operations internally. These two major formats are fixed point and floating point. The differences between fixed and floating point processors are significant enough that they require very different internal implementation, instruction sets and approaches for algorithm implementation.

Fixed point processors represent and manipulate numbers as integers. Floating point processors primarily represent numbers in floating point format, although they can also support integer representation and calculations. Floating point format implements numerical value representation as a combination of mantissa (or fractional part) and an exponent.

Developing an understanding of which applications are appropriate for floating point processors is worthwhile. The inherently large dynamic range available in floating point designs mean that dynamic range limitations can be practically ignored in a design. Floating point processors can implement both floating point and integer operations, making them more flexible. Floating point processors tend to be more expensive because they implement more functionality (complexity) in silicon and have wider buses (typically 32 bit). Floating point capability is appropriate in systems where gain coefficients are changing with time, or coefficients have large dynamic ranges. Floating point processors tend to be more high level language friendly, and thus can be easier to develop code for. The code development process is also less architecture aware. Thus, relative ease of development and schedule advantage are being traded off against higher cost and hardware complexity when considering floating point design implementations.

The typically lower cost and higher speed of fixed point DSP implementations are traded off against added design effort for algorithm implementation analysis, and data and coefficient scaling to avoid accumulator overflow. The remainder of this paper focuses on the details of algorithm implementation with fixed point DSP processors.

**Basic DSP system**

This section presents a high-level overview of a typical DSP system and its critical elements. Figure 1 shows a typical DSP system implementation. The digital portion of the system is from the output of the ADC through the DSP processor and into the DAC. The remainder of the system is in the analog domain.

*Figure 1. Typical DSP System*

The ADC (analog to digital converter) is responsible for converting the system input signal from analog to digital representation. Due to the relationship between sampling speed and frequency detailed by the Nyquist sampling theorem, the ADC must be preceded by a LPF (low pass filter). The LPF is required to limit the maximum frequency presented to the ADC to less than half of the ADC's sampling rate. This pre-filtering is known as anti-aliasing since it prevents ambiguous data relationships known as aliasing from being translated into the digital domain.

The output of the ADC is a stream of sampled fixed word length values which represent the analog input signal at the discrete sample points determined by the ADC's sampling frequency. Each of these data samples is represented by a fixed length binary word. The resolution of these samples is limited to the output data word width of the ADC and the data representation width internal to the DSP processor. The ADC outputs are quantized representations of the input sampled analog values. This simply means that a value that has been translated from the analog domain which would occupy one of an infinite number of possible values in an infinite word length system must now be represented by one of a limited quantity of values in a finite word length system.