This article contains the following listings: BITTMAN.ARC
Jim is the president and founder of Bittware Research Systems. He has a BSEE from MIT and an MSEE from the University of Maryland. He can be reached at Bittware Research Systems, 400 East Pratt Street, Baltimore, MD 21202.
When confronted with the challenges presented by Digital Signal Processing (DSP), programmers usually come up with questions such as, "Why would I want to use DSP in the first place?" "How hard is DSP development, and what is required to integrate a DSP processor into my PC-based system?" and, most importantly, "Is it worth the trouble?"
The answer to the first question is simple: The main reason for adding DSP power to your application is to speed up the program to achieve real-time processing speeds--speeds typically in the range of 25 Mflops. The response to the second question isn't quite so straight-forward because DSP programming varies from board to board and processor to processor. In general, the development packages supplied with most commercial DSP add-in boards make the process relatively simple. The main purpose of this article, in fact, is to describe how to add the power of DSP processing to your PC application using some of these off-the-shelf tools. As for the the third question, we'll return to it at the end of this article, when we evaluate the relative performance of some of the algorithms presented here.
I'll use as an example the classic DSP algorithm known as the Fast Fourier Transform (FFT). The hardware used in the example is a DSP coprocessor board from CAC, which is built around an AT&T DSP32c chip (32-bit floating point DSP processor) with 256K of 25ns SRAM and dual 16-bit analog input and output channels. The DSP32c runs at 50 MHz and is capable of 25 Mflops.
Fast Fourier Transform
For comparison, I'll implement the FFT in three different ways: as a C program running on the PC, as C code running on the add-in DSP board, and as DSP assembly language running on the add-in board.
The examples will utilize FFTs from several different sources. One of the FFTs comes from the book Numerical Recipes in C: The Art of Scientific Computing, an excellent reference that provides C source code and detailed information on many different algorithms. The source code is also provided on a diskette that can be compiled for the DSP32c without modification. We'll test this routine on both the PC and the DSP board.
I'll also try a PC-based FFT from the MathPak 87 library by Precision Plus Software. Finally, to demonstrate programming the DSP in assembler, I will use the assembly-code FFT routine provided by the AT&T application library. We will invoke the assembly language FFT from both a main module written in C as well as one written in DSP assembler.
I'll run the examples from within the Digital Signal Processing HeadQuarters (DspHq) environment. DspHq is my company's DSP development software package that integrates libraries of functions for the PC and DSP cards, along with graphics and data management. Please note that there is no intrisic requirement for DspHq; it is used as a matter of convenience. However, the listings as shown here do rely on DspHq. It should not be difficult to modify them for standalone operation. Note also that the electronic version of the listings contain additional DspHq-related files that could not be presented here for lack of space.
Testing an FFT requires a number of steps:
- Synthesize a cosine wave form with a given amplitude, dc offset, frequency and sample rate.
- Calculate the real-valued FFT of the cosine wave form.
- Rederive the original waveform by using an inverse real-FFT.
- Calculate log-magnitude of FFT.
- Display all waveforms graphically (as shown in Figure 1 through Figure 4).
FFT On the PC
The PC-based executable file is called ddj.exe and incorporates both the PC-based FFT routines from Numerical Recipes and the functions necessary to invoke binaries that will execute on the add-in board.
The main module is called ddj.c and is shown in Listing One (page 90). The comments in ddj.c contain information such as the makefile used to create ddj.exe. The heart of this routine is a switch statement that selects from among the different functions necessary for our test. The first four case blocks invoke PC-based routines. Case 1 calls synth_cos to generate a cosine; case 2 calls the routine realft, which is the algorithm from Numerical Recipes optimized for real values; case 3 also calls realft, but invokes it as the inverse transform; finally, case 4 calls a function to calculate the log-magnitude. The rest of the case blocks are used to invoke functions residing on the DSP board and are discussed later in this article.
For purposes of timing comparison, I also implemented a host-resident C program that calls an FFT routine optimized for the 80x87. This routine comes from Precision Plus' MathPak87 library. The driving source file is called ddj_mp87.c and is shown in Listing Two (page 90).
C Code Running On the DSP
Binary files that will execute on the DSP32c have the suffix .32c. Those for the older DSP32 have the suffix .32. Although the two devices are source compatible (the DSP32c being a superset), their binary formats are not.
The C source file for the example is ddj_32c.c, and its corresponding executable is ddj_32c.32c. Executable files are produced by d3cc, the AT&T C-compiler for the DSP32c. This compiler produces object files with a .o file extension. The command d3cc -c realft.c compiles the algorithm from Numerical Recipes in C without modification. For example, the resulting object file, realft.o, is then linked with the other files and libraries shown in Example 1.
Example 1: Compile/link command to produce a DSP32c executable is from the C language source. I use the math library first, even though it is slower, because the error checking is more complete, and the routines accept a wider range of inputs.
d3cc -lm -lap -0 ddj_32c.32c ddj_32c.o four1.o realft.o \ -s startup.o -m memory.map d3cc : AT&T DSP32c 'C' compiler -lm : Use math library first -lap : Use application library second -0 ddj_32c.32c : output file name ddj_32c.32c ddj_32c.o : object file for linker four1.o : object file for linker realft.o : object file for linker -s startup.o : assembly startup file (hardware dependent) -m memory.map : memory configuration file (hardware dependent)
Downloading and running this executable is accomplished by the code fragment shown in case 6 in ddj.c (Listing One). You can also download and execute a binary from the DOS command line with the AT&T d3dl utility. You can choose the most appropriate blend of working in an integrated environment (such as DspHq) versus working with standalone utilities (such as d3dl); d3emu, a symbolic debugger that comes with the add-in board; and d3hex, a utility to produce output suitable for PROM burners.
In this case, I have combined several functions into one executable. The DSP-resident program waits for a function number sent from the PC and then executes the proper routine, clears the function number, and waits for the next instruction.
The main module for the target-based executable, ddj_32c.c, is shown in Listing Three (page 92). This is analogous to the host-based ddj.c shown earlier in that it basically consists of a switch statement that dispatches to the requested functions. The functions, in the order shown in the case statement, are:
- Convert from IEEE format to DSP format.
- Convert from DSP to IEEE.
- Synthesize a cosine waveform.
- Invoke realft for a forward FFT.
- Invoke realft for an inverse FFT.
- Calculate the log-magnitude.
- Invoke rffta, the assembly-language implementation of the FFT.
Assembly Code on the DSP
For testing assembly-code, we'll use the FFT routine rfft, provided by the AT&T assembler application library. (Note the similarity between DSP32c assembly language and C, a similarity that makes it relatively easy to become proficient in DSP32c assembler.) Invoking rfft is similar to invoking the C routine realft, with the exception of having to download the number of stages of the FFT prior to execution. This is shown in case 9 of main( ) in ddj.c Listing One).
I've also implemented the equivalent of ddj_32c.c in assembler. Because assembly-language source files have the suffix .s, our file is called ddj_32s.s and is shown in Listing Four (page 94). AT&T provides its own assembler/linker program called d3make used with assembly language programs only. Example 2 shows how this would be invoked in the source code file.
Example 2: Invoking the AT&T assembler/linker program called d3make in the source code file.
d3make -o ddj_32s.32c -M6 -Q -W ddj_32s.s d3make : AT&T assembler/linker make -o ddj_32s.32c : output file name -M6 : memory mode, hardware dependent -Q : DSP32c (as opposed to DSP32) target device -W : turn off warnings ddj_32s.s : source file input
The Results
As I stated at the beginning of this article, the main reason for adding DSP power to your application is to speed it up and achieve real-time processing speeds. The timing results for the examples developed here are shown in Table 1. Except where noted, the DSP32c processing times include data upload/download and conversion. As you can see, realft running on the DSP32c is about four times faster than the same code running on a 25-MHz 386/387 computer, and the assembly-language rffta is another factor of four faster than that. If you subtract all overhead times (download, upload, and conversion), you end up with a 80-fold speed-up over straightforward hostbased C code, and a 40-fold improvement over the optimized PC-based routines from the MathPak library.
Table 1: Timing tests were calculated on a 25-MHz 386 PC(25-MHz 387). 1024 point real-valued FFT, input sine wave. 200 Hz sine sampled at kHz, amplitude = 1 dc offset = 0, phase = 0.
Routine Name Computation Execution time (mS) ------------------------------------------------------------------ Numerical Recipes, realft PC 165.8 MathPak 87, rvfft PC 86.3 Numerical Recipes, realft DSP32c 45.6* AT&T Application Library, rffta DSP32c 9.8* AT&T Application Library, rffta DSP32c 2.1** Generate Cosine PC 94.0 Generate Cosine DSP32c 38.4 Download and convert n/a 2.2 Convert, upload, and convert n/a 3.3 Upload rate = 2.7 Mbytes/sec, Download rate = 2.9 Mbytes/sec *Includes data download, upload, and conversion **Algorithm only, no overhead.
Products Mentioned
DspHq Signal Processing Development Software BittWare Research Systems Inner Harbor Center, 8th Floor 400 East Pratt Street Baltimore, MD 21202 800-848-0435
D3emu Debugger Board Interface Library Communications, Automation, and Control 1642 Union Blvd. Allentown, PA 18103 800-367-6735
Numerical Recipes in C: The Art of Scientific Computing William H. Press, Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling ISBN 0-521-35465-X Cambridge University Press New Rochelle, NY 10801
DSP32c Assembler DSP32c C Compiler DSP32c Application Library AT&T Microelectronics Dept. 52AL330240 555 Union Blvd. Allentown, PA 18103 800-372-2447
MathPak 87 Precision Plus Software 939 Griffith Street London, Ontario N6K 3S2 Canada 519-657-0633
_ADDING THE POWER OF DSP TO YOUR APPLICATIONS_ by Jim Bittman[LISTING ONE]
<a name="0115_000e"> /*-------------------- DDJ.C----------------------------------------- | PC-Resident program for FFT and for controlling DSP co-processor | The Makefile for this program is: | .c.obj: | cl /AL -c -DANSI $< | ddj.exe: ddj.obj dspif.obj dspif.h four1.obj realft.obj | cl /AL ddj.obj dspif.obj four1.obj realft.obj \ | -link dsputill.lib hqmcil.lib | This program assumes the DspHq environment and requires additional | support files (menu definition and function specification files) | not listed in the Makefile. However, it should not be difficult to | excerpt the relevant portions of this code for standalone execution. +------------------------------------------------------------------*/ #include <stdio.h> /* include for NULL define */ #include <math.h> /* include for math functions */ #include "hq_ci_.h" /* include for dsphq interface */ #include "nr.h" /* include nr function prototypes */ #include "nrutil.h" /* include nr utility prototypes */ #include "dsputil.h" /* include CAC prototypes */ #include "dspif.h" /* dsp interface function header */ /*---------- define the menu parameter types ----------------------*/ typedef unsigned long parm1_type; /* Buffer 1 */ typedef unsigned long parm2_type; /* Buffer 3 */ typedef menu_float parm3_type; /* Cosine Amplitude */ typedef menu_float parm4_type; /* Cosine DC-offset */ typedef menu_float parm5_type; /* Cosine Frequency */ typedef menu_float parm6_type; /* Cosine Sammple Rate */ /*------------ constant definitions -------------------------------*/ #define SQR(a) ((a)*(a)) #define PI (float) 3.141592654 #define BAD_FUNC_NUM (int) -1 /* user error code */ /*----- Function Constants For Dsp32 'C' & Assembly Code ------*/ #define DSP32_X 1 /* converts IEEE ==> DSP numbers */ #define IEEE32_X 2 /* converts DSP ==> IEEE numbers */ /*----- Function Constants For Dsp32 'C' Code ----------------*/ #define GENCOS_C 3 /* generate test cosine */ #define REALFT_C 4 /* performs realft, from numerical recipes */ #define IREALFT_C 5 /* performs inv_realft, from numerical recipes */ #define LOGMAG_C 6 /* calculates log-magnitude */ #define RFFTA_C 7 /* calculates real fft using app-lib */ /*----- Function Constants For Dsp32 Assembly Code ------------*/ #define RFFT_S 3 /* performs rfft, from AT&T application library */ #define MAG_S 4 /* calculates magnitude-squared */ #define LOG_S 5 /* calculates log */ /*----- Function Prototypes ----------------------------------*/ int ilog2 (unsigned int); void log_mag (float far * i1, float far * o1, long bs); void scale_data (float far *output1, float scale_val, long np); void synth_cos (float far *data1, long np, float a, float d, float f, float s); main(int argc, char *argv[]) { long indx; /* used for loop index */ int func_num; /* for function number from dsphq */ long np; /* for block size from dsphq */ float far * input1; /* array address */ float far * output1; /* array address */ long bs; /* address of DSP blocksize var */ long flag; /* address of DSP funcnum flag */ parm1_type b1; /* DSP Buffer #1 */ parm2_type b2; /* DSP Buffer #2 */ parm3_type amp; /* cosine amplitude */ parm4_type dco; /* cosine DC offset */ parm5_type freq; /* cosine frequency */ parm6_type samprate; /* cosine sample rate */ init_intfc(argc, argv); /* init dsphq interface */ func_num = get_func_num(); /* get the function number */ np = get_block_size(); /* get the block size */ /* get menu parameters */ b1 = get_parm(1); /* DSP Buffer #1 */ b2 = get_parm(2); /* DSP Buffer #2 */ amp = get_parm(3); /* cosine amplitude */ dco = get_parm(4); /* cosine DC offset */ freq = get_parm(5); /* cosine frequency */ samprate = get_parm(6); /* cosine sample rate */ /* get array addresses */ input1 = get_data_in_ptr(1); /* base address of input #1 */ output1 = get_data_out_ptr(1); /* base address of output #1 */ /* perform selected function */ switch (func_num) { case 1 : /*--- Synthesize Cosine Using PC ---------------------*/ synth_cos(output1, np, amp, dco, freq, samprate); break; case 2 : /*---- Numerical Recipes Forward Real FFT Using PC ----*/ output1--; /* NR funcs index at 1 */ realft(output1, np>>1, 1); /* forward real fft */ break; case 3 : /*---- Numerical Recipes Inverse Real FFT Using PC ----*/ output1--; /* NR funcs index at 1 */ realft(output1, np>>1, -1); /* inverse real fft */ output1++; /* reallign address */ scale_data(output1,1.0/(np >> 1),np); /* restore original ampl. */ break; case 4 : /*------ Calculate LOG(10)-[MAGNITUDE] using PC --------*/ if (input1) log_mag(input1, output1, np); /* take logmag of input */ else log_mag(output1, output1, np); /* perform logmag in-place */ break; case 5 : /*--------- Synthesize Cosine Using DSP-32C -------------*/ init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */ dsp_dl_fp(get_addr("amp"),amp); /* download floats */ dsp_dl_fp(get_addr("dco"),dco); dsp_dl_fp(get_addr("freq"),freq); dsp_dl_fp(get_addr("samprate"),samprate); set_dspbuf("o1", b1); /* set dsp buffer address */ dsp_dl_int(flag,GENCOS_C); /* invoke function on dsp */ wait4dsp(flag); /* wait for dsp to finish */ ul_float(output1,np,flag,b1); /* upload results */ break; case 6 : /*---- Numerical Recipes Forward Real FFT Using DSP-32C ----*/ init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */ set_dspbuf("o1", b1); /* set dsp buffer address */ dl_float(input1,np,flag,b1); /* download float array */ dsp_dl_int(flag,REALFT_C); /* execute "realft" on dsp */ wait4dsp(flag); /* wait for dsp to finish */ ul_float(output1,np,flag,b1); /* upload results */ break; case 7 : /*---- Numerical Recipes Inverse Real FFT Using DSP-32C -----*/ init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */ dl_float(input1,np,flag,b1); /* download float array */ set_dspbuf("o1", b1); /* set dsp buffer address */ dsp_dl_int(flag,IREALFT_C); /* execute "inv_realft" on dsp */ wait4dsp(flag); /* wait for dsp to finish */ ul_float(output1,np,flag,b1); /* upload results */ break; case 8 : /*----- Calculate LOG(10)-[MAGNITUDE] using DSP-32C ----*/ init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */ dl_float(input1,np,flag,b1); /* download float array */ set_dspbuf("i1", b1); /* set dsp buffer address */ set_dspbuf("o1", b2); /* set dsp buffer address */ dsp_dl_int(flag,LOGMAG_C); /* execute "inv_realft" on dsp */ wait4dsp(flag); /* wait for dsp to finish */ ul_float(output1,np,flag,b2); /* upload results */ break; case 9 : /*------- Forward Real FFT Using DSP-32C App-Lib --------*/ init_dsp("ddj_32s.32c",&flag,&bs,np); /* download dsp code & init */ dl_float(input1,np,flag,b1); /* download float array */ set_dspbuf("o1", b1); /* set dsp buffer address */ dsp_dl_int(get_addr("stages"),ilog2(np)); /* download int */ dsp_dl_int(flag,RFFT_S); /* execute "rfft" on dsp */ wait4dsp(flag); /* wait for dsp to finish */ ul_float(output1,np,flag,b1); /* upload results */ break; case 10: /*-------- Download Data To DSP-32C ---------------------*/ init_dsp("ddj_32s.32c",&flag,&bs,np); /* download dsp code & init */ dl_float(input1,np,flag,b1); /* download data from pc to dsp */ break; case 11: /*--------- Upload Data From DSP-32C ---------------------*/ init_dsp("ddj_32s.32c",&flag,&bs,np); /* download dsp code & init */ ul_float(output1,np,flag,b1); /* upload results */ break; default : set_err_return(BAD_FUNC_NUM); break; } } /*--- This function returns the integer part of the log base 2 of x. ---*/ int ilog2(unsigned int x) { return( x >> 1 ? 1 + ilog2(x >> 1) : 0); } /*--- This function scales np elements of data1[] by scale_val. ---*/ void scale_data(float far * data1, float scale_val, long np) { long i; for (i = 0; i < np; i++) { data1[i] *= scale_val; } } /*--- Function generates cosine data: data[i] = A cos((2 pi f/s) i) + d ---*/ void synth_cos(float far * data1, long np, float a, float d, float f, float s) { long i; float theta, angle_step; angle_step = 2.0 * PI * f / s; theta = 0.0; for (i = 0; i < np; i++) { data1[i] = (a * cos(theta)) + d; theta += angle_step; } } /*--- log_mag ---*/ void log_mag(float far * i1, float far * o1, long bs) { long i; long n; n = bs >> 1; o1[0] = log10(SQR(i1[0])); for (i = 1; i < n; i++) { o1[i] = log10(SQR(i1[2*i]) + SQR(i1[2*i+1])); } for (i = n; i < bs; i++) { o1[i] = 0.0; } } <a name="0115_000f"> <a name="0115_0010">[LISTING TWO]
<a name="0115_0010"> /*-----------------DDJ_MP87.C------------------------------------------- | MathPak 87 FFT Function Group Execution Source File | The makefile is: ddj_mp87.exe: ddj_mp87.c | cl /AL ddj_mp87.c -link hqmcil.lib mpak87.lib +----------------------------------------------------------------------*/ #include <hq_ci_.h> /* include function prototypes */ #include <mpak87.h> /* include math pak header */ #define BAD_BLOCK_SIZE -1 /* user defined error codes */ #define BAD_FUNC_NUM -2 int fft_stages(long fft_size); /*function prototype*/ void main(int argc, char *argv[]) { int m; /* number of fft stages */ far_array_of_double o1; /* data pointer */ init_intfc(argc, argv); /* MUST do this before other interface functions */ o1 = get_data_out_ptr(1); /* get address of data */ if ((m = fft_stages(get_block_size())) == 0) set_err_return(BAD_BLOCK_SIZE); /* won't happen if .fnc file correct */ else switch(get_func_num()) { case 1: rvfft(o1, m); break; /* real fft from MathPak library */ case 2: irvfft(o1, m); break; /* inverse real fft from MathPak */ default: set_err_return(BAD_FUNC_NUM); /* won't happen if .fnc file correct */ break; } } /* return the log(base 2) of the input, or 0 if input is not a power of 2 */ int fft_stages(long fft_size) { int rtn; int sw_fft; sw_fft = fft_size; switch(sw_fft) { case 8 : rtn = 3; break; case 16 : rtn = 4; break; case 32 : rtn = 5; break; case 64 : rtn = 6; break; case 128 : rtn = 7; break; case 256 : rtn = 8; break; case 512 : rtn = 9; break; case 1024 : rtn = 10; break; case 2048 : rtn = 11; break; case 4096 : rtn = 12; break; case 8192 : rtn = 13; break; default : rtn = 0; break; } return(rtn); } <a name="0115_0011"> <a name="0115_0012">[LISTING THREE]
<a name="0115_0012"> /*--------------------DDJ_32C.C-------------------------------------- | This file is will run on the DSP add-in board after compilation | by the AT&T C compiler d3cc. The makefile is: | .c.o: | d3cc -c $< | .s.o: | d3as -W -Q $< | ddj_32c.32c: ddj_32c.o startup.o memory.map four1.o realft.o | d3cc -lm -lap -o ddj_32c.32c ddj_32c.o four1.o realft.o \ | -s startup.o -m memory.map +--------------------------------------------------------------------*/ #include <stdio.h> #include <math.h> #include <nr.h> #define PI 3.1415926 #define SQR(a) ((a)*(a)) asm(".global i1, i2, o1, o2"); asm(".global funcnum, bs"); asm(".global amp, dco, freq, samprate"); short funcnum; short bs; float *i1, *i2, *o1, *o2; float amp, dco, freq, samprate; /*------------------------------------------------------------*/ short fft_stages(fft_size) short fft_size; { short rtn; switch(fft_size) { case 32 : rtn = 5; break; case 64 : rtn = 6; break; case 128 : rtn = 7; break; case 256 : rtn = 8; break; case 512 : rtn = 9; break; case 1024 : rtn = 10; break; case 2048 : rtn = 11; break; case 4096 : rtn = 12; break; case 8192 : rtn = 13; break; default : rtn = 0; break; } return(rtn); } /*------------------------------------------------------------*/ main() { register float scal; short n; float *data1, *data2, temp; register short i,j; while (1) { funcnum=0; while (!(funcnum)) ; /* Wait for PC to Download Function Number */ n = bs >> 1; switch(funcnum) { case 1: /*--------- Convert to DSP Format ------------*/ dsp32(bs,o1); /*IEEE-->DSP */ break; case 2: /*--------- Convert to IEEE format -----------*/ ieee32(bs,o1); /*DSP-->IEEE*/ break; case 3: /*--------- Synthesize Cosine ----------------*/ scal = 2.0 * PI * freq / samprate; j = 0; data1 = o1; for (i=bs; i-- > 0; j++) { *data1++ = (amp * cos(scal * j)) + dco; } break; case 4: /*------- Forward FFT ------------------------*/ data1 = o1; data1--; realft(data1,n,1); /*from numerical recipes*/ break; case 5: /*------- Inverse FFT ------------------------*/ data1 = o1; data1--; realft(data1,n,-1); /*from numerical recipes*/ /* scale by 1/n to retain original amplitude*/ data1 = o1; scal = 1.0 / n; for (i=bs; i-- > 0; data1++) { *data1 = *data1 * scal; } break; case 6: /*----- Calc LOG-MAGNITUDE (data output from NR-realft)--*/ o1[0] = log10(SQR(i1[0])); temp = log10(SQR(i1[1])); for (i=1;i<n;i++) { o1[i]=log10(SQR(i1[2*i])+SQR(i1[2*i+1])); } o1[n] = temp; for (i=n+1; i<bs; i++) { o1[i] = 0.0; } break; case 7: /*------ Forward FFT ----------------------------------*/ data1 = o1; rffta(bs,fft_stages(bs),data1); /*uses AT&T app lib*/ break; default : break; } } } <a name="0115_0013"> <a name="0115_0014">[LISTING FOUR]
<a name="0115_0014"> /*------ file DDJ_32S.S----------------------------------------------- | Assembly language version of FFT test. The makefile is as follows: | ddj_32s.32c: ddj_32s.s | d3make -o ddj_32s.32c -M6 -Q -W ddj_32s.s +---------------------------------------------------------------------*/ #include <dspregs.h> /*--------------------------------------------------------------------*/ .global i1,o1,o2 .global funcnum, bs, stages .global endofcode /*--------------------------------------- initialization -------------*/ r5 = 0x0003 pcw = r5 ioc = 0x30CC0 /*--------------------------------------- wait until funcnum != 0 ----*/ begin: r5 = *funcnum nop if (eq) goto begin nop /*--------------- switch on funcnum, which is set to function number --*/ /* func 1: IEEE-->DSP */ r5 = r5 - 1 if (eq) goto do_dsp32 /* func 2: DSP-->IEEE */ r5 = r5 - 1 if (eq) goto do_ieee32 /* func 3: invoke rffta, from AT&T app library */ r5 = r5 - 1 if (eq) goto rffta /* func 4: calc magnitude squared */ r5 = r5 - 1 if (eq) goto do_mag nop /* func 5: calc log */ r5 = r5 - 1 if (eq) goto do_log10 nop /* illegal function number */ goto finished nop /*---------------------------------call to _rffta -----------------*/ rffta: r2e = *o1 r1 = *bs r3 = *stages *fft_b = r2e *fft_n = r1e *fft_m = r3e call _rffta (r14) nop fft_lv: int24 localV fft_n: int24 0 fft_m: int24 0 fft_b: int24 0 .align 4 goto finished nop /*---------------------------------calc magnitude------------------*/ do_mag: r8 = *bs r10e = *i1 r8 = r8 - 2 r11e = *o1 nop a0 = *r10++ /* DC value */ nop a2 = *r10++ /* Nyquist */ *r11++ = a0 = a0*a0 /* save DC mag */ magloop: a0 = *r10++ nop a1 = *r10++ a0 = a0*a0 nop a1 = a1*a1 nop nop *r11++ = a1 = a0 + a1 if (r8-->=0) goto magloop nop *r11++ = a0 = a2*a2 /* save Nyquist mag */ goto finished nop /*---------------------------------calc log10------------------*/ do_log10: r12e = *i1 r11e = *o1 r13 = *bs r10e = in r9e = out r13 = r13 - 2 logloop: *r10 = a3 = *r12++ call _log10 (r14) nop int24 localV int24 in, out .align 4 *r11++ = a3 = *r9 if (r13-->=0) goto logloop nop goto finished nop /*--------------------------------------call _ieee32 converter-----*/ do_ieee32: r1e = *o1 r2 = *bs *o1_ieee32 = r1e *bs_ieee32 = r2e call _ieee32 (r14) nop int24 localV bs_ieee32: int24 0 o1_ieee32: int24 0 .align 4 goto finished nop /*--------------------------------------call _dsp32 converter-----*/ do_dsp32: r1e = *o1 r2 = *bs *o1_dsp32 = r1e *bs_dsp32 = r2e call _dsp32 (r14) nop bs_dsp32: int24 0 o1_dsp32: int24 0 .align 4 goto finished nop /*-------------------------------finished, set funcnum=0 -----------*/ finished: r1=0 goto begin *funcnum=r1 bs: int 256 stages: int 8 funcnum: int 0 .align 4 i1: int24 0x2000 o1: int24 0x2000 o2: int24 0x3000 .align 4 localV: 2*float 0.0 in: float 0.0 out: float 0.0 max: float 0.0 scalefac: float 0.0 #include <_rffta.asm> #include <_log10.asm> #include <_ieee32.asm> #include <_dsp32.asm> /*------------------------------------- mark end of code------------*/ endofcode: int 0xDEAD, 0xC0DE
Copyright © 1991, Dr. Dobb's Journal