Dr. Dobb's | Software Implementation of Trigonometric Functions Using CORDIC Algorithm

Software Implementation of Trigonometric Functions Using CORDIC Algorithm

How to implement fixed point trigonometric routines in software for use in a variety of drive control apps using the Coordinate Rotation Digital Computer (CORDIC) algorithm.

December 23, 2006
URL:http://www.drdobbs.com/architecture-and-design/software-implementation-of-trigonometric/196700769

Trigonometric functions are often used in embedded systems. Motor drive control applications such as the Park Transform, Clarke Transform, and PWM generation use trigonometric functions extensively. Various methods exist to compute the trigonometric functions. These include Taylor series, Curve fitting algorithms, and the CORDIC algorithm.

This tutorial describes software implementation of the following fixed point trigonometric routines using the CORDIC Algorithm on Infineon's XC164CS Microcontroller with MAC unit. The implementation of the algorithm is examined concerning accuracy and efficiency:

Complex Magnitude
Sine
Cosine

Routines are provided for signed two's complement arithmetic. First a brief description of the theory behind the algorithm is presented. Then the theory is extended to the implementation of algorithm in XC164CS Processor after which the numerical errors that occur in the fixed point implementation is discussed.

The CORDIC Arithmetic Technique
The Coordinate Rotation DIgital Computer (CORDIC) algorithm is an iterative technique proposed by Volder in 1956. This algorithm can be a very powerful tool in areas where arithmetic or trigonometric function evaluation is heavily utilized, such as digital signal processing, motor control.

The general vector rotational transform rotates a plane vector [X, Y] by an angle Θ to produce a new vector point [X_i+1, Y_i+1] as in Equations (1) and (2). The CORDIC Rotation is achieved by the same principle. It rotates the point [X, Y] in series of steps, which are smaller than Θ.

This rotation may be in anti-clockwise direction (increase in Θ) or clockwise direction. Suppose if we wish to achieve a total rotation of 35^o, we may rotate our point 30^o anticlockwise, followed by 10^o anticlockwise, followed by 5^o clockwise.

The reason for this simplification is to break down the rotation (Θ) into many steps, each of decreasing size and has each step such that tan is the power of 2, where Θ is the rotational angle. The first seven steps of the set of rotations are shown in Table 1 below.

Table 1

This would allow us to implement the multiplication by tan Θ as a simple bit shift operation (2^-i). Hence Equations (3) and (4) reduce to

From Table 1 it is clear that the total rotation step is 99.88^o. Since the rotation can be of clockwise or anticlockwise direction these steps are used to approximate angles between +99.88^o to -99.88^o. For mathematical simplicity the rotation angles are limited to "90^o and +90^o. For rotation angles greater than +/-90^o additional rotation is required. The cos Θ term (K) is a constant which approaches to 0.6073 after 'n' iterations. The angle Θ is accumulated in Z_i+1

The CORDIC Rotator is operated in one of the two modes

1) Rotation Mode
2) Vector mode

The rotation mode rotates the input vector to a specified angle. The vectoring mode rotates the input vector to x axis while recording the angle required to make the rotation (i.e. the direction of rotation is opposite in both the modes). The complex magnitude is computed using vectoring mode, the sine and cosine of the input angle is computed using rotation mode.

Rotation Mode.

The CORDIC equation for rotation mode is

(Equation 7)

Vectoring Mode.

The CORDIC equation for vectoring mode is

(Equation 8)

After 'n' Iterations

(Equation 9)

Computation of Complex magnitude

The Aim of the algorithm is to calculate the magnitude of a complex number C= X + jY. Magnitude of this complex number is given by

The Cordic rotator rotates the input vector to angle Z_i for aligning the result vector with the x axis (Figure 1), the result of the operation is a rotation angle and the scaled magnitude of the original vector.

Figure 1

To extend the region of convergence greater than +/-90^o, the phase is rotated by -90^o if Y is positive and it is rotated by +90^o if Y is negative. The CORDIC Rotation is done with successively smaller values of Z, starting with Z = 45^o. The sign of Y decides to add or subtract the phase. In the rotation process C is multiplied with the Inverse CORDIC gain (K) of 0.6073.

Implementation.

The Library is C-callable, hand coded assembly routine written for Infineon's XC164CS Microcontroller with MAC unit. A tasking tool is used for compilation. For the implementation we make the assumption that inputs are in 1Q15 format.

The input data's are scaled down by 2 to avoid the overflow. The rotational gain K needs to be compensated at some stage (i.e. compensation can be done before or after the iteration). In order to scale down the input further, the input X is multiplied by the inverse of gain before the iteration.

In the Tasking tool chain, the parameter transforms of the first four arguments of the function will be in R12 to R16. Using C166SV2 instruction set, the micro rotations according to Equation (8) is given below as reference implementation. A fixed point number representation is used for the implementation. The registers R1, R2, R13 are assigned with X, Y, shift value respectively.

Label3: 
    MOV     R12,#0h 
    MOV     R3,R1
    MOV     R5,R2
    ASHR    R5,#0fh
    CMP     R5,#0
    JMPA    cc_NZ, Label1


; Micro rotation 1 


;I=I+(Q>>K) 
    MOV     R7,R2
    ASHR    R7,R13
    MOV     MAH,R1
    CoADD R12,R7
CoSTORE R1,MAS
;Q=Q-(I_tmp>>K) 
    ASHR     R3,R13
    MOV     MAH,R2
    CoSUB      R12,R3
    CoSTORE R2,MAS
JMPA         cc_NZ,Label 2


; Micro rotation 2


Label1: 
;I=I-(Q>>K) 
    MOV     R7, R2
    ASHR     R7,R13
    MOV     MAH,R1
    CoSUB      R12,R7
    CoSTORE R1,MAS
; Q=Q+ (I_tmp>>K) 
    ASHR     R3, R13
    MOV     MAH, R2
    CoADD     R12, R3
    CoSTORE R2,MAS


Label2: 
    ADD      R13, #1h
    CMPD1 R6,#0h
    JMPR     cc_NZ, Label3

The number of iterations is fixed to 15 and the direction of rotation is depended on Y, therefore there is no need to record the degree of rotation (i.e. the value of Z). This reduces the latency from n + 25 to n + 19 cycles, where 'n' is the number of iterations. In Figure 2 (in the attached PDF file ), the program flowchart of the full assembler program of the complex magnitude is shown.

Table 2 Output

Table 3 Cycle count

Table 4 Code Size (Bytes)

Computation of Sine and Cosine for an input angle

Sine and cosine of the input angles is calculated using CORDIC. If the initial Y component of rotation transform is set to zero the rotation mode reduces to:

Where, K is the CORDIC Gain. By setting initial X component to 0.60725 the rotation process produces an unscaled version of sine and cosine term. Since the rotational angle is limited to "90^o and +90^o additional rotation is required. This is done by exploiting the symmetry property of the sine wave.

The values in other Quadrants are computed by using the relations, Sine (-Z) = -Sine (Z) and Sine (180-Z) = Sine (Z). The absolute value of the input is calculated. If the input is negative (III/IV Quadrant), then sign=1. If absolute value of the input is greater than 1/2 (II/III Quadrant), it is subtracted from 1. If sign=1, the result is negated to give the final sine result.

Implementation.

The input vector Z contains the angle in radians between [-,] which is normalized between (-1, 1) in 1Q15 format (Z=Z rad/). For example, 45^o=/4 is equivalent to Z =1/4 =0.25 (8192 in 1Q15 format. Denormalisation is done in the algorithm.

The algorithm is presented in a 'C' like pseudo code. Note that the Cos Θ constant for this algorithm is 0.60725. We also assume that the 12 values of tan-1 (1/2i) are stored as a look up table in 4Q12 format.

Using C166SV2 instruction set, the micro rotations according to Equation (7) is given below as reference implementation. A fixed point number representation is used for the implementation. The registers R1, R2, R11 and R13 are assigned with X, Y, Z, shift value respectively.

Label3: 
    MOV     R12,#0h 
    MOV     R3,R1
    MOV     R5,R2
    ASHR    R5,#0fh
    CMP     R5,#0
    JMPA     cc_NZ, Label1


; Micro rotation 1


;I=I+(Q>>K) MOV R7,R2
    ASHR     R7,R13
    MOV     MAH,R1
    CoADD   R12,R7
CoSTORE   R1,MAS
;Q=Q-(I_tmp>>K) 
    ASHR     R3,R13
    MOV     MAH,R2
    CoSUB   R12,R3
CoSTORE  R2,MAS
;Z=Z+atan(K[L]) 
    MOV     MAH,R11
    CoADD   R12,[R4+] 
CoSTORE   R11,MAS 
    JMPA     cc_NZ,Label 2 
    

; Micro rotation 2


Label1: 
;I=I-(Q>>K) 
    MOV     R7, R2
    ASHR     R7,R13
    MOV     MAH,R1
    CoSUB       R12,R7
    CoSTORE  R1,MAS
; Q=Q+ (I_tmp>>K) 
    ASHR     R3, R13
    MOV     MAH, R2
    CoADD      R12, R3
    CoSTORE  R2,MAS
;Z=Z-atan(K[L]) 
    MOV     MAH,R11
    CoSUB       
R12,[R4+] 
    CoSTORE   R11,MAS


Label2: 
    ADD       R13, #1h
    CMPD1  R6,#0h
    JMPR     cc_NZ, Label3

We cannot neglect the angle information as in complex magnitude, since the direction of rotation is dependent on Z Parameter. To denormalize, the input is multiplied by 4DBAh (4Q12 format). Therefore the lookup table value (tan-1 (1/2i)) has to be in 4Q12 format. Due to this the number of iterations is reduced to 12.

Pseudo code.

Input vector Z is initialized to the desired angle, Y=0 and X=0.60725.The initialization of X specifies the constant 0.60725 which results from the Cos Θ term.

 {
    short L;
    short I, Q;
    short Z;
    short tmp_I;
    short sign;

Q=0;
I=0.60725;

Z=P*pi; //denormalization of input

If (Z<0) {
        sign =1;
}

//If X is in III/IV Quadrant (Extension of region of convergence)
If (abs (Z)>0.5)
{
        Z=1-abs (Z);
}

//CORDIC Rotation

For (L = 0; L < 15; L++) {
    tmp_I = I;
If (Z < 0.0) {
    I += Q >>L;
    Q -= tmp_I >>L;
            Z=Z+ tan-1 (2-L); // value of tan-1 (2-L) is stored in lookup table
    } else {

    I -= Q >>L;
    Q += tmp_I >>L;
            Z=Z- tan-1 (2-L);
    }
    }
    if (sign==0)
    {
            I=I;
            Q=Q;
    } else {
            I=-I;
            Q=-Q;
    }
}

The Sine of the desired angle is now present in the variable I and the Cosine of the desired angle is in the variable Q. These outputs are within the integer range "32768 to +32767.

Table 5 Output

Table 6 Cycle count

Table 7 Code Size (Bytes)

Numerical Error in CORDIC

The error in CORDIC is split to different factors as approximation error and truncation error (Equation 1). Theoretical realization of CORDIC has infinite iterations which produce the accurate result. But, the practical implementation of CORDIC has finite number of iterations.

This is the cause for approximation error. In general CORDIC Algorithm produces one additional bit of accuracy for each iteration. The truncation error is due to the finite word length effect. For example, consider a fixed-point representation of 5 bits, with the lower order 3 bits after the binary point. If x_i = 1.2345678, then the approximate representation of this number is 01001. Hence Q[x_i] = 0 * 2^l + 1* 2⁰ + 0 * 2^-1 + 0*2^-2+ 1 * 2^-3 = 1 + 0.125 = 1.125.

Hence the quantization error due to finite word length is

E_i = 1.2345678 - 1.125 = 0.1095678 < 2^-3 (0.125).

Due to these errors the precision of CORDIC is affected. A Sine wave of 21 samples with input frequency 50 Hz and sampling frequency of 4000 Hz is generated using Ideal CORDIC and the implemented CORDIC.

Figure 3: The Magnitude of the difference between Ideal CORDIC and implemented CORDIC

The magnitude of the difference between the Ideal CORDIC and the implemented CORDIC is shown in Figure 3 above. The X axis represents the input angle which is normalized and the Y axis represents the approximation and the truncation error.

Acknowledgements

Thanks to Samuel Ginsberg and Richard Armstrong for helping me to understand the concept. Thanks to Manoj Palat, Raghunath lolur, and Sonali Nath for their valuable ideas and code review.

References
1) Ray Andraka, "A Survey of CORDIC algorithms for FPGA based computers," International Symposium on Field Programmable Gate Arrays," Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays, Pages: 191 - 200,Year 1998

2) Y. H. Hu, "The quantization effects of the CORDIC algorithm," IEEE Trans. Signal Processing, pp. 834-844, Apr. 1992.

3) Sang Yoon Park and Nam Ik Cho, "Fixed point error analysis of CORDIC processor based on the variance propagation," IEEE Transactions on Circuits and Systems I-Fundamental Theory and Applicat, vol. 51 no. 3 pp.573-584, Mar. 2004

4) Samuel Ginsberg, "Compact and Efficient Generation of Trigonometric Functions using a CORDIC algorithm"

5) CORDIC FAQ

6) Infineon Technologies, C166S V2 User manual, 16-Bit Microcontroller V 1.7

Christober Rayappan is a software engineer at Infineon Technologies, with a particular focus on the design and development of DSP Algorithms in embedded processors and FPGAs, and finite word length effects in signal processing. He is currently involved in the development of DSP libraries for Infineon 16-bit microcontrollers.