# Optimization & Fixed-Point Iteration

Dec01: Algorithm Alley

Shehrzad is a software engineer for picoLiter Inc. He can be contacted at shehrzad_q@hotmail.com.

Matlab is a programming language and IDE that I find ideal for development of numerical algorithms. However, programmers used to C-style arrays and C programming will need to make mindset adjustments when coding efficient algorithms in the Matlab programming language. In particular, the use of vector operations is absolutely tantamount to ensure optimal performance. In this article, I share my experiences in implementing numerical methods in Matlab, describe how I transformed the code from a C/C++-style algorithm into vectorized Matlab code, and show the performance gains resulting from this transformation. The complete source code (in Matlab .m files) illustrating these points is available electronically; see "Resource Center," page 5.

### Mathematical Preliminaries

The solution to a system of nonlinear equations can be approximated through the use of iterative numerical techniques. A system such as this has the form of Figure 1(a). This system can be transformed by solving the ith equation for xi, obtaining Figure 1(b).

One method for solving systems such as this is known as "fixed-point iteration" (also known as "successive approximation"). Given an initial guess to the solution, this scheme works by successively feeding refined estimates to the solution back into the system. The generated sequence should eventually converge to the solution. Of course, the system in question must satisfy certain prerequisites, such as continuity and differentiability, to name two. (For a more in-depth discussion of this numerical technique, see Numerical Analysis, Fifth Edition, by Richard L. Burden and J. Douglas Faires, Brooks/Cole, 1993.)

### Fixed-Point Iteration

Recently, I came across just such a system that needed to be solved in a "quasi real-time" fashion; that is, it needed to run as fast as possible. I typically use Matlab from The MathWorks (http://www.mathworks.com/) to prototype numerical algorithms. The system in question had two free variables; call them x and y. The system is described in Figures 2(a) and 2(b), where Fij=element of an NXM matrix represents a sampled function; xi=element of a 1XN vector represents the x-axis, and yi=element of a 1XM vector represents the y-axis.

To solve for x and y, I utilized the fixed-point iteration scheme, using the initial guess in Figures 3(a) and 3(b). Given F, x, and y, my first cut at a Matlab implementation was coded up in a C-style fashion, forgoing the use of any vectorized operations. Listing One presents the M-file fixed_pt_iteration_1.m. Typically, you use the difference between two successive iterations as a stopping condition. That is, if the difference between the current expression for the approximate solution and the result of the earlier iteration is less than some tolerance, the iterative procedure terminates. Since I am really interested in code optimization and profiling, for the purpose of this discussion, I simply let the algorithm iterate 100 times.

One of the first operations this function performs is to use the Matlab load command to read from the file variables.mat the three Matlab variables: F, x_axis, and y_axis.

• F is a 9X9 matrix representing a sampled function of two variables.
• x_axis is a 1X9 vector representing the "row dimension" of F.

• y_axis is a 1X9 vector representing the "column dimension" of F.

Next, the starting point is computed, and fixed-point iteration can then begin in earnest. The doubly nested loop continues evaluating the summation expressions, given by the equations in Figures 2(a) and 2(b). The subfunction gaussian evaluates the two-dimensional gaussian expression in Figure 4.

This implementation looks just like a line-by-line translation from a C snippet of code. Nowhere does it take advantage of any Matlab vector operations, which can greatly impact total computational time.

### An Optimized Implementation

Listing Two is my optimized Matlab code. The fixed_pt_iteration_2 function starts out identical to the first version (the computation of the initial guess can be optimized as well, but since the performance bottleneck is in the main loop, I focus attention there). The first thing to notice is the elimination of the double for loop. The previous implementation cycled through each term of the summation expression given by Figures 2(a) and 2(b) one by one, continually incrementing the numerator and denominator variables. This is analogous to the += operator in C. What happens in the Matlab-optimized function is that matrices are formed, which are then multiplied together in an element-by-element fashion (not matrix multiplication), and finally the Matlab sum function is used to add up all of their elements.

The gaussian subfunction is no longer needed, as the exponential expressions are now evaluated in a vectorized fashion (the built-in Matlab function exp, like just about every other Matlab function, lets you pass in a matrix as well as a scalar as an input argument). To compute the expression given by the equation in Figure 4 without the use of for loops, matrices are formed using the Matlab function repmat. This function replicates a matrix in a tiling fashion. The matrix variables x_grid and y_grid replace the variables x_axis and y_axis. Likewise, the variable that stored the current approximate solution (fixed_pt_solution) has been replaced with the two matrices x_approx and y_approx. x_approx and y_approx are nothing more than 9X9 matrices consisting of the current approximated values of x and y, respectively. With these variables in place, evaluation of the summation terms present in the equations in Figures 2(a), 2(b), 3(a), and 3(b) is straightforward. The matrices are multiplied together in an element-by-element fashion — which is very different from matrix multiplication. Hence, the use of the ".*" operator is important; see Figures 5(a) and 5(b), an illustration of the differences between the "*" and ".*" operators.

Once the exponential terms have been evaluated, the sum function adds up the elements of the resultant matrices, the approximated values of x and y are updated accordingly, and the fixed-point iteration begins anew.

### Performance Profiling

To illustrate the performance gains that fixed_pt_iteration_2 provides, I use the Matlab profile function. This function, as its name suggests, provides profiling data. I have also provided a simple helper function (given an integer, either 1 or 2, indicating which function to run and the number of times to run the function) that calls the desired function N times, and at the end, spits out instrumentation information. Tables 1(a) and 1(b) are example runs on my aging 400-MHz PII laptop for both functions. What is astounding is that on my platform, fixed_pt_iteration_2 was on the average 20 times faster than fixed_pt_iteration_1.

### Conclusion

This example speaks volumes for the importance of avoiding for loops wherever possible when programming with Matlab. Matlab makes extensive use of BLAS routines, which are highly tuned toward an individual machine's memory hierarchies. As such, vector operations are always preferred over for loops. In fact, the performance gains resulting from usage of these vector operations are so significant that they often can drive the algorithm selection. For example, the fixed-point iteration algorithm described in this article technically is an example of Jacobi fixed-point iteration. As we have seen, this algorithm is easily vectorized. An alternative method, Gauss-Seidel iteration, can be theoretically shown to provide faster rates of convergence than Jacobi iteration. However, because Gauss-Seidel iteration cannot utilize vector arithmetic, the total computation time is actually much longer (see Numerical Methods with MATLAB, by G.J. Borse, PWS Publishing, 1997). Examples such as this abound in numerical analysis and scientific programming. For C programmers such as myself, vector code may appear strange at first, but clearly, the performance gains are significant.

### Acknowledgment

Thanks to my colleague Mohan Bodduluri for proof reading a draft of this article, and the fine people at Accuray who gave me the opportunity to explore interesting subject matters such as this when I worked there.

DDJ

#### Listing One

```function fixed_pt_iteration_1
% FIXED_PT_ITERATION_1   un-optimized fixed-point iteration
num_it = 100; % # of iterations
num_vars = 2;
sigma_term = .125;

load variables.mat; % get F, x & y axes

% get starting point for fixed-pt iteration
fixed_pt_solution = get_initial_guess(x_axis, y_axis, F);

length_x_axis = length(x_axis);
length_y_axis = length(y_axis);

for n=1:num_it
denominator = 0; % denominator of expression given by equations 2(a) & 2(b)
numerator = [0 0]; % numerator of expression given by equations 2(a) & 2(b)

for ii = 1:length(x_axis)
for jj = 1:length(y_axis)
xy = fixed_pt_solution; % [ xi yi ]
axis_pts = [x_axis(ii) y_axis(jj)];
term = F(ii,jj)*gaussian(xy, axis_pts, sigma_term);
denominator = denominator + term;
numerator(1) = numerator(1) + x_axis(ii) * term;
numerator(2) = numerator(2) + y_axis(jj) * term;
end
end
fixed_pt_solution = numerator./denominator;
end
function initial_guess = get_initial_guess(x_axis, y_axis, F)
%
% GET_INITIAL_GUESS - returns starting point for fixed point iteration
%                     evaluates expression given by 3(a) & 3(b)
x_initial = 0;
y_initial = 0;
[M N] = size(F);

for ii=1:M
for jj=1:N
x_initial = x_initial + x_axis(ii) * F(ii,jj);
y_initial = y_initial + y_axis(jj) * F(ii,jj);
end
end
initial_guess = [ x_initial y_initial ] ./ sum(F(:));
function term = gaussian(xy, axis_pts, sigma_term)
%
% GAUSSIAN - evaluates 2-dimensional gaussian given by equation in Fig. 4

term = 1.0;
for dim=1:2
term = term * exp(-(xy(dim)-axis_pts(dim)) *
(xy(dim)-axis_pts(dim)) / (sigma_term));
end
```

#### Listing Two

```function fixed_pt_iteration_2
% FIXED_PT_ITERATION_2   optimized fixed-point iteration

num_it = 100; % # of iterations
num_vars = 2;
sigma_term = .125;

load variables.mat; % get F, x & y axes
[M N] = size(F);

% form x_grid by tiling x' along column dimension
x_grid = repmat(x_axis', [1 N]);
% same for y dimension by except tile along the row dimension
y_grid = repmat(y_axis, [M 1]);

% get starting point for fixed-pt iteration
fixed_pt_solution = get_initial_guess(x_axis, y_axis, F);

for n=1:num_it
x_approx = repmat(fixed_pt_solution(1),[9 9]);
y_approx = repmat(fixed_pt_solution(2),[9 9]);

x_exp = exp( -((x_approx-x_grid).*(x_approx-x_grid))./sigma_term );
y_exp = exp( -((y_approx-y_grid).*(y_approx-y_grid))./sigma_term );

term1 = F .* x_exp .* y_exp;
denominator = sum(term1(:));

term2_x = x_grid .* term1;
term2_y = y_grid .* term1;

numerator = [ sum(term2_x(:)) sum(term2_y(:)) ];

fixed_pt_solution = numerator./denominator;
end

function initial_guess = get_initial_guess(x_axis, y_axis, F)
%
% GET_INITIAL_GUESS - returns starting point for fixed point iteration
%                     evaluates expression given by 3(a) & 3(b)
%
x_initial = 0;
y_initial = 0;
[M N] = size(F);

for ii=1:M
for jj=1:N
x_initial = x_initial + x_axis(ii) * F(ii,jj);
y_initial = y_initial + y_axis(jj) * F(ii,jj);
end
end
initial_guess = [ x_initial y_initial ] ./ sum(F(:));
```

#### Listing Three

```function profile_fixed_pt(which_implementation, N)
% PROFILE_FIXED_PT   demonstrates use of Matlab profiler
%    LOAD_DRR(which_implementation, N) runs the function indicated
%    by which_implementation N times.
%    Use '1' for fixed_pt_iteration_1 (slow implementation)
%        '2' for fixed_pt_iteration_2 (optimized implementation)

if which_implementation == 1
fun_str = 'fixed_pt_iteration_1';
elseif which_implementation == 2
fun_str = 'fixed_pt_iteration_2';
else
error('Must specify 1 or 2');
end
profile on;
for ii=1:N
eval(fun_str);
end
profile off;
profile report;```

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

# First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

# HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

# Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

# Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

# How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>