Parallel

Efficiently Raising Matrices to an Integer Power

By Victor Duvanenko, June 01, 1991

Avoid redundancy--and improve performance--when raising polynomials to integer powers.

JUN91: EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER

EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER

Avoiding redundant computations

Victor J. Duvanenko

Victor is a member of the technical staff at TrueVision, where he develops videographics products for PCs. He can be reached at 7340 Shadeland Station, Indianapolis, IN 46256 or by e-mail, [email protected].

Once in a great while, a matrix, a polynomial, or something else complicated must be raised to an integer power. Let's say the value of the power is N. Let's also give a name to the item that needs to be raised to a power--call it M. A straightforward way to do this would be to perform (N-1) matrix multiplications by M (that is, M*M*M*...*M*M). It is possible, however, to get the same answer in at most 2log₂N matrix multiplications.

If you parenthesize pairs of Ms, you can see that much of the work is redundant (that is, (M*M)*(M*M)*(M*M)*...*(M*M)). In other words, the (M*M) term could be computed once and then used repeatedly. It also follows that ((M*M)*(M*M)) terms, or pairs of pairs, could be computed once and then used. The point is that there is much redundant work present when straightforward multiplication is used.

For the sake of argument, let's assume that N = 16. Working backwards, we get the results in Example 1(a). In other words, the answer can be obtained by squaring M (a matrix multiplication), then squaring the result, squaring the result again, and squaring it once more. This requires only 4 matrix multiplications instead of the 15 needed in the straightforward method. It is easy to see that the growth rate of the squarin-method is logarithmic: To compute M³² requires 5 matrix multiplies (squarings), while computing M⁶⁴ requires 6 matrix multiplies. Therefore, the larger the power, the higher the computational savings.

The example above worked out nicely because N was a power of 2. A bit more work is needed when N is not a power of 2 (that is, N = 5 or 7 or 9). These cases are simple to handle once you realize that N can be expressed as a sum of numbers that are all powers of 2. For example, a five is equal to a one plus a four, where one and four are numbers that are powers of 2 (5 = 2⁰ + 2²). The same holds for seven (7 = 2⁰ + 2¹ + 2²) and nine (9 = 2⁰ + 2³). The equations in Example 1(b) show this principle applied to matrix squaring.

So, when N is not a power of 2, the answer can be obtained by multiplying several terms raised to the power of 2 power. This procedure needs to be performed systematically, in the form of a program.

If you consider N as a binary value, it is easy to see that the bit positions with a 1 in them indicate the presence of the corresponding power of 2, and Os indicate the absence of that power of 2. For example, a nine can be represented in binary as 1001, which states that 2⁰ and 2³ are present, and 2¹ and 2² are not. Therefore, it is possible to look at the bit pattern of N and determine when multiplications need to be done.

The procedure of raising a matrix to an integer power is then as follows:

Initialize the intermediate result matrix to an identity matrix (a 1 in matrix world). Determine the number of bits that N can possibly have set by computing log₂(N+1). This indicates the number of times squaring must be done, because it shows the highest bit number set to 1.
If bit 0 of N is set, move M into the intermediate result matrix.
Square M, and place it back in M. If bit 1 of N is set, multiply the intermediate result matrix by M (which is M²).
Square M again, and place it back in M. If bit 2 of N is set, multiply the intermediate result matrix by M (which is M⁴). Continue until you reach the highest bit of N set to 1 (at most, log₂ (N + 1) times).
The intermediate result matrix holds the answer.

This algorithm is easily implemented. The algorithm runs in about 2log₂N time, because it takes about log₂N squarings (each squaring being a matrix multiplication), plus at most log₂N multiplications with the intermediate result (as N can have at most log₂N bits set to 1).

Application

One application of this method is shown in Computer Algorithms: Introduction to Design and Analysis, which describes a method of computing Fibonacci numbers using matrix multiplication. Fibonacci numbers are defined as in Example 1(c). A less known matrix Fibonacci method is shown in Example 1(d).

In other words, any Fibonacci number can be computed by first raising the matrix M to the appropriate power and then multiplying it by the (1,0) matrix. Because a matrix can be raised to a power in 2log₂N time, the algorithm to compute a Fibonacci number runs in order log₂N time.

Three ways to compute Fibonacci numbers have been implemented and are shown in Listing One (page 157). First shown in the fibonacci procedure is the straightforward method that simply remembers the last two Fibonacci numbers and computes the next. This method takes (n - 1) additions to compute fibanacci(n), which is linear running time. Double-precision numbers are used because Fibonacci numbers grow very quickly and overflow even the long integers. The second method is concise, recursive, and inefficient (see procedure fibonacci_recursive). Its running time grows exponentially (ⁿ, where = 1.6). The third method is the matrix raised to a power method described above (shown in procedure fibonacci_log).

The main portion of the program allows you to compute Fibonacci numbers using the three available methods, and to experience the running time differences. To prove that the matrix method is the fastest, the three methods were timed on an 8-MHz PC/AT with a math coprocessor. Table 1 shows the running times to compute a Fibonacci number 1000 times.

Table 1: Running times to compute a Fibonacci number 1000 times

    Fibonacci       Linear       Matrix
     Number        Algorithm    Algorithm
  (time in sec)  (time in sec)
  ----------------------------------------

       1400          67.57       19.95
        700          33.56       17.63
        350          16.86       15.38
        175           8.52       14.18
         88           4.40       11.86
         44           2.20        9.73
         22           1.27        7.53

The recursive method was not timed because its running time grew so rapidly as to make it impractical. For example, to compute Fibonacci(25) only once took the recursive method over 30 seconds, whereas the other methods were instantaneous. Table 1 shows that the running time of the linear algorithm is indeed linear. It is also obvious that the running time of the matrix algorithm grows much slower, and wins above 350 or so. Therefore, for computing large Fibonacci numbers, the matrix method is fastest.

A word of caution: Fibonacci numbers grow very rapidly. In fact, Fibonacci(1500) overflows the math coprocessor capabilities and causes a math overflow error. One curious fact about Fibonacci numbers is that the ratio of two successive Fibonacci numbers, F_n_F/n-1, approaches a number near 1.6 as n grows. This number, also called the Golder Ratio (), appears in many places such as architecture and recursive function theory. For instance, the number of recursive calls to compute F_n (a Fibonacci number) is more than ⁿ. Thus, computing Fibonacci numbers quickly may be useful for determining an accurate value of to 10,000 decimal places.

References

Baase, S. Computer Algorithms: Introduction to Design and Analysis. 2nd edition. Reading, Mass.: Addison-Wesley, 1988.

Hill, F.S. Jr. Computer Graphics. New York, N.Y.: Macmillan, 1990.


_EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER_
by Victor Duvanenko

[LISTING ONE]

<a name="0162_0008">

#include <stdio.h>
#include <math.h>
#include <string.h>

#define N        2

/* Procedure to multiply two square matrices */
void matrix_mult( mat1, mat2, result, n )
double *mat1, *mat2;   /* pointers to the matrices to be multiplied */
double *result;        /* pointer to the result matrix */
int  n;                /* n x n matrices */
{
   register  i, j, k;
   for( i = 0; i < n; i++ )
      for( j = 0; j < n; j++, result++ ) {
         *result = 0.0;
         for( k = 0; k < n; k++ )
            *result += *( mat1 + i * n + k ) * *( mat2 + k * n + j );
          /* result[i][j] += mat1[i][k] * mat2[k][j]; */
      }
}
/* Procedure to copy square matrices quickly.  Assumes the elements are
   double precision floating-point type. */
void matrix_copy( src, dest, n )
double *src, *dest;
int  n;
{
   memcpy( (void *)dest, (void *)src, (unsigned)( n * n * sizeof( double )));
}
/* Procedure to compute Fibonacci numbers in linear time */
double fibonacci( n )

int n;
{
   register i;
   double fib_n, fib_n1, fib_n2;

   if ( n <= 0 )   return( 0.0 );
   if ( n == 1 )   return( 1.0 );
   fib_n2 = 0.0;                   /* initial value of Fibonacci( n - 2 ) */
   fib_n1 = 1.0;                   /* initial value of Fibonacci( n - 1 ) */
   for( i = 2; i <= n; i++ ) {
      fib_n = fib_n1 + fib_n2;
      fib_n2 = fib_n1;
      fib_n1 = fib_n;
   }
   return( fib_n );
}
/* Procedure to compute Fibonacci numbers recursively (inefficiently) */
double fibonacci_recursive( n )
int n;
{
   if ( n <= 0 )   return( 0.0 );
   if ( n == 1 )   return( 1.0 );
   return( (double)( fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)));
}
/* Procedure to compute Fibonacci numbers in logarithmic time (fast!) */
double fibonacci_log( n )
int n;
{
   register k, bit, num_bits;
   double a[2][2], b[2][2], c[2][2], d[2][2];
   if ( n <= 0 )   return( 0.0 );
   if ( n == 1 )   return( 1.0 );
   if ( n == 2 )   return( 1.0 );

   n--;                             /* need only a^(n-1) */
   a[0][0] = 1.0;  a[0][1] = 1.0;   /* initialize the Fibonacci matrix */
   a[1][0] = 1.0;  a[1][1] = 0.0;
   c[0][0] = 1.0;  c[0][1] = 0.0;   /* initialize the result to identity */
   c[1][0] = 0.0;  c[1][1] = 1.0;

   /* need to convert log bases as only log base e (or ln) is available */
   num_bits = ceil( log((double)( n + 1 )) / log( 2.0 ));

   /* Result will be in matrix 'c'. Result (c) == a if bit0 is 1. */
   bit = 1;
   if ( n & bit )   matrix_copy( a, c, N );
   for( bit <<= 1, k = 1; k < num_bits; k++ )   /* Do bit1 through num_bits. */
   {
      matrix_mult( a, a, b, N );    /* square matrix a; result in matrix b */
      if ( n & bit ) {              /* adjust the result */
         matrix_mult( b, c, d, N );
         matrix_copy( d, c, N );
      }
      matrix_copy( b, a, N );
      bit <<= 1;                    /* next bit */
   }
   return( c[0][0] );
}
main()
{
   int  n;
   for(;;) {
      printf( "Enter the Fibonacci number to compute ( 0 to exit ): " );
      scanf( "%d", &n );
      if ( n == 0 )   break;
      printf("\nMatrix    method: Fibonacci( %d ) = %le\n",n,fibonacci_log(n));
      printf("\nLinear    method: Fibonacci( %d ) = %le\n",n,fibonacci(n));
      printf("\nRecursive method: Fibonacci( %d ) = %le\n",n,fibonacci_recursive(n));
   }
   return(0);
}

Previous 1 2 3 4 5

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Parallel

Efficiently Raising Matrices to an Integer Power

EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER

Avoiding redundant computations

Victor J. Duvanenko

Application

Table 1: Running times to compute a Fibonacci number 1000 times

References

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Parallel

Efficiently Raising Matrices to an Integer Power

EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER

Victor J. Duvanenko

Related Reading

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content