Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Parallel

Efficiently Raising Matrices to an Integer Power


JUN91: EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER

EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER

Avoiding redundant computations

Victor J. Duvanenko

Victor is a member of the technical staff at TrueVision, where he develops videographics products for PCs. He can be reached at 7340 Shadeland Station, Indianapolis, IN 46256 or by e-mail, [email protected].


Once in a great while, a matrix, a polynomial, or something else complicated must be raised to an integer power. Let's say the value of the power is N. Let's also give a name to the item that needs to be raised to a power--call it M. A straightforward way to do this would be to perform (N-1) matrix multiplications by M (that is, M*M*M*...*M*M). It is possible, however, to get the same answer in at most 2log2N matrix multiplications.

If you parenthesize pairs of Ms, you can see that much of the work is redundant (that is, (M*M)*(M*M)*(M*M)*...*(M*M)). In other words, the (M*M) term could be computed once and then used repeatedly. It also follows that ((M*M)*(M*M)) terms, or pairs of pairs, could be computed once and then used. The point is that there is much redundant work present when straightforward multiplication is used.

For the sake of argument, let's assume that N = 16. Working backwards, we get the results in Example 1(a). In other words, the answer can be obtained by squaring M (a matrix multiplication), then squaring the result, squaring the result again, and squaring it once more. This requires only 4 matrix multiplications instead of the 15 needed in the straightforward method. It is easy to see that the growth rate of the squarin-method is logarithmic: To compute M32 requires 5 matrix multiplies (squarings), while computing M64 requires 6 matrix multiplies. Therefore, the larger the power, the higher the computational savings.

The example above worked out nicely because N was a power of 2. A bit more work is needed when N is not a power of 2 (that is, N = 5 or 7 or 9). These cases are simple to handle once you realize that N can be expressed as a sum of numbers that are all powers of 2. For example, a five is equal to a one plus a four, where one and four are numbers that are powers of 2 (5 = 20 + 22). The same holds for seven (7 = 20 + 21 + 22) and nine (9 = 20 + 23). The equations in Example 1(b) show this principle applied to matrix squaring.

So, when N is not a power of 2, the answer can be obtained by multiplying several terms raised to the power of 2 power. This procedure needs to be performed systematically, in the form of a program.

If you consider N as a binary value, it is easy to see that the bit positions with a 1 in them indicate the presence of the corresponding power of 2, and Os indicate the absence of that power of 2. For example, a nine can be represented in binary as 1001, which states that 20 and 23 are present, and 21 and 22 are not. Therefore, it is possible to look at the bit pattern of N and determine when multiplications need to be done.

The procedure of raising a matrix to an integer power is then as follows:

  1. Initialize the intermediate result matrix to an identity matrix (a 1 in matrix world). Determine the number of bits that N can possibly have set by computing log2(N+1). This indicates the number of times squaring must be done, because it shows the highest bit number set to 1.
  2. If bit 0 of N is set, move M into the intermediate result matrix.
  3. Square M, and place it back in M. If bit 1 of N is set, multiply the intermediate result matrix by M (which is M2).
  4. Square M again, and place it back in M. If bit 2 of N is set, multiply the intermediate result matrix by M (which is M4). Continue until you reach the highest bit of N set to 1 (at most, log2 (N + 1) times).
  5. The intermediate result matrix holds the answer.
This algorithm is easily implemented. The algorithm runs in about 2log2N time, because it takes about log2N squarings (each squaring being a matrix multiplication), plus at most log2N multiplications with the intermediate result (as N can have at most log2N bits set to 1).

Application

One application of this method is shown in Computer Algorithms: Introduction to Design and Analysis, which describes a method of computing Fibonacci numbers using matrix multiplication. Fibonacci numbers are defined as in Example 1(c). A less known matrix Fibonacci method is shown in Example 1(d).

In other words, any Fibonacci number can be computed by first raising the matrix M to the appropriate power and then multiplying it by the (1,0) matrix. Because a matrix can be raised to a power in 2log2N time, the algorithm to compute a Fibonacci number runs in order log2N time.

Three ways to compute Fibonacci numbers have been implemented and are shown in Listing One (page 157). First shown in the fibonacci procedure is the straightforward method that simply remembers the last two Fibonacci numbers and computes the next. This method takes (n - 1) additions to compute fibanacci(n), which is linear running time. Double-precision numbers are used because Fibonacci numbers grow very quickly and overflow even the long integers. The second method is concise, recursive, and inefficient (see procedure fibonacci_recursive). Its running time grows exponentially (n, where = 1.6). The third method is the matrix raised to a power method described above (shown in procedure fibonacci_log).

The main portion of the program allows you to compute Fibonacci numbers using the three available methods, and to experience the running time differences. To prove that the matrix method is the fastest, the three methods were timed on an 8-MHz PC/AT with a math coprocessor. Table 1 shows the running times to compute a Fibonacci number 1000 times.

Table 1: Running times to compute a Fibonacci number 1000 times

    Fibonacci       Linear       Matrix
     Number        Algorithm    Algorithm
  (time in sec)  (time in sec)
  ----------------------------------------

       1400          67.57       19.95
        700          33.56       17.63
        350          16.86       15.38
        175           8.52       14.18
         88           4.40       11.86
         44           2.20        9.73
         22           1.27        7.53

The recursive method was not timed because its running time grew so rapidly as to make it impractical. For example, to compute Fibonacci(25) only once took the recursive method over 30 seconds, whereas the other methods were instantaneous. Table 1 shows that the running time of the linear algorithm is indeed linear. It is also obvious that the running time of the matrix algorithm grows much slower, and wins above 350 or so. Therefore, for computing large Fibonacci numbers, the matrix method is fastest.

A word of caution: Fibonacci numbers grow very rapidly. In fact, Fibonacci(1500) overflows the math coprocessor capabilities and causes a math overflow error. One curious fact about Fibonacci numbers is that the ratio of two successive Fibonacci numbers, FnF/n-1, approaches a number near 1.6 as n grows. This number, also called the Golder Ratio (), appears in many places such as architecture and recursive function theory. For instance, the number of recursive calls to compute Fn (a Fibonacci number) is more than n. Thus, computing Fibonacci numbers quickly may be useful for determining an accurate value of to 10,000 decimal places.

References

Baase, S. Computer Algorithms: Introduction to Design and Analysis. 2nd edition. Reading, Mass.: Addison-Wesley, 1988.

Hill, F.S. Jr. Computer Graphics. New York, N.Y.: Macmillan, 1990.


_EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER_
by Victor Duvanenko


[LISTING ONE]
<a name="0162_0008">

#include <stdio.h>
#include <math.h>
#include <string.h>

#define N        2

/* Procedure to multiply two square matrices */
void matrix_mult( mat1, mat2, result, n )
double *mat1, *mat2;   /* pointers to the matrices to be multiplied */
double *result;        /* pointer to the result matrix */
int  n;                /* n x n matrices */
{
   register  i, j, k;
   for( i = 0; i < n; i++ )
      for( j = 0; j < n; j++, result++ ) {
         *result = 0.0;
         for( k = 0; k < n; k++ )
            *result += *( mat1 + i * n + k ) * *( mat2 + k * n + j );
          /* result[i][j] += mat1[i][k] * mat2[k][j]; */
      }
}
/* Procedure to copy square matrices quickly.  Assumes the elements are
   double precision floating-point type. */
void matrix_copy( src, dest, n )
double *src, *dest;
int  n;
{
   memcpy( (void *)dest, (void *)src, (unsigned)( n * n * sizeof( double )));
}
/* Procedure to compute Fibonacci numbers in linear time */
double fibonacci( n )

int n;
{
   register i;
   double fib_n, fib_n1, fib_n2;

   if ( n <= 0 )   return( 0.0 );
   if ( n == 1 )   return( 1.0 );
   fib_n2 = 0.0;                   /* initial value of Fibonacci( n - 2 ) */
   fib_n1 = 1.0;                   /* initial value of Fibonacci( n - 1 ) */
   for( i = 2; i <= n; i++ ) {
      fib_n = fib_n1 + fib_n2;
      fib_n2 = fib_n1;
      fib_n1 = fib_n;
   }
   return( fib_n );
}
/* Procedure to compute Fibonacci numbers recursively (inefficiently) */
double fibonacci_recursive( n )
int n;
{
   if ( n <= 0 )   return( 0.0 );
   if ( n == 1 )   return( 1.0 );
   return( (double)( fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)));
}
/* Procedure to compute Fibonacci numbers in logarithmic time (fast!) */
double fibonacci_log( n )
int n;
{
   register k, bit, num_bits;
   double a[2][2], b[2][2], c[2][2], d[2][2];
   if ( n <= 0 )   return( 0.0 );
   if ( n == 1 )   return( 1.0 );
   if ( n == 2 )   return( 1.0 );

   n--;                             /* need only a^(n-1) */
   a[0][0] = 1.0;  a[0][1] = 1.0;   /* initialize the Fibonacci matrix */
   a[1][0] = 1.0;  a[1][1] = 0.0;
   c[0][0] = 1.0;  c[0][1] = 0.0;   /* initialize the result to identity */
   c[1][0] = 0.0;  c[1][1] = 1.0;

   /* need to convert log bases as only log base e (or ln) is available */
   num_bits = ceil( log((double)( n + 1 )) / log( 2.0 ));

   /* Result will be in matrix 'c'. Result (c) == a if bit0 is 1. */
   bit = 1;
   if ( n & bit )   matrix_copy( a, c, N );
   for( bit <<= 1, k = 1; k < num_bits; k++ )   /* Do bit1 through num_bits. */
   {
      matrix_mult( a, a, b, N );    /* square matrix a; result in matrix b */
      if ( n & bit ) {              /* adjust the result */
         matrix_mult( b, c, d, N );
         matrix_copy( d, c, N );
      }
      matrix_copy( b, a, N );
      bit <<= 1;                    /* next bit */
   }
   return( c[0][0] );
}
main()
{
   int  n;
   for(;;) {
      printf( "Enter the Fibonacci number to compute ( 0 to exit ): " );
      scanf( "%d", &n );
      if ( n == 0 )   break;
      printf("\nMatrix    method: Fibonacci( %d ) = %le\n",n,fibonacci_log(n));
      printf("\nLinear    method: Fibonacci( %d ) = %le\n",n,fibonacci(n));
      printf("\nRecursive method: Fibonacci( %d ) = %le\n",n,fibonacci_recursive(n));
   }
   return(0);
}


Copyright © 1991, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.